Skip to content

Overview

Developers can access Run:ai through various programmatic interfaces.

API Architecture

Run:ai is composed of a single, multi-tenant control plane. Each tenant can be connected to one or more GPU clusters. See Run:ai system components for detailed information.

The following programming interfaces are available:

API Description Purpose
Control Plane API Access the control plane for getting and modifying business objects This is the API mostly used by system developers. The API is also used by the Run:ai user interface as well as the new command-line interface
Cluster API Submit Workloads directly to the Cluster A YAML-based API allowing submittion of Workloads directly to the Cluster. With Run:ai 2.18, this API is replaced by a Control-plane API to submit jobs, which is now the recommended method
Metrics API (deprecated) Get cluster metrics Get utilization metrics directly from the monitoring agent (Prometheus). This API is in the process of being deprecated and is replaced with metric-specific control plane API

Control Plane API

Allows you to Add, delete, modify and list Run:ai meta-data objects such as Projects, Departments, Users. For Clusters of Run:ai 2.18 and above, allows the submitting of Workloasd.

The API is provided as REST and is accessible via the control plane endpoint.

For more information see Control Plane REST API.

Important

The endpoints and fields specified in the API reference are the ones that are officially supported by Run:ai. Endpoints and fields that are not listed in the API reference are not supported.

Run:ai does not recommend using API endpoints and fields marked as deprecated and will not add functionality to them. Once an API endpoint or field is marked as deprecated, Run:ai will stop supporting it after 2 major releases for self-hosted deployments, and after 6 months for SaaS deployments.

For details, see the Deprecation notifications.

Cluster API

The Cluster API allows you to submit and delete Workloads directly to the cluster itself.

The API is provided as Kubernetes API.

Cluster API is accessible via the GPU cluster itself. As such, multiple clusters may have multiple endpoints.

Important

  • This API is replaced by a Control-plane API to submit jobs, which is now the recommended method for cluster versions of 2.18 and above.
  • If you are looking to automate tasks with older versions of Run:ai, it's best to use the Run:ai Command-line interface which provides forward compatibility.

Metrics API

Retrieve metrics from multiple GPU clusters.

See the Metrics API document.

API Authentication

See API Authentication for information on how to gain authenticated access to Run:ai APIs.