Skip to content

Network requirements

The following network requirements are for the NVIDIA Run:ai components installation and usage.

Installation

Inbound rules

Name Description Source Destination Port
Installation via BCM SSH Access Installer Machine NVIDIA Base Command Manager headnodes 22

Outbound rules

Name Description Source Destination Port
Container Registry Pull NVIDIA Run:ai images All kubernetes nodes runai.jfrog.io 443
Helm repository NVIDIA Run:ai Helm repository for installation Installer machine runai.jfrog.io 443

The NVIDIA Run:ai installation has software requirements that require additional components to be installed on the cluster. This article includes simple installation examples which can be used optionally and require the following cluster outbound ports to be open:

Name Description Source Destination Port
Kubernetes Registry Ingress Nginx image repository All kubernetes nodes registry.k8s.io 443
Google Container Registry GPU Operator, and Knative image repository All kubernetes nodes gcr.io 443
Red Hat Container Registry Prometheus Operator image repository All kubernetes nodes quay.io 443
Docker Hub Registry Training Operator image repository All kubernetes nodes docker.io 443

External access

Set out below are the domains to whitelist and ports to open for installation, upgrade, and usage of the application and its management.

Note

Ensure the inbound and outbound rules are correctly applied to your firewall.

Inbound rules

To allow your organization’s NVIDIA Run:ai users to interact with the cluster using the NVIDIA Run:ai Command-line interface, or access specific UI features, certain inbound ports need to be open:

Name Description Source Destination Port
NVIDIA Run:ai control plane HTTPS entrypoint 0.0.0.0 NVIDIA Run:ai system nodes 443
NVIDIA Run:ai cluster HTTPS entrypoint RFC1918 private IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
NVIDIA Run:ai system nodes 443

Outbound rules

Note

Outbound rules applied to the NVIDIA Run:ai cluster component only. In case the NVIDIA Run:ai cluster is installed together with the NVIDIA Run:ai control plane, the NVIDIA Run:ai cluster FQDN refers to the NVIDIA Run:ai control plane FQDN.

For the NVIDIA Run:ai cluster installation and usage, certain outbound ports must be open:

Name Description Source Destination Port
Cluster sync Sync NVIDIA Run:ai cluster with NVIDIA Run:ai control plane NVIDIA Run:ai system nodes NVIDIA Run:ai control plane FQDN 443
Metric store Push NVIDIA Run:ai cluster metrics to NVIDIA Run:ai control plane's metric store NVIDIA Run:ai system nodes NVIDIA Run:ai control plane FQDN 443

Internal network

Ensure that all Kubernetes nodes can communicate with each other across all necessary ports. Kubernetes assumes full interconnectivity between nodes, so you must configure your network to allow this seamless communication. Specific port requirements may vary depending on your network setup.