Network requirements¶

The following network requirements are for the NVIDIA Run:ai components installation and usage.

Installation¶

Inbound rules¶

Name	Description	Source	Destination	Port
Installation via BCM	SSH Access	Installer Machine	NVIDIA Base Command Manager headnodes	22

Outbound rules¶

Name	Description	Source	Destination	Port
Container Registry	Pull NVIDIA Run:ai images	All kubernetes nodes	runai.jfrog.io	443
Helm repository	NVIDIA Run:ai Helm repository for installation	Installer machine	runai.jfrog.io	443

The NVIDIA Run:ai installation has software requirements that require additional components to be installed on the cluster. This article includes simple installation examples which can be used optionally and require the following cluster outbound ports to be open:

Name	Description	Source	Destination	Port
Kubernetes Registry	Ingress Nginx image repository	All kubernetes nodes	registry.k8s.io	443
Google Container Registry	GPU Operator, and Knative image repository	All kubernetes nodes	gcr.io	443
Red Hat Container Registry	Prometheus Operator image repository	All kubernetes nodes	quay.io	443
Docker Hub Registry	Training Operator image repository	All kubernetes nodes	docker.io	443

External access¶

Set out below are the domains to whitelist and ports to open for installation, upgrade, and usage of the application and its management.

Note

Ensure the inbound and outbound rules are correctly applied to your firewall.

Inbound rules¶

To allow your organization’s NVIDIA Run:ai users to interact with the cluster using the NVIDIA Run:ai Command-line interface, or access specific UI features, certain inbound ports need to be open:

Name	Description	Source	Destination	Port
NVIDIA Run:ai control plane	HTTPS entrypoint	0.0.0.0	NVIDIA Run:ai system nodes	443
NVIDIA Run:ai cluster	HTTPS entrypoint	RFC1918 private IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
NVIDIA Run:ai system nodes	443

Outbound rules¶

Note

Outbound rules applied to the NVIDIA Run:ai cluster component only. In case the NVIDIA Run:ai cluster is installed together with the NVIDIA Run:ai control plane, the NVIDIA Run:ai cluster FQDN refers to the NVIDIA Run:ai control plane FQDN.

For the NVIDIA Run:ai cluster installation and usage, certain outbound ports must be open:

Name	Description	Source	Destination	Port
Cluster sync	Sync NVIDIA Run:ai cluster with NVIDIA Run:ai control plane	NVIDIA Run:ai system nodes	NVIDIA Run:ai control plane FQDN	443
Metric store	Push NVIDIA Run:ai cluster metrics to NVIDIA Run:ai control plane's metric store	NVIDIA Run:ai system nodes	NVIDIA Run:ai control plane FQDN	443

Internal network¶

Ensure that all Kubernetes nodes can communicate with each other across all necessary ports. Kubernetes assumes full interconnectivity between nodes, so you must configure your network to allow this seamless communication. Specific port requirements may vary depending on your network setup.