Installing Red Hat OpenShift AI Self-Managed 3.4(latest) in Red Hat Openshift 4.22(latest) on AWS: A Shell-Based Guide

A compact, annotated shell steps to install Red Hat Openshift AI on OpenShift 4.22 (latest version) on AWS - with a ready-to-edit install-config.yaml, sensible defaults, and the trade-offs you should know before you press enter.

openshift
cloud
aws
devops
kubernetes
Author
Affiliation

Independent

Published

June 10, 2026

Modified

June 10, 2026

Keywords

OpenShift, RHAI, AWS, IPI, install-config, kubernetes, devops, NVIDIA, AI, self-managed

TL;DR — Save the shell script below, edit OCP_VERSION and your install-config.yaml, then run openshift-install create cluster --dir ocp-install. The full annotated walkthrough is below.

This post documents a compact, repeatable shell-based workflow used to install OpenShift 4.x on AWS. I extracted the working script, annotated each step, and included a sample install-config.yaml for a minimal IPI installation.

Why this matters

Repeatable installs are the difference between a cluster you understand and one you merely possess.

  • Repeatability — encapsulates download, install, and bootstrap steps.
  • Auditability — the full script appears below so you can review before running.
  • Tunable — variables like OCP_VERSION and ARCH are exposed for quick edits.
Figure 1: High-level installation flow.

AWS prerequisites — at a glance

Before you run the installer, you need an EC2 host to drive the install and a public Route 53 hosted zone matching your baseDomain.

Figure 2: Create the installer EC2 host.
Figure 3: Create the public Route 53 hosted zone.

Both screenshots show the minimum configuration that satisfies the OpenShift installer’s IAM and DNS expectations.

TipPick a specific version

Avoid latest-4.22 for production runs — pin to a release like 4.22.3 so the install is reproducible months later when latest has moved on.

Prerequisites

The script (annotated)

Rather than dropping the full ocp-install.sh here as one block, this section walks through the script one command group at a time so you can copy, adapt, and run each step independently.

The whole file is also available as a single download: ocp-install.sh. The sections below correspond 1:1 to the blocks inside it.

1. Patch the host

A predictable starting state. Run a full package update so the build host isn’t carrying half-applied kernel or curl patches when the installer hits the network.

sudo yum update -y

2. Working directory and pinned versions

Everything below assumes a dedicated tools/ directory and three exported variables. Pinning OCP_VERSION to a specific release (not latest-*) is what makes the install reproducible weeks later.

mkdir -p tools && cd tools/

export OCP_VERSION="latest-4.22"   # pin to e.g. 4.22.3 for production
export ARCH="x86_64"
export HOST_ARCH="$(uname -m)"

3. Install the oc and kubectl clients

Download the client tarball from the Red Hat mirror, extract the two binaries, and move them onto PATH. The -fsSL flags make curl fail loudly on HTTP errors instead of silently writing an HTML error page into oc.tar.gz.

curl -fsSLk \
  "https://mirror.openshift.com/pub/openshift-v4/${HOST_ARCH}/clients/ocp/${OCP_VERSION}/openshift-client-linux.tar.gz" \
  -o oc.tar.gz

tar zxf oc.tar.gz
rm -f oc.tar.gz README.md
chmod +x oc kubectl
sudo mv oc kubectl /usr/local/bin/

Verify before moving on:

oc version --client
kubectl version --client
Client Version: 4.22.0
Kustomize Version: v5.7.1
Client Version: v1.35.2
Kustomize Version: v5.7.1

4. Install the openshift-install binary

Same pattern, different tarball. This is the installer that orchestrates the bootstrap node, control plane, and AWS resources.

curl -fsSLk \
  "https://mirror.openshift.com/pub/openshift-v4/${HOST_ARCH}/clients/ocp/${OCP_VERSION}/openshift-install-linux.tar.gz" \
  -o openshift-install-linux.tar.gz

tar zxf openshift-install-linux.tar.gz
rm -f openshift-install-linux.tar.gz README.md
chmod +x openshift-install
sudo mv openshift-install /usr/local/bin/

Verify the version matches OCP_VERSION:

[ec2-user@ip-172-31-27-222 ~]$ openshift-install version
openshift-install 4.22.0
built from commit 92cb4e39665966fd3128abd6256b13aa56523eeb
release image quay.io/openshift-release-dev/ocp-release@sha256:283887f2860a745387608d106e70e5be2314df2497ee08c69e7bc669ca091340
release architecture amd64

5. Generate the cluster SSH key

The installer bakes this public key into every node so you can ssh core@… later for diagnostics. Use a dedicated key per cluster — don’t reuse your personal one.

ssh-keygen -t ed25519 -N '' -f ~/.ssh/rhoai-demo
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/rhoai-demo

~/.ssh/rhoai-demo is a private key. Add it to .gitignore and never copy it into a container image or CI artifact.

6. Launch the install

Drop the install-config.yaml from the next section into ./ocp-install/, then hand control over to the installer. The --dir flag is what makes re-runs, gather, and destroy work consistently against the same state.

mkdir -p ocp-install
# Place install-config.yaml inside ./ocp-install/ first, then:
openshift-install create cluster --dir ocp-install --log-level=info

Expect 30–45 minutes for an AWS IPI install. Watch progress with: ## Prepare the install-config.yaml file:

Adapt baseDomain, instance types, and pullSecret before use.

[ec2-user@ip-172-31-27-222 ~]$ mkdir -p ocp-install
# Place install-config.yaml inside ./ocp-install/ first, then:
[ec2-user@ip-172-31-27-222 ~]$ cat ocp-install/install-config.yaml
apiVersion: v1
baseDomain: belowthestack.dev
compute:
  - architecture: amd64
    hyperthreading: Enabled
    name: worker
    platform:
      aws:
        type: m6i.4xlarge
    replicas: 0
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    aws:
      type: g6.12xlarge
      rootVolume:
        size: 1000
  replicas: 1
metadata:
  name: rhoai-demo
networking:
  clusterNetwork:
    - cidr: 10.128.0.0/14
      hostPrefix: 23
  machineNetwork:
    - cidr: 10.0.0.0/16
  networkType: OVNKubernetes
  serviceNetwork:
    - 172.30.0.0/16
platform:
  aws:
    region: eu-west-3
publish: External
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"<token>","email":"<email>"}}}'
sshKey: 'ssh-ed25519 AAAA...redacted...'

What a successful install looks like

After the bootstrap phase finishes and the cluster operators settle, the installer prints a final summary block. If you see something close to the output below, the cluster is up and the web console is reachable.

 [ec2-user@ip-172-31-27-222 ~]$ openshift-install create cluster --dir ocp-install/ --log-level=info
{.text filename="installer output"}
INFO Waiting up to 30m0s (until 3:53AM UTC) to ensure each cluster operator has finished progressing...
INFO All cluster operators have completed progressing
INFO Checking to see if there is a route at openshift-console/console...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run
INFO     export KUBECONFIG=/home/ec2-user/ocp-install/auth/kubeconfig
INFO Access the OpenShift web-console here: https://console-openshift-console...
INFO Login to the console with user: "kubeadmin", and password: "UoGgI-xxxx"
INFO Time elapsed: 29s

Two things worth doing immediately:

# 1. Wire up oc/kubectl against the new cluster
export KUBECONFIG=/home/ec2-user/ocp-install/auth/kubeconfig
oc get nodes
oc get clusteroperators
# 2. Save (and then rotate) the kubeadmin credentials
cat ocp-install/auth/kubeadmin-password

The kubeadmin account is a temporary bootstrap credential. As soon as you’ve wired up a real identity provider (OIDC, htpasswd, LDAP…) and granted cluster-admin to a real user, delete the kubeadmin secret:

oc delete secret kubeadmin -n kube-system
WarningNever commit secrets

pullSecret and the contents of ~/.ssh/rhoai-demo are credentials. Add install-config.yaml, *.pem, and auth/ to your .gitignore before running the installer — the installer mutates this file and writes auth/kubeadmin-password and auth/kubeconfig next to it.

Inside the OpenShift web console

Browsing to the URL printed by the installer (https://console-openshift-console.apps.rhoai-demo.xxxx.dev) and logging in with kubeadmin lands on the cluster Overview. This is the first place I check after every install — it confirms the control plane, operators, and Insights all came up green before any workloads are scheduled.

Figure 4: Red Hat OpenShift 4.22 web console — cluster Overview right after install.

A few things worth noticing in Figure 4:

  • Top banner“You are logged in as a temporary administrative user…” is the cluster reminding you that kubeadmin is bootstrap-only and you should configure a real identity provider via cluster OAuth.
  • Status tiles — Control Plane, Operators, and Insights are all green. Single control plane node matches the controlPlane.replicas: 1 lab topology from the install-config.
  • Details panel — confirms OpenShift version: 4.22.0, Infrastructure provider: AWS, and the cluster API endpoint https://api.rhoai-demo.xxxx.dev:6443 used by oc/kubectl.
  • AlertmanagerReceiversNotConfigured — expected on a fresh cluster. Wire up a receiver (PagerDuty, Slack, email…) before treating any workload as production.

The console’s Getting started resources panel (“Add identity providers”, “Configure alert receivers”, “Take console tour”) is a sensible day-0 checklist — work through it before installing the RHOAI operator on top.

Installing Red Hat OpenShift AI 3.4

With OpenShift 4.22 healthy, RHOAI installs as a stack of operators plus two custom resources — a DSCInitialization (one-time bootstrap) and a DataScienceCluster (the components you actually want enabled). The flow below uses the CLI end-to-end so it is scriptable and easy to re-run.

1. Install the prerequisite operators

Each prerequisite below gets its own namespace, its own OperatorGroup, and a pinned Subscription. This is more verbose than dropping everything into openshift-operators, but it makes each operator’s blast radius and upgrade lifecycle explicit — and gives you something concrete to delete if you ever want to uninstall one cleanly.

Save each manifest to a file and apply with oc apply -f <file>. The OperatorGroup scope (empty spec for AllNamespaces, explicit targetNamespaces for OwnNamespace) matches each operator’s supported install mode.

1.1 Red Hat OpenShift Serverless

Serverless (Knative) is what backs KServe model serving in RHOAI 3.4. Without it, the kserve component will refuse to come up healthy.

op-serverless.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: serverless-operator
  namespace: openshift-serverless
spec:
  channel: stable
  name: serverless-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic
oc apply -f op-serverless.yaml

1.2 Red Hat OpenShift Service Mesh

Service Mesh (Istio) provides the ingress gateway and mTLS plumbing that KServe + the RHOAI dashboard sit behind.

op-servicemesh.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: servicemeshoperator3
  namespace: openshift-operators
spec:
  channel: stable
  installPlanApproval: Automatic
  name: servicemeshoperator3
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: servicemeshoperator3.v3.3.3
oc apply -f op-servicemesh.yaml

1.3 Red Hat OpenShift Pipelines

Pipelines (Tekton) is the engine behind RHOAI’s Data Science Pipelines component.

op-pipelines.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-pipelines-operator-rh
  namespace: openshift-operators
spec:
  channel: latest
  installPlanApproval: Automatic
  name: openshift-pipelines-operator-rh
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: openshift-pipelines-operator-rh.v1.22.2
oc apply -f op-pipelines.yaml

1.4 Authorino

Authorino handles the auth layer in front of KServe inference endpoints.

op-authorino.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: authorino-operator
  namespace: openshift-operators
spec:
  channel: stable
  installPlanApproval: Automatic
  name: authorino-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: authorino-operator.v1.2.2
oc apply -f op-authorino.yaml

1.5 Red Hat cert-manager Operator

cert-manager issues and rotates the TLS certificates that KServe uses for its inference endpoints and that Service Mesh consumes for ingress gateway termination. RHOAI 3.4 will refuse to bring kserve to Ready if no cert-manager CRDs are present in the cluster.

op-cert-manager.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: cert-manager-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-cert-manager-operator
  namespace: cert-manager-operator
spec:
  targetNamespaces:
    - openshift-nfd
  upgradeStrategy: Default
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-cert-manager-operator
  namespace: cert-manager-operator
spec:
  channel: stable-v1
  name: openshift-cert-manager-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic
oc apply -f op-cert-manager.yaml

After the operator is up, create a ClusterIssuer (Let’s Encrypt, an internal CA, or a self-signed one for lab clusters) so KServe and other RHOAI components have somewhere to ask for certificates. You can do this after the RHOAI install — KServe will pick it up when it’s there.

1.6 JobSet Operator

Batch-style group scheduling for distributed training. RHOAI’s Training Operator (PyTorchJob, RayJob, etc.) hands work off to JobSet so all workers come up together (gang scheduling) instead of straggling.

op-jobset.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-jobset-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: jobset-operators
  namespace: openshift-jobset-operator
spec:
  targetNamespaces:
    - openshift-jobset-operator
  upgradeStrategy: Default
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: jobset-operator
  namespace: openshift-jobset-operator
spec:
  channel: stable-v1.0
  installPlanApproval: Automatic
  name: job-set
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: jobset-operator.v1.0.0
oc apply -f op-jobset.yaml

You will need to create an instance of the JobSetOperator CR with name cluster for the RHOAI operator to successfully work.

op-jobset-instance.yaml
---
apiVersion: operator.openshift.io/v1
kind: JobSetOperator
metadata:
  name: cluster
spec:
  logLevel: Normal
  operatorLogLevel: Normal
  managementState: Managed
oc apply -f op-jobset-instance.yaml

1.7 Leader Worker Set Operator

LWS is the workload abstraction RHOAI uses for multi-host inference — splitting a single large model (Llama-405B, Granite-MoE) across multiple GPU nodes with one leader pod orchestrating N worker pods.

op-lws.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-lws-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-lws-operator
  namespace: openshift-lws-operator
spec:
  targetNamespaces:
    - openshift-lws-operator
  upgradeStrategy: Default
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: leader-worker-set
  namespace: openshift-lws-operator
spec:
  channel: stable-v1.0
  installPlanApproval: Automatic
  name: leader-worker-set
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: leader-worker-set.v1.0.0
oc apply -f op-lws.yaml

2. Enable GPUs (Node Feature Discovery + NVIDIA GPU Operator)

Skip this section on a CPU-only cluster. On a GPU worker pool, the NFD operator labels nodes that expose PCI GPU devices, and the NVIDIA GPU operator deploys the driver, container toolkit, and DCGM exporter.

2.1 Node Feature Discovery

NFD ships in AllNamespaces install mode — empty OperatorGroup spec.

op-nfd.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-nfd
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: nfd-operators
  namespace: openshift-nfd
spec:
  targetNamespaces:
    - openshift-nfd
  upgradeStrategy: Default
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: nfd
  namespace: openshift-nfd
spec:
  channel: stable
  name: nfd
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic
oc apply -f op-nfd.yaml

Leave the default name and settings, and click Create NodeFeatureDiscovery and wait for 5–10 minutes for GPU-labeled nodes to appear.

2.2 NVIDIA GPU Operator

The NVIDIA GPU operator ships in OwnNamespace install mode, so the OperatorGroup must list its own namespace under targetNamespaces.

op-nvidia-gpu.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: nvidia-gpu-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: nvidia-gpu-operators
  namespace: nvidia-gpu-operator
spec:
  targetNamespaces:
    - nvidia-gpu-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: gpu-operator-certified
  namespace: nvidia-gpu-operator
spec:
  channel: stable
  name: gpu-operator-certified
  source: certified-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic
oc apply -f op-nvidia-gpu.yaml

2.3 Roll out the GPU driver — ClusterPolicy

Once both CSVs reach Succeeded, you create a ClusterPolicy — the CR that actually rolls out the driver DaemonSet, container toolkit, and DCGM exporter onto every GPU-labeled node.

Leave the default name and settings, and click Create ClusterPolicy and wait for 15 to 20 minutes for the driver to roll out.

3. Install the Red Hat OpenShift AI operator

3.1 Add a 4.21 CatalogSource for rhods-operator

Why this step is necessary. The default redhat-operators CatalogSource shipped with OpenShift 4.22 (registry.redhat.io/redhat/ redhat-operator-index:v4.22) does not yet contain a rhods-operator package. Until that lands, point a new CatalogSource at the 4.21 index — which does ship rhods-operator.v3.4.x — and have the RHOAI Subscription consume it.

This adds a parallel catalog (it does not replace the default redhat-operators), so other operators on the cluster keep tracking the 4.22 index as usual.

catalogsource-rhoai.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: redhat-operators-v4-21
  namespace: openshift-marketplace
spec:
  displayName: Red Hat Operators v4.21
  publisher: Red Hat
  sourceType: grpc
  image: registry.redhat.io/redhat/redhat-operator-index:v4.21
  updateStrategy:
    registryPoll:
      interval: 45m
oc apply -f catalogsource-rhoai.yaml

Remove this CatalogSource (and switch the Subscription below back to redhat-operators) once a future OpenShift 4.22.z ships rhods-operator in the default 4.22 index — otherwise you’ll be tracking a frozen index.

3.2 Subscribe to rhods-operator

RHOAI lives in redhat-ods-operator. Create the namespace, an OperatorGroup, and a pinned Subscription on channel stable-3.4 pointing at the 4.21 CatalogSource you just created.

rhoai-operator.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: redhat-ods-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: rhods-operator
  namespace: redhat-ods-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: rhods-operator
  namespace: redhat-ods-operator
spec:
  channel: stable-3.x # Make sure to use the latest 3.x channel for RHOAI 3.4.
  installPlanApproval: Automatic
  name: rhods-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace-v4.21
  startingCSV: rhods-operator.3.4.0
oc apply -f rhoai-operator.yaml

4. Create components — DataScienceCluster

This is the CR you’ll edit most often: it’s where you turn individual components on or off. The example below enables the full set used in a typical RHOAI demo (dashboard, workbenches, pipelines, model serving, training, Ray, Kueue).

dsc.yaml
apiVersion: datasciencecluster.opendatahub.io/v2
kind: DataScienceCluster
metadata:
  name: default-dsc
spec:
  components:
    aipipelines:
      argoWorkflowsControllers:
        managementState: Removed 
      managementState: Removed
    dashboard:
      managementState: Removed
    feastoperator:
      managementState: Removed
    kserve:
      managementState: Removed
    kueue:
      defaultClusterQueueName: default
      defaultLocalQueueName: default
      managementState: Removed
    llamastackoperator:
      managementState: Removed
    modelregistry:
      managementState: Removed
      registriesNamespace: rhoai-model-registries
    ray:
      managementState: Removed
    trainingoperator:
      managementState: Removed
    trustyai:
      managementState: Removed
    workbenches:
      managementState: Removed
      workbenchNamespace: rhods-notebooks 
oc apply -f dsc.yaml

# Watch components come up
oc get datasciencecluster default-dsc \
  -o jsonpath='{range .status.conditions[*]}{.type}={.status}{"\n"}{end}'

Start narrow — dashboard, workbenches, and datasciencepipelines is enough to take the platform for a spin. Flip components from Removed to Managed later by re-applying this CR.

6. Open the RHOAI dashboard

The operator publishes a Route in redhat-ods-applications. Print it and open it in your browser:

oc get route -n redhat-ods-applications rhods-dashboard \
  -o jsonpath='https://{.spec.host}{"\n"}'

Log in with the same kubeadmin you used for the OpenShift console — once RHOAI sees you have cluster-admin, the dashboard exposes Settings → User management where you can grant data scientist / admin roles to real users from your IdP.

Don’t rely on kubeadmin for RHOAI day-to-day. RHOAI ties workbench ownership and pipeline runs to the logged-in user — when you eventually delete the kubeadmin secret, anything created by it becomes orphaned. Add an IdP before sharing the cluster.

7. Add an HTPasswd identity provider

On a lab cluster, the quickest way to stop leaning on kubeadmin is an HTPasswd identity provider — a flat file of user/bcrypt pairs that OpenShift authenticates against natively. Production clusters should prefer OIDC/LDAP/GitHub, but for a demo this is two minutes of work and unblocks the RHOAI dashboard.

The RHOAI 3.x dashboard refuses to fully render for kubeadmin because that account has no User object in user.openshift.io — it’s synthesised on the fly by OAuth. RHOAI’s workbench/pipeline ownership queries need a real User, hence the need for an HTPasswd (or any other) identity provider before the dashboard “loads normally”.

a. Create the HTPasswd file and the secret it backs

htpasswd -c -B ./htpasswd mladmin
# bcrypt the password interactively, then ship the file to OpenShift:

oc create secret generic htpasswd-secret \
   --from-file=htpasswd=./htpasswd \
   -n openshift-config

The -B flag forces bcrypt (required by OpenShift) and -c creates the file fresh. Drop -c on subsequent runs so you don’t blow away existing entries.

b. Wire the secret into the cluster OAuth CR

oauth-htpasswd.yaml
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
    - name: local_users
      mappingMethod: claim
      type: HTPasswd
      htpasswd:
        fileData:
          name: htpasswd-secret
oc apply -f oauth-htpasswd.yaml

c. Grant cluster-admin to the new user

oc adm policy add-cluster-role-to-user cluster-admin mladmin

d. Wait for the OAuth pods to roll, then re-login

The cluster operator rotates the OAuth server pods after every change to the OAuth CR — it takes 1–2 minutes:

oc get pods -n openshift-authentication -w
# Ctrl-C once the new oauth-openshift-* pods are Running

Log out of kubeadmin in the OpenShift console (top-right → Log out), then log back in as mladmin. The RHOAI dashboard will now load normally and mladmin will own anything it creates.

Once mladmin (or any real user) has cluster-admin, delete the bootstrap account so it can never be used again:

oc delete secret kubeadmin -n kube-system

8. Inside the RHOAI dashboard

Logging back in as mladmin and hitting the rhods-dashboard Route lands on the RHOAI Home view. This is the first place to sanity-check that every DataScienceCluster component you enabled in step 4 came up — each section in the left rail corresponds to one managementState: Managed entry from dsc.yaml.

Figure 5: Red Hat OpenShift AI 3.4 dashboard — landing page right after install, logged in as mladmin.

A few things worth noticing in Figure 5:

  • Left navigationData science projects, Workbenches, Pipelines, Distributed workloads, Models, Resources, Settings. A missing entry means the matching DataScienceCluster component is Removed instead of Managed — re-apply dsc.yaml to enable it.
  • Top-right user menu — should read mladmin, not kubeadmin. If you still see kubeadmin, the OAuth pods from step 7 haven’t rolled yet (oc get pods -n openshift-authentication -w).
  • Settings → User management — only visible because mladmin has cluster-admin. This is where you grant data scientist / admin roles to other HTPasswd users without giving them full cluster privileges.

The first thing most RHOAI walkthroughs do from this screen is Data science projects → Create project and then Create workbench with a PyTorch image — that exercises the workbench controller, the PVC provisioner, and (if you picked a GPU image) the NVIDIA device plugin in one shot.

Important notes & best practices

  • Replace OCP_VERSION with a specific release for reproducible runs.
  • Always pass --dir so re-running and gather commands work consistently.
  • Run downloads from a bastion with stable network connectivity.
  • The single-master controlPlane.replicas: 1 above is a lab topology — production needs three.
ImportantSingle-node OpenShift ≠ Production HA

A controlPlane.replicas: 1 install is fine for a demo cluster but should never carry workloads you can’t lose. For HA, use three control-plane nodes and at least two workers across availability zones.

Troubleshooting

Table 1: Common failure modes and where to look.
Symptom First thing to check
Installer hangs on bootstrap openshift-install gather bootstrap --dir ocp-install
mv fails moving binaries Re-run the mv lines with sudo
AWS API quota errors Verify the EC2/EBS quotas in your target region
DNS resolution fails post-install Verify Route 53 NS records match the hosted zone

Wrap-up

If you treat the script in this post as a starting point rather than a black box — pin versions, audit IAM, and review the install-config.yaml before each run — you’ll get a reproducible OpenShift install you actually understand.

Found a sharp edge? Open an issue at josephassiga/josephassiga.github.io — I keep this post updated as the installer evolves.