Upgrade the existing Kaptain installation to a newer version.
Prerequisites
Before you begin:
-
Install Kaptain 1.2.0 on a Konvoy cluster.
-
Ensure the existing cluster meets the criteria listed in the installation documentation.
-
Download the
kubeflow-1.4.0_1.3.0.tgz
tarball from the support website. -
Ensure the following base addons that are needed by Kaptain are enabled in your Konvoy cluster:
- configRepository: https://github.com/mesosphere/kubernetes-base-addons configVersion: stable-1.20-4.3.0 addonsList: - name: istio enabled: true - name: dex enabled: true - name: cert-manager enabled: true - name: prometheus enabled: true
-
Ensure the Kaptain addon repository is present in your Konvoy
cluster.yaml
:- configRepository: https://github.com/mesosphere/kubeaddons-kaptain configVersion: stable-1.20-1.4.0 addonsList: - name: knative enabled: true
Upgrade Kaptain
Run the upgrade according to your Konvoy version:
For Konvoy 1.x:
- Add these properties to your
params.yaml
file:
dkpPlatformVersion: "1"
installMinioOperator: "true"
- Upgrade Kaptain:
kubectl kudo upgrade --instance kaptain --namespace kubeflow ./kubeflow-1.4.0_1.3.0.tgz -P params.yaml
For Konvoy 2.x:
- Upgrade Kaptain:
kubectl kudo upgrade --instance kaptain --namespace kubeflow ./kubeflow-1.4.0_1.3.0.tgz
- Monitor the upgrade process by running:
kubectl kudo plan status --instance kaptain --namespace kubeflow
-
Log in to Kaptain after the upgrade completes.
-
Locate the cluster endpoint and copy it to your clipboard.
If you are running Kaptain on-premises, use this command:
kf_uri=$(kubectl get svc kubeflow-ingressgateway --namespace kubeflow -o jsonpath="{.status.loadBalancer.ingress[*].ip}") && echo "https://${kf_uri}"
OR If you are running Kaptain on AWS, use this command:
kf_uri=$(kubectl get svc kubeflow-ingressgateway --namespace kubeflow -o jsonpath="{.status.loadBalancer.ingress[*].hostname}") && echo "https://${kf_uri}"
- Obtain the login credentials from Konvoy to authenticate:
konvoy get ops-portal
Workloads behavior during the upgrades
- When upgrading from Kaptain version
1.2.0
to1.3.0
the following workloads do not require stopping and can proceed without interruption during the upgrade: Jupyter Notebooks, Training Jobs (TFJob
,PyTorchJob
,MXNetJob
), KatibExperiments
andTrials
, andSparkApplications
), Kubeflow Pipelines.