Kubeflow Deployment

Kubeflow Deployment

Introduction

This document provides instructions for deploying Kubeflow on Tanzu Kubernetes cluster.

Scope and Steps

Kubeflow provides components for each stage in the machine learning lifecycle, from exploration through to training and deployment. Operators can choose what is best for their users, there is no requirement to deploy every component of Kubeflow.

Prerequisites

NOTE: All prerequisites must be installed and configured before creating the Tanzu Kubernetes cluster.

Perform the following steps:

  1. Download and Install kubectl for vSphere in our validation for Kubeflow version 1.5 of kubectl requires v1.21+.
  2. Make sure you first create a Tanzu Kubernetes cluster and install GPU Operator on your Tanzu Kubernetes cluster in the configuration session.
  3. Install Kustomize for Kubeflow installation

Deploy Kubeflow

We used the manifests for installation, perform the following steps to deploy Kubeflow 1.5.0 on your Tanzu Kubernetes cluster:

  1. The following kubectl command creates a ClusterRoleBinding that grants access to authenticated users to run a privileged set of workloads using the default PSP vmware-system-privileged.

kubectl create clusterrolebinding default-tkg-admin-privileged-binding --clusterrole=psp:vmware-system-privileged --group=system:authenticated

  1. Set the default storageclass for pv claims of kubeflow components such as MinIO and MySQL:

kubectl patch storageclass seletedstorageclassname -p '{"metadata": {"annotations"{"storageclass.kubernetes.io/is-default-class":"true"}}}'

A picture containing chart</p>
<p>Description automatically generated

Figure 1: Set Default Storageclass

  1. Download the scripts to deploy kubeflow by cloning the Github repository:

 git clone https://github.com/kubeflow/manifests.git

 git checkout v1.5-branch 

  1. You can install kubeflow official components by using either of the two options, Install with a single command or Install individual components. Note: Individual components may have dependencies. If all the individual commands are executed, the result is the same as the single command installation.
  2. Verify all the pods are running. The kubectl apply commands may fail on the first try. This is inherent in how Kubernetes and kubectl work. Try to rerun the command until it succeeds.

              To check that all Kubeflow-related pods are ready, use the following commands:

kubectl get pods -n cert-manager

kubectl get pods -n istio-system

kubectl get pods -n auth

kubectl get pods -n knative-eventing

kubectl get pods -n knative-serving

kubectl get pods -n kubeflow

kubectl get pods -n kubeflow-user-example-com

The following diagram shows the pods deployed in the Istio namespace:

kubectl get pod -n istio-system

NAME                                    READY        STATUS.    RESTARTS   AGE

authservice-0                           1/1          RUNNING     0         23h

cluster-local-gateway-7796d7bc87-9qb5v  1/1          Running     0         24h

istio-ingressgateway-64b7899489-ft5gn   1/1          Running     0         24h

istio-5d9bb9cb4-5zvzz                   1/1          Running     0         24h

Figure 2: Pods in istio-system Namespace

Figure 3 shows the pods deployed in the kubeflow namespace:

Table</p>
<p>Description automatically generated

Figure 3: Pods in Kubeflow Namespace

  1. Access the Kubeflow central dashboard:
  • Option 1:  Port forward: The default way of accessing Kubeflow is via port-forward.

  kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

                    Example:  http://localhost:8080

  • Option 2: NodePort/LoadBalancer/Ingress: since many of the Kubeflow web apps (for example, Tensorboard Web App, Jupyter Web App, Katib UI) use secure cookies, we need to set up HTTPS.

We can access the dashboard using the LoadBalancer external IP address

  • Change the type of the istio-ingressgateway service to LoadBalancer:

kubectl -n istio-system patch service istio-ingressgateway -p '{"spec": {"type": "LoadBalancer"}}'

kubectl get svc -n istio-system

NAME            TYPE            CLUSTER-IP     EXTERNAL-IP    PORT (S)

Authservice   ClusterIP       10.100.82.68    ‹none>         8080/TCP

cluster-local-gateway ClusterIP 10.101.213.134 ‹none> 15020/TCP,80/TCP

istio-ingressgateway LoadBalancer 10.104.45.33 172.16.20.72 15021:32506/TCP,80:31917/TCP,443:32332/TCP,314

istiod ClusterIP  10.103.211.151 (none>

5010/TCP.15012/TCP,443/TCP,15014/TCP

knative-local-gateway ClusterIP 10.111.221.131 ‹none> 80/TCP

Figure 4: Change istio-ingressgateway Service Type to Loadbalancer

And make changes to set up HTTPS configuration.

Configure HTTPS

Make the following changes:

  • Update Istio Gateway to expose port 443 with HTTPS and make port 80 redirected to 443:

kubectl -n kubeflow edit gateways.networking.istio.io kubeflow-gateway

servers:

- hosts:

  -“*”

  port:

    name: http

    number: 80

    protocol: HTTP

  tls:

    httpsRedirect: true

-hosts:

 -“*”

 port:

   name: https

   number: 443

   protocol: HTTPS

tls:

  mode: SIMPLE

  privatekey:/etc/istio/ingressgateway-certs/tls.key

  serverCertificate:/etc/istio/ingressgateway-certs/tls.crt 

 Figure 5: Update istio Gateway Attributes

  • Change the REDIRECT_URL in oidc-authservice-parameters configmap.

               In our example, 172.16.20.72 is the IP address of the istio-ingressgateway.

               kubectl -n istio-system edit configmap oidc-authservice-parameters

               OIDC SCOPES: profile email groups

         PORT: ‘"8080”’

         REDIRECT URL: https://172.16.20.72/login/oide

         SKIP AUTH URI: / dex

         STORE PATH: /var/lib/authservice/data.db        

Figure 6: Change REDIRECT_URL to Loadbalancer IP Address

                 Append the same to the redirectURIs list in dex configmap:

                  kubectl -n auth edit configmap dex

  • Rollout restart authservice and dex

        kubectl -n istio-system rollout restart statefulset authservice

                 kubectl -n auth rollout restart deployment dex

  • Create a certificate.yaml with the YAML in Figure 7 to create a self-signed certificate:

        kubectl -n istio-system apply -f certificate.yaml

apiVersion:

cert-manager.io/vlalpha2

kind: Certificate

metadata:

name: istio-ingressgateway-certs

namespace: istio-system

spec:

commonName: istio-ingressgateway.istio-system.svc

ipAddresses:

- 172.16.20.72

isCA: true

issuerRef:

kind: ClusterIssuer

name: kubeflow-self-signing-issuer

secretName:istio-ingressgateway-certs              

Figure 7: Create istio-ingressgateway Certificate

  • We can access the Kubeflow Central Dashboard from https:// IP address of the istio-ingressgateway.

                                    Graphical user interface, application</p>
<p>Description automatically generated

Figure 8: Kubeflow Login Page

Log in with the default user's credential. The default email address is user@example.com and the default password is 12341234. The default user’s namespace is Kubeflow-user-example-com.

-image-20220601000249-1

Figure 9: Kubeflow Central Dashboard

Add New Users

Add new user: users are managed by Kubeflow profile module:

      cat <<EOF | kubectl apply -f

      apiVersion: kubeflow.org/v1beta1

      kind: Profile

      metadata:

      name: newuser’s namesmespacename   # replace with the name of profile you want

      spec:

      owner:

      kind: User

      name: newuser@example.com   # replace with the user email

      EOF

Add the user credentials in dex in Kubeflow for basic authentication. Generate the hash by using bcrypt in the dex configmap:

kubectl edit cm dex -o yaml -n auth

Add the new user under the staticPasswords section: 

-email: newuser@example.com

 hash: $2v$12$4K/VkmDdla10rb3xAt82zu8qk7Ad6ReFR4ICP9UeYE9ONLiN9D£72

 username: newuser

Figure 10: Add New User in Dex Configmap

For more information, refer to Kubeflow Getting Started.

Filter Tags

Document