Kubeflow series: Solve one-click installation of kubeflow based on Ali cloud mirror in China

Environmental preparation kubeflow is very demanding for the environment. See the official requirements:at least one worker node with a minimum of: 4 ...
Environmental preparation
kustomize
Process of using machine learning kit
Modify the kustomize file
One-click Installation

Environmental preparation

kubeflow is very demanding for the environment. See the official requirements:
at least one worker node with a minimum of:

  • 4 CPU
  • 50 GB storage
  • 12 GB memory

Of course, you can install if you don't, but you will have resource problems later on because this is the full package.

A kubernetes cluster is already installed, and here I'm using a cluster installed by rancher.

sudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher

Here I choose version 1.14 of k8s, which is compatible between kubeflow and k8s. Official description Here, my kubeflow is version 0.6.

You can also create Aliyun kubernetes directly (remember to choose version 1.14):

If you want to install it directly, you can adjust it to One-click installation of kubeflow

kustomize

Download the kustomize file

The official tutorials are kfclt Installed, kfclt is essentially installed using kustomize, so here I download the kustomize file directly and install it by modifying the mirror.

Official kustomize file Download Address

git clone https://github.com/kubeflow/manifests cd manifests git checkout v0.6-branch cd <target>/base kubectl kustomize . | tee <output file>

There are many files, which can be exported separately by script or generated by kfctl generate all -V using the kfctl command:

kustomize/ ├── ambassador.yaml ├── api-service.yaml ├── argo.yaml ├── centraldashboard.yaml ├── jupyter-web-app.yaml ├── katib.yaml ├── metacontroller.yaml ├── minio.yaml ├── mysql.yaml ├── notebook-controller.yaml ├── persistent-agent.yaml ├── pipelines-runner.yaml ├── pipelines-ui.yaml ├── pipelines-viewer.yaml ├── pytorch-operator.yaml ├── scheduledworkflow.yaml ├── tensorboard.yaml └── tf-job-operator.yaml

ambassador Micro Service Gateway
argo for task workflow organization
Dashboard Kanban page for central dashboard kubeflow
tf-job-operator Deep Learning Framework Engine, a CRD built on tensorflow. Resource type kind is TFJob
katib superparametric server

Process of using machine learning kit

Modify the kustomize file

Modify the kustomize image

Modify Mirror:

grc_image = [ "gcr.io/kubeflow-images-public/ingress-setup:latest", "gcr.io/kubeflow-images-public/admission-webhook:v20190520-v0-139-gcee39dbc-dirty-0d8f4c", "gcr.io/kubeflow-images-public/kubernetes-sigs/application:1.0-beta", "gcr.io/kubeflow-images-public/centraldashboard:v20190823-v0.6.0-rc.0-69-gcb7dab59", "gcr.io/kubeflow-images-public/jupyter-web-app:9419d4d", "gcr.io/kubeflow-images-public/katib/v1alpha2/katib-controller:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/katib-manager:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/katib-manager-rest:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-bayesianoptimization:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-grid:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-hyperband:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-nasrl:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-random:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/katib/v1alpha2/katib-ui:v0.6.0-rc.0", "gcr.io/kubeflow-images-public/metadata:v0.1.8", "gcr.io/kubeflow-images-public/metadata-frontend:v0.1.8", "gcr.io/ml-pipeline/api-server:0.1.23", "gcr.io/ml-pipeline/persistenceagent:0.1.23", "gcr.io/ml-pipeline/scheduledworkflow:0.1.23", "gcr.io/ml-pipeline/frontend:0.1.23", "gcr.io/ml-pipeline/viewer-crd-controller:0.1.23", "gcr.io/kubeflow-images-public/notebook-controller:v20190603-v0-175-geeca4530-e3b0c4", "gcr.io/kubeflow-images-public/profile-controller:v20190619-v0-219-gbd3daa8c-dirty-1ced0e", "gcr.io/kubeflow-images-public/kfam:v20190612-v0-170-ga06cdb79-dirty-a33ee4", "gcr.io/kubeflow-images-public/pytorch-operator:v1.0.0-rc.0", "gcr.io/google_containers/spartakus-amd64:v1.1.0", "gcr.io/kubeflow-images-public/tf_operator:v0.6.0.rc0", "gcr.io/arrikto/kubeflow/oidc-authservice:v0.2" ] doc_image = [ "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.ingress-setup:latest", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.admission-webhook:v20190520-v0-139-gcee39dbc-dirty-0d8f4c", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.kubernetes-sigs.application:1.0-beta", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.centraldashboard:v20190823-v0.6.0-rc.0-69-gcb7dab59", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.jupyter-web-app:9419d4d", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.katib-controller:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.katib-manager:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.katib-manager-rest:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.suggestion-bayesianoptimization:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.suggestion-grid:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.suggestion-hyperband:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.suggestion-nasrl:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.suggestion-random:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.katib.v1alpha2.katib-ui:v0.6.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.metadata:v0.1.8", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.metadata-frontend:v0.1.8", "registry.cn-shenzhen.aliyuncs.com/shikanon/ml-pipeline.api-server:0.1.23", "registry.cn-shenzhen.aliyuncs.com/shikanon/ml-pipeline.persistenceagent:0.1.23", "registry.cn-shenzhen.aliyuncs.com/shikanon/ml-pipeline.scheduledworkflow:0.1.23", "registry.cn-shenzhen.aliyuncs.com/shikanon/ml-pipeline.frontend:0.1.23", "registry.cn-shenzhen.aliyuncs.com/shikanon/ml-pipeline.viewer-crd-controller:0.1.23", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.notebook-controller:v20190603-v0-175-geeca4530-e3b0c4", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.profile-controller:v20190619-v0-219-gbd3daa8c-dirty-1ced0e", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.kfam:v20190612-v0-170-ga06cdb79-dirty-a33ee4", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.pytorch-operator:v1.0.0-rc.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/google_containers.spartakus-amd64:v1.1.0", "registry.cn-shenzhen.aliyuncs.com/shikanon/kubeflow-images-public.tf_operator:v0.6.0.rc0", "registry.cn-shenzhen.aliyuncs.com/shikanon/arrikto.kubeflow.oidc-authservice:v0.2" ]
Modify PVC to use dynamic storage

Modify pvc storage using local-path-provisioner Dynamic allocation of PV.

Install local-path-provisioner:

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml

If you want to use it directly in kubeflow, you also need to change the StorageClass to the default storage:

... apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: local-path annotations: #Add as Default StorageClass storageclass.beta.kubernetes.io/is-default-class: "true" provisioner: rancher.io/local-path volumeBindingMode: WaitForFirstConsumer reclaimPolicy: Delete ...

When finished, you can try building a PVC:

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: local-path-pvc namespace: default spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Gi

Note: If you do not set the default storageclass, you need to bind PVC with storageClassName: local-path

One-click Installation

Here I made a one-click launch of the National Endoscopic kubeflow project:
https://github.com/shikanon/kubeflow-manifests

4 January 2020, 14:53 | Views: 5794

Add new comment

For adding a comment, please log in
or create account

0 comments