有狀態(tài)部署慢?使用 openkruise 實(shí)現(xiàn)容器應(yīng)用固定ID
背景說明
我們在業(yè)務(wù)上容器的過程中遇到了如下問題:
- 以 deployment 部署的應(yīng)用 pod,由于 id 經(jīng)常變更,服務(wù)重啟,監(jiān)控變得難以維護(hù)。這里只是以監(jiān)控為切入點(diǎn),事實(shí)上,還有諸多應(yīng)用需要與id強(qiáng)綁定。
- statefulset 可以解決上面的問題,但是引入一個(gè)新的問題就是 statefulset 本身為了維護(hù)有狀態(tài)的應(yīng)用,所有的應(yīng)用 Pod 啟動(dòng)是有嚴(yán)格的先后順序,也就是串行啟動(dòng),對(duì)于大規(guī)模的應(yīng)用 pod 來講,啟動(dòng)消耗時(shí)間太長,這是無法忍受的。
為解決以上問題,我們在容器平臺(tái)當(dāng)中引入了 openkruise。
openkruise簡介
項(xiàng)目地址:https://github.com/openkruise/kruise
詳細(xì)的說明可以參考這篇文章:
https://yq.aliyun.com/articles/706442
從當(dāng)前 github 上的文檔來看,目前 OpenKruise 支持五種改進(jìn)的控制器:
- CloneSet: CloneSet is a workload that mainly focuses on managing stateless applications. It provides full features for more efficient, deterministic and controlled deployment, such as inplace update, specified pod deletion, configurable priority/scatter update, preUpdate/postUpdate hooks.
- Advanced StatefulSet: An enhanced version of default StatefulSet with extra functionalities such as inplace-update, pause and MaxUnavailable.
- SidecarSet: A controller that injects sidecar containers into the Pod spec based on selectors and also is able to upgrade the sidecar containers.
- UnitedDeployment: This controller manages application pods spread in multiple fault domains by using multiple workloads.
- BroadcastJob: A job that runs Pods to completion across all the nodes in the cluster.
UnitedDeployment 是在 StatefulSet 基礎(chǔ)上的更高級(jí)抽象,通過一個(gè)資源描述可以管理多個(gè) StatefulSet 的實(shí)例組,可實(shí)現(xiàn)多實(shí)例組的灰度發(fā)布與滾動(dòng)升級(jí)。
Broadcast Job 實(shí)際上就是以 DaemonSet 的方式在所有節(jié)點(diǎn)上運(yùn)行一次性 Job,SidercarSet 用于 Sidercar 注入及管理。
而我們要使用到的正是其 Advanced StatefulSet 的特性。關(guān)于Advanced StatefulSet更詳細(xì)的描述如下:
在kubernetes官方的statefulSet上做了功能擴(kuò)展,更新策略由原來的只支持recreate,擴(kuò)展為同時(shí)支持recreate和rollingupdate。rollingupdate還支持兩種策略,一種是InPlaceIfPossible,另一種是InPlaceOnly。InPlaceIfPossible會(huì)盡可能的保證應(yīng)用在原地升級(jí)(只支持鏡像的升級(jí),如果修改了yaml中的其他配置項(xiàng),則無法保證);InPlaceOnly會(huì)保證應(yīng)用一定在原地升級(jí),但是它也只支持鏡像的升級(jí),如果修改了yaml中的其他配置項(xiàng),會(huì)直接拋出異常。另外,原生的StatefulSet只能做到串行啟動(dòng),Advanced StatefulSet可以做到并行啟動(dòng)。
部署openkruise
官方的安裝文檔可以直接參考這里:
https://github.com/openkruise/kruise/tree/master/docs/tutorial
我簡單寫下安裝步驟:
- wget https://github.com/openkruise/kruise/releases/download/v0.4.0/kruise-chart.tgz
- tar xf kruise-chart.tgz
- cd kruise
- helm install openkruise ./ -n kube-system
目前openkruise已經(jīng)更新到了v0.5.0的版本。也可以直接通過阿里云的應(yīng)用目錄來完成其安裝。
下面說一下更詳細(xì)的安裝過程:
1、獲取helm包
- helm repo add incubator http://aliacs-k8s-cn-beijing.oss-cn-beijing.aliyuncs.com/app/charts-incubator/
- helm search repo ack-kruise
- helm fetch incubator/ack-kruise
- tar xf ack-kruise-0.5.0.tgz
- cd ack-kruise
修改values.yml文件如下:
- # Default values for kruise.
- revisionHistoryLimit: 3
- manager:
- # settings for log print
- log:
- # log level for kruise-manager
- level: "4"
- # image settings
- image:
- # repository for kruise-manager image
- repository: hub.example.com/library/kruise-manager
- # tag for kruise-manager image
- tag: v0.5.0
- # resources of kruise-manager container
- resources:
- limits:
- cpu: 500m
- memory: 1Gi
- requests:
- cpu: 500m
- memory: 1Gi
- metrics:
- addr: localhost
- port: 8080
- custom_resource_enable: StatefulSet
其實(shí)這里就改了兩個(gè)東西:
- image:默認(rèn)是docker hub上的地址,我這里改到了私有鏡像倉庫
- custom_resource_enable:用于指定啟用哪幾種資源,如果不指定的話,openkruise支持的五種資源會(huì)全部啟用,我這里只用到了StatefulSet,所以這里只啟用了這一種資源
然后執(zhí)行安裝操作:
- helm install ack-kruise -n kube-system ./
安裝完后,會(huì)生成以下五種crd:
- # kubectl get crds |grep kruise
- broadcastjobs.apps.kruise.io 2020-04-26T10:29:28Z
- clonesets.apps.kruise.io 2020-04-26T10:29:28Z
- sidecarsets.apps.kruise.io 2020-04-26T10:29:28Z
- statefulsets.apps.kruise.io 2020-04-26T10:29:28Z
- uniteddeployments.apps.kruise.io 2020-04-26T10:29:28Z
同時(shí)會(huì)創(chuàng)建一個(gè) kruise-system 的命名空間,并在里面生成一個(gè) pod:
- # kubectl get pods -n kruise-system
- NAME READY STATUS RESTARTS AGE
- kruise-controller-manager-0 1/1 Running 0 55m
驗(yàn)證 statefulset 資源的 webhook 是否被正常創(chuàng)建:
- # kubectl get mutatingwebhookconfiguration -o yaml
- apiVersion: v1
- items:
- - apiVersion: admissionregistration.k8s.io/v1
- kind: MutatingWebhookConfiguration
- metadata:
- creationTimestamp: "2020-04-26T10:29:28Z"
- generation: 3
- name: kruise-mutating-webhook-configuration
- resourceVersion: "622944921"
- selfLink: /apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations/kruise-mutating-webhook-configuration
- uid: 303a7b7f-3a62-49d7-8ef6-082ea288eeb2
- webhooks:
- - admissionReviewVersions:
- - v1beta1
- clientConfig:
- caBundle: xxxxx
- service:
- name: kruise-webhook-server-service
- namespace: kruise-system
- path: /mutating-create-update-statefulset
- port: 443
- failurePolicy: Fail
- matchPolicy: Exact
- name: mutating-create-update-statefulset.kruise.io
- namespaceSelector:
- matchExpressions:
- - key: control-plane
- operator: DoesNotExist
- objectSelector: {}
- reinvocationPolicy: Never
- rules:
- - apiGroups:
- - apps.kruise.io
- apiVersions:
- - v1alpha1
- operations:
- - CREATE
- - UPDATE
- resources:
- - statefulsets
- scope: '*'
- sideEffects: Unknown
- timeoutSeconds: 30
- ......
也是確保其他未用到的相關(guān) mutatingwebhook 是關(guān)閉的。。在實(shí)際測試中,SidecarSet 資源的 mutatingwebhook 可能會(huì)導(dǎo)致創(chuàng)建的 pod 出不來。
這些webhook本質(zhì)上都是kubernetes的admissioncontrol,只要你安裝了,哪怕沒有使用,當(dāng)你在執(zhí)行相關(guān)操作時(shí),都需要被所有的adminssioncontrol檢測,如果admissioncontrol本身出了問題,就會(huì)導(dǎo)致請求無法響應(yīng)的狀態(tài)。同時(shí)這些webhook類型的adminssioncontrol也會(huì)拖慢響應(yīng)速度。
用法示例
下面是官方提供的一個(gè)基于 openkruise 提供的 statefulset 資源的部署文件示例:
- apiVersion: apps.kruise.io/v1alpha1
- kind: StatefulSet
- metadata:
- name: demo-v1-guestbook-kruise
- labels:
- app.kubernetes.io/name: guestbook-kruise
- app.kubernetes.io/instance: demo-v1
- spec:
- replicas: 3
- serviceName: demo-v1-guestbook-kruise
- selector:
- matchLabels:
- app.kubernetes.io/name: guestbook-kruise
- app.kubernetes.io/instance: demo-v1
- template:
- metadata:
- labels:
- app.kubernetes.io/name: guestbook-kruise
- app.kubernetes.io/instance: demo-v1
- spec:
- readinessGates:
- # A new condition that ensures the pod remains at NotReady state while the in-place update is happening
- - conditionType: InPlaceUpdateReady
- containers:
- - name: guestbook-kruise
- image: openkruise/guestbook:v1
- imagePullPolicy: Always
- ports:
- - name: http-server
- containerPort: 3000
- podManagementPolicy: Parallel # allow parallel updates, works together with maxUnavailable
- updateStrategy:
- type: RollingUpdate
- rollingUpdate:
- # Do in-place update if possible, currently only image update is supported for in-place update
- podUpdatePolicy: InPlaceIfPossible
- # Allow parallel updates with max number of unavailable instances equals to 2
- maxUnavailable: 3
執(zhí)行部署之后,啟動(dòng) pod 示例如下:
- # kubectl get pods |grep demo-v1
- demo-v1-guestbook-kruise-0 1/1 Running 0 62s
- demo-v1-guestbook-kruise-1 1/1 Running 0 62s
- demo-v1-guestbook-kruise-2 1/1 Running 0 62s
也可通過如下操作查看資源狀態(tài):
- # kubectl get sts.apps.kruise.io
- NAME DESIRED CURRENT UPDATED READY AGE
- demo-v1-guestbook-kruise 3 3 3 3 56s
- openkruise提供的statefulset的資源名為sts.apps.kruise.io
更詳細(xì)的用法可參考:
Advanced StatefulSet具體的使用方法:https://github.com/openkruise/kruise/blob/master/docs/concepts/astatefulset/README.md
Advanced StatefulSet示例文件:https://github.com/openkruise/kruise/blob/master/docs/tutorial/v1/guestbook-statefulset.yaml
UnitedDeployment具體的使用方法:https://github.com/openkruise/kruise/blob/master/docs/tutorial/uniteddeployment.md
UnitedDeployment示例文件:https://raw.githubusercontent.com/kruiseio/kruise/master/docs/tutorial/v1/uniteddeployment.yaml