Statefulset

StatefulSet类似于ReplicaSet,但是它可以处理Pod的启动顺序,为保留每个Pod的状态设置唯一标识,同时具有以下功能:

  • 稳定的、唯一的网络标识符

  • 稳定的、持久化的存储

  • 有序的、优雅的部署和缩放

  • 有序的、优雅的删除和终止

  • 有序的、自动滚动更新

StatefulSet 中 Pod 的 DNS 格式:

statefulSetName-{0..N-1}.serviceName.namespace.svc.cluster.local

其中

  • serviceName 为 Headless Service 的名字

  • 0..N-1 为 Pod 所在的序号,从 0 开始到 N-1

  • statefulSetName 为 StatefulSet 的名字

  • namespace 为服务所在的 namespace,Headless Service 和 StatefulSet 必须在相同的 namespace

  • .cluster.local 为 Cluster Domain

1 实践

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: {{ template "fullname" . }}
  namespace: {{ .Release.Namespace }}
  labels:
    app: {{ template "fullname" . }}
    chart: {{ template "xdb.chart" . }}
    release: {{ .Release.Name | quote }}
    heritage: {{ .Release.Service | quote }}
  {{- with .Values.statefulsetAnnotations }}
  annotations:
{{ toYaml . | indent 4 }}
  {{- end }}

spec:
  serviceName: {{ template "fullname" . }}
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: {{ template "fullname" . }}
      release: {{ .Release.Name }}
  template:
    metadata:
      labels:
        app: {{ template "fullname" . }}
        release: {{ .Release.Name }}
        role: candidate
        {{- with .Values.podLabels }}
{{ toYaml . | indent 8 }}
        {{- end }}
      annotations:
        checksum/config: {{include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
        {{- with .Values.podAnnotations }}
{{ toYaml . | indent 8 }}
        {{- end }}
    spec:
      {{- if .Values.schedulerName }}
      schedulerName: "{{ .Values.schedulerName }}"
      {{- end }}
      serviceAccountName: {{ template "serviceAccountName" . }}
      {{- with .Values.nodeSelector }}
      nodeSelector:
{{ toYaml . | indent 8 }}
      {{- end }}
      {{- with .Values.tolerations }}
      tolerations:
{{ toYaml . | indent 8 }}
      {{- end }}
      affinity:
      {{- if .Values.affinity }}
{{ toYaml .Values.affinity  | indent 8 }}
      {{- end }}
      # 这里的反亲和性控制数据库服务绝对不调度到同机器上
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: "release"
                operator: In
                values:
                - {{ .Release.Name }}
              - key: app
                operator: In
                values:
                - {{ template "fullname" . }}
            topologyKey: "kubernetes.io/hostname"
      initContainers:
      - name: init-mysql
        image: "{{ .Values.mysql.image }}:{{ .Values.mysql.tag }}"
        imagePullPolicy: {{ .Values.imagePullPolicy | quote }}
        resources:
{{ toYaml .Values.resources | indent 10 }}
        command: ['sh','-c']
        args:
          - |
            # Workaround for kylin4/10's docker start with umask 027.
            umask 0022
            # Generate mysql server-id from pod ordinal index.
            ordinal=$(echo $(hostname) | tr -cd "[1-9]")
            # Copy server-id.conf adding offset to avoid reserved server-id=0 value.
            cat /mnt/config-map/server-id.cnf | sed s/@@SERVER_ID@@/$((100 + $ordinal))/g > /mnt/conf.d/server-id.cnf
            # Copy appropriate conf.d files from config-map to config mount.
            cp -f /mnt/config-map/node.cnf /mnt/conf.d/
            cp -f /mnt/config-map/*.sh /mnt/scripts/
            chmod +x /mnt/scripts/*
            /init-container.sh {{ .Release.Name }}
            {{- if .Values.persistence.enabled }}
            # remove lost+found.
            rm -rf /mnt/data/lost+found
            {{- end }}
        volumeMounts:
          - name: conf
            mountPath: /mnt/conf.d
          - name: scripts
            mountPath: /mnt/scripts
          - name: config-map
            mountPath: /mnt/config-map
          {{- if .Values.persistence.enabled }}
          - name: data
            mountPath: /mnt/data
          {{- end }}
      containers:
      - name: mysql
        image: "{{ .Values.mysql.image }}:{{ .Values.mysql.tag }}"
        imagePullPolicy: {{ .Values.imagePullPolicy | quote }}
        lifecycle:
          preStop:
            exec:
              command:
                - sh
                - -c
                - /mysql/bin/mysqladmin --defaults-file=/mysql/etc/user.root.cnf shutdown && sleep 10
        command:
          - sh
          - -c
          - umask 0022 && sleep 30 && /docker-entrypoint.sh mysqld {{ .Release.Name }}
        {{- with .Values.mysql.args }}
        args:
        {{- range . }}
          - {{ . | quote }}
        {{- end }}
        {{- end }}
        resources:
{{ toYaml .Values.mysql.resources | indent 10 }}
        env:
        # namespace
        - name: {{ (print $.Release.Name "_NAMESPACE") }}
          value: {{ .Release.Namespace }}
        # HAStrategy
        - name: {{ (print $.Release.Name "_SQL_RUNNING_ABNORMAL_JOIN_TOPO") }}
          {{- if not .Values.mysql.HAStrategy.sqlRunningAbnormalJoinTopo }}
          value: "true"
          {{- else }}
          value: {{ .Values.mysql.HAStrategy.sqlRunningAbnormalJoinTopo | quote }}
          {{- end }}
        - name: {{ (print $.Release.Name "_GTID_ABNORMAL_JOIN_TOPO") }}
          {{- if not .Values.mysql.HAStrategy.gtidAbnormalJoinTopo }}
          value: "true"
          {{- else }}
          value: {{ .Values.mysql.HAStrategy.gtidAbnormalJoinTopo | quote }}
          {{- end }}
        - name: {{ (print $.Release.Name "_NOT_IN_TOPO_JOIN_EXCHANGE") }}
          {{- if not .Values.mysql.HAStrategy.notInTopoJoinExchange }}
          value: "true"
          {{- else }}
          value: {{ .Values.mysql.HAStrategy.notInTopoJoinExchange | quote }}
          {{- end }}
        - name: {{ (print $.Release.Name "_MASTER_SWITCH_TO_SLAVE_KILL_SESSION") }}
          {{- if not .Values.mysql.HAStrategy.masterSwitchToSlaveKillSession }}
          value: "false"
          {{- else }}
          value: {{ .Values.mysql.HAStrategy.masterSwitchToSlaveKillSession | quote }}
          {{- end }}
        # resource params
        - name: {{ (print $.Release.Name "_CPU_LIMIT") }}
          {{- if not .Values.mysql.resources.limits }}
          value: "0"
          {{- else if not .Values.mysql.resources.limits.cpu }}
          value: "0"
          {{- else }}
          value: {{ .Values.mysql.resources.limits.cpu }}
          {{- end }}
        - name: {{ (print $.Release.Name "_MEMORY_LIMIT") }}
          {{- if not .Values.mysql.resources.limits }}
          value: "0"
          {{- else if not .Values.mysql.resources.limits.memory }}
          value: "0"
          {{- else }}
          value: {{ .Values.mysql.resources.limits.memory }}
          {{- end }}
        - name: {{ (print $.Release.Name "_DISK_LIMIT") }}
          {{- if not .Values.persistence.enabled }}
          value: "0"
          {{- else if not .Values.persistence.size }}
          value: "0"
          {{- else }}
          value: {{ .Values.persistence.size }}
          {{- end }}
        {{- if .Values.timezone }}
        - name: TZ
          value: {{ .Values.timezone }}
        {{- end }}
        {{- if .Values.mysql.extraEnvVars }}
{{ tpl .Values.mysql.extraEnvVars . | indent 8 }}
        {{- end }}
        - name: {{ (print $.Release.Name "_MYSQL_ROOT_PASSWORD") }}
          valueFrom:
            secretKeyRef:
              name: {{ template "fullname" . }}
              key: mysql-root-password
        # backup env params
        - name: {{ (print $.Release.Name "_BACKUP_ENABLED") }}
          value: {{ .Values.backup.enabled | quote }}
        - name: {{ (print $.Release.Name "_SCHEDULE") }}
          value: {{ .Values.backup.cronjob.schedule | quote }}
        - name: {{ (print $.Release.Name "_LOCAL_ENABLED") }}
          {{- if .Values.backup.storage.hostPath.enabled  }}
          value: "true"
          {{- else }}
          value: "false"
          {{- end }}
        - name: {{ (print $.Release.Name "_LOCAL_PATH") }}
          value: "/data/backup"
        - name: {{ (print $.Release.Name "_LOCAL_SAVE_HOURS") }}
          {{- if .Values.backup.storage.hostPath.enabled }}
          value: {{ .Values.backup.storage.hostPath.saveHours | quote }}
          {{- else if .Values.backup.storage.persistence.enabled }}
          value: {{ .Values.backup.storage.persistence.saveHours | quote }}
          {{- else }}
          value: "0"
          {{- end}}
        - name: {{ (print $.Release.Name "_S3_ENABLED") }}
          {{- if or .Values.backup.storage.s3.enabled}}
          value: "true"
          {{- else }}
          value: "false"
          {{- end }}
        - name: {{ (print $.Release.Name "_S3_ADDRESS") }}
          value: {{ .Values.backup.storage.s3.address  | quote }}
        - name: {{ (print $.Release.Name "_S3_BUCKET") }}
          value: {{ .Values.backup.storage.s3.bucket | quote }}
        - name: {{ (print $.Release.Name "_S3_PATH") }}
          value: {{ .Release.Name }}
        - name: {{ (print $.Release.Name "_S3_AK") }}
          value: {{ .Values.backup.storage.s3.accessKey  | quote }}
        - name: {{ (print $.Release.Name "_S3_SK") }}
          value: {{ .Values.backup.storage.s3.secretKey  | quote }}
        - name: {{ (print $.Release.Name "_S3_SAVE_HOURS") }}
          value: {{ .Values.backup.storage.s3.saveHours | quote }}
        ports:
        - name: mysql
          containerPort: 3306
        - name: xagent
          containerPort: 8500
        - name: xagent-sync
          containerPort: 8501
        volumeMounts:
        - name: data
          mountPath: /mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        - name: logs
          mountPath: /var/log/mysql
        {{- if .Values.mysql.initializationFiles }}
        - name: initialization
          mountPath: /docker-entrypoint-initdb.d
        {{- end }}
        {{- if and .Values.backup.enabled }}
        - name: backup-data
          mountPath: /data/backup
        {{- end }}
        livenessProbe:
          exec:
            command:
            {{- if .Values.mysql.allowEmptyRootPassword }}
            - sh
            - -c
            - /mysql/bin/mysqladmin --defaults-file=/mysql/etc/user.root.cnf ping
            {{- else }}
            - sh
            - -c
            - /mysql/bin/mysqladmin --defaults-file=/mysql/etc/user.root.cnf ping
            {{- end }}
          initialDelaySeconds: {{ .Values.mysql.livenessProbe.initialDelaySeconds }}
          periodSeconds: {{ .Values.mysql.livenessProbe.periodSeconds }}
          timeoutSeconds: {{ .Values.mysql.livenessProbe.timeoutSeconds }}
          successThreshold: {{ .Values.mysql.livenessProbe.successThreshold }}
          failureThreshold: {{ .Values.mysql.livenessProbe.failureThreshold }}
        readinessProbe:
          exec:
            command:
            {{- if .Values.mysql.allowEmptyRootPassword }}
            - sh
            - -c
            - /mysql/bin/mysql --defaults-file=/mysql/etc/user.root.cnf -e "SELECT 1"
            {{- else }}
            - sh
            - -c
            - /mysql/bin/mysql --defaults-file=/mysql/etc/user.root.cnf -e "SELECT 1"
            {{- end }}
          initialDelaySeconds: {{ .Values.mysql.readinessProbe.initialDelaySeconds }}
          periodSeconds: {{ .Values.mysql.readinessProbe.periodSeconds }}
          timeoutSeconds: {{ .Values.mysql.readinessProbe.timeoutSeconds }}
          successThreshold: {{ .Values.mysql.readinessProbe.successThreshold }}
          failureThreshold: {{ .Values.mysql.readinessProbe.failureThreshold }}
      volumes:
      - name: conf
        emptyDir: {}
      - name: scripts
        emptyDir: {}
      - name: logs
        emptyDir: {}
      - name: config-map
        configMap:
          name: {{ template "fullname" . }}
      {{- if and .Values.persistence.enabled .Values.persistence.hostPath }}
      - name: data
        hostPath:
          path: {{ (print .Values.persistence.hostPath "/" $.Release.Name) }}
          type: DirectoryOrCreate
      {{- else if not .Values.persistence.enabled }}
      - name: data
        emptyDir: {}
      {{- end }}
      {{- if and .Values.backup.enabled }}
      - name: backup-data
        {{- if .Values.backup.storage.hostPath.enabled }}
        hostPath:
          path: {{ (print $.Values.backup.storage.hostPath.hostPath "/" $.Release.Name) }}
          type: DirectoryOrCreate
        {{- else if not .Values.backup.storage.persistence.enabled }}
        emptyDir: {}
        {{- end }}
      {{- end }}
  volumeClaimTemplates:
  {{- if and .Values.persistence.enabled (not .Values.persistence.hostPath) }}
  - metadata:
      name: data
      annotations:
      {{- range $key, $value := .Values.persistence.annotations }}
        {{ $key }}: {{ $value }}
      {{- end }}
    spec:
      accessModes:
      {{- range .Values.persistence.accessModes }}
      - {{ . | quote }}
      {{- end }}
      resources:
        requests:
          storage: {{ .Values.persistence.size | quote }}
      {{- if .Values.persistence.storageClass }}
      {{- if (eq "-" .Values.persistence.storageClass) }}
      storageClassName: ""
      {{- else }}
      storageClassName: "{{ .Values.persistence.storageClass }}"
      {{- end }}
      {{- end }}
  {{- end }}
  {{- if and .Values.backup.enabled .Values.backup.storage.persistence.enabled }}
  - metadata:
      name: backup-data
    spec:
      accessModes:
      {{- range .Values.backup.storage.persistence.accessModes }}
      - {{ . | quote }}
      {{- end }}
      resources:
        requests:
          storage: {{ .Values.backup.storage.persistence.size }}
      storageClassName: {{ .Values.backup.storage.persistence.storageClass }}
  {{- end }}

这是一个 Kubernetes StatefulSet 的 YAML 配置模板,用于部署一个有状态的高可用 MySQL 集群(或类似数据库服务)。以下是对关键部分的详细解释:

1.1 元数据 (Metadata)

metadata:
  name: {{ template "fullname" . }}  # 通过模板生成完整名称(通常包含 Release 名)
  namespace: {{ .Release.Namespace }}  # 部署的命名空间
  labels:  # 标签用于资源标识和选择
    app: {{ template "fullname" . }}
    chart: {{ template "xdb.chart" . }}  # Helm Chart 名称
    release: {{ .Release.Name | quote }}  # Helm Release 名称
    heritage: {{ .Release.Service | quote }}  # 部署工具(如 Helm)
  annotations:  # 可选注解
    {{- with .Values.statefulsetAnnotations }}
    {{ toYaml . | indent 4 }}
    {{- end }}

1.2 StatefulSet 核心配置

基本参数

spec:
  serviceName: {{ template "fullname" . }}  # 关联的 Headless Service 名称
  replicas: {{ .Values.replicaCount }}  # 副本数(Pod 数量)
  selector:  # 选择器,匹配 Pod 模板标签
    matchLabels:
      app: {{ template "fullname" . }}
      release: {{ .Release.Name }}

Pod 模板 (Template)

  • 标签与注解

    metadata:
      labels:
        app: {{ template "fullname" . }}
        release: {{ .Release.Name }}
        role: candidate  # 标识 Pod 角色(如候选主节点)
      annotations:
        checksum/config: {{include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}  # 配置变更时触发滚动更新
  • 调度配置

    spec:
      schedulerName: "{{ .Values.schedulerName }}"  # 自定义调度器(可选)
      nodeSelector: {{ toYaml .Values.nodeSelector | nindent 8 }}  # 节点选择器
      tolerations: {{ toYaml .Values.tolerations | nindent 8 }}  # 容忍污点
      affinity:  # 亲和性/反亲和性规则
        podAntiAffinity:  # 强制反亲和性,避免 Pod 调度到同一节点
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: "release", operator: "In", values: [{{ .Release.Name }}]
              - key: "app", operator: "In", values: [{{ template "fullname" . }}]
            topologyKey: "kubernetes.io/hostname"

1.3 容器配置

初始化容器 (InitContainers)

initContainers:
- name: init-mysql
  image: "{{ .Values.mysql.image }}:{{ .Values.mysql.tag }}"
  command: ['sh','-c']
  args:
    - |
      # 生成唯一 server-id(基于 Pod 序号)
      ordinal=$(echo $(hostname) | tr -cd "[1-9]")
      cat /mnt/config-map/server-id.cnf | sed s/@@SERVER_ID@@/$((100 + $ordinal))/g > /mnt/conf.d/server-id.cnf
      # 复制配置文件
      cp -f /mnt/config-map/node.cnf /mnt/conf.d/
      {{- if .Values.persistence.enabled }}
      rm -rf /mnt/data/lost+found  # 清理默认目录
      {{- end }}
  volumeMounts:  # 挂载配置和数据卷
    - name: config-map
      mountPath: /mnt/config-map
    - name: data
      mountPath: /mnt/data

主容器 (MySQL)

containers:
- name: mysql
  image: "{{ .Values.mysql.image }}:{{ .Values.mysql.tag }}"
  env:  # 环境变量(动态生成)
    - name: {{ (print $.Release.Name "_NAMESPACE") }}
      value: {{ .Release.Namespace }}
    - name: {{ (print $.Release.Name "_MYSQL_ROOT_PASSWORD") }}
      valueFrom:  # 从 Secret 获取密码
        secretKeyRef:
          name: {{ template "fullname" . }}
          key: mysql-root-password
  ports:  # 暴露端口
    - name: mysql
      containerPort: 3306
  volumeMounts:  # 挂载数据、配置和日志目录
    - name: data
      mountPath: /mysql
    - name: conf
      mountPath: /etc/mysql/conf.d
    - name: logs
      mountPath: /var/log/mysql
    - name: backup-data
      mountPath: /data/backup
  livenessProbe:  # 健康检查(MySQL 存活探针)
    exec:
      command:
        - /mysql/bin/mysqladmin --defaults-file=/mysql/etc/user.root.cnf ping
  readinessProbe:  # 就绪检查(MySQL 准备就绪探针)
    exec:
      command:
        - /mysql/bin/mysql --defaults-file=/mysql/etc/user.root.cnf -e "SELECT 1"

容器启动之后,会有如下磁盘映射信息:

"/data/lib/kubelet/pods/pod_xxx/volumes/kubernetes.io~empty-dir/conf:/etc/mysql/conf.d",
"/data/lib/kubelet/pods/pod_xxx/volumes/kubernetes.io~empty-dir/logs:/var/log/mysql",

emptyDir 存储卷是 Pod 对象生命周期中的一个临时目录,类似于Docker上的 docker挂载卷,在 Pod 对象启动时即被创建,而在 Pod 对象被移除时会被一并删除。

Volumes

      volumes:
      - name: conf
        emptyDir: {}
      - name: scripts
        emptyDir: {}
      - name: logs
        emptyDir: {}
      - name: config-map
        configMap:
          name: {{ template "fullname" . }}
      {{- if and .Values.persistence.enabled .Values.persistence.hostPath }}
      - name: data
        hostPath:
          path: {{ (print .Values.persistence.hostPath "/" $.Release.Name) }}
          type: DirectoryOrCreate
      {{- else if not .Values.persistence.enabled }}
      - name: data
        emptyDir: {}
      {{- end }}
      {{- if and .Values.backup.enabled }}
      - name: backup-data
        {{- if .Values.backup.storage.hostPath.enabled }}
        hostPath:
          path: {{ (print $.Values.backup.storage.hostPath.hostPath "/" $.Release.Name) }}
          type: DirectoryOrCreate
        {{- else if not .Values.backup.storage.persistence.enabled }}
        emptyDir: {}
        {{- end }}
      {{- end }}

1.4 存储配置

卷声明 (VolumeClaimTemplates)

volumeClaimTemplates:
- metadata:
    name: data
  spec:
    accessModes: {{ toYaml .Values.persistence.accessModes | nindent 4 }}
    resources:
      requests:
        storage: {{ .Values.persistence.size | quote }}  # 动态申请持久化存储
    storageClassName: {{ .Values.persistence.storageClass }}  # 存储类(可选)

备份卷 (Backup)

{{- if and .Values.backup.enabled .Values.backup.storage.persistence.enabled }}
- metadata:
    name: backup-data
  spec:
    accessModes: {{ toYaml .Values.backup.storage.persistence.accessModes | nindent 4 }}
    resources:
      requests:
        storage: {{ .Values.backup.storage.persistence.size }}
{{- end }}

启动容器之后,会有如下磁盘映射(xdbmysql57001 是 pod 名字):

"/data/local-path-provisioner/pvc-xx1_default_data-xdbmysql57001-1:/mysql",
"/data/local-path-provisioner/pvc-xx2_default_backup-data-xdbmysql57001-1:/data/backup",

1.5 关键功能

高可用性

  • 通过 podAntiAffinity 确保 Pod 分散在不同节点。

  • 使用 Headless Service (serviceName) 实现稳定的网络标识。

动态配置

  • 通过 Helm 模板生成资源名称、标签和环境变量。

  • 支持从 ConfigMap 和 Secret 注入配置。

持久化存储

  • 使用 volumeClaimTemplates 动态申请 PVC。

  • 支持本地路径 (hostPath) 或云存储。

备份支持

  • 配置 S3 或本地路径作为备份存储。

Last updated