kubernetes - 如何使用扩展/缩减策略实现 Kubernetes 水平 pod 自动扩展？

AWS EKS 中的 Kubernetes v1.19
我正在尝试在我的 EKS 集群中实现水平 pod 自动缩放，并试图模仿我们现在使用 ECS 所做的事情。使用 ECS，我们执行类似以下操作

在连续 3 次 1 分钟采样周期后 CPU >= 90% 时向上扩展

在连续 5 次 1 分钟采样周期后 CPU <= 60% 时按比例缩小

在连续 3 次 1 分钟采样周期后内存 >= 85% 时向上扩展

在连续 5 次 1 分钟采样周期后，当内存 <= 70% 时按比例缩小

我正在尝试使用 HorizontalPodAutoscaler种类，和helm create给我这个模板。 (注意我修改了它以满足我的需要，但 metrics 节仍然存在。)

{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: {{ include "microserviceChart.Name" . }}
  labels:
    {{- include "microserviceChart.Name" . | nindent 4 }}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ include "microserviceChart.Name" . }}
  minReplicas: {{ include "microserviceChart.minReplicas" . }}
  maxReplicas: {{ include "microserviceChart.maxReplicas" . }}
  metrics:
    {{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
    - type: Resource
      resource:
        name: cpu
        targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
    {{- end }}
    {{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
    - type: Resource
      resource:
        name: memory
        targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
    {{- end }}
{{- end }}

但是，如何拟合 Horizontal Pod Autoscaling 中显示的放大/缩小信息在上面的模板中，匹配我想要的行为？

最佳答案

Horizontal Pod Autoscaler 根据观察到的指标(如 CPU 或 Memory)自动扩展复制 Controller 、部署、副本集或有状态集中的 Pod 数量。
有一个官方演练，重点是 HPA它正在扩展:

Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Walkthrough

缩放副本数量的算法如下:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

可以使用 YAML 实现(已经呈现的)自动缩放的示例。如下所示:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: HPA-NAME
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: DEPLOYMENT-NAME
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 75

A side note!

HPA will use calculate both metrics and chose the one with bigger desiredReplicas!

解决我在问题下写的评论:

I think we misunderstood each other. It's perfectly okay to "scale up when CPU >= 90" but due to logic behind the formula I don't think it will be possible to say "scale down when CPU <=70". According to the formula it would be something in the midst of: scale up when CPU >= 90 and scale down when CPU =< 45.

此示例可能具有误导性，并且在所有情况下都不是 100% 正确。 看看下面的例子:

HPA设置为 averageUtilization的 75% .

具有某种近似程度的快速计算( HPA 的默认容差为 0.1 ):

2复制品:

scale-up (by 1 ) 应该发生在: currentMetricValue是 >= 80% :

x = ceil[2 * (80/75)] , x = ceil[2,1(3)] , x = 3

scale-down (by 1 ) 应该在 currentMetricValue 时发生是 <= 33% :

x = ceil[2 * (33/75)] , x = ceil[0,88] , x = 1

8复制品:

scale-up (by 1 ) 应该在 currentMetricValue 时发生是 >= 76% :

x = ceil[8 * (76/75)] , x = ceil[8,10(6)] , x = 9

scale-down (by 1 ) 应该在 currentMetricValue 时发生是 <= 64% :

x = ceil[8 * (64/75)] , x = ceil[6,82(6)] , x = 7

按照这个例子，有 8复制品及其 currentMetricValue在 55 ( desiredMetricValue 设置为 75 )应该 scale-down至 6复制品。
描述 HPA 决策制定的更多信息( 例如为什么它不缩放 )可以通过运行找到:

$ kubectl describe hpa HPA-NAME

Name:                                                     nginx-scaler
Namespace:                                                default
Labels:                                                   <none>
Annotations:                                              <none>
CreationTimestamp:                                        Sun, 07 Mar 2021 22:48:58 +0100
Reference:                                                Deployment/nginx-scaling
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  5% (61903667200m) / 75%
  resource cpu on pods  (as a percentage of request):     79% (199m) / 75%
Min replicas:                                             1
Max replicas:                                             10
Deployment pods:                                          5 current / 5 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type     Reason                   Age                   From                       Message
  ----     ------                   ----                  ----                       -------
  Warning  FailedGetResourceMetric  4m48s (x4 over 5m3s)  horizontal-pod-autoscaler  did not receive metrics for any ready pods
  Normal   SuccessfulRescale        103s                  horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale        71s                   horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale        71s                   horizontal-pod-autoscaler  New size: 5; reason: cpu resource utilization (percentage of request) above target

HPA可以通过 Kubernetes 版本 1.18 中引入的更改来修改扩展程序和更新的地方:

Support for configurable scaling behavior

Starting from v1.18 the v2beta2 API allows scaling behavior to be configured through the HPA behavior field. Behaviors are specified separately for scaling up and down in scaleUp or scaleDown section under the behavior field. A stabilization window can be specified for both directions which prevents the flapping of the number of the replicas in the scaling target. Similarly specifying scaling policies controls the rate of change of replicas while scaling.

Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Support for configurable scaling behavior

我认为您可以使用新引入的字段，例如 behavior和 stabilizationWindowSeconds根据您的特定需求调整您的工作量。
我也建议您联系 EKS文档以获取更多引用，支持指标和示例。

关于kubernetes - 如何使用扩展/缩减策略实现 Kubernetes 水平 pod 自动扩展？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66485722/

kubernetes - 如何使用扩展/缩减策略实现 Kubernetes 水平 pod 自动扩展？

Support for configurable scaling behavior

上一篇：python - Django 会自动创建索引吗？

下一篇：Azure SQL 权限 : How to allow using Query Performance Insight, 但不更改定价层等设置？