我们运行兼容 Kubernetes (OKD 3.11) 的本地/私有(private)云集群,其中后端应用程序与用作缓存和 K/V 存储的低延迟 Redis 数据库进行通信。新的架构设计将在两个地理上分布的数据中心(“区域”)之间平均划分工作节点。我们可以假设节点名称和区域之间是静态配对的,现在我们还添加了带有区域名称的节点标签。
建议采用什么方法来保护与内存数据库的低延迟通信,使客户端应用程序坚持与允许它们使用的数据库位于同一区域?启动额外的数据库副本是可行的,但不会阻止两个区域之间的循环路由...
最佳答案
将此评论发布为社区 wiki,以获得更好的可见性,请随意编辑和扩展。
解决这个问题的最佳选择是使用 istio - Locality Load Balancing
。链接中的要点:
A locality defines the geographic location of a workload instance within your mesh. The following triplet defines a locality:
Region: Represents a large geographic area, such as us-east. A region typically contains a number of availability zones. In Kubernetes, the label topology.kubernetes.io/region determines a node’s region.
Zone: A set of compute resources within a region. By running services in multiple zones within a region, failover can occur between zones within the region while maintaining data locality with the end-user. In Kubernetes, the label topology.kubernetes.io/zone determines a node’s zone.
Sub-zone: Allows administrators to further subdivide zones for more fine-grained control, such as “same rack”. The sub-zone concept doesn’t exist in Kubernetes. As a result, Istio introduced the custom node label topology.istio.io/subzone to define a sub-zone.
That means that a pod running in zone
bar
of regionfoo
is not considered to be local to a pod running in zonebar
of regionbaz
.
评论中建议了另一个可以考虑进行流量平衡调整的选项:
使用nodeAffinity
实现特定“区域”中调度 pod
和 节点
的一致性。
There are currently two types of node affinity, called
requiredDuringSchedulingIgnoredDuringExecution
andpreferredDuringSchedulingIgnoredDuringExecution
. You can think of them as "hard" and "soft" respectively, in the sense that the former specifies rules that must be met for a pod to be scheduled onto a node (similar to nodeSelector but using a more expressive syntax), while the latter specifies preferences that the scheduler will try to enforce but will not guarantee. The "IgnoredDuringExecution" part of the names means that, similar to how nodeSelector works, if labels on a node change at runtime such that the affinity rules on a pod are no longer met, the pod continues to run on the node. In the future we plan to offerrequiredDuringSchedulingRequiredDuringExecution
which will be identical torequiredDuringSchedulingIgnoredDuringExecution
except that it will evict pods from nodes that cease to satisfy the pods' node affinity requirements.Thus an example of
requiredDuringSchedulingIgnoredDuringExecution
would be "only run the pod on nodes with Intel CPUs" and an examplepreferredDuringSchedulingIgnoredDuringExecution
would be "try to run this set of pods in failure zone XYZ, but if it's not possible, then allow some to run elsewhere".
更新:基于@mirekphd comment ,它仍然无法按照要求的方式充分发挥作用:
It turns out that in practice Kubernetes does not really let us switch off secondary zone, as soon as we spin up a realistic number of pod replicas (just a few is enough to see it)... they keep at least some pods in the other zone/DC/region by design (which is clever when you realize that it removes the dependency on the docker registry survival, at least under default imagePullPolicy for tagged images), GibHub issue #99630 - NodeAffinity preferredDuringSchedulingIgnoredDuringExecution doesn't work well
关于kubernetes - 在 Kubernetes/Openshift 中将客户端-服务器流量保持在同一区域的最佳方法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70006961/