添加用于异常检测的配置时,Istio会停止负载平衡

问题描述

团队

我一直在使用isito1.7和离群值检测,这是我发现的一些奇怪的事情 vs-dr.yaml

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
Metadata:
  name: recommendation
spec:
  hosts:
    - "recommendation-demo.com"
  gateways:
    - istio-system/monitoring-gateway
  http:
  - name: "other-account-route"
    route:
    - destination:
        host: recommendation
        subset: v2
      weight: 100
    - destination:
        host: recommendation
        subset: v1
      weight: 0
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
Metadata:
  name: recomm-dr
spec:
  host: recommendation
  subsets:
  - name: v2
    labels:
      version: v2
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
      connectionPool:
        tcp: {}
        http: {}
      outlierDetection:
        consecutiveErrors: 2
        interval: 1s
        baseEjectionTime: 30s
        maxEjectionPercent: 10
  - name: v1
    labels:
      version: v1

因此,如果未在目标规则中配置异常检测,那么负载均衡将像

一样成功地工作
kubectl -n micro exec -it $CLIENT_POD -c istio-proxy – sh -c ‘while true; do curl -L recommendation-demo.com; sleep 1; done’
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 45
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 851
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 44
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 46
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 852
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 45
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 47
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 853
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 46
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 48
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 47
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 854

但是我添加了这部分之后

outlierDetection:
consecutiveErrors: 2
interval: 1s
baseEjectionTime: 30s
maxEjectionPercent: 50

我唯一得到的结果是来自

recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1321
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1322
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1323
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1324
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1325
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1326
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1327

在我添加异常值配置并扩展部署之后,顺便说一句,最小的Pod可以成功路由

recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 32
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 33
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1364
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 34
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1365
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 35
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1366
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 36

所以我的问题是,

  1. 这是预期的行为吗?在这种情况下,假设我们在一个rs中有3个Pod,然后应用较新的配置,那么该请求将仅路由到最小的Pod建议-v2-57ddf9cd95-skkgd
  2. 我们已经有了rs和离群值配置,然后在rs中添加了额外的pod,它们可以成功实现负载均衡吗?
  3. 有人对异常值进行成功配置吗? 非常感谢您的回复

解决方法

我在YouTube和video github存储库上创建了以下示例,其中带有此this

它基于1个部署,其中包含服务,适当的网关,虚拟服务和目标规则。

使用istio 1.7.4在gke上进行测试。


示例yamls。

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: recommendation
    version: v2
  name: recommendation-v2
spec:
  replicas: 2
  selector:
    matchLabels:
      app: recommendation
      version: v2
  template:
    metadata:
      labels:
        app: recommendation
        version: v2
      annotations:
        sidecar.istio.io/inject: "true"
    spec:
      containers:
      - env:
        - name: JAVA_OPTIONS
          value: -Xms15m -Xmx15m -Xmn15m
        name: recommendation
        image: quay.io/rhdevelopers/istio-tutorial-recommendation:v2.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        - containerPort: 8778
          name: jolokia
          protocol: TCP
        - containerPort: 9779
          name: prometheus
          protocol: TCP
        resources:
          requests:
            memory: "80Mi"
            cpu: "200m" # 1/5 core
          limits:
            memory: "120Mi"
            cpu: "500m"
        livenessProbe:
          exec:
            command:
            - curl
            - localhost:8080/health/live
          initialDelaySeconds: 5
          periodSeconds: 4
          timeoutSeconds: 1
        readinessProbe:
          exec:
            command:
            - curl
            - localhost:8080/health/ready
          initialDelaySeconds: 6
          periodSeconds: 5
          timeoutSeconds: 1
        securityContext:
          privileged: false

---


apiVersion: v1
kind: Service
metadata:
  name: recommendation
  labels:
    app: recommendation
spec:
  ports:
  - name: http
    port: 8080
  selector:
    app: recommendation


---            

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: my-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
      - "*"

---

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: recommendation
spec:
  hosts:
    - "*"
  gateways:
    - "my-gateway"
  http:
  - name: "other-account-route"
    route:
    - destination:
        host: recommendation
        subset: v2
      weight: 100


---

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: recomm-dr
spec:
  host: recommendation
  subsets:
  - name: v2
    labels:
      version: v2
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
      outlierDetection:
        consecutiveErrors: 1
        interval: 1s
        baseEjectionTime: 60s
        maxEjectionPercent: 100

1。这是预期的行为吗?在这种情况下,假设我们在一个rs中有3个Pod,然后应用较新的配置,那么该请求将仅路由到最小的Pod建议-v2-57ddf9cd95-skkgd

否,在应用了outlierDetection之后,它应该可以像以前一样工作,除非它们返回503。

2。我们已经有了rs和离群值配置,然后在rs中添加了额外的pod,它们可以成功实现负载均衡吗?

是的,应该成功实现负载均衡。


上面的Yamls有测试。

在outlierDetection下面添加了

  outlierDetection:
    consecutiveErrors: 1
    interval: 10s
    baseEjectionTime: 90s
    maxEjectionPercent: 100

具有2个副本和离群值检测。

recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 5
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 5
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 6

使用2个副本,outlierDetection并使用kubectl scale deployment recommendation-v2 --replicas=4添加了下2个副本

recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 15
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 17
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 16
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 18
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 17
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 19
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 18
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 20
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 19
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 20
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 21
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 21
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 22
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 22
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 23
recommendation v2 from 'recommendation-v2-7f76b4c8cc-kvqjk': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 23
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 24
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 5
recommendation v2 from 'recommendation-v2-7f76b4c8cc-kvqjk': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-kvqjk': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 24
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 25

添加了2个新副本,ml9m7和kvqjk。


有人对异常值进行成功配置吗?非常感谢您的答复!

如果我正确理解了它应该如何工作,那么上面的示例可以正常工作,如果您手动更改1个吊舱以返回503,则该吊舱将从池中弹出,并在90秒钟后重新添加

视频上方有一种方法可以使推荐副本返回503。

kubectl exec -ti recommendation-v2-7f76b4c8cc-6tvmj -c recommendation /bin/bash
bash-4.4# curl localhost:8080/misbehave
Following requests to / will return a 503

如果您开始发送流量,则可以检查部署副本的日志,该日志应返回503,并带有

kubectl logs recommendation-v2-7f76b4c8cc-6tvmj -c recommendation --tail 10

每90秒钟有一些请求,在istio检测到503之后,它将以outlierDetection弹出。 90年代后,istio将尝试一次又一次地发送流量。

其他资源:

,

所以这个问题似乎与locality load balancer

outlierDetection 未定义时,局部性 {​​{1}} 被禁用 -> 因此不使用局部性。这就是负载均衡器正常工作的原因。

但是在设置 failover 后,默认情况下启用了位置 outlierDetection -> 因此请求将在一个位置上进行负载平衡

-> 如果你想确定:

failover