elasticsearch 无法解析主机 [elasticsearch-discovery]

标签 elasticsearch kubernetes kubernetes-helm amazon-eks

我将 elastichsearch 部署到我的 AWS EKS 以进行日志记录,使用 stable/elasticsearch 图表,使用以下命令:

helm install stable/elasticsearch --namespace logging --name elasticsearch --set data.terminationGracePeriodSeconds=0

安装后,所有在日志记录下运行的 pod 都处于 Running 但未处于就绪状态

NAME                                    READY   STATUS    RESTARTS   AGE
elasticsearch-client-64bb574bff-85lp9   0/1     Running   0          41s
elasticsearch-client-64bb574bff-t4h6r   0/1     Running   0          44m
elasticsearch-data-0                    0/1     Pending   0          44m
elasticsearch-master-0                  0/1     Pending   0          44m

我从 elasticsearch pod 日志中得到了这个警告

[elasticsearch-client-647c67f49d-npjp4] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again
[2018-11-27T22:53:55,009][WARN ][o.e.d.z.UnicastZenPing   ] 
[elasticsearch-client-647c67f49d-npjp4] failed to resolve host 
[elasticsearch-discovery] java.net.UnknownHostException: elasticsearch-discovery: Name or service not known

更新

helm 模板 ./charts/stable/elasticsearch --namespace logging --name elasticsearch --set data.terminationGracePeriodSeconds=0 > deployment.yaml

这是helm install的模板

---
# Source: elasticsearch/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch
  labels:
    app: elasticsearch
    chart: "elasticsearch-1.14.1"
    release: "elasticsearch"
    heritage: "Tiller"
data:
  elasticsearch.yml: |-
    cluster.name: elasticsearch

    node.data: ${NODE_DATA:true}
    node.master: ${NODE_MASTER:true}
    node.ingest: ${NODE_INGEST:true}
    node.name: ${HOSTNAME}

    network.host: 0.0.0.0
    # see https://github.com/kubernetes/kubernetes/issues/3595
    bootstrap.memory_lock: ${BOOTSTRAP_MEMORY_LOCK:false}

    discovery:
      zen:
        ping.unicast.hosts: ${DISCOVERY_SERVICE:}
        minimum_master_nodes: ${MINIMUM_MASTER_NODES:2}

    # see https://github.com/elastic/elasticsearch-definitive-guide/pull/679
    processors: ${PROCESSORS:}

    # avoid split-brain w/ a minimum consensus of two masters plus a data node
    gateway.expected_master_nodes: ${EXPECTED_MASTER_NODES:2}
    gateway.expected_data_nodes: ${EXPECTED_DATA_NODES:1}
    gateway.recover_after_time: ${RECOVER_AFTER_TIME:5m}
    gateway.recover_after_master_nodes: ${RECOVER_AFTER_MASTER_NODES:2}
    gateway.recover_after_data_nodes: ${RECOVER_AFTER_DATA_NODES:1}
  log4j2.properties: |-
    status = error
    appender.console.type = Console
    appender.console.name = console
    appender.console.layout.type = PatternLayout
    appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%m%n
    rootLogger.level = info
    rootLogger.appenderRef.console.ref = console
    logger.searchguard.name = com.floragunn
    logger.searchguard.level = info
  pre-stop-hook.sh: |-
    #!/bin/bash
    exec &> >(tee -a "/var/log/elasticsearch-hooks.log")
    NODE_NAME=${HOSTNAME}
    echo "Prepare to migrate data of the node ${NODE_NAME}"
    echo "Move all data from node ${NODE_NAME}"
    curl -s -XPUT -H 'Content-Type: application/json' 'elasticsearch-client:9200/_cluster/settings' -d "{
      \"transient\" :{
          \"cluster.routing.allocation.exclude._name\" : \"${NODE_NAME}\"
      }
    }"
    echo ""

    while true ; do
      echo -e "Wait for node ${NODE_NAME} to become empty"
      SHARDS_ALLOCATION=$(curl -s -XGET 'http://elasticsearch-client:9200/_cat/shards')
      if ! echo "${SHARDS_ALLOCATION}" | grep -E "${NODE_NAME}"; then
        break
      fi
      sleep 1
    done
    echo "Node ${NODE_NAME} is ready to shutdown"
  post-start-hook.sh: |-
    #!/bin/bash
    exec &> >(tee -a "/var/log/elasticsearch-hooks.log")
    NODE_NAME=${HOSTNAME}
    CLUSTER_SETTINGS=$(curl -s -XGET "http://elasticsearch-client:9200/_cluster/settings")
    if echo "${CLUSTER_SETTINGS}" | grep -E "${NODE_NAME}"; then
      echo "Activate node ${NODE_NAME}"
      curl -s -XPUT -H 'Content-Type: application/json' "http://elasticsearch-client:9200/_cluster/settings" -d "{
        \"transient\" :{
            \"cluster.routing.allocation.exclude._name\" : null
        }
      }"
    fi
    echo "Node ${NODE_NAME} is ready to be used"

---
# Source: elasticsearch/templates/client-serviceaccount.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "client"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-client

---
# Source: elasticsearch/templates/data-serviceaccount.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "data"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-data

---
# Source: elasticsearch/templates/master-serviceaccount.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "master"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-master

---
# Source: elasticsearch/templates/client-svc.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "client"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-client

spec:
  ports:
    - name: http
      port: 9200
      targetPort: http
  selector:
    app: elasticsearch
    component: "client"
    release: elasticsearch
  type: ClusterIP

---
# Source: elasticsearch/templates/master-svc.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "master"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-discovery
spec:
  clusterIP: None
  ports:
    - port: 9300
      targetPort: transport
  selector:
    app: elasticsearch
    component: "master"
    release: elasticsearch

---
# Source: elasticsearch/templates/client-deployment.yaml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "client"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-client
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: elasticsearch
        component: "client"
        release: elasticsearch
    spec:
      serviceAccountName: elasticsearch-client
      securityContext:
        fsGroup: 1000
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchLabels:
                  app: "elasticsearch"
                  release: "elasticsearch"
                  component: "client"
      initContainers:
      # see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
      # and https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html#mlockall
      - name: "sysctl"
        image: "busybox:latest"
        imagePullPolicy: "Always"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      containers:
      - name: elasticsearch
        env:
        - name: NODE_DATA
          value: "false"
        - name: NODE_MASTER
          value: "false"
        - name: DISCOVERY_SERVICE
          value: elasticsearch-discovery
        - name: PROCESSORS
          valueFrom:
            resourceFieldRef:
              resource: limits.cpu
        - name: ES_JAVA_OPTS
          value: "-Djava.net.preferIPv4Stack=true -Xms512m -Xmx512m "
        - name: MINIMUM_MASTER_NODES
          value: "2"
        resources:
            limits:
              cpu: "1"
            requests:
              cpu: 25m
              memory: 512Mi

        readinessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 5
        livenessProbe:
          httpGet:
            path: /_cluster/health?local=true
            port: 9200
          initialDelaySeconds: 90
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.0"
        imagePullPolicy: "IfNotPresent"
        ports:
        - containerPort: 9200
          name: http
        - containerPort: 9300
          name: transport
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          name: config
          subPath: elasticsearch.yml
      volumes:
      - name: config
        configMap:
          name: elasticsearch

---
# Source: elasticsearch/templates/data-statefulset.yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "data"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-data
spec:
  serviceName: elasticsearch-data
  replicas: 2
  template:
    metadata:
      labels:
        app: elasticsearch
        component: "data"
        release: elasticsearch
    spec:
      serviceAccountName: elasticsearch-data
      securityContext:
        fsGroup: 1000
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchLabels:
                  app: "elasticsearch"
                  release: "elasticsearch"
                  component: "data"
      initContainers:
      # see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
      # and https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html#mlockall
      - name: "sysctl"
        image: "busybox:latest"
        imagePullPolicy: "Always"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: "chown"
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.0"
        imagePullPolicy: "IfNotPresent"
        command:
        - /bin/bash
        - -c
        - chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data &&
          chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/logs
        securityContext:
          runAsUser: 0
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: data
      containers:
      - name: elasticsearch
        env:
        - name: DISCOVERY_SERVICE
          value: elasticsearch-discovery
        - name: NODE_MASTER
          value: "false"
        - name: PROCESSORS
          valueFrom:
            resourceFieldRef:
              resource: limits.cpu
        - name: ES_JAVA_OPTS
          value: "-Djava.net.preferIPv4Stack=true -Xms1536m -Xmx1536m "
        - name: MINIMUM_MASTER_NODES
          value: "2"
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.0"
        imagePullPolicy: "IfNotPresent"
        ports:
        - containerPort: 9300
          name: transport

        resources:
            limits:
              cpu: "1"
            requests:
              cpu: 25m
              memory: 1536Mi

        readinessProbe:
          httpGet:
            path: /_cluster/health?local=true
            port: 9200
          initialDelaySeconds: 5
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: data
        - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          name: config
          subPath: elasticsearch.yml
        - name: config
          mountPath: /pre-stop-hook.sh
          subPath: pre-stop-hook.sh
        - name: config
          mountPath: /post-start-hook.sh
          subPath: post-start-hook.sh
        lifecycle:
          preStop:
            exec:
              command: ["/bin/bash","/pre-stop-hook.sh"]
          postStart:
            exec:
              command: ["/bin/bash","/post-start-hook.sh"]
      terminationGracePeriodSeconds: 0
      volumes:
      - name: config
        configMap:
          name: elasticsearch
  updateStrategy:
    type: OnDelete
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: "30Gi"

---
# Source: elasticsearch/templates/master-statefulset.yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  labels:
    app: elasticsearch
    chart: elasticsearch-1.14.1
    component: "master"
    heritage: Tiller
    release: elasticsearch
  name: elasticsearch-master
spec:
  serviceName: elasticsearch-master
  replicas: 3
  template:
    metadata:
      labels:
        app: elasticsearch
        component: "master"
        release: elasticsearch
    spec:
      serviceAccountName: elasticsearch-master
      securityContext:
        fsGroup: 1000
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchLabels:
                  app: "elasticsearch"
                  release: "elasticsearch"
                  component: "master"
      initContainers:
      # see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
      # and https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html#mlockall
      - name: "sysctl"
        image: "busybox:latest"
        imagePullPolicy: "Always"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: "chown"
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.0"
        imagePullPolicy: "IfNotPresent"
        command:
        - /bin/bash
        - -c
        - chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data &&
          chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/logs
        securityContext:
          runAsUser: 0
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: data
      containers:
      - name: elasticsearch
        env:
        - name: NODE_DATA
          value: "false"
        - name: DISCOVERY_SERVICE
          value: elasticsearch-discovery
        - name: PROCESSORS
          valueFrom:
            resourceFieldRef:
              resource: limits.cpu
        - name: ES_JAVA_OPTS
          value: "-Djava.net.preferIPv4Stack=true -Xms512m -Xmx512m "
        - name: MINIMUM_MASTER_NODES
          value: "2"
        resources:
            limits:
              cpu: "1"
            requests:
              cpu: 25m
              memory: 512Mi

        readinessProbe:
          httpGet:
            path: /_cluster/health?local=true
            port: 9200
          initialDelaySeconds: 5
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.0"
        imagePullPolicy: "IfNotPresent"
        ports:
        - containerPort: 9300
          name: transport

        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: data
        - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          name: config
          subPath: elasticsearch.yml
      volumes:
      - name: config
        configMap:
          name: elasticsearch
  updateStrategy:
    type: OnDelete
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: "4Gi"


---
# Source: elasticsearch/templates/client-pdb.yaml


---
# Source: elasticsearch/templates/data-pdb.yaml


---
# Source: elasticsearch/templates/master-pdb.yaml


---
# Source: elasticsearch/templates/podsecuritypolicy.yaml


---
# Source: elasticsearch/templates/role.yaml


---
# Source: elasticsearch/templates/rolebinding.yaml

最佳答案

我修复了添加一个 PVC 和一个 StorageClass,因为来自 master-0 和 data-0 的事件日志是 PVC 未绑定(bind)

关于elasticsearch 无法解析主机 [elasticsearch-discovery],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53509648/

相关文章:

logging - Logstash 从日志消息中过滤时间戳

azure - 将 ado 变量传递给 Azure ADO 上的 pv、pvc、配置映射和部署文件

networking - 无法删除法兰绒插件

amazon-ec2 - 在Kubernetes从节点运行的Pod处于ContainerCreating状态

kubernetes - 无法通过Kubernetes使用Helm安装Nginx

azure-devops - 多行 overrideValues Helm 安装 Azure DevOps

java - 有没有办法在 Elasticsearch 中连接同一对象之间的属性?

python - Elasticsearch - field_value_factor,缺少参数

elasticsearch - 能否确保Elasticsearch内部匹配项返回所有受影响的子级?

kubernetes-helm - 添加到 ACR 的 Helm 图表的索引 URL 是什么?