linux - 为特定模式的所有索引重新索引弹性数据

标签 linux shell elasticsearch sh

我有多个弹性指数。索引的名称具有特定格式。以下是我的索引示例:

  • abc_2ab99742-94d2-43f8-a582-ce10a0f031dc;

  • abc_e8241182-1a40-410b-a95d-c883472444f4;

现在我需要重新索引这些索引中的所有数据。为此,我编写了一个 shell 脚本,它正在执行以下操作。

1. Loop for all indices
  1.1 create a temporary index like abc_2ab99742-94d2-43f8-a582-ce10a0f031dc_tmp.
  1.2 reindix all the data from the original index to temp.
  1.3 delete and re-create the original index.
  1.4 reindex the data from temp to original index.
  1.5 delete the temporary index.

下面是shell脚本,我自己写的。

#!/bin/bash

ES_HOST="localhost"
ES_PORT="9200"
TMP="_tmp"

indices=$(curl -s "http://${ES_HOST}:${ES_PORT}/_cat/indices/abc_*?h=index" | egrep 'abc_[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{8}*')

# do for all abc elastic indices
for index in $indices
do
    echo "Reindex process starting for index: $index"
    tmp_index=$index${TMP}
    output=$(curl -X PUT "http://${ES_HOST}:${ES_PORT}/$tmp_index" -H 'Content-Type: application/json' -d'
    {
        "settings" : {
            "index" : {
                "number_of_shards" : 16,
                "number_of_replicas" : 1
            }
        }
    }')
    echo "Temporary index: $tmp_index created with output: $output"
    echo "Starting reindexing elastic data from original index:$index to temporary index:$tmp_index"
    output=$(curl -X POST "http://${ES_HOST}:${ES_PORT}/_reindex" -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": '"$index"'
      },
      "dest": {
        "index": '"$tmp_index"'
      }
    }
    ')
    echo "Reindexing completed from original index:$index to temporary index:$tmp_index with output: $output"
    echo "Deleting $index"
    output=$(curl -X DELETE "http://${ES_HOST}:${ES_PORT}/$index")
    echo "$index deleted with status: $output"
    echo "Creating index: $index"
    output=$(curl -X PUT "http://${ES_HOST}:${ES_PORT}/$index" -H 'Content-Type: application/json' -d'
    {
        "settings" : {
            "index" : {
                "number_of_shards" : 16,
                "number_of_replicas" : 1
            }
        }
    }')
    echo "Index: $index creation status: $output"
    echo "Starting reindexing elastic data from temporary index:$tmp_index to original index:$index"
    output=$(curl -X POST "http://${ES_HOST}:${ES_PORT}/_reindex" -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": '"$tmp_index"'
      },
      "dest": {
        "index": '"$index"'
      }
    }
    ')
    echo "Reindexing completed from temporary index:$tmp_index to original index:$index with output: $output"
    echo "Deleting $tmp_index"
    output=$(curl -X DELETE "http://${ES_HOST}:${ES_PORT}/$tmp_index")
    echo "$tmp_index deleted with status: $output"
done

但是我在 reindex 命令中遇到异常。以下是异常(exception)情况

Reindexing completed from original index:abc_58b888be-a90f-e3be-838d-88877aee572c to temporary index:abc_58b888be-a90f-e3be-838d-88877aee572c_tmp with output: {"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [source]","line":4,"col":9}],"type":"parsing_exception","reason":"[reindex] failed to parse field [source]","line":4,"col":9,"caused_by":{"type":"json_parse_exception","reason":"Unrecognized token 'abc_58b888be': was expecting ('true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@39913ba; line: 4, column: 31]"}},"status":400}

任何人都可以帮助我,因为我不太擅长 shell 脚本。

最佳答案

问题出在你的shell脚本上,检查以下部分;

output=$(curl -X POST "http://${ES_HOST}:${ES_PORT}/_reindex" -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": '"$index"'
      },
      "dest": {
        "index": '"$tmp_index"'
      }
    }
    ')

这里你假设要发布一个json,但是json是无效的,改变如下,然后你的脚本就可以工作了:

output=$(curl -XPOST "http://${ES_HOST}:${ES_PORT}/_reindex" -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": "'"$index"'"
      },
      "dest": {
        "index": "'"$tmp_index"'"
      }
    }
    ')



"index": "'"$index"'"

这是根据 json 格式创建一个有效的键值对

关于linux - 为特定模式的所有索引重新索引弹性数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50465751/

相关文章:

linux - 如何让 DKMS 将正确的 ARCH 传递给 'make' ?

linux - 如何使用shell脚本将两列分成数组

python - 在 django 中与 elasticsearch 交互

elasticsearch - Elasticsearch数据模型

sorting - 对嵌套数组进行排序并返回弹性中的前 10 名

node.js - 我在 Linux 中安装 npm 时遇到一堆错误

linux - nfs 文件共享未挂载

linux - Systemtap 不显示内核函数中的所有局部变量

庆典 shell : change multiple file names to add leading zeros

c - 在c中使用时系统命令的返回值是什么