amazon-web-services - elasticsearch只显示1个使用logstash进行数据迁移的docs.count

我正在尝试使用自定义Templete的logstash将数据从S3(.csv文件的数据)移动到 Elasticsearch 集群。
但是当我在Kibana中使用以下查询进行检查时，它仅将docs.count = 1和其余记录显示为docs.deleted:-

GET /_cat/indices?v

我的第一个问题是:

为什么只传输一条记录(最后一条)，而其他记录则删除删除？

现在，当我在Kibana中使用以下查询查询该索引时:-

GET /my_file_index/_search
{
  "query": {
    "match_all": {}
  }
}

我在"message" :字段中仅得到一条记录，其中用逗号分隔数据，所以第二个问题是:-

我如何在模板文件中指定所有列映射并将其输入到logstash中的情况下，像在csv中那样获取具有列名称的数据？

我也尝试在logstash csv过滤器中提供columns字段，但是没有运气。

 columns => ["col1", "col2",...]

任何帮助，将不胜感激。

编辑-1:以下是我的logstash.conf文件:

input {
 s3{
     access_key_id => "xxx"
     secret_access_key => "xxxx"
     region => "eu-xxx-1"
     bucket => "xxxx"
     prefix => "abc/stocks_03-jul-2018.csv"
   }
}
filter {
  csv {
      separator => ","
      columns => ["AAA","BBB","CCC"]
  }
}
output {
    amazon_es {
        index => "my_r_index"
        document_type => "my_r_index"
        hosts => "vpc-totemdev-xxxx.eu-xxx-1.es.amazonaws.com"
        region => "eu-xxxx-1"
        aws_access_key_id => 'xxxxx'
        aws_secret_access_key => 'xxxxxx+xxxxx'
        document_id => "%{id}"
        template => "templates/template_2.json"
        template_name => "my_r_index"
 }
}

注意:
Logstash版本:6.3.1
的elasticsearch版本:6.2

编辑:-2与示例csv header 一起添加template_2.json文件:-

1.映射文件:-

{ 
    "template" : "my_r_index", 
    "settings" : {
        "index" : {
            "number_of_shards" : 50,
            "number_of_replicas" : 1
         },
         "index.codec" : "best_compression",
         "index.refresh_interval" : "60s"
      },
    "mappings" : { 
        "_default_" : { 
            "_all" : { "enabled" : false },
       "properties" : { 
        "SECURITY" : {
            "type" : "keyword"
        },
        "SERVICEID" : {
            "type" : "integer"
        },
        "MEMBERID" : {
            "type" : "integer"
        },
        "VALUEDATE" : {
            "type" : "date"
        },
        "COUNTRY" : {
            "type" : "keyword"
        },
        "CURRENCY" : {
            "type" : "keyword"
        },
        "ABC" : {
            "type" : "integer"
        },
        "PQR" : {
            "type" : "keyword"
        },
        "KKK" : {
            "type" : "keyword"
        },
        "EXPIRYDATE" : {
            "type" : "text",
            "index" : "false"
        },
        "SOMEID" : {
            "type" : "double",
            "index" : "false"
        },
        "DDD" : {
            "type" : "double",
            "index" : "false"
        },
        "EEE" : {
            "type" : "double",
            "index" : "false"
        },
        "FFF" : {
            "type" : "double",
            "index" : "false"
        },
        "GGG" : {
            "type" : "text",
            "index" : "false"
        },
        "LLL" : {
            "type" : "double",
            "index" : "false"
        },
        "MMM" : {
            "type" : "double",
            "index" : "false"
        },
        "NNN" : {
            "type" : "double",
            "index" : "false"
        },
        "OOO" : {
            "type" : "double",
            "index" : "false"
        },
        "PPP" : {
            "type" : "text",
            "index" : "false"
        },
        "QQQ" : {
            "type" : "integer",
            "index" : "false"
        },
        "RRR" : {
            "type" : "double",
            "index" : "false"
        },
        "SSS" : {
            "type" : "double",
            "index" : "false"
        },
        "TTT" : {
            "type" : "double",
            "index" : "false"
        },
        "UUU" : {
            "type" : "double",
            "index" : "false"
        },
        "VVV" : {
            "type" : "text",
            "index" : "false"
        },
        "WWW" : {
            "type" : "double",
            "index" : "false"
        },
        "XXX" : {
            "type" : "double",
            "index" : "false"
        },
        "YYY" : {
            "type" : "double",
            "index" : "false"
        },
        "ZZZ" : {
            "type" : "double",
            "index" : "false"
        },
        "KNOCKORWARD" : {
            "type" : "text",
            "index" : "false"
        },
        "RANGEATSSPUT" : {
            "type" : "double",
            "index" : "false"
        },
        "STDATMESSPUT" : {
            "type" : "double",
            "index" : "false"
        },
        "CONSENSUPUT" : {
            "type" : "double",
            "index" : "false"
        },
        "CLIENTLESSPUT" : {
            "type" : "double",
            "index" : "false"
        },
        "KNOCKOUESSPUT" : {
            "type" : "text",
            "index" : "false"
        },
        "RANGACTOR" : {
            "type" : "double",
            "index" : "false"
        },
        "STDDACTOR" : {
            "type" : "double",
            "index" : "false"
        },
        "CONSCTOR" : {
            "type" : "double",
            "index" : "false"
        },
        "CLIENTOR" : {
            "type" : "double",
            "index" : "false"
        },
        "KNOCKOACTOR" : {
            "type" : "text",
            "index" : "false"
        },
        "RANGEPRICE" : {
            "type" : "double",
            "index" : "false"
        },
        "STANDARCE" : {
            "type" : "double",
            "index" : "false"
        },
        "NUMBERICE" : {
            "type" : "integer",
            "index" : "false"
        },
        "CONSECE" : {
            "type" : "double",
            "index" : "false"
        },
        "CLIECE" : {
            "type" : "double",
            "index" : "false"
        },
        "KNOCICE" : {
            "type" : "text",
            "index" : "false"
        },
        "SKEWICE" : {
            "type" : "text",
            "index" : "false"
        },
        "WILDISED" : {
            "type" : "text",
            "index" : "false"
        },
        "WILDATUS" : {
            "type" : "text",
            "index" : "false"
        },
        "RRF" : {
            "type" : "double",
            "index" : "false"
        },
        "SRF" : {
            "type" : "double",
            "index" : "false"
        },
        "CNRF" : {
            "type" : "double",
            "index" : "false"
        },
        "CTRF" : {
            "type" : "double",
            "index" : "false"
        },
        "RANADDLE" : {
            "type" : "double",
            "index" : "false"
        },
        "STANDANSTRADDLE" : {
            "type" : "double",
            "index" : "false"
        },
        "CONSLE" : {
            "type" : "double",
            "index" : "false"
        },
        "CLIDLE" : {
            "type" : "double",
            "index" : "false"
        },
        "KNOCKOADDLE" : {
            "type" : "text",
            "index" : "false"
        },
        "RANGEFM" : {
            "type" : "double",
            "index" : "false"
        },
        "SMIUM" : {
            "type" : "double",
            "index" : "false"
        },
        "CONIUM" : {
            "type" : "double",
            "index" : "false"
        },
        "CLIEEMIUM" : {
            "type" : "double",
            "index" : "false"
        },
        "KNOREMIUM" : {
            "type" : "text",
            "index" : "false"
        },
        "COT" : {
            "type" : "double",
            "index" : "false"
        },
        "CLIEEDSPOT" : {
            "type" : "double",
            "index" : "false"
        },
        "IME" : {
            "type" : "keyword"
        },
        "KKE" : {
            "type" : "keyword"
        }
        } 
    }
    }     
}

我的excel内容为:-

标题:实际标题很长，因为有很多列，请继续考虑与下面类似的其他列名。

  SECURITY | SERVICEID  | MEMBERID | VALUEDATE...

第一行:同样，下面某些列的列值具有空白值，我已经在上面提到了包含所有列值的真实模板文件(在上面的映射文件中)。

KKK-LMN 2 1815 6/25/2018
PPL-ORL 2 1815 2018年6月25日
SLB-ORD 2 1815 6/25/2018

3. Kibana查询输出
查询:

GET /my_r_index/_search
{
  "query": {
    "match_all": {}
  }
}

出局:

{
        "_index": "my_r_index",
        "_type": "my_r_index",
        "_id": "IjjIZWUBduulDsi0vYot",
        "_score": 1,
        "_source": {
          "@version": "1",
          "message": "XXX-XXX-XXX-USD,2,3190,2018-07-03,UNITED STATES,USD,300,60,Put,2042-12-19,,,,.009108041,q,,,,.269171754,q,,,,,.024127966,q,,,,68.414017367,q,,,,.298398645,q,,,,.502677959,q,,,,,0.040880692400344164,q,,,,,,,159.361792143,,,,.631296636,q,,,,.154877384,q,,42.93,N,Y,\n",
          "@timestamp": "2018-08-23T07:56:06.515Z"
        }
      },

...上述其他类似记录。

EDIT-3:
使用autodetect_column_names => true后的样本输出:

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 10,
    "successful": 10,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "indr",
        "_type": "logs",
        "_id": "hAF1aWUBS_wbCH7ZG4tW",
        "_score": 1,
        "_source": {
          "2": "2",
          "1815": "1815",
          "message": """
PPL-ORD-XNYS-USD,2,1815,6/25/2018,UNITED STATES

""",
          "SLB-ORD-XNYS-USD": "PPL-ORD-XNYS-USD",
          "6/25/2018": "6/25/2018",
          "@timestamp": "2018-08-24T01:03:26.436Z",
          "UNITED STATES": "UNITED STATES",
          "@version": "1"
        }
      },
      {
        "_index": "indr",
        "_type": "logs",
        "_id": "kP11aWUBctDorPcGHICS",
        "_score": 1,
        "_source": {
          "2": "2",
          "1815": "1815",
          "message": """
SLBUSD,2,1815,4/22/2018,UNITEDSTATES

""",
          "SLB-ORD-XNYS-USD": "SLBUSD",
          "6/25/2018": "4/22/2018",
          "@timestamp": "2018-08-24T01:03:26.436Z",
          "UNITED STATES": "UNITEDSTATES",
          "@version": "1"
        }
      },
      {
        "_index": "indr",
        "_type": "logs",
        "_id": "j_11aWUBctDorPcGHICS",
        "_score": 1,
        "_source": {
          "2": "SERVICE",
          "1815": "CLIENT",
          "message": """
UNDERLYING,SERVICE,CLIENT,VALUATIONDATE,COUNTRY

""",
          "SLB-ORD-XNYS-USD": "UNDERLYING",
          "6/25/2018": "VALUATIONDATE",
          "@timestamp": "2018-08-24T01:03:26.411Z",
          "UNITED STATES": "COUNTRY",
          "@version": "1"
        }
      }
    ]
  }
}

最佳答案

我确定您的单个文档的ID为%{id}。第一个问题来自以下事实:在您的CSV文件中，您没有提取名称为id的列，而这正是您在document_id => "%{id}"中使用的列，因此所有行都使用id %{id}进行了索引，并且每个索引都删除了前一个。最后，您拥有一个文档，该文档已被索引为CSV中的行。

关于第二个问题，您需要修复过滤器部分，如下所示:

filter {
  csv {
      separator => ","
      autodetect_column_names => true
  }
  date {
    match => [ "VALUATIONDATE", "M/dd/yyyy" ]
  }
}

您还需要像这样修复索引模板(我只在format字段中添加了VALUATIONDATE设置:

{
  "order": 0,
  "template": "helloindex",
  "settings": {
    "index": {
      "codec": "best_compression",
      "refresh_interval": "60s",
      "number_of_shards": "10",
      "number_of_replicas": "1"
    }
  },
  "mappings": {
    "_default_": {
      "_all": {
        "enabled": false
      },
      "properties": {
        "UNDERLYING": {
          "type": "keyword"
        },
        "SERVICE": {
          "type": "integer"
        },
        "CLIENT": {
          "type": "integer"
        },
        "VALUATIONDATE": {
          "type": "date",
          "format": "MM/dd/yyyy"
        },
        "COUNTRY": {
          "type": "keyword"
        }
      }
    }
  },
  "aliases": {}
}

关于amazon-web-services - elasticsearch只显示1个使用logstash进行数据迁移的docs.count，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51932734/

amazon-web-services - elasticsearch只显示1个使用logstash进行数据迁移的docs.count

上一篇：android - 停止声音服务

下一篇：elasticsearch - 如果是多重排序，Elastic Search的响应速度会变慢吗？[不是脚本排序]