apache-spark - Spark 无法检测 ES 版本 - 如果网络/Elasticsearch 集群不可访问,通常会发生这种情况

标签 apache-spark elasticsearch

我正在尝试从本地elasticsearch读取数据并收到“无法检测ES版本...'es.nodes.wan.only'”错误,但是当我启用TRACE日志时,应用程序能够连接到elasticsearch .

我使用elasticsearch-spark_2.11-2.4.5.jar将应用程序提交到本地spark以连接到elasticsearch 6.2.4。

20/05/07 10:15:47 TRACE HeaderElement: enter HeaderElement.getParameterByName(String)
20/05/07 10:15:47 TRACE CommonsHttpTransport: Rx @[192.168.50.34] [200-OK] [{
  "name" : "node-1",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "39igfUt5S4S3JYomBTZmqw",
  "version" : {
    "number" : "6.2.4",
    "build_hash" : "ccec39f",
    "build_date" : "2018-04-12T20:37:28.497551Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}
]
20/05/07 10:15:47 DEBUG HttpMethodBase: re-creating response stream from byte array
20/05/07 10:15:47 DEBUG HttpMethodBase: re-creating response stream from byte array
20/05/07 10:15:47 DEBUG DataSource: Discovered Elasticsearch version [6.2.4]
20/05/07 10:15:47 TRACE CommonsHttpTransport: Closing HTTP transport to 192.168.50.34:9200
20/05/07 10:15:47 TRACE HttpConnection: enter HttpConnection.close()
20/05/07 10:15:47 TRACE HttpConnection: enter HttpConnection.closeSockedAndStreams()
Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
    at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:196)
    at org.elasticsearch.spark.sql.SchemaUtils$.discoverMappingAsField(SchemaUtils.scala:76)
    at org.elasticsearch.spark.sql.SchemaUtils$.discoverMapping(SchemaUtils.scala:69)
    at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema$lzycompute(DefaultSource.scala:112)
    at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema(DefaultSource.scala:112)
    at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:116)
    at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:116)
    at scala.Option.getOrElse(Option.scala:121)
    at org.elasticsearch.spark.sql.ElasticsearchRelation.schema(DefaultSource.scala:116)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:403)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
    at Elastic2Json$.main(Elastic2Json.scala:25)

这是从elasticsearch索引读取数据帧的代码

 val spark = SparkSession.builder.appName("ElasticRead").getOrCreate()
    val reader = spark.read.format("org.elasticsearch.spark.sql")
      .option("es.read.metadata", "false")
      .option("es.nodes.wan.only", "true")
      .option("es.port", "9200")
      .option("es.net.ssl", "false")
      .option("es.nodes", "localhost")
      .option("es.resource", "myindex/document")
      .option("es.http.retries", "3")

    println("...test 1")
    val df = reader.load("myindex").limit(10)
    println("...test 2 Schema")
    df.printSchema()
    df.show()
```

Thanks

最佳答案

我解决了这个问题,将elasticsearch库更改为我正在使用的特定elasticsearch版本。

关于apache-spark - Spark 无法检测 ES 版本 - 如果网络/Elasticsearch 集群不可访问,通常会发生这种情况,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61648925/

相关文章:

python - 如何将 jar 添加到 bluemix spark 上的 python 笔记本?

elasticsearch - 在Elasticsearch中组合查询

amazon-web-services - 如何在AWS Elasticsearch中应用生命周期模式太多索引

java - 从 Spark 类路径中删除 Jars

python - 将 Apache Spark Scala 代码转换为 Python

python - Spark SQL:如果单词列表中的单词包含在列中,则在新列中返回找到的单词

apache-spark - 当运行 'Spark.sql'时,始终显示 'WARN Hive: Failed to access metastore. This class should not accessed in runtime'

c# 将类作为参数传递

elasticsearch - Elasticsearch更改内部时区

elasticsearch - Open Distro Elasticsearch-使用JWT向Kibana进行身份验证