我有来自 .orc 的 OpenStreetMap (OSM) 数据,该数据存储在某个国家/地区的 var nlorc
中,我正在尝试读取该国家/地区特定城市的数据。据我所知,城市实体在OSM中被定义为“关系”。我的数据的 nlorc.printSchema()
返回以下内容:
root
|-- id: long (nullable = true)
|-- type: string (nullable = true)
|-- tags: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- lat: decimal(9,7) (nullable = true)
|-- lon: decimal(10,7) (nullable = true)
|-- nds: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- ref: long (nullable = true)
|-- members: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- type: string (nullable = true)
| | |-- ref: long (nullable = true)
| | |-- role: string (nullable = true)
|-- changeset: long (nullable = true)
|-- timestamp: timestamp (nullable = true)
|-- uid: long (nullable = true)
|-- user: string (nullable = true)
|-- version: long (nullable = true)
|-- visible: boolean (nullable = true)
例如,https://www.openstreetmap.org/relation/47798#map=13/51.4373/4.8888显示城市名称是“标签”的一部分。如何访问标签的键并选择特定城市?
最佳答案
您可以使用getItem
来访问 map 的元素:
df = ...
df.filter(df("tags").getItem("name")==="Baarle-Nassau").show()
关于scala - Spark 读取 Open Street Map 数据并选择条目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69235949/