我厌倦了谷歌的地理编码,并决定尝试另一种方法。数据科学工具包 (http://www.datasciencetoolkit.org) 允许您对无限数量的地址进行地理编码。 R 有一个出色的包,可用作其功能的包装器 (CRAN:RDSTK)。该软件包有一个名为 street2coordinates()
的函数。与数据科学工具包的地理编码实用程序接口(interface)。
但是,RDSTK 函数 street2coordinates()
如果您尝试对诸如城市、国家/地区之类的简单内容进行地理编码,则不起作用。在以下示例中,我将尝试使用该函数获取凤凰城的经纬度:
> require("RDSTK")
> street2coordinates("Phoenix+Arizona+United+States")
[1] full.address
<0 rows> (or 0-length row.names)
数据科学工具包中的实用程序完美运行。这是给出答案的 URL 请求:http://www.datasciencetoolkit.org/maps/api/geocode/json?sensor=false&address=Phoenix+Arizona+United+States
我对地理编码多个地址(完整的地址和城市名称)感兴趣。我知道 Data Science Toolkit URL 会很好用。
如何与 URL 交互并将多个纬度和经度获取到带有地址的数据框中?
这是一个示例数据集:
dff <- data.frame(address=c(
"Birmingham, Alabama, United States",
"Mobile, Alabama, United States",
"Phoenix, Arizona, United States",
"Tucson, Arizona, United States",
"Little Rock, Arkansas, United States",
"Berkeley, California, United States",
"Duarte, California, United States",
"Encinitas, California, United States",
"La Jolla, California, United States",
"Los Angeles, California, United States",
"Orange, California, United States",
"Redwood City, California, United States",
"Sacramento, California, United States",
"San Francisco, California, United States",
"Stanford, California, United States",
"Hartford, Connecticut, United States",
"New Haven, Connecticut, United States"
))
最佳答案
像这样:
library(httr)
library(rjson)
data <- paste0("[",paste(paste0("\"",dff$address,"\""),collapse=","),"]")
url <- "http://www.datasciencetoolkit.org/street2coordinates"
response <- POST(url,body=data)
json <- fromJSON(content(response,type="text"))
geocode <- do.call(rbind,sapply(json,
function(x) c(long=x$longitude,lat=x$latitude)))
geocode
# long lat
# San Francisco, California, United States -117.88536 35.18713
# Mobile, Alabama, United States -88.10318 30.70114
# La Jolla, California, United States -117.87645 33.85751
# Duarte, California, United States -118.29866 33.78659
# Little Rock, Arkansas, United States -91.20736 33.60892
# Tucson, Arizona, United States -110.97087 32.21798
# Redwood City, California, United States -117.88536 35.18713
# New Haven, Connecticut, United States -72.92751 41.36571
# Berkeley, California, United States -122.29673 37.86058
# Hartford, Connecticut, United States -72.76356 41.78516
# Sacramento, California, United States -121.55541 38.38046
# Encinitas, California, United States -116.84605 33.01693
# Birmingham, Alabama, United States -86.80190 33.45641
# Stanford, California, United States -122.16750 37.42509
# Orange, California, United States -117.85311 33.78780
# Los Angeles, California, United States -117.88536 35.18713
这利用了 street2coordinates API (documented here) 的 POST 接口(interface),它在 1 个请求中返回所有结果,而不是使用多个 GET 请求。Phoenix 的缺失似乎是 street2coordinates API 中的一个错误。如果你去 API demo page 并尝试“凤凰城,亚利桑那州,美国”,你会得到一个空响应。但是,正如您的示例所示,使用他们的“Google 风格的地理编码器”确实会为 Phoenix 提供结果。所以这是一个使用重复 GET 请求的解决方案。请注意,这运行速度要慢得多。
geo.dsk <- function(addr){ # single address geocode with data sciences toolkit
require(httr)
require(rjson)
url <- "http://www.datasciencetoolkit.org/maps/api/geocode/json"
response <- GET(url,query=list(sensor="FALSE",address=addr))
json <- fromJSON(content(response,type="text"))
loc <- json['results'][[1]][[1]]$geometry$location
return(c(address=addr,long=loc$lng, lat= loc$lat))
}
result <- do.call(rbind,lapply(as.character(dff$address),geo.dsk))
result <- data.frame(result)
result
# address long lat
# 1 Birmingham, Alabama, United States -86.801904 33.456412
# 2 Mobile, Alabama, United States -88.103184 30.701142
# 3 Phoenix, Arizona, United States -112.0733333 33.4483333
# 4 Tucson, Arizona, United States -110.970869 32.217975
# 5 Little Rock, Arkansas, United States -91.207356 33.608922
# 6 Berkeley, California, United States -122.29673 37.860576
# 7 Duarte, California, United States -118.298662 33.786594
# 8 Encinitas, California, United States -116.846046 33.016928
# 9 La Jolla, California, United States -117.876447 33.857515
# 10 Los Angeles, California, United States -117.885359 35.187133
# 11 Orange, California, United States -117.853112 33.787795
# 12 Redwood City, California, United States -117.885359 35.187133
# 13 Sacramento, California, United States -121.555406 38.380456
# 14 San Francisco, California, United States -117.885359 35.187133
# 15 Stanford, California, United States -122.1675 37.42509
# 16 Hartford, Connecticut, United States -72.763564 41.78516
# 17 New Haven, Connecticut, United States -72.927507 41.365709
关于r - 如何使用数据科学工具箱对简单地址进行地理编码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22887833/