我使用 library(jsonlite)
导入了 .json
文件 stream_in(file(".json"))
但是,其中一列看起来仍然是 .json
格式。
我不太确定如何从 .json
列中提取 ID
和 email
列。
My example:
date <- as.Date(as.character( c("2015-02-13",
"2015-02-14",
"2015-02-14")))
ID <- c(1,2,3)
name <- c("John","Michael","Thomas")
drinks <- c("Beer","Coffee","Tee")
consumed <- c(2,5,3)
john<- "{\"employeID\":\"1\",\"other_details\":{\"email\":\"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e882878086a88f8590c68b8785" rel="noreferrer noopener nofollow">[email protected]</a>\"},\"computer\":\"yes\"}"
michael<- "{\"employeID\":\"2\",\"other_details\":{\"email\":\"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="214c48424940444d615840494e4e0f424e4c" rel="noreferrer noopener nofollow">[email protected]</a>\"},\"computer\":\"yes\"}"
thomas<- "{\"employeID\":\"3\",\"other_details\":{\"email\":\"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c3b7abacaea2b083a4aea2aaafeda0acae" rel="noreferrer noopener nofollow">[email protected]</a>\"},\"computer\":\"yes\"}"
json <- c(john,michael,thomas)
df <- data.frame(date,ID,name,drinks,consumed,json)
data.frame 看起来像这样:
我想要获得以下格式:
date ID name drinks consumed email computer
#1 2015-02-13 1 John Beer 2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="600a0f080e20070d184e030f0d" rel="noreferrer noopener nofollow">[email protected]</a> yes
#2 2015-02-14 2 Michael Coffee 5 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bed3d7ddd6dfdbd2fec7dfd6d1d190ddd1d3" rel="noreferrer noopener nofollow">[email protected]</a> no
#3 2015-02-14 3 Thomas Tee 3 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d4a0bcbbb9b5a794b3b9b5bdb8fab7bbb9" rel="noreferrer noopener nofollow">[email protected]</a> yes
我尝试的是首先以不同的变体再次使用library(jsonlite)
,但它总是会导致:
fromJSON(df$json[1])
Error: Argument 'txt' must be a JSON string, URL or file.
如何正确提取这些字段?
最佳答案
df$json
是一个因子向量,而 fromJSON
只接受 JSON 字符串、URL 或文件。你可以试试
fromJSON(as.character(df$json[1]))
或在创建df
时添加stringsAsFactor=FALSE
。
你完成你的任务,你可以尝试:
library(tidyverse)
df %>%
filter(json != "{}") %>% # Drop rows with json == "{}"
rowwise() %>%
do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>%
merge(df %>% select(-json), by="ID", all.y=TRUE)
输出:
ID employeID email computer date name drinks consumed
1 1 1 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="1973767177597e7461377a7674" rel="noreferrer noopener nofollow">[email protected]</a> yes 2015-02-13 John Beer 2
2 2 2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5c31353f343d39301c253d343333723f3331" rel="noreferrer noopener nofollow">[email protected]</a> yes 2015-02-14 Michael Coffee 5
3 3 3 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0a7e6265676b794a6d676b636624696567" rel="noreferrer noopener nofollow">[email protected]</a> yes 2015-02-14 Thomas Tee 3
它可以处理json
列中带有“{}”
的情况。
df2 <- df %>%
rbind(data.frame(date="2015-02-14", ID=4, name="Kitman",
drinks="Chocolate", consumed=1, json="{}"))
df2 %>%
filter(json != "{}") %>%
rowwise() %>%
do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>%
merge(df2 %>% select(-json), by="ID", all.y=TRUE)
输出:
ID employeID email computer date name drinks consumed
1 1 1 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7b111413153b1c160355181416" rel="noreferrer noopener nofollow">[email protected]</a> yes 2015-02-13 John Beer 2
2 2 2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d4b9bdb7bcb5b1b894adb5bcbbbbfab7bbb9" rel="noreferrer noopener nofollow">[email protected]</a> yes 2015-02-14 Michael Coffee 5
3 3 3 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9beff3f4f6fae8dbfcf6faf2f7b5f8f4f6" rel="noreferrer noopener nofollow">[email protected]</a> yes 2015-02-14 Thomas Tee 3
4 4 <NA> <NA> <NA> 2015-02-14 Kitman Chocolate 1
已过时:
cbind(
df %>% select(-json),
df$json %>%
map(~as.data.frame(jsonlite::fromJSON(.))) %>%
do.call("rbind", .)
)
输出:
date ID name drinks consumed employeID email computer
1 2015-02-13 1 John Beer 2 1 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5832373036183f3520763b3735" rel="noreferrer noopener nofollow">[email protected]</a> yes
2 2015-02-14 2 Michael Coffee 5 2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0c61656f646d69604c756d646363226f6361" rel="noreferrer noopener nofollow">[email protected]</a> yes
3 2015-02-14 3 Thomas Tee 3 3 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="76021e191b170536111b171f1a5815191b" rel="noreferrer noopener nofollow">[email protected]</a> yes
关于json - 如何从具有 .json 格式的行中提取字符串?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40376241/