json - 如何从具有 .json 格式的行中提取字符串？

我使用 library(jsonlite) 导入了 .json 文件 stream_in(file(".json"))

但是，其中一列看起来仍然是 .json 格式。我不太确定如何从 .json 列中提取 ID 和 email 列。

  My example:

  date <- as.Date(as.character( c("2015-02-13",
                                    "2015-02-14",
                                    "2015-02-14")))
  ID <- c(1,2,3)
  name <- c("John","Michael","Thomas")
  drinks <- c("Beer","Coffee","Tee")
  consumed <- c(2,5,3)
  john<- "{\"employeID\":\"1\",\"other_details\":{\"email\":\"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e882878086a88f8590c68b8785" rel="noreferrer noopener nofollow">[email protected]</a>\"},\"computer\":\"yes\"}"
  michael<- "{\"employeID\":\"2\",\"other_details\":{\"email\":\"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="214c48424940444d615840494e4e0f424e4c" rel="noreferrer noopener nofollow">[email protected]</a>\"},\"computer\":\"yes\"}"
  thomas<- "{\"employeID\":\"3\",\"other_details\":{\"email\":\"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c3b7abacaea2b083a4aea2aaafeda0acae" rel="noreferrer noopener nofollow">[email protected]</a>\"},\"computer\":\"yes\"}"
  json <- c(john,michael,thomas)
  df <- data.frame(date,ID,name,drinks,consumed,json)

data.frame 看起来像这样:

我想要获得以下格式:

         date ID    name   drinks    consumed    email       computer
#1 2015-02-13  1    John   Beer        2      <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="600a0f080e20070d184e030f0d" rel="noreferrer noopener nofollow">[email protected]</a>      yes
#2 2015-02-14  2 Michael Coffee        5 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bed3d7ddd6dfdbd2fec7dfd6d1d190ddd1d3" rel="noreferrer noopener nofollow">[email protected]</a>       no
#3 2015-02-14  3  Thomas    Tee        3  <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d4a0bcbbb9b5a794b3b9b5bdb8fab7bbb9" rel="noreferrer noopener nofollow">[email protected]</a>      yes

我尝试的是首先以不同的变体再次使用library(jsonlite)，但它总是会导致:

fromJSON(df$json[1])  

Error: Argument 'txt' must be a JSON string, URL or file.

如何正确提取这些字段？

最佳答案

df$json 是一个因子向量，而 fromJSON 只接受 JSON 字符串、URL 或文件。你可以试试

fromJSON(as.character(df$json[1]))

或在创建df时添加stringsAsFactor=FALSE。

你完成你的任务，你可以尝试:

library(tidyverse)

df %>% 
  filter(json != "{}") %>%   # Drop rows with json == "{}"
  rowwise() %>%
  do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>% 
  merge(df %>% select(-json), by="ID", all.y=TRUE)

输出:

  ID employeID             email computer       date    name drinks consumed
1  1         1      <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="1973767177597e7461377a7674" rel="noreferrer noopener nofollow">[email protected]</a>      yes 2015-02-13    John   Beer        2
2  2         2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5c31353f343d39301c253d343333723f3331" rel="noreferrer noopener nofollow">[email protected]</a>      yes 2015-02-14 Michael Coffee        5
3  3         3  <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0a7e6265676b794a6d676b636624696567" rel="noreferrer noopener nofollow">[email protected]</a>      yes 2015-02-14  Thomas    Tee        3

它可以处理json列中带有“{}”的情况。

df2 <- df %>% 
  rbind(data.frame(date="2015-02-14", ID=4, name="Kitman", 
                   drinks="Chocolate", consumed=1, json="{}"))

df2 %>% 
  filter(json != "{}") %>% 
  rowwise() %>%
  do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>% 
  merge(df2 %>% select(-json), by="ID", all.y=TRUE)

输出:

  ID employeID             email computer       date    name    drinks consumed
1  1         1      <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7b111413153b1c160355181416" rel="noreferrer noopener nofollow">[email protected]</a>      yes 2015-02-13    John      Beer        2
2  2         2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d4b9bdb7bcb5b1b894adb5bcbbbbfab7bbb9" rel="noreferrer noopener nofollow">[email protected]</a>      yes 2015-02-14 Michael    Coffee        5
3  3         3  <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9beff3f4f6fae8dbfcf6faf2f7b5f8f4f6" rel="noreferrer noopener nofollow">[email protected]</a>      yes 2015-02-14  Thomas       Tee        3
4  4      <NA>              <NA>     <NA> 2015-02-14  Kitman Chocolate        1

已过时:

cbind(
  df %>% select(-json), 
  df$json %>% 
    map(~as.data.frame(jsonlite::fromJSON(.))) %>% 
    do.call("rbind", .)
)

输出:

        date ID    name drinks consumed employeID             email computer
1 2015-02-13  1    John   Beer        2         1      <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5832373036183f3520763b3735" rel="noreferrer noopener nofollow">[email protected]</a>      yes
2 2015-02-14  2 Michael Coffee        5         2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0c61656f646d69604c756d646363226f6361" rel="noreferrer noopener nofollow">[email protected]</a>      yes
3 2015-02-14  3  Thomas    Tee        3         3  <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="76021e191b170536111b171f1a5815191b" rel="noreferrer noopener nofollow">[email protected]</a>      yes

关于json - 如何从具有 .json 格式的行中提取字符串？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40376241/

json - 如何从具有 .json 格式的行中提取字符串？

上一篇：assert - 即使在场景中间失败，我可以让 Specflow 做最后一件事吗？

下一篇：xml - 让默认生成的 hybris 数据 bean 扩展自己生成的数据 bean