r - 错误: Data source must be a dictionary (dplyr)

标签 r error-handling dplyr

我对 R 很陌生,没有找到解决我的问题的方法。我真的希望你能帮助我。

虽然有更多的列和观察结果,但我的数据框如下所示:

dt <- data.frame(hid = c(1, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4),
                     syear = c(2000, 2001, 2003, 2003, 2003, 2000, 2000, 2001, 2001, 2002, 2002),
                     employlvl = c("Full-time", "Part-time", "Part-time", "Unemployed", "Unemployed",
                                    "Full-time", "Full-time", "Full-time", "Unemployed", "Part-time", 
                                    "Full-time"),
                     relhead = c("Head", "Head", "Head", "Partner", "other", "Head", 
                                                  "Partner", "Head", "Partner", "Head", "Partner")) 
<小时/>
| hid | syear |  employlvl  |       relhead         |
|-----|-------|-------------|-----------------------|
|  1  | 2000  |  Full-time  |         Head          |
|  2  | 2001  |  Part-time  |         Head          |
|  2  | 2003  |  Part-time  |         Head          |
|  2  | 2003  |  Unemployed |        Partner        |
|  2  | 2003  |  Unemployed |         other         |
|  4  | 2000  |  Full-time  |         Head          |
|  4  | 2000  |  Full-time  |        Partner        |
|  4  | 2001  |  Full-time  |         Head          |
|  4  | 2001  |  Unemployed |        Partner        |
|  4  | 2002  |  Part-time  |         Head          |
|  4  | 2002  |  Full-time  |        Partner        |

我想创建另一个列来指示合作伙伴的就业级别,并希望得到以下输出:

| hid | syear |  employlvl  |         relhead       |      Partner      |
|-----|-------|-------------|-----------------------|-------------------|
|  1  | 2000  |  Part-time  |         Head          |        NA         |
|  2  | 2001  |  Part-time  |         Head          |        NA         |
|  2  | 2003  |  Part-time  |         Head          |    Unemployed     |
|  2  | 2003  |  Unemployed |       Partner         |        NA         |
|  2  | 2003  |  Unemployed |         other         |        NA         |
|  4  | 2000  |  Full-time  |         Head          |     Full-time     |
|  4  | 2000  |  Full-time  |        Partner        |        NA         |
|  4  | 2001  |  Full-time  |         Head          |    Unemployed     |
|  4  | 2001  |  Unemployed |        Partner        |        NA         |
|  4  | 2002  |  Part-time  |         Head          |     Full-time     |
|  4  | 2002  |  Full-time  |        Partner        |        NA         |

目前我正在使用以下代码。 (再次感谢用户ycw)

library(dplyr)
library(tidyr)

dt2 <- dt %>%
  group_by(hid, syear) %>%
  filter(n() > 1) %>%
  filter(`relhead` != "Child") %>%
  spread(relhead, employlvl) %>%
  mutate(Relation = "Head") %>%
  rename(`Employment Partner` = Partner) %>%
  select(-Head)

dt3 <- dt %>%
  left_join(dt2, by = c("hid", "syear", "relhead" = "Relation"))

该代码对于这个小数据集来说工作得非常好。但当我尝试获取全部数据时,我得到以下结果:

Error: Data source must be a dictionary

非常感谢您的帮助。

最佳答案

刚刚遇到类似的问题和相同的错误消息。仔细检查我的数据集后,我发现有两列具有相同的名称。在我重命名其中之一后,它就可以正常工作了。

关于r - 错误: Data source must be a dictionary (dplyr),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45769987/

相关文章:

r - `boot`包中自定义分层样本策略

google-apps-script - 从Google Apps脚本后端在客户端记录完整异常(exception)

r - 根据 id 序列扩展列

r - 如何通过 R 中的变量计算疾病患病率

r - 从 R 内部堆积床文件

r - igraph + R 顶点按条件着色(最好是连续颜色)

php - PHP的 Assets 错误回显其他

r - 使用 plyr 在两列上加入两个海量数据帧

regex - 紧跟在关键字之后选择单词

jsp - IE自定义404错误页面无法运行JSP