r - 如何设置条件来填补面板数据中的特定空白？

我在 R data.frame 中有面板数据，其中包含 1989 年至 2008 年各国武装冲突的年份。然而，仅包括对特定年份经历过武装冲突的国家的观察。

数据集类似于:

df <- data.frame(c("1989","1993","1998",
     "1990","1995","1997"),
    c(rep(c(750, 135), c(3,3))), c(rep(1,6)))
names(df)<-c("year","countrycode","conflict")
print(df)

  year countrycode conflict
1 1989         750        1
2 1993         750        1
3 1998         750        1
4 1990         135        1
5 1995         135        1
6 1997         135        1

我现在想填补面板数据中的空白，但仅限于不超过三年的空白。例如，我想在第 1 行和第 2 行之间以及第 5 行和第 7 行之间添加行(间隔分别为 3 年和 1 年)，但不在第 2 行和第 3 行之间以及第 4 行和第 5 行之间添加行(间隔分别为 4 年) 。完成此过程后，上面的 data.frame 将如下所示:

> df2 <- data.frame(c("1989","1990","1991","1992","1993","1998",
+      "1990","1995","1996","1997"),
+     c(rep(c(750, 135), c(6,4))), c(1,0,0,0,1,1,1,1,0,1))
> names(df2) <- c("year","countrycode","conflict")
> print(df2)
   year countrycode conflict
1  1989         750        1
2  1990         750        0
3  1991         750        0
4  1992         750        0
5  1993         750        1
6  1998         750        1
7  1990         135        1
8  1995         135        1
9  1996         135        0
10 1997         135        1

我已经查看了 plm 包(请参阅 here )，但在那里找不到任何答案。另外，我对 R 还比较陌生，所以我会很高兴得到任何提示。

最佳答案

这是使用data.table的解决方案。我们的想法是首先创建一个 data.table，其中仅包含缺少的条目 (dt.rest)，然后rbind 它们。我以这样的方式编写它，每一行的输出(通过复制/粘贴和打印)应该相当容易理解。如果有不清楚的地方请告诉我。

require(data.table)
dt <- data.table(df, key="countrycode")
dt$year <- as.numeric(as.character(dt$year))
dt[J(unique(countrycode)), year2 := c(tail(year, -1), NA)]
dt.rest <- dt[, { tt <- which(year2-year-1 <=3); 
                  list(year = unlist(lapply(tt, function(x) 
                              seq(year[x]+1, year2[x]-1, by=1))), 
                       conflict = 0)
                }, by=countrycode]
setcolorder(dt.rest, c("year", "countrycode", "conflict"))

#    year countrycode conflict
# 1: 1996         135        0
# 2: 1990         750        0
# 3: 1991         750        0
# 4: 1992         750        0

现在，我们只需重新绑定(bind)它们。这是通过使用 data.table 中的 rbindlist 函数来完成的，该函数可以更有效地绑定(bind) data.frame 或 data.table比rbind。

dt[, year2 := NULL]
dt <- rbindlist(list(dt, dt.rest))
setkey(dt, "countrycode", "year")

dt
#     year countrycode conflict
#  1: 1990         135        1
#  2: 1995         135        1
#  3: 1996         135        0
#  4: 1997         135        1
#  5: 1989         750        1
#  6: 1990         750        0
#  7: 1991         750        0
#  8: 1992         750        0
#  9: 1993         750        1
# 10: 1998         750        1

关于r - 如何设置条件来填补面板数据中的特定空白？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/15716413/

r - 如何设置条件来填补面板数据中的特定空白？

上一篇：asp.net-mvc - 请推荐为 MVC 网格实现内联编辑的方法？

下一篇：Grep --exclude-dir (仅限根目录)