r - 有条件插入行

标签 r insert formatting rows genetics

我有一个独特的数据集,其中的一部分可以使用以下方法复制:

data <- textConnection("SNP_Pres,Chr_N,BP_A1F,A1_Beta,A2_SE,ForSortSNP,SortOrder
rs122,13,100461219,C,T,rs122,6
1,16362,0.8701,-0.0048,0.0056,rs122,7
1,19509,0.546015137607046,-0.0033,0.0035,rs122,8
1,17218,0.1539,-0.004,0.013,rs122,9
rs142,13,61952115,G,T,rs142,6
1,16387,0.1295,0.0044,0.0057,rs142,7
1,17218,0.8454,0.006,0.013,rs142,9
rs160,13,100950452,C,T,rs160,6
1,16387,0.549,-0.0021,0.0035,rs160,7
1,19509,0.519102731537216,0.003,0.0027,rs160,8
rs298,13,66664221,C,G,rs298,6
1,19509,0.308290808358246,-0.0032,0.0033,rs298,8
1,17218,0.7227,0.022,0.01,rs298,9")
mydata <- read.csv(data, header = T, sep = ",", stringsAsFactors=FALSE)

它的格式适合在需要保留丢失数据条目的程序中使用。在这种情况下,缺少的条目由 Sort Order 列中的数字跳过指示。如果该列下降 6 - 7 - 8 - 9,则条目完成,新条目再次从 6 开始。

我需要一种方法来读取数据文件,并为每个缺失的条目插入一行零,以便文件如下所示:

data <- textConnection("SNP_Pres,Chr_N,BP_A1F,A1_Beta,A2_SE,ForSortSNP,SortOrder
rs122,13,100461219,C,T,rs122,6
1,16362,0.8701,-0.0048,0.0056,rs122,7
1,19509,0.546015137607046,-0.0033,0.0035,rs122,8
1,17218,0.1539,-0.004,0.013,rs122,9
rs142,13,61952115,G,T,rs142,6
1,16387,0.1295,0.0044,0.0057,rs142,7
0,0,0,0,0,rs142,8
1,17218,0.8454,0.006,0.013,rs142,9
rs160,13,100950452,C,T,rs160,6
1,16387,0.549,-0.0021,0.0035,rs160,7
1,19509,0.519102731537216,0.003,0.0027,rs160,8
0,0,0,0,0,rs160,9
rs298,13,66664221,C,G,rs298,6
0,0,0,0,0,rs289, 7
1,19509,0.308290808358246,-0.0032,0.0033,rs298,8
1,17218,0.7227,0.022,0.01,rs298,9")
mydata <- read.csv(data, header = T, sep = ",", stringsAsFactors=FALSE)

最终,最后两列 ForSortSNPSortOrder 将从数据文件中删除,但为了方便起见,现在将它们包含在内。 非常感谢任何建议。

最佳答案

这是使用 expand.gridmerge 函数的解决方案。

grid <- with(mydata, expand.grid(ForSortSNP=unique(ForSortSNP), SortOrder=unique(SortOrder)))
complete <- merge(mydata, grid, all=TRUE, sort=FALSE)
complete[is.na(complete)] <- 0 # replace NAs with 0's
complete <- complete[order(complete$ForSortSNP, complete$SortOrder), ] # re-sort

关于r - 有条件插入行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12303954/

相关文章:

r - 创建根据 Z 轴着色的 3D 图

jquery - jQuery 中的 .text() 和 .html() 方法在 IE7 中去除尾随空格?

css - CSS 文件中的代码在一行中?

html - 电子邮件中的乱码 html

arrays - R:根据选择器减少数组

python - 在 Linux 上编译同时使用 R 和 numpy 的 C 代码

mysql - 插入语句中的值与插入的值不匹配

sql - 在 SQL Server 2008 中插入表使用数组

.net - 格式化负 TimeSpan

r - 散点图 : Error in FUN(X[[i]], ...):找不到对象 'Group'