我将data.frame列表作为自定义函数的输出,因此我打算将每个data.frame按其最后一列(给出阈值)进行拆分。但是,我很好地操作了两个列表,并将它们组合在一起仅得到一张表。但是操作此新表时出现错误。我不知道问题是从哪里来的。如何解决此错误?谁能指出我可能解决此错误?如果可以解决此错误,我想实现包装器。如何轻松操作data.frame列表?还有更好的方法来调试错误吗?
迷你示例:
savedDF <- list(
bar = data.frame(.start=c(12,21,37), .stop=c(14,29,45), .score=c(5,9,4)),
cat = data.frame(.start=c(18,42,18,42,81), .stop=c(27,46,27,46,114), .score=c(10,5,10,5,34)),
foo = data.frame(.start=c(3,3,33,3,33,91), .stop=c(24,24,10,24,10,17), .score=c(22,22,6,22,6,7))
)
discardedDF <- list(
bar = data.frame(.start=c(16,29), .stop=c(20,37), .score=c(2,11)),
cat = data.frame(.start=c(21,31), .stop=c(23,43), .score=c(1,9)),
foo = data.frame(.start=c(54, 79), .stop=c(71,93), .score=c(3,8))
)
我可以这样操作:
both <- do.call("rbind", c(savedDF, discardedDF))
cn <- c("letter", "seq")
# FIXME :
DF <- cbind(
read.table(text = chartr("_", ".", rownames(both)), header=T, sep = ".", col.names = cn),
both)
DF <- transform(DF, isPassed = ifelse(.score > 8, "Pass", "Fail"))
by(DF, DF[c("letter", "isPassed")],
function(x) write.csv(x[-(1:length(savedDF))],
sprintf("%s_%s_%s.csv", x$letter[1], x$isPassed[1])))
但是我有一个错误
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 15 did not have 2 elements
为什么我有这个错误?谁能指出我该如何解决?
我想要的输出是CSV文件列表,如下所示:
bar.saved.Pass.csv
bar.saved.Fail.csv
bar.discarded.Pass.csv
bar.discarded.Fail.csv
cat.saved.Pass.csv
cat.saved.Fail.csv
cat.discarded.Pass.csv
cat.discarded.Fail.csv
foo.saved.Pass.csv
foo.saved.Fail.csv
foo.discarded.Pass.csv
foo.discarded.Fail.csv
但是我认为仍然不希望控制导出的CSV文件。如何改善此包装器的功能?我打算让它通过自定义选择输出目录,否则更动态将是不错的选择。任何想法 ?非常感谢
最佳答案
这是你想要的?
library(tidyverse)
library(magrittr)
both <- do.call("rbind", c(savedDF, discardedDF))
both %<>% rownames_to_column(var = "cn")
both %<>% separate(cn, c("letters", "seq"), sep = "\\.")
both %<>% mutate(isPassed = ifelse(.score > 8, "Passed", "Failed"),
isDiscard = ifelse(is.na(seq), "Saved", "Discarded"))
list_of_dfs <- both %>% split(list(.$letters, .$isPassed, .$isDiscard))
csv_names <- paste0("/Users/nathanday/Desktop/", names(list_of_dfs), ".csv") # change this path
mapply(write.csv, list_of_dfs, csv_names)
%<>%
运算符是简写形式,因此both %<>% rownames_to_columm(var = "cn")
与both <- rownames_to_column(both, var = "cn")
相同为了使其更加“动态”以允许输出路径输入,您可以将其包装在已经具有如下功能的函数结构中:
output_where <- function(output_path, list1, list2) {
if (!dir.exists(output_path)) {
dir.create(file.path(output_path))
}
both <- do.call(rbind, c(list1, list2))
both %<>% rownames_to_column(var = "cn")
both %<>% separate(cn, c("letters", "seq"), sep = "\\.")
both %<>% mutate(isPassed = ifelse(.score > 8, "Passed", "Failed"), isDiscard = ifelse(is.na(seq), "Saved", "Discarded"))
list_of_dfs <- both %>% split(list(.$letters, .$isPassed, .$isDiscard))
csv_names <- paste0(output_path, names(list_of_dfs), ".csv")
return(mapply(write.csv, list_of_dfs, csv_names))
}
output_where("~/Desktop/", savedDF, discardedDF)
以获得更多动态:
output_where <- function(output_path, list1, list2) {
if (!dir.exists(output_path)) {
dir.create(file.path(output_path))
}
names(list1) <- paste("list1", names(list1), sep = ".")
names(list2) <- paste("list2", names(list2), sep = ".")
both <- do.call(rbind, c(list1, list2))
both %<>% rownames_to_column(var = "cn")
both %<>% separate(cn, c("original_list", "letters", "seq"), sep = "\\.")
both %<>% mutate(isPassed = ifelse(.score > 8, "Passed", "Failed"))
list_of_dfs <- both %>% split(list(.$letters, .$isPassed, .$original_list))
csv_names <- paste0(output_path, names(list_of_dfs), ".csv")
return(mapply(write.csv, list_of_dfs, csv_names))
}
关于r - 处理data.frame列表时出现意外错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40721425/