示例数据看起来像这样(仅供引用,我有数百个这样的文件)。棘手的部分是文件中的“NO RECORD”。我没有尝试几个小时将它放入 R 没有任何成功
BEGIN DATA
RIM
DATE AF QD QU
09/30/1920 NO RECORD 370.00 NO RECORD
10/01/1920 NO RECORD 391.00 391.00
10/02/1920 NO RECORD 496.00 MISSING
10/03/1920 NO RECORD 660.00 MISSING
10/04/1920 NO RECORD 881.00 MISSING
10/05/1920 NO RECORD 660.00 MISSING
10/06/1920 NO RECORD 515.00 -9999
10/07/1920 NO RECORD 443.00 NO RECORD
10/08/1920 NO RECORD 443.00 MISSING
10/09/1920 NO RECORD 443.00 443.00
10/10/1920 NO RECORD 443.00 MISSING
这是我最新的 R 代码
library(zoo)
# function to read data
obsRead <- function(path2file, filename, number_line_skip, header_or_not) {
tmpName <- paste(path2file, filename, sep="")
tmpData <- read.zoo(tmpName,
tz='', stringsAsFactors = FALSE, strip.white = TRUE,
header=header_or_not, skip=number_line_skip,
na.strings = c("NA", "N/A", "MISSING", "NO RECORD", "-9999"), # tell zoo what NA values look like
qName <- c("AF", "QD", "QU")
names(tmpData) <- qName
index(tmpData) <- as.Date(index(tmpData)) # Convert index from POSIXct to Date
str(tmpData)
return(tmpData)
}
dataDir = "path/to/file/"
dataFile <- "sampleData.txt"
nLineSkip <- 3
header_or_not <- FALSE
Q_obs <- obsRead(dataDir, dataFile, nLineSkip, header_or_not)
我从 R 得到的错误
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 2 did not have 6 elements
任何建议将不胜感激!谢谢!
最佳答案
尝试这个:
library(zoo)
L <- readLines("path/to/file/sampleData.txt")
L <- gsub("NO RECORD", "NO_RECORD", L)
z <- read.zoo(text = L, header = TRUE, skip = 2, format = "%m/%d/%Y",
na.strings = c("NA", "N/A", "MISSING", "NO_RECORD", "-9999"))
z
给予:
> z
AF QD QU
1920-09-30 NA 370 NA
1920-10-01 NA 391 391
1920-10-02 NA 496 NA
1920-10-03 NA 660 NA
1920-10-04 NA 881 NA
1920-10-05 NA 660 NA
1920-10-06 NA 515 NA
1920-10-07 NA 443 NA
1920-10-08 NA 443 NA
1920-10-09 NA 443 443
1920-10-10 NA 443 NA
关于读取带/不带空格和数字的混合文本的不规则格式文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25314749/