需要读取txt文件
https://raw.githubusercontent.com/fonnesbeck/Bios6301/master/datasets/addr.txt
并将它们转换为数据框 R,列号为:姓氏、名字、街道名、街道名、城市、州和 zip ...
尝试使用 sep 命令将它们分开但失败了...
最佳答案
扩展我的评论,这是另一种方法。如果您的完整数据集有更广泛的模式需要考虑,您可能需要调整一些代码。
library(stringr) # For str_trim
# Read string data and split into data frame
dat = readLines("addr.txt")
dat = as.data.frame(do.call(rbind, strsplit(dat, split=" {2,10}")), stringsAsFactors=FALSE)
names(dat) = c("LastName", "FirstName", "address", "city", "state", "zip")
# Separate address into number and street (if streetno isn't always numeric,
# or if you don't want it to be numeric, then just remove the as.numeric wrapper).
dat$streetno = as.numeric(gsub("([0-9]{1,4}).*","\\1", dat$address))
dat$streetname = gsub("[0-9]{1,4} (.*)","\\1", dat$address)
# Clean up zip
dat$zip = gsub("O","0", dat$zip)
dat$zip = str_trim(dat$zip)
dat = dat[,c(1:2,7:8,4:6)]
dat
LastName FirstName streetno streetname city state zip
1 Bania Thomas M. 725 Commonwealth Ave. Boston MA 02215
2 Barnaby David 373 W. Geneva St. Wms. Bay WI 53191
3 Bausch Judy 373 W. Geneva St. Wms. Bay WI 53191
...
41 Wright Greg 791 Holmdel-Keyport Rd. Holmdel NY 07733-1988
42 Zingale Michael 5640 S. Ellis Ave. Chicago IL 60637
关于r - 如何在 R 中读取文本文件并创建数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33384095/