我有一个字符串文件file.txt
,其中第一个单词是类名,其余是描述,如下所示:
n01440764 tench, Tinca tinca
n01443537 goldfish, Carassius auratus
n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
我想将文件读入由两列组成的数据帧 df['class']
包含类,df['description']
包含其余部分内容。
最佳答案
你可以这样做:
df = pd.read_csv(data, sep='\s{2,}', engine='python', names=['col'])
df['class'] = df['col'].str.split().apply(lambda x: x[0])
# Splitting on first occurence of whitespace
df['description'] = df['col'].str.join('').apply(lambda x: x.split(' ',1)[1])
del(df['col'])
print (df)
class description
0 n01440764 tench, Tinca tinca
1 n01443537 goldfish, Carassius auratus
2 n01484850 great white shark, white shark, man-eater, man...
关于python - 将文件读入数据帧,在 python 中的第一个单词之后分割文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39305748/