python - 为什么 Pandas 中我的列数据相差一？

我使用 Pandas 库来处理文本，因为我发现它比 csv 模块远容易。问题就在这里。我有一个包含多列的 .csv 文件:subtitle、title、 和 description。以下是我访问每列中的行内容的方法。

colnames = ['subtitle', 'description', 'title']
data = pandas.read_csv('C:\Users\B\cwitems.csv', names=colnames)
subtit = list(data.subtitle)
desc = list(data.description)
title = list(data.title)

for line in zip(subtit, desc, title):
    print line

问题是，无论出于何种原因，当我打印 line 时，预期的副标题并未打印。当我打印每个 desc 时，标题就会显示。当我单独打印 subtit 时，会打印描述。因此，每列似乎都偏离了 -1。谁能解释这种行为？这是预期的情况吗？我该如何避免它？

最佳答案

我认为您试图加载一个包含 4 列的文件，但只给出了 3 个列名称。如果您只需要加载前 3 列，请使用

data = pandas.read_csv('C:\Users\B\cwitems.csv', names=colnames, usecols=[0,1,2])

您不必删除文件中未使用的列。

默认情况下，read_csv 加载所有列，在您的情况下 #cols = #colnames+1，因此第一列用作数据帧索引。所有剩余列均移位 1。

关于python - 为什么 Pandas 中我的列数据相差一？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23436681/

上一篇：python - python初学者错误中的变量增量

下一篇：python - cherrypy:响应包含点的网址？

python - 类型错误 : 'NoneType' object is not iterable when using zip_longest

python - 如何使用 python API 保存/显示 giphy gif？

python - 在 CSV 中搜索匹配字段并使用初始日期

python - Pandas ，按列删除重复 N 次的重复项

python - 使用掩码替换 numpy 数组中的字符串会导致字符串被截断

python - 哪些 svm python 模块使用 gpu？

regex - 多行正则表达式

javascript - 使用 python selenium 从 javascript 链接下载 csv 文件

python - 具有上限/下限的 Numpy 自定义 Cumsum 函数？