python - Pandas 表格开始前的数据

我正在使用 pandas 解析一个包含 20k 行数据表的 Excel 文件。到目前为止一切顺利，但我也想使用表开头上方的一小部分元数据(生成表的日期)。

目前如果我不跳过任何行:

raw = pd.read_excel(datafile, sheetname=0, parse_cols="B, D:I")

前几行只是nans:

>>> raw.values[0]
array([nan, nan, nan, nan, nan, nan, nan], dtype=object)

我可以用更基本的东西打开文件，比如 xlrd 来获取数据，但这需要将整个文件加载到内存中两次，这是我不想做的事情。

pandas 是否可以在不重新导入文件的情况下获取表开头上方的数据？

最佳答案

考虑以下方法:

xl = pd.ExcelFile(filepath)

# you may want to set a correct row and column 
meta_data = xl.book.sheet_by_index(0).cell_value(0,0) 

skiprows = 5 # set it accordnigly...

df = xl.parse(0, skiprows=skiprows, parse_cols="B, D:I") \
       .dropna(axis=1, how='all')

关于python - Pandas 表格开始前的数据，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44022762/

上一篇：python - 类型错误 : init() got an unexpected keyword argument 'current_app'

下一篇：python - Pandas :用 10T 分组，但格式为 00:11:00

c++ - 在 C++ 中以编程方式创建 Excel 文件

python - 如何使用 pandas 和 beautiful soup 来抓取多个网页地址上的表格？

python - 将单位与 Pandas DataFrame 关联

python - 有没有办法通过 python 将任务删除到 Windows 任务计划程序？

Python Tkinter 滚动条问题

Python MySQL 模块

vba - excel VBA : operating with dates from spread sheet into a while/do

java - 从 csv 中删除重复项并在 Java 中对它们进行计数

python - Pandas 不可散列类型 : 'list' when using describe()

python - Pandas 表格开始前的数据

上一篇：python - 类型错误 : __init__() got an unexpected keyword argument 'current_app'

下一篇：python - Pandas :用 10T 分组，但格式为 00:11:00

上一篇：python - 类型错误 : init() got an unexpected keyword argument 'current_app'