python - 在数据框中插入记录的 concat 的替代方法

我有一个 90,000 次迭代的 for 循环。每次迭代都会生成一行，在循环结束时，我想要一个包含所有 90K 行的 dataframe。

我现在的做法如下 - 在每次迭代中，我将行存储为名为“sum_df”的 dataframe 并使用 concat 将每一行插入到名为 output_df 的 dataframe。就像下面一样 -

output_df = pd.concat([output_df, sum_df], sort=False)

但是，这个concat函数似乎效率低下并且减慢了执行速度。更好的方法是什么？

最佳答案

I store the row as a dataframe and use concat to insert each row into the dataframe called output_df.

你的预处理是效率低下的原因。相对于附加到列表列表而言，连接数据帧的成本较高。因此，不要将每一行存储为数据帧。假设您可以将“行”转换为单个列表:

LoL = []
for item in some_iterable:
    lst = func(item)    # func is a function which returns a list from item
    LoL.append(lst)     # append to list of lists
df = pd.DataFrame(LoL)  # construct dataframe from list of lists

或更简洁地说:

df = pd.DataFrame([func(item) for item in some_iterable])

关于python - 在数据框中插入记录的 concat 的替代方法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54292590/

上一篇：python - 在 sklearn 的 .fit() 方法中使用 numpy.ndarray 与 Pandas Dataframe

下一篇：python - PyFMI FMU 模块方法 get_variable_unit() 常量

相关文章：

python - 如何修补基类方法？

python - 谷歌应用引擎 : Scheduled Tasks With Cron for Python

python - 数据帧的频率

python - 在具有违反一对一映射的列的 Pandas DataFrame 中查找行

python - 由于数据长度不等，将嵌套 JSON 列表展平为 Pandas DataFrame 时出现问题

python - 将列表的值设置为数据帧列表

python - 如何将一系列数值数据转换为特定的分类数据？

python - 素数生成器花费太多时间

python - 如何旋转数据框

python - 查找不在列表中的元素