python - 使用 For 循环修改 Pandas 中的 DataFrame 字典

标签 python pandas dictionary for-loop dataframe

我正在学习 Pandas 教程。我决定半途而废地尝试我认为应该直接进行的事情。我将其压缩为简单的代码,供其他人亲自重现并帮助我查看我的错误或 Python 中的错误。

df = pd.DataFrame({'A': 1.,
                   'B': pd.Timestamp('20130102'),
                   'C': pd.Series(1, index = list(range(4)), dtype = 'float32'),
                   'D': np.array([3] * 4, dtype = 'int32'),
                   'E': pd.Categorical(["test", "train", "test", "train"]),
                   'F': 'foo'
                   })

# Made copy of df and modified it individually to show that it works.
df2 = df
df2.drop([1,3], inplace=True) # Dropping 2nd and 5th row.
print(df2)

# Now trying to do the same for multiple dataframes in a 
# dictionary keeps giving me an error.

dic = {'1900' : df, '1901' : df, '1902' : df} # Dic w/ 3 pairs.
names = ['1900', '1901', '1902']              # The dic keys in list.

# For loop to drop the 2nd and 4th row.
for ii in names:
    df_dic = dic[str(ii)]
    df_dic.drop([1,3], inplace=True)
    dic[str(ii)] = df_dic

我得到的输出是:

     A          B    C  D     E    F
0  1.0 2013-01-02  1.0  3  test  foo
2  1.0 2013-01-02  1.0  3  test  foo
--------------------------------------------------------------------------
ValueError                               Traceback (most recent call last)
<ipython-input-139-8236a9c3389e> in <module>()
     21 for ii in names:
     22     df_dic = dic[str(ii)]
---> 23     df_dic.drop([1,3], inplace=True)

C:\Anaconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, level, inplace, errors)
   1905                 new_axis = axis.drop(labels, level=level, errors=errors)
   1906             else:
-> 1907                 new_axis = axis.drop(labels, errors=errors)
   1908             dropped = self.reindex(**{axis_name: new_axis})
   1909             try:

C:\Anaconda3\lib\site-packages\pandas\indexes\base.py in drop(self, labels, errors)
   3260             if errors != 'ignore':
   3261                 raise ValueError('labels %s not contained in axis' %
-> 3262                                  labels[mask])
   3263             indexer = indexer[~mask]
   3264         return self.delete(indexer)

ValueError: labels [1 3] not contained in axis

很明显,单独执行时删除行是有效的,因为它给了我所需的输出。为什么在 For 循环 中实现会导致其行为异常?

提前致谢。

最佳答案

您需要copy 数据帧:

for ii in names:
    df_dic = dic[str(ii)].copy()
    df_dic.drop([1,3], inplace=True)
    dic[str(ii)] = df_dic

print (dic)
{'1900':      A          B    C  D     E    F
0  1.0 2013-01-02  1.0  3  test  foo
2  1.0 2013-01-02  1.0  3  test  foo, '1902':      A          B    C  D     E    F
0  1.0 2013-01-02  1.0  3  test  foo
2  1.0 2013-01-02  1.0  3  test  foo, '1901':      A          B    C  D     E    F
0  1.0 2013-01-02  1.0  3  test  foo
2  1.0 2013-01-02  1.0  3  test  foo}

Copying in docs .

关于python - 使用 For 循环修改 Pandas 中的 DataFrame 字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41300467/

相关文章:

python - 我如何从值是列表的字典中创建字典列表?

C# 用带值的元组实例化字典

python - 只要没有错误,如何运行while循环

python - Fortran 的 "implicit none"在 Python 中是否有等效项?

python - Pandas groupby 和对数据集的判断

python - 将整数系列转换为 Pandas 中的时间增量

python - 我如何将单词转换为 python 3 中的数字(自己的键和值)?

python - 针对按天分区的数据过滤 n 天窗口的 spark DataFrame

python - mysql sum 只产生 1

python - pandas 合并列并添加原始列