我有一个看起来像这样但有多行的数据框:
column_1 column_2 column_3
1 {lk, 18m, NaN} {kjaf, NaN}
我想从每个集合中取出NaN
,但是迭代行会导致RuntimeError:Setchanged size during iteration
。
到目前为止我使用的代码如下:
for index, row in df.iterrows():
col2 = row['column_2']
col3 = row['column_3']
for x in col2:
col2.discard('NaN')
for y in col3:
col3.discard('NaN')
最佳答案
如果 NaN
缺少值,您可以在集合理解中使用 if
:
df = pd.DataFrame({'column_1': [1, 1],
'column_2': [[np.nan, '18m'], ['lk', 'r']],
'column_3': [['kjaf'], ['ddd']]})
print (df)
column_1 column_2 column_3
0 1 [nan, 18m] [kjaf]
1 1 [lk, r] [ddd]
cols = ['column_2', 'column_3']
df[cols] = df[cols].applymap(lambda x: set([i for i in x if pd.notna(i)]))
#oldier pandas versions
#df[cols] = df[cols].applymap(lambda x: set([i for i in x if pd.notnull(i)]))
print (df)
column_1 column_2 column_3
0 1 {18m} {kjaf}
1 1 {r, lk} {ddd}
如果 NaN
是字符串:
df = pd.DataFrame({'column_1': [1, 1],
'column_2': [['NaN', '18m'], ['lk', 'r']],
'column_3': [['kjaf'], ['ddd']]})
print (df)
column_1 column_2 column_3
0 1 [NaN, 18m] [kjaf]
1 1 [lk, r] [ddd]
cols = ['column_2', 'column_3']
df[cols] = df[cols].applymap(lambda x: set([i for i in x if i != 'NaN']))
print (df)
column_1 column_2 column_3
0 1 {18m} {kjaf}
1 1 {r, lk} {ddd}
关于python - 删除数据帧中集合中的 'NaN' 值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52369647/