我有相同的代码,我正在尝试使用简单的条件在 pandas 数据框中创建新字段:
if df_reader['email1_b']=='NaN':
df_reader['email1_fin']=df_reader['email1_a']
else:
df_reader['email1_fin']=df_reader['email1_b']
但我看到了这个奇怪的错误:
ValueError Traceback (most recent call last)
<ipython-input-92-46d604271768> in <module>()
----> 1 if df_reader['email1_b']=='NaN':
2 df_reader['email1_fin']=df_reader['email1_a']
3 else:
4 df_reader['email1_fin']=df_reader['email1_b']
/home/user/GL-env_py-gcc4.8.5/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
谁能给我解释一下,我需要用这个做什么?
最佳答案
df_reader['email1_b']=='NaN'
是一个 bool 值向量(每行一个),但您需要一个 bool 值才能使 if
正常工作.改用这个:
df_reader['email1_fin'] = np.where(df_reader['email1_b']=='NaN',
df_reader['email1_a'],
df_reader['email1_b'])
作为旁注,您确定 'NaN'
吗?不是NaN
吗?在后一种情况下,你的表达应该是:
df_reader['email1_fin'] = np.where(df_reader['email1_b'].isnull(),
df_reader['email1_a'],
df_reader['email1_b'])
关于python - Series 的真值在数据框中不明确,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45811610/