python - Pandas 相当于 np.where

标签 python pandas numpy where-clause

np.where 具有向量化 if/else 的语义(类似于 Apache Spark 的 when/otherwise DataFrame 方法)。我知道我可以在 pandas.Series 上使用 np.where，但 pandas 经常定义自己的 API 来使用而不是原始 numpy 函数，通常使用 pd.Series/pd.DataFrame 更方便。

果然，我找到了pandas.DataFrame.where。但是，乍一看，它具有完全不同的语义。我找不到使用 Pandas where 重写 np.where 最基本示例的方法:

# df is pd.DataFrame
# how to write this using df.where?
df['C'] = np.where((df['A']<0) | (df['B']>0), df['A']+df['B'], df['A']/df['B'])

我是否遗漏了一些明显的东西？还是 pandas 的 where 用于完全不同的用例，尽管名称与 np.where 相同？

最佳答案

试试:

(df['A'] + df['B']).where((df['A'] < 0) | (df['B'] > 0), df['A'] / df['B'])

numpy 之间的区别where和 DataFrame where是由DataFrame 提供的默认值吗？那个where方法正在被调用 (docs)。

即

np.where(m, A, B)

大致相当于

A.where(m, B)

如果您想使用 pandas 获得类似的调用签名，您可以利用 the way method calls work in Python :

pd.DataFrame.where(cond=(df['A'] < 0) | (df['B'] > 0), self=df['A'] + df['B'], other=df['A'] / df['B'])

或不带 kwargs(注意:参数的位置顺序与 numpy where argument order 不同):

pd.DataFrame.where(df['A'] + df['B'], (df['A'] < 0) | (df['B'] > 0), df['A'] / df['B'])

关于python - Pandas 相当于 np.where，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38579532/

上一篇：python - 有 Python 语言规范吗？

下一篇：python - 如何在 Python 中正确使用 coverage.py？

相关文章：

python - 战舰 python ，轮流重置棋盘

python - 如何使用 df.resample 更改周开始日期？

python - 我可以使用 python 将 xlsb 文件转换为 xlsx 吗？

python - read_table pandas python 数字错误

python - 在Python中迭代二维数组？

python - 在来自 Python 的 CSS 网格中定位图像

python - 当列为varchar时，django从mysql中选择最大字段

python - NumPy - 什么是广播？

python - 如何将数组项本身转换为数组Python

python - Django、RabbitMQ 和 Celery - 为什么在我更新开发中的 Django 代码后 Celery 运行我的任务的旧版本？