python - 计算每行 pandas 的较高行数

我有一个 2 列的 DataFrame:

positions = pd.DataFrame({"pos" : [1, 2, 3, 4, 5], "mcap" : [1, 4, 3, 2, 5]}, index = ["a", "b", "c", "d", "e"])

对于每个索引值，我需要找到位于 2D 世界右上角的点的数量，即对于每条线，我需要计算严格高于当前线的线数。

所以上面例子的答案是:

pd.Series([4, 1, 1, 1, 0], index = ["a", "b", "c", "d", "e"])

我知道如何在循环中执行此操作，但是一旦 DataFrame 变大，这将花费大量计算时间，因此我正在寻找一种更 pythonic 的方法来执行此操作。

编辑。简单的循环解决方案。

answer = pd.Series(np.zeros(len(positions)), index = ["a", "b", "c", "d", "e"])
for asset in ["a", "b", "c", "d", "e"]:
    better_by_signal = positions[positions["pos"] > positions["pos"].loc[asset]].index
    better_by_cap = positions[positions["mcap"] > positions["mcap"].loc[asset]].index
    idx_intersection = better_by_signal.intersection(better_by_cap)
    answer[asset] = len(idx_intersection)

最佳答案

您可以使用 numpy 广播来查找 x 轴 (pos) 和 y 轴 (mcap) 的所有正差异对:

import numpy as np
import pandas as pd

positions = pd.DataFrame({"pos" : [1, 2, 3, 4, 5], "mcap" : [1, 4, 3, 2, 5]}, index = ["a", "b", "c", "d", "e"])

arrx = np.asarray([positions.pos])
arry = np.asarray([positions.mcap])
positions["count"] = ((arrx - arrx.T > 0) & (arry - arry.T > 0)).sum(axis = 1)

print(positions)

示例输出

   pos  mcap  count
a    1     1      4
b    2     4      1
c    3     3      1
d    4     2      1
e    5     5      0

关于python - 计算每行 pandas 的较高行数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52986530/

上一篇：python - 使用 django-jsonschema-form 或 django-schemulator 将 jsonschema 呈现为一种形式

下一篇：python - 如何生成 Google Cloud KMS key 并附加云存储桶

python - 如何使用 App Engine 中的任务队列 Python API 传递压缩数据？

带有 float 的字符串格式化的 Python 精度

pandas - 在 Pandas 中，如何使用列索引而不是引用列名来设置索引？

pandas - 如何显示 Pandas 列中某个值的最后一次出现？

python - 在 Python import 语句中使用相对路径有什么意义吗？

python - ImportError : No module named cv2 error upon running . 终端中的 py 文件

python - 将 pandas DataFrame 的索引增加一个

python - 如何按列值(字符串)过滤 DataFrame 由 const 字符串包含

python - 通过仅获取元组的第一个值来重新格式化元组列表时出现问题？