python - Pandas 适用，但访问之前计算的值

假设我有一个像这样的 DataFrame(或 Series):

     Value
0    0.5
1    0.8
2    -0.2
3    None
4    None
5    None

我想创建一个新的Result 列。

每个结果的值由前一个值通过任意函数f确定。

如果之前的值不可用(None 或 NaN)，我希望使用之前的结果(当然，并对其应用 f)。

使用之前的值很容易，我只需要使用shift。然而，访问以前的结果似乎并不那么简单。

例如，下面的代码计算了结果，但如果需要则无法访问之前的结果。

df['Result'] = df['Value'].shift(1).apply(f)

请假设 f 是任意的，因此使用 cumsum 之类的解决方案是不可能的。

显然，这可以通过迭代来完成，但我想知道是否存在更 Pandas 式的解决方案。

df['Result'] = None
for i in range(1, len(df)):
  value = df.iloc[i-1, 'Value']
  if math.isnan(value) or value is None:
    value = df.iloc[i-1, 'Result']
  df.iloc[i, 'Result'] = f(value)

示例输出，给定 f = lambda x: x+1:

差:

   Value    Result
0    0.5       NaN
1    0.8       1.5
2   -0.2       1.8
3    NaN       0.8
4    NaN       NaN
5    NaN       NaN

好:

   Value    Result
0    0.5       NaN
1    0.8       1.5
2   -0.2       1.8
3    NaN       0.8
4    NaN       1.8   <-- previous Value not available, used f(previous result)
5    NaN       2.8   <-- same

最佳答案

看起来它对我来说必须是一个循环。我讨厌循环...所以当我循环时，我使用 numba

Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.

https://numba.pydata.org/

from numba import njit


@njit
def f(x):
    return x + 1

@njit
def g(a):
    r = [np.nan]
    for v in a[:-1]:
        if np.isnan(v):
            r.append(f(r[-1]))
        else:
            r.append(f(v))
    return r

df.assign(Result=g(df.Value.values))

   Value  Result
0    0.5     NaN
1    0.8     1.5
2   -0.2     1.8
3    NaN     0.8
4    NaN     1.8
5    NaN     2.8

关于python - Pandas 适用，但访问之前计算的值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46421928/

python - Pandas 适用，但访问之前计算的值

上一篇：python - 计算 Pandas 中的工资天数

下一篇：python - 如何使用 argparse 将参数传递给 python 函数？