python - 仅当四肢具有相同值并限制最大出现次数时,如何填补数据空白?

标签 python pandas dataframe fillna

我在这里搜索了很多可以解决这个问题的答案,但找不到。期望的结果是当四肢值相等时仅填充间隙,长度限制为 4 个值:

我的数据集:

0     NaN
1     NaN
2     NaN
3     5.0
4     5.0
5     NaN
6     NaN
7     5.0
8     6.0
9     NaN
10    NaN
11    NaN
12    NaN
13    NaN
14    NaN
15    5.0
16    5.0
17    NaN
18    NaN
19    6.0
20    6.0
21    NaN
22    NaN
23    NaN
24    NaN
25    5.0
26    NaN
27    NaN
28    NaN
29    NaN
30    NaN
31    NaN
32    NaN
33    5.0
34    NaN
35    NaN

期望的结果(当四肢值相等时仅填充间隙,间隙长度限制为4):

0     NaN   # Not filled since the gap ends with 5 but this is the dataset beginning (don't know how it starts)
1     NaN   # Not filled since the gap ends with 5 but this is the dataset beginning (don't know how it starts)
2     NaN   # Not filled since the gap ends with 5 but this is the dataset beginning (don't know how it starts)
3     5.0  # Original dataset
4     5.0  # Original dataset
5     5.0    # Filled since the gap starts with 5 and ends with 5 (and is smaller than 4 values)
6     5.0    # Filled since the gap starts with 5 and ends with 5 (and is smaller than 4 values)
7     5.0  # Original dataset
8     6.0  # Original dataset
9     NaN    # Not filled since the gap starts with 6 and ends with 5
10    NaN         .
11    NaN         .
12    NaN         .
13    NaN         .
14    NaN    # Not filled since the gap starts with 6 and ends with 5
15    5.0  # Original dataset
16    5.0  # Original dataset
17    NaN    # Not filled since the gap starts with 5 and ends with 6
18    NaN    # Not filled since the gap starts with 5 and ends with 6
19    6.0  # Original dataset
20    6.0  # Original dataset
21    NaN    # Not filled since the gap starts with 6 and ends with 5
22    NaN         .
23    NaN         .
24    NaN    # Not filled since the gap starts with 6 and ends with 5
25    5.0  # Original dataset
26    5.0    # Filled since the gap starts with 5 and ends with 5
27    5.0    # Filled since the gap starts with 5 and ends with 5
28    5.0    # Filled since the gap starts with 5 and ends with 5
29    5.0    # Filled since the gap starts with 5 and ends with 5
30    NaN    # Not filled since maximum gap is 4
31    NaN    # Not filled since maximum gap is 4
32    NaN    # Not filled since maximum gap is 4
33    5.0  # Original dataset
34    NaN    # Not filled since the gap starts with 5 but this is the dataset end (don't know how it ends)
35    NaN    # Not filled since the gap starts with 5 but this is the dataset end (don't know how it ends)

最佳答案

应该是这样的:

def extremities(arr):
nones = [i for i,x in enumerate(arr) if x == None]
not_nones = [i for i,x in enumerate(arr) if x != None]
for i in nones:
    try:
        start = [x for x in not_nones if x < i][-1]
        finish = [x for x in not_nones if x > i][0]
    except:
        continue
    if arr[start] == arr[finish] and i - start < 5:
        arr[i] = arr[start]
return arr

编辑:

抱歉,我忘记了它的长度限制为 4 个值。我编辑了代码。

关于python - 仅当四肢具有相同值并限制最大出现次数时,如何填补数据空白?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67456952/

相关文章:

python - 属性错误: 'module' object has no attribute 'GraphDatabaseService'

python - 将包含 pandas Series 的列转换为特征

Python/Pandas - 组合 groupby 均值和最小值

python - Pandas:根据多列值删除行

r - 计算虚拟变量 = 数据集中两个特定年份的 1 个观测值

替换数据框中的特定行

python - 通过矩阵代数拟合原点

python - postgresql:共享内存不足?

python - 如何使用 pydicom 创建 JPEG 压缩 DICOM 数据集?

python - 具有对称列和索引(行)标签的 Pandas Dataframe