python - np.where索引大于某个值

我认为这会相当简单，但显然我在这里遗漏了一些东西。

我希望能够利用 np.where 和 df.groupby('Name').apply() 在 df 中创建新列(称之为'New')，其中如果相应组的索引(对应于原始的索引)，则该列的值为1 df) 大于或等于 (>=) 特定值，否则 0。

对于背景，我按 'Name' 列对 df 进行分组，并且我有一个 dict() ，其中包含相应的值用于 groupby() 中的每个名称。我希望这是清楚的，如有必要，我可以提供进一步的说明。

这是我到目前为止所得到的，给定示例df:

df = pd.DataFrame([['William', 1, 0, 0, 0, 1],['James', 0, 1, 1, 1, 1],['James', 1, 0, 0, 0, 0],
                ['James', 1, 0, 1, 1, 0],['William', 0, 1, 1, 0, 1],['William', 0, 0, 0, 0, 0],
                ['William', 1, 0, 1, 1, 0],['James', 0, 1, 1, 0, 1],['James', 0, 0, 0, 0, 0]],
                columns=['Name','x1','x2','x3','x4','Interest'])

       Name  x1  x2  x3  x4  Interest
0  William   1   0   0   0         1
1    James   0   1   1   1         1
2    James   1   0   0   0         0
3    James   1   0   1   1         0
4  William   0   1   1   0         1
5  William   0   0   0   0         0
6  William   1   0   1   1         0
7    James   0   1   1   0         1
8    James   0   0   0   0         0

然后，我在 df 中查找 'Interest' 列具有 1 的每个组的最后一行，使用:

mydict = df[df['Interest']==1].groupby('Name').apply(lambda x: x.index[-1]).to_dict()

{'James': 7, 'William': 4}

注意:这是一个简化的示例。对于我的实际应用程序，我将第三行的索引拉到最后一行(即 .apply(lambda x: x.index[-3]).to_dict())，但是下一部分是我的问题的根源在哪里。

现在，我想创建一个新列'Name'，如果行索引为>=，则值为1该组的 mydict 中的值，否则 0。我尝试了一些方法:

for key, val in mydict.items():
    df['New'] = np.where((df['Name']==key) & (df.index>=val), 1, 0)

这显然会覆盖为'James'所做的任何事情，并只返回'William'的正确列。我怎样才能有效地做到这一点？

为了彻底，这是我的预期输出:

      Name  x1  x2  x3  x4  Interest  New
0  William   1   0   0   0         1    0
1    James   0   1   1   1         1    0
2    James   1   0   0   0         0    0
3    James   1   0   1   1         0    0
4  William   0   1   1   0         1    1
5  William   0   0   0   0         0    1
6  William   1   0   1   1         0    1
7    James   0   1   1   0         1    1
8    James   0   0   0   0         0    1

最佳答案

使用 map

df.assign(New=(df.index >= df.Name.map(mydict)).astype(int))

      Name  x1  x2  x3  x4  Interest  New
0  William   1   0   0   0         1    0
1    James   0   1   1   1         1    0
2    James   1   0   0   0         0    0
3    James   1   0   1   1         0    0
4  William   0   1   1   0         1    1
5  William   0   0   0   0         0    1
6  William   1   0   1   1         0    1
7    James   0   1   1   0         1    1
8    James   0   0   0   0         0    1

关于python - np.where索引大于某个值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51094437/

python - np.where索引大于某个值

上一篇：python - 在 return 语句的同一行使用 if/else

下一篇：python - 通过 pod 访问 kubernetes python api