python - 过滤列表上的多个数据框列

标签 python pandas dataframe

我有一个数据框,其中包含 5 个玩家列,其中包含随机玩家名称。我希望能够传递玩家列表,并且仅返回两个玩家都出现在该行中的行(跨这 5 列)。

这是生成数据帧并成功过滤掉包含行中任何一个人的行的代码。我将如何确保该行包含两个人?

random_events = ('SHOT', 'MISSED_SHOT', 'GOAL')
random_team = ('Preferred', 'Other')
events = list()

for i in range(6):
    event = dict()
    event['event_type'] = random.choice(random_events)
    event['team'] = random.choice(random_team)
    event['coords_x'] = round(random.uniform(-100, 100), 2)
    event['coords_y'] = round(random.uniform(-42.5, 42.5), 2)
    event['person_1'] = f'Person {random.randint(1, 2)}'
    event['person_2'] = f'Person {random.randint(3, 4)}'
    event['person_3'] = f'Person {random.randint(5, 6)}'
    event['person_4'] = f'Person {random.randint(7, 8)}'
    event['person_5'] = f'Person {random.randint(9, 10)}'
    events.append(event)

df = pd.DataFrame(events)
print(df)


filter_list = ['Person 1', 'Person 3']
filtered_df = df.loc[
    (df['person_1'].isin(filter_list)) |
    (df['person_2'].isin(filter_list)) |
    (df['person_3'].isin(filter_list)) |
    (df['person_4'].isin(filter_list)) |
    (df['person_5'].isin(filter_list))]

print(filtered_df)

这是我得到的结果——显示仅包含人员 1 或人员 3 的行,以及返回人员 1 和人员 3。在下面的示例中,我只想将索引为 2 的行返回给我

   coords_x  coords_y   event_type  person_1  person_2  person_3  person_4   person_5       team
0     38.82    -39.18  MISSED_SHOT  Person 1  Person 4  Person 6  Person 7   Person 9   Preferred
2     94.43     30.13         GOAL  Person 1  Person 3  Person 5  Person 8   Person 9       Other
3    -68.38    -24.42  MISSED_SHOT  Person 2  Person 3  Person 5  Person 7  Person 10   Preferred
4     99.48     22.79         SHOT  Person 1  Person 4  Person 5  Person 7   Person 9   Preferred

提前谢谢您。

最佳答案

这是一个通用方法。您可能需要根据您的具体情况来尝试一下。

# Define a list of all of the person columns in the dataframe
person_cols = [f'person_{i}' for i in [1, 2, 3, 4, 5]]

# Which rows contain Person 2 in any column? (creates a series of True or False)
(df[person_cols] == "Person 2").any(axis='columns')

# Which rows contain both Person 2 and Person 3? 
# This time I'm saving the series to use as a selection mask
mask = (
        (df[person_cols] == "Person 2").any(axis='columns') 
      & (df[person_cols] == "Person 3").any(axis='columns') 
)

# show just the rows where the mask above is True
print(df[mask])

编辑:

为必须全部在场的任意玩家列表设置掩码。

from operator import and_
from functools import reduce

players = ['Player 1', 'Player 3', 'Player 4']
filters = [(df[person_cols] == p).any(axis='columns') for p in players]
mask = reduce(and_, filters, True)

print(df[mask])

关于python - 过滤列表上的多个数据框列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53135866/

相关文章:

python - Django - Foreman 找不到已安装的模型

python - 在 anaconda 上将 pip 升级到 ver18

python - 检测和排除 pandas DataFrame 中的异常值

python - 困难的重复数据删除

python - 以浮点秒数增加初始时间值?

python - sqlalchemy 寻找字符串形式的服务器版本,而不是类似字节的对象

python - Pandas 枢轴错误 "Exception: Data must be 1-dimensional"

python - Pandas 按每个用户的条件频率分组

python - 寻找在巨大的 Pandas Dataframe 中对一行进行切片的最快方法

dataframe - Polars 原生 API 而不是缓慢的 "map_elements"