我正在使用 pandas
尝试获取在两个日期之间购买了特定类型契约(Contract)的成员(member)的数量。我正在使用的数据框类似于:
Member Nbr Contract-Type Date-Joined
20 1 Year Membership 2011-08-01
3128 3 Month Membership 2011-07-22
3535 4 Month Membership 2015-02-18
3760 4 Month Membership 2010-02-28
3762 3 Month Membership 2010-01-31
3882 1 Month Membership 2010-04-24
3892 3 Month Membership 2010-03-24
4116 3 Month Membership 2014-12-02
4700 1 Month Membership 2014-11-11
4802 4 Month Membership 2014-07-26
5004 1 Year Membership 2012-03-12
5020 1 Year Membership 2010-07-28
5022 3 Month Membership 2010-06-25
5130 1 Year Membership 2011-01-04
...
如果只有一种我有兴趣使用的契约(Contract)类型,我可以得到计数
print(len(df[(df['Date-Joined'] > '2010-01-01')
& (df['Date-Joined'] < '2012-02-01')
& (df['Member Type'] == '1 Year Membership')]))
当我通过使用以下代码指定 1 年成员(member)资格
或 4 个月成员(member)资格
尝试类似的事情时
print(len(df[(df['Date-Joined'] > '2013-01-01')
& (df['Date-Joined'] < '2013-02-01')
& (df['Member Type'] == '1 Year Membership')
or (df['Member Type'] == '4 Month Membership')]))
出现以下错误
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
并用 &
条件替换 or
条件返回 0
最佳答案
使用|
代替或
。此外,&
优先于 |
,因此您的逻辑需要多一组括号。
import io
import pandas as pd
data = io.StringIO('''\
Member Nbr,Contract-Type,Date-Joined
20,1 Year Membership,2011-08-01
3128,3 Month Membership,2011-07-22
3535,4 Month Membership,2015-02-18
3760,4 Month Membership,2010-02-28
3762,3 Month Membership,2010-01-31
3882,1 Month Membership,2010-04-24
3892,3 Month Membership,2010-03-24
4116,3 Month Membership,2014-12-02
4700,1 Month Membership,2014-11-11
4802,4 Month Membership,2014-07-26
5004,1 Year Membership,2012-03-12
5020,1 Year Membership,2010-07-28
5022,3 Month Membership,2010-06-25
5130,1 Year Membership,2011-01-04
''')
df = pd.read_csv(data)
print(df[
(df['Date-Joined'] > '2010-01-01') &
(df['Date-Joined'] < '2012-02-01') &
(df['Contract-Type'] == '1 Year Membership')
])
# Member Nbr Contract-Type Date-Joined
# 0 20 1 Year Membership 2011-08-01
# 11 5020 1 Year Membership 2010-07-28
# 13 5130 1 Year Membership 2011-01-04
print(df[
(df['Date-Joined'] > '2010-01-01') &
(df['Date-Joined'] < '2012-02-01') &
(df['Contract-Type'] == '1 Year Membership') |
(df['Contract-Type'] == '4 Month Membership')
])
# Member Nbr Contract-Type Date-Joined
# 0 20 1 Year Membership 2011-08-01
# 2 3535 4 Month Membership 2015-02-18 <====== BEWARE!
# 3 3760 4 Month Membership 2010-02-28
# 9 4802 4 Month Membership 2014-07-26 <====== BEWARE!
# 11 5020 1 Year Membership 2010-07-28
# 13 5130 1 Year Membership 2011-01-04
print(df[
(df['Date-Joined'] > '2010-01-01') &
(df['Date-Joined'] < '2012-02-01') &
((df['Contract-Type'] == '1 Year Membership') |
(df['Contract-Type'] == '4 Month Membership'))
])
# Member Nbr Contract-Type Date-Joined
# 0 20 1 Year Membership 2011-08-01
# 3 3760 4 Month Membership 2010-02-28
# 11 5020 1 Year Membership 2010-07-28
# 13 5130 1 Year Membership 2011-01-04
关于python - 按日期和 OR 条件过滤的 Pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38575789/