python - 按日期和 OR 条件过滤的 Pandas

标签 python pandas

我正在使用 pandas 尝试获取在两个日期之间购买了特定类型契约(Contract)的成员(member)的数量。我正在使用的数据框类似于:

Member Nbr       Contract-Type    Date-Joined 
20           1 Year Membership     2011-08-01   
3128        3 Month Membership     2011-07-22   
3535        4 Month Membership     2015-02-18  
3760        4 Month Membership     2010-02-28
3762        3 Month Membership     2010-01-31
3882        1 Month Membership     2010-04-24    
3892        3 Month Membership     2010-03-24     
4116        3 Month Membership     2014-12-02   
4700        1 Month Membership     2014-11-11   
4802        4 Month Membership     2014-07-26   
5004         1 Year Membership     2012-03-12
5020         1 Year Membership     2010-07-28    
5022        3 Month Membership     2010-06-25    
5130         1 Year Membership     2011-01-04
                      ...

如果只有一种我有兴趣使用的契约(Contract)类型,我可以得到计数

print(len(df[(df['Date-Joined'] > '2010-01-01') 
          & (df['Date-Joined'] < '2012-02-01')
          & (df['Member Type'] == '1 Year Membership')]))

当我通过使用以下代码指定 1 年成员(member)资格4 个月成员(member)资格 尝试类似的事情时

print(len(df[(df['Date-Joined'] > '2013-01-01') 
      & (df['Date-Joined'] < '2013-02-01')
      & (df['Member Type'] == '1 Year Membership')
      or (df['Member Type'] == '4 Month Membership')]))

出现以下错误

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

并用 & 条件替换 or 条件返回 0

最佳答案

使用|代替。此外,& 优先于 |,因此您的逻辑需要多一组括号。

import io
import pandas as pd

data = io.StringIO('''\
Member Nbr,Contract-Type,Date-Joined
20,1 Year Membership,2011-08-01   
3128,3 Month Membership,2011-07-22
3535,4 Month Membership,2015-02-18  
3760,4 Month Membership,2010-02-28
3762,3 Month Membership,2010-01-31
3882,1 Month Membership,2010-04-24 
3892,3 Month Membership,2010-03-24
4116,3 Month Membership,2014-12-02
4700,1 Month Membership,2014-11-11
4802,4 Month Membership,2014-07-26
5004,1 Year Membership,2012-03-12
5020,1 Year Membership,2010-07-28 
5022,3 Month Membership,2010-06-25 
5130,1 Year Membership,2011-01-04
''')

df = pd.read_csv(data)

print(df[
   (df['Date-Joined'] > '2010-01-01') &
   (df['Date-Joined'] < '2012-02-01') &
   (df['Contract-Type'] == '1 Year Membership')
  ])

#     Member Nbr      Contract-Type    Date-Joined
# 0           20  1 Year Membership     2011-08-01   
# 11        5020  1 Year Membership     2010-07-28 
# 13        5130  1 Year Membership     2011-01-04

print(df[
   (df['Date-Joined'] > '2010-01-01') &
   (df['Date-Joined'] < '2012-02-01') &
   (df['Contract-Type'] == '1 Year Membership') |
   (df['Contract-Type'] == '4 Month Membership')
  ])

#     Member Nbr       Contract-Type    Date-Joined
# 0           20   1 Year Membership     2011-08-01   
# 2         3535  4 Month Membership     2015-02-18  <====== BEWARE!
# 3         3760  4 Month Membership     2010-02-28
# 9         4802  4 Month Membership     2014-07-26  <====== BEWARE!
# 11        5020   1 Year Membership     2010-07-28 
# 13        5130   1 Year Membership     2011-01-04

print(df[
   (df['Date-Joined'] > '2010-01-01') &
   (df['Date-Joined'] < '2012-02-01') &
   ((df['Contract-Type'] == '1 Year Membership') |
   (df['Contract-Type'] == '4 Month Membership'))
  ])

#     Member Nbr       Contract-Type    Date-Joined
# 0           20   1 Year Membership     2011-08-01   
# 3         3760  4 Month Membership     2010-02-28
# 11        5020   1 Year Membership     2010-07-28 
# 13        5130   1 Year Membership     2011-01-04

关于python - 按日期和 OR 条件过滤的 Pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38575789/

相关文章:

python - 如何保存 LibSVM python 对象实例?

Python Azure Function 远程调试

pandas - 使用 pandas 读取 csv 文件时,utf-8 和 latin-1 将不起作用

python - Pandas 列的高效积

python - 使用 Pandas 进行切片和创建列表

C++ 将已存在的对象实例公开给脚本语言

python - 在 CGI 脚本中不捕获异常有什么风险

python - 十六进制到 int32 Big Endian

python - 选择Pandas多索引组中的第一个子组

python - 按日期时间间隔比较两个数据帧(python pandas)