python - pandas:groupby 两列 nunique

我有以下示例集。

        CustID     Condition      Month        Reading  Consumption 
0     108000601         True       June       20110606      28320.0
1     108007000         True       July       20110705      13760.0
2     108007000         True     August       20110804      16240.0
3     108008000         True  September       20110901      12560.0
4     108008000         True    October       20111004      12400.0
5     108000601        False   November       20111101       9440.0
6     108090000        False   December       20111205      12160.0
7     108008000        False    January       20120106      11360.0
8     108000601         True   February       20120206      10480.0
9     108000601         True      March       20120306       9840.0

以下 groupby 为我提供了我正在寻找的部分内容。

dfm.groupby(['条件'])['CustID'].nunique()

Condition
True      3
False     3

但是我如何获得符合这两个条件的唯一 ID？例如

Condition
True      3
False     3
Both      2

最佳答案

不确定这是否是最“ Pandas ”的方式，但您可以使用set来比较每个分区中的用户(Python set数据结构是一个散列表将自动丢弃重复项):

custid_true = set(dfm[dfm['Condition']==True].CustID)
custid_false = set(dfm[dfm['Condition']==False].CustID)
custid_both = custid_true.intersection(custid_false)
n_custid_both = len(custid_both)

关于python - pandas:groupby 两列 nunique，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34215467/

上一篇：python - 需要让 tkinter 消息框根据条件显示消息

下一篇：python - 在 Python 中如何让某件事有 20% 的几率发生？

相关文章：

python - 如何在 Python 中创建 logit 正态分布？

python - virtualenv venv 未激活或创建所需的文件夹

python - 如何用python方式在一行中写这个？

python - 根据特定条件将数据帧一列中的所有行转置为多列

python - 用给定均值截断正态分布

c++ - 图像是样本还是总体

python - 在Jupyter笔记本中的不同单元格中显示mayavi X3D图形

python - django 模板 : how to expand a variable into the string argument for the built-in tag `url`

python - 如何绘制单个数据点？

python - 按列分组并将多个聚合作为数据框返回