python - 如何每周从 pandas 数据框中提取唯一值

我有一个 pandas 数据框，如下所示，我需要每周从中提取所有唯一的用户 ID:-

    sender_user_id    created
0   2                 2016-12-19 03:34:30.013923
1   3                 2016-12-20 03:34:30.013923 
2   6                 2016-12-21 03:34:30.013923 
3   22                2016-12-22 03:34:30.013923
3   6                 2016-12-22 06:34:30.013923

我需要一个输出字典或数据框，它每周聚合所有唯一的 user_id，如下所示

    created                         user_ids
0   2016-12-19 03:34:30.013923      2,5,24,15,6
1   2016-12-25 03:34:30.013923      8,9,14,21,5

我有一个想法，我们可以每周拆分数据帧并使用
numpy.unique() 函数，但是有没有一个好的且优化的方法来做到这一点？

最佳答案

考虑这个随机生成的df

rng = np.arange(100)
df = pd.DataFrame(columns=['sender_user_id', 'created'])
for t in pd.date_range('2016-03-31', periods=10, freq='3B'):
    for i in np.random.permutation(rng)[:5]:
        df = df.append(dict(sender_user_id=i, created=t), ignore_index=True)

df.sender_user_id = df.sender_user_id.astype(int)

重新采样，on

df.resample('W', on='created').sender_user_id.unique().reset_index(name='user_ids')

关于python - 如何每周从 pandas 数据框中提取唯一值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41430529/

上一篇：python - cubes.errors.ConfigurationError : config should be a ConfigParser instance, 但为 <type 'instance' >

下一篇：php - 在 Python 中使用 IMAPClient 获取电子邮件 - 需要将数据存储在列表中

相关文章：

python - Numpy:用连续整数填充二维 numpy 数组

python - 在python中将数字加在一起，数字是从字典列表中提取的

python - 测试数据框 pandas 中的行是否为 NULL 值

python - 在 pandas 中，如何使用 "where"参数来查询日期时间索引列？

python - 如何根据给定行中第 3 次出现的值获取列？

python 二维数组来听写

python - 有没有办法将列表中的元素插入到第二个列表中的每个第 kn 个位置？

python - Moviepy - 避免使用 ImageSequenceClip 写入磁盘？

python - sklearn 的 TfidfVectorizer 词频？

python - 获取完整的unicode句子