python - Pandas - 分组、排序并保留第一行

标签 python pandas dataframe group-by pandas-groupby

我有一个数据框，我需要按 id 分组，然后按 time 排序，并只保留每个 id 的第一条记录。尝试了其他答案中建议的不同方法，但无法获得正确的结果。将感谢您的帮助!

test = pd.DataFrame({'id' : [1,1,1,
                           2,2,
                           3,3,3,3],
                   'ref'  : ["search","social","direct",
                          "social","direct",
                          "direct",'social','social','social'],
                   'media':['video', 'page', 'video',
                           'page', 'pic', 
                            'pic', 'video', 'page', 'video'],
                   'time': ['2019-04-10 19:00:00.569', '2019-04-10 00:10:00.569', '2019-04-10 00:10:20.569',
                          '2019-04-10 12:10:00.569','2019-04-10 11:10:00.569',
                          '2019-04-10 22:10:00.569','2019-04-10 14:10:00.569','2019-04-10 14:30:00.569','2019-04-10 15:10:00.569']})

预期结果:

    id  ref     media
0   1   social  page
1   2   direct  pic
2   3   social  video

最佳答案

您可以排序然后删除重复项:

test.sort_values(by=['id', 'time']).drop_duplicates('id').drop('time',1)

   id     ref  media
1   1  social   page
4   2  direct    pic
6   3  social  video

关于python - Pandas - 分组、排序并保留第一行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55580340/

上一篇：python - Pandas :为缺失日期填充数据

下一篇：python - 如果字典键的值在其他列表中，则从列表中的字典中删除元素

python - ZeroMQ、Redis 和 Gevent

python - 将 pandas 列传递给函数时出现 "ValueError: The truth value of a Series is ambiguous"

c# - Python 等同于 C# 的 .Select？

pandas - 如何对 pyspark dataframe 中的单列进行 reshape 操作？

python - 如何区分 Pandas 中除一列之外的两个数据框？

python - cx_freeze 无法使用 pandas 库创建 exe

R:在表格中将一些行按一列移动

r - 如何简洁地编写包含数据框中许多变量的公式？

python - 我应该使用哪个版本的 cloudshell-shell 和 cloudshell-shell-core？