python - 创建 pandas 汇总表(但不是 groupby)

我在 pandas 中得到了下表:

<头>

x	是
1	1
2	3
2	5
2	4
1	4
1	5

我想看看变量x的模式，所以我想看看模式是什么。

在表格中，您会看到 x=1 然后 x=2 三次，然后返回 x=1 两次。

<头>

x	# 发生次数	第一个 y 值	最后一个 y 值
1	1	1	1
2	3	3	4
1	2	4	5

我最初尝试了 groupby，但使用 groupby 会将所有 x 组合在一起，这不是我们想要的方式。

为方便起见，我应该提供数据框的内容。

data = {'x': [1, 2, 2, 2, 1, 1],
        'y': [1, 3, 5, 4, 4, 5]}
df = pd.DataFrame(data)

提前致谢。

最佳答案

这是一个孤岛问题。我们首先需要通过获取 index 并减去该 x 组的 cumcount 来将 x 值分组为岛:

group = df.index - df.groupby('x').cumcount()

输出:

0    0
1    1
2    1
3    1
4    3
5    3
dtype: int64

我们现在可以按该变量进行分组，为所需的输出添加适当的函数列表:

df.groupby(group).agg({'x':[('x', 'first'), ('# occurrence', 'size')], 'y':[('first y', 'first'), ('last y', 'last')]}).reset_index(drop=True)

输出:

   x                    y
   x # occurrence first y last y
0  1            1       1      1
1  2            3       3      4
2  1            2       4      5

取决于您的 pandas 版本 (>=0.25)，您还可以使用 dict 来指定聚合:

df.groupby(group).agg(**{ 'x' :('x', 'first'), '# occurrence':('x', 'count'), 'first y':('y', 'first'), 'last y':('y', 'last')})

输出:

   x  # occurrence  first y  last y
0  1             1        1       1
1  2             3        3       4
3  1             2        4       5

关于python - 创建 pandas 汇总表(但不是 groupby)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/72780719/

上一篇：c++ - 什么是 (void (*) (void))((uint32_t)&__STACK_END)？

下一篇：android - Gradle 失败，出现 : Failed to load compiled script from classpath [. ../.gradle/caches/jars-9/.../classes.jar]

相关文章：

python - 重定向子进程标准输出

python - 使用 Pandas 中列的唯一值创建一个 DataFrame

python - 如何在 Excel 中为文件夹中的每个 csv 文件创建新工作表

python - 如何使用 Python 在 Ubuntu 中创建自己的命令

python - scikit-learn 在管道中使用多个类预处理 SVM

python - 根据重复条目条件过滤数据帧

Python Pandas : if date1 == 0, 从 date2 复制

python - 使用 Pandas 对多个字符串列进行排名

pandas - 根据 pandas df 计算各个团队的胜率

python - 字符串中的重复字符