python - 计算特定组的百分位数

我有 3 列。 Product Id、Price、Group(值 A、B、C、D)

我想获取每个组的价格百分位数，我正在运行以下代码。

for group, price in df.groupby(['group']):
    df['percentile'] = np.percentile(df['price'],60)

每个组的列百分位数只有一个值 3.44。每组的预期值为 2.12、3.43、3.65、4.76。 8.99.

这里出了什么问题，请告诉我。

最佳答案

我认为您可以在循环中使用 DataFrame df 和 price 列，但 price 组价格列:

import pandas as pd
import numpy as np

np.random.seed(1)
df = pd.DataFrame(np.random.randint(10, size=(5,3)))
df.columns = ['Product Id','group','price']
print df
   Product Id  group  price
0           5      8      9
1           5      0      0
2           1      7      6
3           9      2      4
4           5      2      4

for group, price in df.groupby(['group']):
    print np.percentile(df['price'],60)
4.8
4.8
4.8
4.8
group   

for group, price in df.groupby(['group']):
    print np.percentile(price['price'],60)
0.0
4.0
6.0
9.0

np.percentile 的另一种解决方案输出 Serie 在哪里:

print df.groupby(['group'])['price'].apply(lambda x: np.percentile(x,60))
group
0    0.0
2    4.0
7    6.0
8    9.0
Name: price, dtype: float64

解决方案 DataFrameGroupBy.quantile :

print df.groupby(['group'])['price'].quantile(.6)
group
0    0.0
2    4.0
7    6.0
8    9.0
Name: price, dtype: float64

通过评论编辑:

如果您需要新列，请使用 transform , docs :

>>> np.random.seed(1)
>>> df = pd.DataFrame(np.random.randint(10,size=(20,3)))
>>> df.columns = ['Product Id','group','price']
>>> df
    Product Id  group  price
0            5      8      9
1            5      0      0
2            1      7      6
3            9      2      4
4            5      2      4
5            2      4      7
6            7      9      1
7            7      0      6
8            9      9      7
9            6      9      1
10           0      1      8
11           8      3      9
12           8      7      3
13           6      5      1
14           9      3      4
15           8      1      4
16           0      3      9
17           2      0      4
18           9      2      7
19           7      9      8
>>> df['percentil'] = df.groupby(['group'])['price'].transform(lambda x: x.quantile(.6))

>>> df
    Product Id  group  price  percentil
0            5      8      9        9.0
1            5      0      0        4.4
2            1      7      6        4.8
3            9      2      4        4.6
4            5      2      4        4.6
5            2      4      7        7.0
6            7      9      1        5.8
7            7      0      6        4.4
8            9      9      7        5.8
9            6      9      1        5.8
10           0      1      8        6.4
11           8      3      9        9.0
12           8      7      3        4.8
13           6      5      1        1.0
14           9      3      4        9.0
15           8      1      4        6.4
16           0      3      9        9.0
17           2      0      4        4.4
18           9      2      7        4.6
19           7      9      8        5.8

关于python - 计算特定组的百分位数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36944884/

python - 计算特定组的百分位数

上一篇：python - 干草堆刻面 : init() got an unexpected keyword argument 'facet_fields'

下一篇：python - 使用 Django Admin 上传文件

python - 计算特定组的百分位数

上一篇：python - 干草堆刻面 : __init__() got an unexpected keyword argument 'facet_fields'

下一篇：python - 使用 Django Admin 上传文件

上一篇：python - 干草堆刻面 : init() got an unexpected keyword argument 'facet_fields'