我使用以下代码首先对我的数据进行分组,以便我可以获得给定区域和给定月份的 Material 总销售量。
Material_Wise = data.groupby(['Material','Territory Name','Month'])['Gross Sales Qty'].sum()
print(Material_Wise)
Material Territory Name Month
A Region 1 Apr 2017 40000.0
Aug 2017 12000.0
Dec 2017 12000.0
Feb 2018 50000.0
Jan 2017 50000.0
...
E Region 2 Nov 2019 9000.0
Oct 2018 2000.0
Oct 2019 22900.0
Sept 2018 10000.0
Sept 2019 14200.0
上面是我得到的输出,现在我想对我的数据进行排序,这样我可以获得如下所示的输出:
Material Territory Name Month
A Region 1 Jan 2017 50000.0
Apr 2017 40000.0
Aug 2017 12000.0
Dec 2017 12000.0
Feb 2018 50000.0
...
E Region 2 Sept 2018 10000.0
Oct 2018 2000.0
Sept 2019 14200.0
Oct 2019 22900.0
Nov 2019 9000.0
最佳答案
由于您的 Month
列是字符串数据类型,因此默认的排序行为是按字母顺序排序。要对其进行语义排序,您需要将其转换为有序分类类型。
# Convert the months from strings to Timestamps (Apr 2017 -> 2017-01-01), drop the duplicates,
# sort them, and convert them back to strings again.
# The result is a series of semantically-ordered month names
month_names = pd.to_datetime(data['Month']).drop_duplicates().sort_values().dt.strftime('%b %Y')
# Create ordered category of month names
MonthNameDType = pd.api.types.CategoricalDtype(month_names, ordered=True)
# This will appear the same after the conversion. To check, you can use `data.dtypes` before
# and after
data['Month'] = data['Month'].astype(MonthNameDType)
# And groupby as usual
Material_Wise = data.groupby(['Material','Territory Name','Month'], observed=True)['Gross Sales Qty'].sum()
关于pandas - 在 Pandas 中使用 groupBy 后每月对数据进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61590425/