python - 来自 Pandas Dataframe 的文本

标签 python pandas loops filtering

我有一个 Pandas 数据框,其中包含软糖、巧克力和薄荷糖销售量的单个事件。它们按周数进行汇总和排序。我现在将其翻译成文本,然后通过电子邮件发送,使用以下方法:

pd['text'] = 'In calendar week (' + pd['weeknumber'].map(str) + '), customers have bought ' + pd['gummibears'].map(str) + 'kg of gummibears, ' + pd['chocolate'].map(str) + 'kg of chocolate, as well as ' + pd['mint'].map(str) + 'kg of mints.'

理想情况下,结果会给出一个很好的文本来概述销售情况。然而,有可能已经售出 0kg,当然也会出现,看起来像这样:

>>> "In calendar week 25, customers have bought 0kg of gummibears, 25kg of chocolate, as well as 0kg of mints."
>>> "In calendar week 26, customers have bought 6kg of gummibears, 0kg of chocolate, as well as 2kg of mints."

这可行,但会让读者感到困惑。有没有一种优雅的方法可以过滤掉所有 0kg 的实例,而无需嵌套多个循环?最好,上面的结果看起来像这样:

>>> "In calendar week 25, customers have bought 25kg of chocolate."
>>> "In calendar week 26, customers have bought 6kg of gummibears, as well as 2kg of mints."

最佳答案

您可以使用 numpy.where 的自定义函数和 eq 创建的 bool 掩码(==),但对于一般解决方案,文本必须进行标准化:

df = pd.DataFrame({
         'weeknumber':[1,2,3,4,5,6],
         'gummibears':[7,8,9,4,0,0],
         'chocolate': [0,3,5,0,1,0],
         'mint':      [5,3,0,9,2,0]
})


def kg_to_string(col):
    return np.where(df[col].eq(0), '', ' ' + df[col].astype(str) + 'kg of '+ col +',')

start = 'In calendar week (' + df['weeknumber'].astype(str) + '), customers have bought'

#boolean mask if all columns are 0
mask = df[['gummibears','gummibears','mint']].eq(0).all(axis=1)
df['text'] =  start +  np.where(mask, ' nothing', kg_to_string('gummibears') + 
                                                  kg_to_string('chocolate') + 
                                                  kg_to_string('mint'))
#remove last ,
df['text'] = df['text'].str.rstrip(',')
print (df['text'].tolist())
['In calendar week (1), customers have bought 7kg of gummibears, 5kg of mint', 
 'In calendar week (2), customers have bought 8kg of gummibears, 3kg of chocolate,
                                              3kg of mint', 
 'In calendar week (3), customers have bought 9kg of gummibears, 5kg of chocolate',
 'In calendar week (4), customers have bought 4kg of gummibears, 9kg of mint', 
 'In calendar week (5), customers have bought 1kg of chocolate, 2kg of mint', 
 'In calendar week (6), customers have bought nothing']

关于python - 来自 Pandas Dataframe 的文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52586780/

相关文章:

android - 在android中实现循环执行?

java - 循环时如何确保可变数组中的下一项确实是下一项?

python - 如何使用 Python 跳过 Json 文件中的字段

python - 使用 Keras LSTM 进行多对多分类

python - 机器学习: UserWarning: Pandas doesn't allow columns to be created via a new attribute name

python - 追加列对 Pandas 来说很麻烦

loops - 在anylogic中运行N次模拟

python - 从 Python 中的输入创建元组

python - 改变 python bokeh PreText() 的颜色

python - 使用两个条件过滤数据框 pandas python