pandas - 如何根据谓词语句聚合 pandas Series 值？

在 R 中，很容易聚合值并应用函数(在本例中为 sum)

> example <- c(a1=1,a2=2,b1=3,b2=4)
> example # this is the vector (equivalent to Series)
a1 a2 b1 b2 
 1  2  3  4 
> grepl("^a",names(example)) #predicate statement
[1]  TRUE  TRUE FALSE FALSE
> sum(example[grep("^a",names(example))]) #combined into one statement
[1] 3

我可以想到在 pandas 中执行此操作的方法是使用列表理解而不是任何矢量化 pandas 函数:

In [55]: example = pd.Series({'a1':1,'a2':2,'b1':3,'b2':4})

In [56]: example
Out[56]: 
a1    1
a2    2
b1    3
b2    4
dtype: int64

In [63]: sum([example[x] for x in example.index if re.search('^a',x)])
Out[63]: 3

pandas 中是否有等效的矢量化方法？

最佳答案

您可以使用 groupby，它可以将函数应用于索引值(在本例中查看第一个元素):

In [11]: example.groupby(lambda x: x[0]).sum()
Out[11]:
a    3
b    7
dtype: int64

In [12]: example.groupby(lambda x: x[0]).sum()['a']
Out[12]: 3

关于pandas - 如何根据谓词语句聚合 pandas Series 值？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18834823/

上一篇：angularjs - 如何为 AngularStrap 或 Angular UI Bootstrap Popover 提供回调，以在打开和关闭时执行？

下一篇：ruby-on-rails - 如何在配置文件中使用Rails.root？

相关文章：

python - 尝试让 pandas python 代码更短

python - MySQL - 如果表名称包含表列表的记录，则选择表名称

python - 将项目添加到空的 pandas DataFrame

python - 添加一个新的 pandas dataframe 列，用条件计算填充它(平均如果，标准差如果)

pandas - 导入 Pandas 时无法导入名称 'nosetester' 错误

python - 将列的子集转换为列表

python - 将多索引与 Pandas 中的单索引数据框合并

python - 将 SQLAlchemy 查询到 Pandas DF 时重复的列？

python - 异常值分析 Python : Is there a better/more efficient way?

python - 重组 Pandas 中的数据框