我需要使用 PyMongo 驱动程序通过无序的不同字段对(sender
和 recipient
)对来自特定集合的记录进行分组。
例如,(sender_field_value, recipient_field_value) 和 (recipient_field_value, sender_field_value) 对被认为是相等的。
我的聚合管道
groups = base.flow.records.aggregate([
{'$match': {'$or': [
{'sender': _id},
{'recipient': _id}
]
}
},
{'$group': {
'_id': {
'sender': '$sender',
'recipient': '$recipient',
},
'data_id': {
'$max': '$_id'
}
}
},
{'$limit': 20}
])
应用于数据
{ "_id" : ObjectId("533950ca9c3b6222569520c2"), "recipient" : ObjectId("533950ca9c3b6222569520c1"), "sender" : ObjectId("533950ca9c3b6222569520c0") }
{ "_id" : ObjectId("533950ca9c3b6222569520c4"), "recipient" : ObjectId("533950ca9c3b6222569520c0"), "sender" : ObjectId("533950ca9c3b6222569520c1") }
产生以下内容
{'ok': 1.0,
'result': [
{'_id': {'recipient': ObjectId('533950ca9c3b6222569520c0'), 'sender': ObjectId('533950ca9c3b6222569520c1')},
'data_id': ObjectId('533950ca9c3b6222569520c4')},
{'_id': {'recipient': ObjectId('533950ca9c3b6222569520c1'), 'sender': ObjectId('533950ca9c3b6222569520c0')},
'data_id': ObjectId('533950ca9c3b6222569520c2')}
]
}
但想要的结果只是
{'ok': 1.0,
'result': [
{'_id': {'recipient': ObjectId('533950ca9c3b6222569520c0'), 'sender': ObjectId('533950ca9c3b6222569520c1')},
'data_id': ObjectId('533950ca9c3b6222569520c4')}
]
}
什么是合适的管道?
最佳答案
实现不同对分组的技巧是将两种情况下相同的“事物”传递给 $group _id。我将使用常规比较来做到这一点(您可以提出更适合您的情况的不同内容 - 如果您的发件人和收件人不能直接比较,我的解决方案不起作用):
{$project : {
"_id" : 1,
"groupId" : {"$cond" : [{"$gt" : ['$sender', '$recipient']}, {big : "$sender", small : "$recipient"}, {big : "$recipient", small : "$sender"}]}
}},
{$group: {
'_id': "$groupId",
'data_id': {
'$max': '$_id'
}
}}
完整的聚合管道如下所示:
{$match : {
'$or': [{'sender': userId},{'recipient': userId}]
}},
{$project : {
"_id" : 1,
"groupId" : {"$cond" : [{"$gt" : ['$sender', '$recipient']}, {big : "$sender", small : "$recipient"}, {big : "$recipient", small : "$sender"}]}
}},
{$group: {
'_id': "$groupId",
'data_id': {
'$max': '$_id'
}
}},
{$limit: 20}
关于python - MongoDB 聚合 - 按不同对分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22760165/