我有以下 MongoDB 文档:
{
_id: ObjectId('09de14821345dda65c471c99'),
items: [
_id: ObjectId('34de64871345dfa655471c99'),
_id: ObjectId('34de64871345dfa655471c91'),
_id: ObjectId('34de64871345dfa655471c99'),
]
},
{
_id: ObjectId('09de14821345dda65c471c98'),
items: [
_id: ObjectId('24de64871345dfa61271c10'),
_id: ObjectId('24de64871345dfa61271c11'),
_id: ObjectId('24de64871345dfa61271c11'),
]
},
{
_id: ObjectId('09de14821345dda65c471c07'),
items: [
_id: ObjectId('24de64871345dfa61271c05'),
_id: ObjectId('24de64871345dfa61271c06'),
_id: ObjectId('24de64871345dfa61271c07'),
]
}
我需要查找具有重复项目数组元素的所有文档。所以从上面的文档我想得到以下结果:
db.collection.documents.find({/** need query*/}).toArray(function (err, documents) {
console.dir(documents); // documents with id's 09de14821345dda65c471c99 and 09de14821345dda65c471c98
});
我怎样才能做到这一点?
最佳答案
为了对结果进行分组和匹配,您需要使用 Aggregation Framework或Map/Reduce而不是简单的 find()
查询。
示例数据
您的示例文档包含一些错误:一些 ObjectID 太短,数组元素应该是嵌入文档 ({_id: ObjectId(...)}
) 或简单值。
对于我使用的测试数据:
db.mydocs.insert([
{
_id: ObjectId('09de14821345dda65c471c99'),
items: [
ObjectId('34de64871345dfa655471c99'),
ObjectId('34de64871345dfa655471c91'),
ObjectId('34de64871345dfa655471c99')
]
},
{
_id: ObjectId('09de14821345dda65c471c98'),
items: [
ObjectId('24de64871345ddfa61271c10'),
ObjectId('24de64871345ddfa61271c11'),
ObjectId('24de64871345ddfa61271c11')
]
},
{
_id: ObjectId('09de14821345dda65c471c07'),
items: [
ObjectId('24de64871345ddfa61271c05'),
ObjectId('24de64871345ddfa61271c06'),
ObjectId('24de64871345ddfa61271c07')
]
}])
聚合查询
这是使用 mongo
shell 的聚合查询:
db.mydocs.aggregate(
// Unpack items array into stream of documents
{ $unwind: "$items" },
// Group by original document _id and item
{ $group: {
_id: { _id: "$_id", item: "$items" },
count: { $sum: 1 }
}},
// Limit to duplicated array items (1 or more count per document _id)
{ $match: {
count: { $gt: 1 }
}},
// (Optional) clean up the result formatting
{ $project: {
_id: "$_id._id",
item: "$_id.item",
count: "$count"
}}
)
结果示例
{
"_id" : ObjectId("09de14821345dda65c471c98"),
"count" : 2,
"item" : ObjectId("24de64871345ddfa61271c11")
}
{
"_id" : ObjectId("09de14821345dda65c471c99"),
"count" : 2,
"item" : ObjectId("34de64871345dfa655471c99")
}
关于node.js - 如何获取具有非唯一数组元素的文档?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25556106/