mongodb - 如何在 mongodb 上查找和取消设置相同的子字段值

标签 mongodb mongodb-query

我在 mongodb 中有 100 万个文档。我想查找并取消设置相同的字段。你能给我一个方法或想法吗?

我的文档是这样的:

{
        "regions" : [ 
            {"id" : "1", "name" : "World"}, 
            {"id" : "10370","name" : "South America"}, 
            {"id" : "1426","name" : "Suriname"}
        ]
    }
    {
        "regions" : [ 
            {"id" : "1", "name" : "World"}, 
            {"id" : "10370","name" : "South America"}, 
            {"id" : "1426","name" : "Suriname"}
        ]
    }
    {
        "regions" : [ 
            {"id" : "1","name" : "World"}, 
            {"id" : "1734","name" : "USA"}, 
            {"id" : "1136","name" : "Pennsylvania"}, 
            {"id" : "16962","name" : "Greater Philadelphia area"}, 
        ]
    }
    {
        "regions" : [ 
            {"id" : "1","name" : "World"}, 
            {"id" : "1734","name" : "USA"}, 
            {"id" : "1136","name" : "Pennsylvania"}, 
            {"id" : "16962","name" : "Greater Philadelphia area"}, 
        ]
    }
    {
    "regions" : [ 
        {"id" : "1","name" : "World"}, 
        {"id" : "34964","name" : "Oceania"}, 
        {"id" : "15","name" : "Australia"}, 
        {"id" : "470","name" : "Western Australia"}, 
        {"id" : "36282","name" : "Perth"}, 
      ]
   }

我该如何改变:

{
        "regions" : [ 
            {"id" : "1", "name" : "World"}, 
            {"id" : "10370","name" : "South America"}, 
            {"id" : "1426","name" : "Suriname"}
        ]
    }
    {
        "regions" : [ 
            {"id" : "1","name" : "World"}, 
            {"id" : "1734","name" : "USA"}, 
            {"id" : "1136","name" : "Pennsylvania"}, 
            {"id" : "16962","name" : "Greater Philadelphia area"}, 
        ]
    }
    {
"regions" : [ 
    {"id" : "1","name" : "World"}, 
    {"id" : "34964","name" : "Oceania"}, 
    {"id" : "15","name" : "Australia"}, 
    {"id" : "470","name" : "Western Australia"}, 
    {"id" : "36282","name" : "Perth"}, 
   ]
  }

提前感谢您的回答和兴趣。

更新: 我正在尝试这段代码:

db.collection.aggregate(
 {"$group":{"_id": {"id": "$regions.id","name": "$regions.name"},}},
 {"$group":{"_id":ObjectId(),"regions": { "$push": {"id": "$_id.id","name": $_id.name"}}}},
 {"$unwind": "$regions"},
 {"$out": "newcollection"}
)

它给出了这个错误: "errmsg": "insert for $out failed: { connectionId: 111, err:\"E11000 duplicate key error collection: collection.tmp.agg_out.12 index: id dup key: { : ObjectId( '5767f378ff8f5e9302d95bc8') }\", code: 11000, n: 0, ok: 1.0 }",

我怎样才能给一个唯一的 key ?

最佳答案

使用聚合,如果你按数组元素分组,你可以摆脱重复的区域。这样的事情应该有帮助吗?

db.regs.aggregate([{$group:{"_id":{id:"$regions.id",name:"$regions.name"}}}]).pretty()

关于mongodb - 如何在 mongodb 上查找和取消设置相同的子字段值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37896243/

相关文章:

java - WritableServerSelector 没有从集群中选择服务器

node.js - 在 Mocha 中使用 Supertest 测试 Node.js Express API 和 MongoDB

javascript - 如何使用两个参数进行完全匹配搜索?

javascript - 在 mongoDB 中获取错误未知组运算符 '$group'

MongoDB:数据分区上的磁盘I/O利用率已消失

mongodb - Mongo shell 立即关闭

mongodb - Mongo ObjectId内部是如何比较的?

arrays - MongoDB 从不在其他集合中的集合中获取文档

mongodb - docker 在不同的端口上运行 mongo 镜像

mongodb - 字段 "$name"必须是累加器对象