javascript - 优化 Node.js 中的组合 MongoDB 查询

标签 javascript node.js mongodb aggregation-framework

我在 stations 集合中存储了 10 个电台:Station AStation BStation CStation DStation EStation FStation GStation HStation IStation J

现在,要创建所有可能的车站对之间的所有车站间行程的计数列表,我在 Node.js 代码中执行以下操作(使用 Mongoose):

const stationCombinations = []

// get all stations from the stations collection
const stationIds = await Station.find({}, '_id name').lean().exec()

// list of all possible from & to combinations with their names
stationIds.forEach(fromStation => {
  stationIds.forEach(toStation => {
    stationCombinations.push({ fromStation, toStation })
  })
})

const results = []

// loop through all station combinations
for (const stationCombination of stationCombinations) {
  // create aggregation query promise
  const data = Ride.aggregate([
    {
      $match: {
        test: false,
        state: 'completed',
        duration: { $gt: 2 },
        fromStation: mongoose.Types.ObjectId(stationCombination.fromStation._id),
        toStation: mongoose.Types.ObjectId(stationCombination.toStation._id)
       }
    },
    {
      $group: {
        _id: null,
        count: { $sum: 1 }
      }
    },
    {
      $addFields: {
        fromStation: stationCombination.fromStation.name,
        toStation: stationCombination.toStation.name
      }
    }
  ])

  // push promise to array
  results.push(data)
}

// run all aggregation queries
const stationData = await Promise.all(results)

// flatten nested/empty arrays and return
return stationData.flat()

执行此函数会给出以下格式的结果:

[
  {
    "fromStation": "Station A",
    "toStation": "Station A",
    "count": 1196
  },
  {
    "fromStation": "Station A",
    "toStation": "Station B",
    "count": 1
  },
  {
    "fromStation": "Station A",
    "toStation": "Station C",
    "count": 173
  },
]

And so on for all other combinations...

该查询当前需要花费大量时间来执行,并且我不断从 MongoDB Atlas 收到有关由于这些查询而导致数据库服务器负载过重的警报。当然必须有一种优化的方法来做这样的事情?

最佳答案

您需要使用 MongoDB native 操作。您需要通过 fromStationtoStation 进行 $group 并使用 $lookup 连接两个集合。

注意:我假设您有 MongoDB >=v3.6 并且 Station._idObjectId

db.ride.aggregate([
  {
    $match: {
      test: false,
      state: "completed",
      duration: {
        $gt: 2
      }
    }
  },
  {
    $group: {
      _id: {
        fromStation: "$fromStation",
        toStation: "$toStation"
      },
      count: {
        $sum: 1
      }
    }
  },
  {
    $lookup: {
      from: "station",
      let: {
        fromStation: "$_id.fromStation",
        toStation: "$_id.toStation"
      },
      pipeline: [
        {
          $match: {
            $expr: {
              $in: [
                "$_id",
                [
                  "$$fromStation",
                  "$$toStation"
                ]
              ]
            }
          }
        }
      ],
      as: "tmp"
    }
  },
  {
    $project: {
      _id: 0,
      fromStation: {
        $reduce: {
          input: "$tmp",
          initialValue: "",
          in: {
            $cond: [
              {
                $eq: [
                  "$_id.fromStation",
                  "$$this._id"
                ]
              },
              "$$this.name",
              "$$value"
            ]
          }
        }
      },
      toStation: {
        $reduce: {
          input: "$tmp",
          initialValue: "",
          in: {
            $cond: [
              {
                $eq: [
                  "$_id.toStation",
                  "$$this._id"
                ]
              },
              "$$this.name",
              "$$value"
            ]
          }
        }
      },
      count: 1
    }
  },
  {
    $sort: {
      fromStation: 1,
      toStation: 1
    }
  }
])

MongoPlayground

未测试:

const data = Ride.aggregate([
  {
     $match: {
       test: false,
       state: 'completed',
       duration: { $gt: 2 }
     }
  },
  {
    $group: {
      _id: {
        fromStation: "$fromStation",
        toStation: "$toStation"
      },
      count: { $sum: 1 }
    }
  },
  {
    $lookup: {
      from: "station",
      let: {
        fromStation: "$_id.fromStation",
        toStation: "$_id.toStation"
      },
      pipeline: [
        {
          $match: {
            $expr: {
              $in: [
                "$_id",
                [
                  "$$fromStation",
                  "$$toStation"
                ]
              ]
            }
          }
        }
      ],
      as: "tmp"
    }
  },
  {
    $project: {
      _id: 0,
      fromStation: {
        $reduce: {
          input: "$tmp",
          initialValue: "",
          in: {
            $cond: [
              {
                $eq: [
                  "$_id.fromStation",
                  "$$this._id"
                ]
              },
              "$$this.name",
              "$$value"
            ]
          }
        }
      },
      toStation: {
        $reduce: {
          input: "$tmp",
          initialValue: "",
          in: {
            $cond: [
              {
                $eq: [
                  "$_id.toStation",
                  "$$this._id"
                ]
              },
              "$$this.name",
              "$$value"
            ]
          }
        }
      },
      count: 1
    }
  },
  {
    $sort: {
      fromStation: 1,
      toStation: 1
    }
  }
])

关于javascript - 优化 Node.js 中的组合 MongoDB 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60237301/

相关文章:

css - 覆盖 Twitter Bootstrap 不起作用

mongodb - 为什么 MongoDb 分片中 Collection 的文档数量在减少

node.js - 在 MongoDB 中保存纳秒时间戳(由 Go 生成,使用 Node.js 保存)

javascript - HTML 获取图像字段的值并将该值用作图像字段的 src

javascript - 将所有对象复制到数组中的最简单方法

javascript - 使用 jsdom 和 nodeJS 加载 SpreadJS

node.js - MongoServerSelectionError:连接 ECONNREFUSED::1:27017

javascript - 使用 Spectrum Color Picker 设置 freedrawcolor

javascript - 在 Javascript (NodeJS) 中调用 REST API 并返回未定义的响应

node.js - Express.js - 在 MongoDB 中找不到记录时显示自定义 404 页面