javascript - Node-Express 一次可以发出多少个请求?

标签 javascript node.js express node-redis amazon-athena

我有一个脚本,可以从 AWS Athena 中提取 25,000 条记录,它基本上是一个 PrestoDB 关系 SQL 数据库。假设我正在为这些记录中的每一个生成一个请求,这意味着我必须向 Athena 发出 25,000 个请求,那么当数据返回时,我必须向我的 Redis 集群发出 25,000 个请求。

一次从 Node 到 Athena 发出的理想请求量是多少?

我问这个问题的原因是因为我尝试通过创建一个包含 25,000 个 Promise 的数组,然后对其调用 Promise.all(promiseArray) 来实现此目的,但应用程序只是永远挂起。

因此,我决定一次触发 1 个并使用递归来拼接第一个索引,然后在解决 promise 后将剩余记录传递给调用函数。

问题是这需要很长时间。我休息了大约一个小时,回来时发现还剩下 23,000 条记录。

我尝试用 google 搜索 Node 和 Athena 可以同时处理多少个请求,但一无所获。我希望有人对此有所了解并能够与我分享。

谢谢。

这是我的代码仅供引用:

作为旁注,我想做的不同之处在于,我可以一次发送 4、5、6、7 或 8 个请求,而不是一次发送一个请求,具体取决于执行速度。

此外, Node 集群将如何影响此类产品的性能?

exports.storeDomainTrends = () => {
return new Promise((resolve, reject)=>{
    athenaClient.execute(`SELECT DISTINCT the_column from "the_db"."the_table"`,
    (err, data) =>  {
        var getAndStoreDomainData = (records) => {
            if(records.length){
                return new promise((resolve, reject) => {
                    var subrecords = records.splice(0, )[0]
                    athenaClient.execute(`
                    SELECT 
                    field,
                    field,
                    field,
                    SUM(field) as field
                    FROM "the_db"."the_table"
                    WHERE the_field IN ('Month') AND the_field = '`+ record.domain_name +`'
                    GROUP BY the_field, the_field, the_field
                    `, (err, domainTrend) => {

                        if(err) {
                            console.log(err)
                            reject(err)
                        }

                        redisClient.set(('Some String' + domainTrend[0].domain_name), JSON.stringify(domainTrend))
                        resolve(domainTrend);
                    })
                })
                .then(res => {
                    getAndStoreDomainData(records);
                })
            }
        }

        getAndStoreDomainData(data);

    })
})

}

最佳答案

使用lib您的代码可能如下所示:

const Fail = function(reason){this.reason=reason;};
const isFail = x=>(x&&x.constructor)===Fail;
const distinctDomains = () =>
  new Promise(
    (resolve,reject)=>
      athenaClient.execute(
        `SELECT DISTINCT domain_name from "endpoint_dm"."bd_mb3_global_endpoints"`,
        (err,data)=>
          (err)
            ? reject(err)
            : resolve(data)
      )
  );
const domainDetails = domain_name =>
  new Promise(
    (resolve,reject)=>
      athenaClient.execute(
        `SELECT 
        timeframe_end_date,
        agg_type,
        domain_name,
        SUM(endpoint_count) as endpoint_count
        FROM "endpoint_dm"."bd_mb3_global_endpoints"
        WHERE agg_type IN ('Month') AND domain_name = '${domain_name}'
        GROUP BY timeframe_end_date, agg_type, domain_name`,
        (err, domainTrend) =>
            (err)
              ? reject(err)
              : resolve(domainTrend)
        )
  );
const redisSet = keyValue =>
  new Promise(
    (resolve,reject)=>
      redisClient.set(
        keyValue,
        (err,res)=>
          (err)
            ? reject(err)
            : resolve(res)
      )
  );
const process = batchSize => limitFn => resolveValue => domains => 
  Promise.all(
    domains.slice(0,batchSize)
    .map(//map domains to promises
      domain=>
        //maximum 5 active connections
        limitFn(domainName=>domainDetails(domainName))(domain.domain_name)
        .then(
          domainTrend=>
            //the redis client documentation makes no sense whatsoever
            //https://redis.io/commands/set
            //no mention of a callback
            //https://github.com/NodeRedis/node_redis
            //mentions a callback, since we need the return value
            //and best to do it async we will use callback to promise
            redisSet([
              `Endpoint Profiles - Checkin Trend by Domain - Monthly - ${domainTrend[0].domain_name}`,
              JSON.stringify(domainTrend)
            ])
        )
        .then(
          redisReply=>{
            //here is where things get unpredictable, set is documented as 
            //  a synchronous function returning "OK" or a function that
            //  takes a callback but no mention of what that callback recieves
            //  as response, you should try with one or two records to
            //  finish this on reverse engineering because documentation
            //  fails 100% here and can not be relied uppon.
            console.log("bad documentation of redis client... reply is:",redisReply);
            (redisReply==="OK")
              ? domain
              : Promise.reject(`Redis reply not OK:${redisReply}`)
          }
        )
        .catch(//catch failed, save error and domain of failed item
          e=>
            new Fail([e,domain])
        )
    )
  ).then(
    results=>{
      console.log(`got ${batchSize} results`);
      const left = domains.slice(batchSize);
      if(left.length===0){//nothing left
        return resolveValue.conat(results);
      }
      //recursively call process untill done
      return process(batchSize)(limitFn)(resolveValue.concat(results))(left)
    }
  );
const max5 = lib.throttle(5);//max 5 active connections to athena
distinctDomains()//you may want to limit the results to 50 for testing
//you may want to limit batch size to 10 for testing
.then(process(1000)(max5)([]))//we have 25000 domains here
.then(
  results=>{//have 25000 results
    const successes = results.filter(x=>!isFail(x));
    //array of failed items, a failed item has a .reason property
    //  that is an array of 2 items: [the error, domain]
    const failed = results.filter(isFail);
  }
)

你应该弄清楚redis客户端是做什么的,我试图使用文档来弄清楚,但不妨问问我的金鱼。一旦您对客户端行为进行了逆向工程,最好尝试使用小批量大小来查看是否存在任何错误。必须导入lib才能使用,可以找到here .

关于javascript - Node-Express 一次可以发出多少个请求?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48213335/

相关文章:

javascript - 选择下拉菜单和 promise - 短暂清空选择

javascript - Angular2依赖注入(inject),类继承期间不注入(inject)服务

javascript - jquery脏表单手动设置干净

javascript - 如何在sequelize.js、node中以关联模式插入数据

node.js - 使用 Angular-Meteor 运行预定作业的最理想方式

node.js - JadeJS 和重新渲染 View 时预先写入的表单值

javascript - JavaScript 中的 FizzBu​​zz 示例

javascript - 动态路由在 nuxt 中不起作用?

node.js - 根据子文档日期查找即将发布的文档

javascript - 尝试从 Express JS 项目中的另一条路线调用现有路线时出现问题