javascript - 如何使用 d3 数组中的组和汇总汇总数组?

标签 javascript arrays d3.js alasql

我正在尝试使用 d3-array生成对象数组的两个摘要:

  • 每位教师执行了哪些操作?
  • 每位老师编辑了哪些帖子?

这是我目前的做法:

const data = [
  { post_id: 47469, action: "reply", teacher_username: "John" },
  { post_id: 47469, action: "edit", teacher_username: "John" },
  { post_id: 47468, action: "reply", teacher_username: "John" },
  { post_id: 47465, action: "reply", teacher_username: "Mary" },
  { post_id: 47465, action: "edit", teacher_username: "Mary" },
  { post_id: 47467, action: "edit", teacher_username: "Mary" },
  { post_id: 46638, action: "reply", teacher_username: "Paul" },
];

const teacherSummary = [
  ...d3.rollup(
    data,
    (x) => x.length,
    (d) => d.teacher_username,
    (d) => d.action
  ),
]
  .map((x) => {
    return {
      teacher_username: x[0],
      num_edits: x[1].get("edit") || 0,
      num_replies: x[1].get("reply") || 0,
    };
  })
  .sort((a, b) => d3.descending(a.num_edits, b.num_edits));
// [
//   { "teacher_username": "Mary", "num_edits": 2, "num_replies": 1 },
//   { "teacher_username": "John", "num_edits": 1, "num_replies": 2 },
//   { "teacher_username": "Paul", "num_edits": 0, "num_replies": 1 }
// ]

const postIdsByTeacher = d3.rollups(
  data.filter((x) => x.action === "edit"),
  (v) => [...new Set(v.map((d) => d.post_id))].join(", "), // Set() is used to get rid of duplicate post_ids
  (d) => d.teacher_username
);
// [
//  ["John","47469"],
//  ["Mary","47465, 47467"]
// ]

我对输出格式很灵活。我想要优化的是效率和清晰度:

  • 我能否在一次 rollup 调用中获得两个摘要?也许通过将 edited_post_ids 添加到 teacherSummary
  • 似乎应该有更优雅的方法来替换 [...Map/Set] 调用

编辑:出于好奇,我也使用 alasql 尝试了这种方法.除了 edited_post_ids 中的空值外,它几乎可以正常工作。

sql = alasql(`
select
  teacher_username,
  count(case when action = 'reply' then 1 end) num_replies,
  count(case when action = 'edit' then 1 end) num_edits,
  array(case when action = 'edit' then post_id end) as edited_post_ids
from ?
group by teacher_username
`, [data])
// [ 
//   { teacher_username: "John", num_replies: 2, num_edits: 1, edited_post_ids: [null, 47469, null], }, 
//   { teacher_username: "Mary", num_replies: 1, num_edits: 2, edited_post_ids: [null, 47465, 47467], }, 
//   { teacher_username: "Paul", num_replies: 1, num_edits: 0, edited_post_ids: [null], },
// ];

最佳答案

d3.rollup 的函数签名是:

d3.rollup(可迭代、减少、...键)

从表面上看,您可以在 reduce 中提供一个操作,例如计数或求和或其他一些操作 - 但只有一个。

对于您的输出,您正在寻找两种不同的操作

  • 计算回复和编辑,以及
  • 获取 post_id 的数组操作,其中 action == "edit"

一旦您选择使用 (x) => x.length,您就失去了使用不同的 reduce 操作的机会。可以说 d3.rollup 如果您有多个操作,这不是您需要的功能吗?

您仍然可以将 edited_post_ids 添加到 teacherSummary,只需返回原始数据并应用 filter 然后 map:

const data = [
  { post_id: 47469, action: "reply", teacher_username: "John" },
  { post_id: 47469, action: "edit", teacher_username: "John" },
  { post_id: 47468, action: "reply", teacher_username: "John" },
  { post_id: 47465, action: "reply", teacher_username: "Mary" },
  { post_id: 47465, action: "edit", teacher_username: "Mary" },
  { post_id: 47467, action: "edit", teacher_username: "Mary" },
  { post_id: 46638, action: "reply", teacher_username: "Paul" },
];

const teacherSummary = [...d3.rollup(
  data,
  v => v.length,
  d => d.teacher_username,
  d => d.action
)].map(d => {
  return {
    teacher_username: d[0],
    num_edits: d[1].get("edit") || 0,
    num_replies: d[1].get("reply") || 0,
    edited_post_ids: data
      .filter(x => x.action === "edit" & x.teacher_username == d[0])
      .map(x => x.post_id)
  }
});
  
console.log(teacherSummary);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.0.0/d3.min.js"></script>

另一种方法是不使用 d3.rollup/d3.rollups 并使用 d3.groups反而。 source rollupgroup 都是对 nest 顺便说一句的调用。你失去了 rollup 为你做的计数,必须自己实现它。这个例子读起来有点像 SQL 例子:

const data = [
  { post_id: 47469, action: "reply", teacher_username: "John" },
  { post_id: 47469, action: "edit", teacher_username: "John" },
  { post_id: 47468, action: "reply", teacher_username: "John" },
  { post_id: 47465, action: "reply", teacher_username: "Mary" },
  { post_id: 47465, action: "edit", teacher_username: "Mary" },
  { post_id: 47467, action: "edit", teacher_username: "Mary" },
  { post_id: 46638, action: "reply", teacher_username: "Paul" },
];

// compare with
// select
//   teacher_username,
//   count(case when action = 'reply' then 1 end) num_replies,
//   count(case when action = 'edit' then 1 end) num_edits,
//   array(case when action = 'edit' then post_id end) as 
// edited_post_ids
// from ?
// group by teacher_username

const teacherSummary = d3.groups(data, d => d.teacher_username)
  .map(k => {
    return {
      teacher_username: k[0],
      num_edits: k[1].filter(k2 => k2.action == "edit").length,
      num_replies: k[1].filter(k2 => k2.action == "reply").length,
      edited_post_ids: k[1].filter(k2 => k2.action == "edit").map(k3 => k3.post_id)
    }
  });
  
console.log(teacherSummary);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.0.0/d3.min.js"></script>

然后作为旁注,您可以将 postIdsByTeacher 归结到下面,并避免使用 new Set(etc) 类型的东西:

const data = [
  { post_id: 47469, action: "reply", teacher_username: "John" },
  { post_id: 47469, action: "edit", teacher_username: "John" },
  { post_id: 47468, action: "reply", teacher_username: "John" },
  { post_id: 47465, action: "reply", teacher_username: "Mary" },
  { post_id: 47465, action: "edit", teacher_username: "Mary" },
  { post_id: 47467, action: "edit", teacher_username: "Mary" },
  { post_id: 46638, action: "reply", teacher_username: "Paul" },
];

const postIdsByTeacher = d3.rollups(
  data.filter(d => d.action === "edit"),
  v => [].concat(v.map(k => k.post_id)),
  d => d.teacher_username
);

console.log(postIdsByTeacher);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.0.0/d3.min.js"></script>

但我的直觉是使用 d3.rollup 的值(value)在于当您想进行标准求和和计数时。

关于javascript - 如何使用 d3 数组中的组和汇总汇总数组?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65734133/

相关文章:

javascript - 如何在使用 d3.js 滚动时将 x 轴保持在固定位置?

javascript - D3.js 和 javascript 中的实时图表

javascript - 从对象数组中删除动态键

php - 比较两个数组中的多个值 PHP

javascript - 为什么找不到id?

javascript - 如何在不回显的情况下将 session 值发送到客户端

c - 为什么我的 sizeof(arr)/sizeof(arr[0]) = 1?

javascript - D3.js Sequence Sunburst,点击更改数据

javascript - 如何检查 Javascript 生成器是否已经退出?

javascript - 具有自定义功能的 AngularJS OrderBy