我正在尝试使用 d3-array生成对象数组的两个摘要:
- 每位教师执行了哪些操作?
- 每位老师编辑了哪些帖子?
这是我目前的做法:
const data = [
{ post_id: 47469, action: "reply", teacher_username: "John" },
{ post_id: 47469, action: "edit", teacher_username: "John" },
{ post_id: 47468, action: "reply", teacher_username: "John" },
{ post_id: 47465, action: "reply", teacher_username: "Mary" },
{ post_id: 47465, action: "edit", teacher_username: "Mary" },
{ post_id: 47467, action: "edit", teacher_username: "Mary" },
{ post_id: 46638, action: "reply", teacher_username: "Paul" },
];
const teacherSummary = [
...d3.rollup(
data,
(x) => x.length,
(d) => d.teacher_username,
(d) => d.action
),
]
.map((x) => {
return {
teacher_username: x[0],
num_edits: x[1].get("edit") || 0,
num_replies: x[1].get("reply") || 0,
};
})
.sort((a, b) => d3.descending(a.num_edits, b.num_edits));
// [
// { "teacher_username": "Mary", "num_edits": 2, "num_replies": 1 },
// { "teacher_username": "John", "num_edits": 1, "num_replies": 2 },
// { "teacher_username": "Paul", "num_edits": 0, "num_replies": 1 }
// ]
const postIdsByTeacher = d3.rollups(
data.filter((x) => x.action === "edit"),
(v) => [...new Set(v.map((d) => d.post_id))].join(", "), // Set() is used to get rid of duplicate post_ids
(d) => d.teacher_username
);
// [
// ["John","47469"],
// ["Mary","47465, 47467"]
// ]
我对输出格式很灵活。我想要优化的是效率和清晰度:
- 我能否在一次
rollup
调用中获得两个摘要?也许通过将edited_post_ids
添加到teacherSummary
。 - 似乎应该有更优雅的方法来替换
[...Map/Set]
调用
编辑:出于好奇,我也使用 alasql 尝试了这种方法.除了 edited_post_ids
中的空值外,它几乎可以正常工作。
sql = alasql(`
select
teacher_username,
count(case when action = 'reply' then 1 end) num_replies,
count(case when action = 'edit' then 1 end) num_edits,
array(case when action = 'edit' then post_id end) as edited_post_ids
from ?
group by teacher_username
`, [data])
// [
// { teacher_username: "John", num_replies: 2, num_edits: 1, edited_post_ids: [null, 47469, null], },
// { teacher_username: "Mary", num_replies: 1, num_edits: 2, edited_post_ids: [null, 47465, 47467], },
// { teacher_username: "Paul", num_replies: 1, num_edits: 0, edited_post_ids: [null], },
// ];
最佳答案
d3.rollup
的函数签名是:
d3.rollup(可迭代、减少、...键)
从表面上看,您可以在 reduce
中提供一个操作,例如计数或求和或其他一些操作 - 但只有一个。
对于您的输出,您正在寻找两种不同的操作
- 计算回复和编辑,以及
- 获取
post_id
的数组操作,其中action == "edit"
一旦您选择使用 (x) => x.length
,您就失去了使用不同的 reduce
操作的机会。可以说 d3.rollup
如果您有多个操作,这不是您需要的功能吗?
您仍然可以将 edited_post_ids
添加到 teacherSummary
,只需返回原始数据并应用 filter
然后 map
:
const data = [
{ post_id: 47469, action: "reply", teacher_username: "John" },
{ post_id: 47469, action: "edit", teacher_username: "John" },
{ post_id: 47468, action: "reply", teacher_username: "John" },
{ post_id: 47465, action: "reply", teacher_username: "Mary" },
{ post_id: 47465, action: "edit", teacher_username: "Mary" },
{ post_id: 47467, action: "edit", teacher_username: "Mary" },
{ post_id: 46638, action: "reply", teacher_username: "Paul" },
];
const teacherSummary = [...d3.rollup(
data,
v => v.length,
d => d.teacher_username,
d => d.action
)].map(d => {
return {
teacher_username: d[0],
num_edits: d[1].get("edit") || 0,
num_replies: d[1].get("reply") || 0,
edited_post_ids: data
.filter(x => x.action === "edit" & x.teacher_username == d[0])
.map(x => x.post_id)
}
});
console.log(teacherSummary);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.0.0/d3.min.js"></script>
另一种方法是不使用 d3.rollup
/d3.rollups
并使用 d3.groups
反而。 source rollup
和 group
都是对 nest
顺便说一句的调用。你失去了 rollup
为你做的计数,必须自己实现它。这个例子读起来有点像 SQL 例子:
const data = [
{ post_id: 47469, action: "reply", teacher_username: "John" },
{ post_id: 47469, action: "edit", teacher_username: "John" },
{ post_id: 47468, action: "reply", teacher_username: "John" },
{ post_id: 47465, action: "reply", teacher_username: "Mary" },
{ post_id: 47465, action: "edit", teacher_username: "Mary" },
{ post_id: 47467, action: "edit", teacher_username: "Mary" },
{ post_id: 46638, action: "reply", teacher_username: "Paul" },
];
// compare with
// select
// teacher_username,
// count(case when action = 'reply' then 1 end) num_replies,
// count(case when action = 'edit' then 1 end) num_edits,
// array(case when action = 'edit' then post_id end) as
// edited_post_ids
// from ?
// group by teacher_username
const teacherSummary = d3.groups(data, d => d.teacher_username)
.map(k => {
return {
teacher_username: k[0],
num_edits: k[1].filter(k2 => k2.action == "edit").length,
num_replies: k[1].filter(k2 => k2.action == "reply").length,
edited_post_ids: k[1].filter(k2 => k2.action == "edit").map(k3 => k3.post_id)
}
});
console.log(teacherSummary);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.0.0/d3.min.js"></script>
然后作为旁注,您可以将 postIdsByTeacher
归结到下面,并避免使用 new Set(etc)
类型的东西:
const data = [
{ post_id: 47469, action: "reply", teacher_username: "John" },
{ post_id: 47469, action: "edit", teacher_username: "John" },
{ post_id: 47468, action: "reply", teacher_username: "John" },
{ post_id: 47465, action: "reply", teacher_username: "Mary" },
{ post_id: 47465, action: "edit", teacher_username: "Mary" },
{ post_id: 47467, action: "edit", teacher_username: "Mary" },
{ post_id: 46638, action: "reply", teacher_username: "Paul" },
];
const postIdsByTeacher = d3.rollups(
data.filter(d => d.action === "edit"),
v => [].concat(v.map(k => k.post_id)),
d => d.teacher_username
);
console.log(postIdsByTeacher);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.0.0/d3.min.js"></script>
但我的直觉是使用 d3.rollup
的值(value)在于当您想进行标准求和和计数时。
关于javascript - 如何使用 d3 数组中的组和汇总汇总数组?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65734133/