我有一个数据集,其中最后一项是句子形式的字符串。我的目标是将句子分成单词,并创建一个新的数据集,其中每个单词都在自己的行上,如下所示:
以下是旧数据集的格式:
0: Object { creator: "molly", number: 3, doc: "The cat in the hat ate the rat", … }
1: Object { creator: "may", number: 4, doc: "the crass rat", … }
2: Object { creator: "may", number: 4, doc: "The mouse in the pouch at the cat", … }
3: Object { creator: "may", number: 4, doc: "the fish hog", … }
4: Object { creator: "may", number: 4, doc: "the dog warm", … }
这是我想要的格式:
0: Object { creator: "molly", number: 3, doc: "The", … }
1: Object { creator: "molly", number: 3, doc: "cat", … }
2: Object { creator: "molly", number: 3, doc: "in", … }
3: Object { creator: "molly", number: 3, doc: "the", … }
4: Object { creator: "molly", number: 3, doc: "hat", … }
5: Object { creator: "molly", number: 3, doc: "ate", … }
6: Object { creator: "molly", number: 3, doc: "the", … }
7: Object { creator: "molly", number: 3, doc: "rat", … }
8: Object { creator: "may", number: 4, doc: "the", … }
9: Object { creator: "may", number: 4, doc: "crass", … }
10: Object { creator: "may", number: 4, doc: "rat", … }
我正在使用 D3。以下代码允许我生成一个新的数据集,其中每个单词都在自己的行上:
doc.csv:
date,number,creator,,doc
6/16/2000,3,molly,3,The cat in the hat ate the rat
2/25/2002,4,may,2,The mouse in the pouch at the cat
12/5/2004,3,molly,4,the lovely fish
7/6/2006,1,milly,1,the pog dog
9/7/2003,4,may,4,the fish hog
12/10/2001,4,may,3,the crass rat
6/15/2005,2,maggie,3,the ass rat
6/9/2004,1,milly,4,the fish blue
10/5/2005,1,milly,3,the rat true
10/7/2003,4,may,1,the dog warm
1/19/2009,4,may,2,the cat norm
10/30/2007,1,milly,4,the fish wish
8/13/2009,4,may,2,cat bat ticks
9/30/2004,3,molly,1,dog nog mog
1/17/2006,4,may,3,rat tittily too
12/18/2009,3,molly,1,dog coppily poo
11/2/2007,2,maggie,3,rat pitpat poo
4/17/2007,1,milly,4,fish too!
html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta chartset="utf-8">
<title>Interactive scatterplot</title>
<link rel="stylesheet" type="text/css" href="style.css">
<script type="text/javascript" src="d3.v4.js"></script>
</head>
<body>
<script type="text/javascript" src="split.js"></script>
<textarea id="txtName" name="txt-Name" placeholder="Search for something.." rows="1"></textarea>
</div>
</body>
</html>
代码:
var parseDate = d3.timeParse("%m/%d/%Y");
var hoot = function(d) {return d.doc.split(" ").forEach(function (item) {
var data2 = {creator: d.creator, date: parseDate(d.date),item: item}
console.log(data2)
});}
d3.csv("doc.csv")
.row(function(d) {return {creator: d.creator,date: parseDate(d.date),number: Number(d.number),doc: d.doc, split: (hoot(d))};})
.get(function(error, data) {
});
令人高兴的是,当我console.log data2时,我得到了一些接近我最终目标的东西:
我有两个问题:
1) 变量data2
函数运行后不可用。我试图制作 data2
全局变量,输入 var data2 = [];
在脚本的开头,但这不起作用。
2) 变量 data2
不采用单个数组的形式。我尝试在变量行周围放置方括号(即 var data2 = [{creator: d.creator, date: parseDate(d.date),item: item}]
),但这会生成许多数组,而不是单个数组。
提前感谢您抽出宝贵的时间。
最佳答案
这里data2
是foreach
循环内的局部变量。因此,即使将其设置为全局,您也只能在最后一次迭代期间获得该值。相反,您可以将 data2
设为一个数组,并在每次迭代期间将值push
放入其中。它可能看起来像这样
var parseDate = d3.timeParse("%m/%d/%Y");
var data2 = [];
var hoot = function(d) {return d.doc.split(" ").forEach(function (item) {
data2.push({creator: d.creator, date: parseDate(d.date),item: item})
});}
console.log(data2);
d3.csv("doc.csv")
.row(function(d) {return {creator: d.creator,date: parseDate(d.date),number: Number(d.number),doc: d.doc, split: (hoot(d))};})
.get(function(error, data) {
});
现在控制台记录一下,希望你能得到预期的结果。
关于javascript - 拆分对象属性中的字符串并创建新数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52238276/