抱歉,如果标题有点困惑,我不确定如何用几个词来表达它。
我目前正在处理用户上传 .csv 或 excel 文件的情况,并且必须正确映射数据以准备批量上传。当您阅读下面的代码时,它会更有意义!
第一步:用户上传 .csv/excel 文件,它被转换成一个对象数组。通常第一个数组是标题。
数据将如下所示(包括标题)。这将介于 100 项到最多 ~100,000 项之间:
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', 'lbeckhouse0@stackoverflow.com', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', 'mvassman1@cdbaby.com', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
]
上传后,用户会将每个字段映射到正确的模式。这可以是所有字段,也可以是选定的几个字段。
例如,用户只想排除邮政编码以外的地址部分。 我们将取回“映射字段”数组,重命名为正确的模式名称(即 First Name => firstName):
const MAPPED_FIELDS = [firstName, lastName, company, email, phone, <empty>, <empty>, <empty>, zipCode]
我已经做到了,因此映射字段的索引将始终与“标题”相匹配。因此,任何未映射的 header 都会有一个值。
所以在这种情况下,我们知道只上传索引为 [0, 1, 2, 3, 4, 8] 的数据(属于 DUMMY_DATA)。
然后我们进入最后一部分,我们要为所有数据上传正确的字段,这样我们就会有来自 MAPPED_FIELDS 的正确映射模式与来自 DUMMY_DATA 的映射值相匹配......
const firstObjectToBeUploaded = {
firstName: 'Lambert',
lastName: 'BeckHouse',
company: 'StackOverflow',
email: 'lbeckhouse@stackoverflow.com',
phone: '512-555-1738',
zipCode: '78721'
}
try {
await uploadData(firstObjectToBeUploaded)
} catch (err) {
console.log(err)
}
所有数据都将发送到用 Node.js 编写的 AWS lambda 函数来处理上传/逻辑。
由于数据可能会变得非常大,我正在努力解决如何有效地实现这一点。
最佳答案
如果您希望在更大的数组大小下获得一些性能提升,您可以应用与 Nick 的答案相同的逻辑,但在标准 for
循环中实现。
为
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', 'lbeckhouse0@stackoverflow.com', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', 'mvassman1@cdbaby.com', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
const fieldLength = MAPPED_FIELDS.length;
const dataLength = DUMMY_DATA.length;
const objectsToUpload = [];
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = DUMMY_DATA[i][j];
}
}
objectsToUpload.push(obj);
}
console.log(objectsToUpload);
对于...的
此处隔离 entries()
MAPPED_FIELDS
数组在循环之前一次,以避免重复生成条目迭代器并简单地跳过 null
键而不是稍后过滤它们。解构和可能的迭代器创建/传播似乎使它在小数组上低于 Nick,但在大数组上更快(基于 Chrome 的浏览器测试)。
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', 'lbeckhouse0@stackoverflow.com', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', 'mvassman1@cdbaby.com', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
const MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
const objectsToUpload = [];
for (const datum of DUMMY_DATA.slice(1)) {
const obj = {};
for (const [idx, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[idx];
}
}
objectsToUpload.push(obj);
}
console.log(objectsToUpload);
下面是粗略的基准测试,在我的机器上的结果如下。
for 1,000: 0.400ms
for...of 1,000: 2.900ms
entries 1,000: 1.700ms
for 10,000: 4.100ms
for...of 10,000: 11.700ms
entries 10,000: 13.900ms
for 100,000: 30.200ms
for...of 100,000: 56.500ms
entries 100,000: 100.200ms
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', 'lbeckhouse0@stackoverflow.com', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', 'mvassman1@cdbaby.com', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
function makeBigData(size) {
const [header, ...data] = DUMMY_DATA;
const r = [header];
for (let l = 0; l < size; l += 1) {
r.push([...data[Math.round(Math.random())]]);
}
return r;
}
let data = makeBigData(1000);
console.time('for 1,000');
let objectsToUpload = [];
let fieldLength = MAPPED_FIELDS.length, dataLength = data.length;
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = data[i][j];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for 1,000');
data = makeBigData(1000);
console.time('for...of 1,000');
objectsToUpload = [];
let MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
for (const datum of data.slice(1)) {
const obj = {};
for (const [i, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for...of 1,000');
data = makeBigData(1000);
console.time('entries 1,000');
objectsToUpload = data.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.timeEnd('entries 1,000');
console.log();
data = makeBigData(10000);
console.time('for 10,000');
objectsToUpload = [];
fieldLength = MAPPED_FIELDS.length, dataLength = data.length;
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = data[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for 10,000');
data = makeBigData(10000);
console.time('for...of 10,000');
objectsToUpload = [];
MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
for (const datum of data.slice(1)) {
const obj = {};
for (const [i, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for...of 10,000');
data = makeBigData(10000);
console.time('entries 10,000');
objectsToUpload = data.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.timeEnd('entries 10,000');
console.log();
data = makeBigData(100000);
console.time('for 100,000');
objectsToUpload = [];
fieldLength = MAPPED_FIELDS.length, dataLength = data.length;
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = data[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for 100,000');
data = makeBigData(100000);
console.time('for...of 100,000');
objectsToUpload = [];
MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
for (const datum of data.slice(1)) {
const obj = {};
for (const [i, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for...of 100,000');
data = makeBigData(100000);
console.time('entries 100,000');
objectsToUpload = data.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.timeEnd('entries 100,000');
关于javascript - 映射两个对象数组以准备 'upload' 的最有效方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74175664/