美好的一天,经过大量努力尝试掌握这项技术(数据流)后,我已经成功让管道 100% 运行。
它的作用是将一堆 CSV 文件加载到管道中(来自谷歌云存储),将它们转换为“Domain”对象,然后以 JSON 格式将它们保存到文件中。
我想做的是获取 JSON 对象并将其直接推送到数据库(google cloud firestore)。
我在此阶段应用于数据的最终转换是:
.apply(DatastoreIO.v1().write().withProjectId("____"));
据我所知,调用需要先前的转换来返回一个实体对象,但我无法创建该对象
public Entity toEntity() {
Datastore datastore = DatastoreOptions.getDefaultInstance().getService();
Key taskKey = datastore.newKeyFactory().setKind("Task").newKey("Test");
Entity e = Entity.newBuilder(taskKey).set("Domain", domain)
.set("LocationOnsite", locOnSite)
.set("Company", company).build();
return e;
}
这会返回 com.google.cloud.datastore.Entity,而不是所需的 com.google.datastore.v1.Entity
我认为值得注意的是,“Domain”对象还包含一些其他对象的ArrayList,例如需要包含到数据库中的“Emails”。
下面是我当前拥有的示例 JSON 输出:
{
"Vertical": "Business And Industrial",
"Zip": "35229",
"Company": "Alabama Association of Nonprofits",
"QuantCast": "229219",
"Twitter": "",
"Vimeo": "",
"LocationOnSite": "",
"LastIndexed": "2018-02-01",
"Pinterest": "",
"Youtube": "",
"TechSpend": "$250+",
"Emails": [
{
"Email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="285b404946464746684944494a494549464746585a474e415c5b06475a4f" rel="noreferrer noopener nofollow">[email protected]</a>"
},
{
"Email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f3808683839c8187b3929f9291929e929d9c9d83819c959a8780dd9c8194" rel="noreferrer noopener nofollow">[email protected]</a>"
},
{
"Email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7f1c1e0d131e3f1e131e1d1e121e1110110f0d1019160b0c51100d18" rel="noreferrer noopener nofollow">[email protected]</a>"
},
{
"Email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="cba0aea7a7a2ae8baaa7aaa9aaa6aaa5a4a5bbb9a4ada2bfb8e5a4b9ac" rel="noreferrer noopener nofollow">[email protected]</a>"
},
{
"Email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="dabba9b2b6bfa39abbb6bbb8bbb7bbb4b5b4aaa8b5bcb3aea9f4b5a8bd" rel="noreferrer noopener nofollow">[email protected]</a>"
},
{
"Email": "Unknown"
}
],
"Facebook": "",
"Google+": "",
"Alexa": "",
"Github": "",
"FirstIndexed": "2011-01-03",
"People": [
{
"Email": "Unknown",
"Name": "Joshua Cirulnick"
},
{
"Email": "Unknown",
"Position": "Other",
"Name": " Elaine Lin"
},
{
"Email": "Unknown",
"Position": "Other",
"Name": " Terry Burkle"
},
{
"Email": "Unknown",
"Position": "Director",
"Name": " Ashley Gilbert"
},
{
"Email": "Unknown",
"Position": "President",
"Name": " Carol Weisman"
},
{
"Email": "Unknown",
"Position": "Csuite",
"Name": " Shannon Ammons"
},
{
"Email": "Unknown",
"Position": "Founder",
"Name": " Kelly McDonald"
}
],
"City": "Birmingham",
"Telephone#s": [
{
"Telephone#": "+1-205-879-4712"
},
{
"Telephone#": "+1-205-871-7740"
}
],
"FirstDetected": "N/A",
"LinkedIn": "",
"VK": "",
"State": "AL",
"Instagram": "",
"Country": "US",
"Domain": "alabamanonprofits.org",
"LastFound": "N/A"
}
如果有人能够为我指明如何有效地将此类对象放入 google cloud firestore 数据库的正确方向,我将非常高兴!
最佳答案
您可以将数据写入 Cloud Pub/Sub,这可以触发将数据写入 Cloud Firestore 的函数。 Google I/O 2017 上有一个很好的例子,它做了同样的事情,但使用的是实时数据库。
您可以在这里观看:Data Pipelines with Firebase and Google Cloud (Google I/O '17)
关于java - 从谷歌数据流输出到谷歌云Firestore,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49097284/