我正在尝试使用带有 Common Crawl 的 Crate 示例:https://github.com/crate/crate-commoncrawl
我已经设置了 Crate,甚至使用示例中的说明创建了表架构。
当我在自己的系统上工作时,我使用 URL 访问 CRATE:http://localhost:4200/_plugin/crate-admin
。
我面临的唯一问题是COPY
。让我向您展示该行:
COPY commoncrawl FROM 'ccrawl://cr8.is/1WSiodP';
它正在触发未知异常。 这是错误和错误跟踪:
COPY ERROR (0.000 sec)
Error!
SQLActionException[MalformedURLException: unknown protocol: ccrawl]
错误跟踪:
SQLActionException: INTERNAL_SERVER_ERROR 5000 MalformedURLException: unknown protocol: ccrawl
at java.net.URL.<init>(URL.java:600)
at java.net.URL.<init>(URL.java:490)
at java.net.URL.<init>(URL.java:439)
at java.net.URI.toURL(URI.java:1089)
at io.crate.operation.collect.files.URLFileInput.getStream(URLFileInput.java:52)
at io.crate.operation.collect.files.FileReadingCollector.readLines(FileReadingCollector.java:228)
at io.crate.operation.collect.files.FileReadingCollector.doCollect(FileReadingCollector.java:205)
at io.crate.operation.collect.MapSideDataCollectOperation$1$1.run(MapSideDataCollectOperation.java:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
最佳答案
看起来 crate-commoncrawl 插件安装不正确。请参阅https://github.com/crate/crate-commoncrawl#build--install .
关于java - crate 常见爬行示例不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40926030/