titan - 为什么我无法连接到 Gremlin-Server?

标签 titan gremlin-server gremlinpython

抽象的

我正在尝试在 Docker (v1.13.0) 中设置 Titan/Cassandra/Gremlin-Server 堆栈。我面临的问题是尝试在默认端口 8182 上连接到 Gremlin-Server 的应用程序正在报告错误(详情如下)。

首先,这里是一些相关的版本信息:

  • Cassandra v2.2.8
  • Titan v1.0.0 (Hadoop 1)
  • 小 Sprite 3.2.3

  • 设置

    设置在 Dockerfile 中进行,以便可重现。它假定 Cassandra 容器已经存在,运行 cassandra.yaml ,其中 start_rpc 已设置为 true
    Dockerfile 如下:
    FROM openjdk:alpine
    
    ENV TITAN 'titan-1.0.0-hadoop1'
    
    RUN apk update && apk add bash unzip && rm -rf /var/cache/apk/* \
        && adduser -S -s /bin/bash -D srg \
        && wget -O /tmp/$TITAN.zip http://s3.thinkaurelius.com/downloads/titan/$TITAN.zip \
        && unzip /tmp/$TITAN.zip -d /opt && ln -s /opt/$TITAN /opt/titan \
        && rm /tmp/*.zip \
        && chown -R srg /opt/$TITAN/ \
        && /opt/titan/bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3
    
    COPY conf/gremlin-server/* /opt/$TITAN/conf/gremlin-server/
    
    USER srg
    WORKDIR /opt/titan
    EXPOSE 8182
    
    CMD ["bin/gremlin-server.sh", "conf/gremlin-server/srg.yaml"]
    

    精明的读者会注意到我正在将自定义配置文件复制到容器中,即 Gremlin-Server 配置文件 (srg.yaml) 和 Titan 图形属性文件 (srg.properties)。
    srg.yaml
    host: localhost
    port: 8182
    threadPoolWorker: 1
    gremlinPool: 8
    scriptEvaluationTimeout: 30000
    serializedResponseTimeout: 30000
    channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
    graphs: {
      graph: conf/gremlin-server/srg.properties
      }
    plugins:
      - aurelius.titan
    scriptEngines: {
      gremlin-groovy: {
        imports: [java.lang.Math],
        staticImports: [java.lang.Math.PI],
        scripts: [scripts/empty-sample.groovy]},
      gremlin-jython: {},
      gremlin-python: {},
      nashorn: {
          imports: [java.lang.Math],
          staticImports: [java.lang.Math.PI]}}
    serializers:
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
    processors:
      - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
    metrics: {
      consoleReporter: {enabled: true, interval: 180000},
      csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
      jmxReporter: {enabled: true},
      slf4jReporter: {enabled: true, interval: 180000},
      gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
      graphiteReporter: {enabled: false, interval: 180000}}
    threadPoolBoss: 1
    maxInitialLineLength: 4096
    maxHeaderSize: 8192
    maxChunkSize: 8192
    maxContentLength: 65536
    maxAccumulationBufferComponents: 1024
    resultIterationBatchSize: 64
    writeBufferLowWaterMark: 32768
    writeBufferHighWaterMark: 65536
    ssl: {
      enabled: false}
    
    srg.properties
    gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
    storage.backend=cassandrathrift
    storage.hostname=cassandra  # refers to the linked container
    cache.db-cache = true
    cache.db-cache-clean-wait = 20
    cache.db-cache-time = 180000
    cache.db-cache-size = 0.25
    
    # Start elasticsearch inside the Titan JVM
    index.search.backend=elasticsearch
    index.search.directory=db/es
    index.search.elasticsearch.client-only=false
    index.search.elasticsearch.local-mode=true
    

    执行

    容器使用以下命令运行:docker run -ti --rm=true --link test.cassandra:cassandra -p 8182:8182 titan

    这是 Gremlin-Server 的日志输出:
    0    [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - 
             \,,,/
             (o o)
    -----oOOo-(3)-oOOo-----
    
    297  [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Configuring Gremlin Server from conf/gremlin-server/srg.yaml
    439  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics ConsoleReporter configured with report interval=180000ms
    448  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics CsvReporter configured with report interval=180000ms to fileName=/tmp/gremlin-server-metrics.csv
    557  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics JmxReporter configured with domain= and agentId=
    561  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
    1750 [main] INFO  com.thinkaurelius.titan.core.util.ReflectiveConfigOptionLoader  - Loaded and initialized config classes: 12 OK out of 12 attempts in PT0.148S
    1972 [main] INFO  com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager  - Closed Thrift connection pooler.
    1990 [main] INFO  com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration  - Generated unique-instance-id=ac1100031-ad2d5ffa52e81
    2026 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Configuring index [search]
    2386 [main] INFO  org.elasticsearch.node  - [Lunatik] version[1.5.1], pid[1], build[5e38401/2015-04-09T13:41:35Z]
    2387 [main] INFO  org.elasticsearch.node  - [Lunatik] initializing ...
    2399 [main] INFO  org.elasticsearch.plugins  - [Lunatik] loaded [], sites []
    6471 [main] INFO  org.elasticsearch.node  - [Lunatik] initialized
    6472 [main] INFO  org.elasticsearch.node  - [Lunatik] starting ...
    6477 [main] INFO  org.elasticsearch.transport  - [Lunatik] bound_address {local[1]}, publish_address {local[1]}
    6507 [main] INFO  org.elasticsearch.discovery  - [Lunatik] elasticsearch/u2StmRW1RsyEHw561yoNFw
    6519 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.cluster.service  - [Lunatik] master {new [Lunatik][u2StmRW1RsyEHw561yoNFw][ad2d5ffa52e8][local[1]]{local=true}}, removed {[Lunatik][kKyL9UE-R123LLZTTrsVCw][ad2d5ffa52e8][local[1]]{local=true},}, reason: local-disco-initial_connect(master)
    6908 [main] INFO  org.elasticsearch.http  - [Lunatik] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.3:9200]}
    6909 [main] INFO  org.elasticsearch.node  - [Lunatik] started
    6923 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.gateway  - [Lunatik] recovered [0] indices into cluster_state
    7486 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.cluster.metadata  - [Lunatik] [titan] creating index, cause [api], templates [], shards [5]/[1], mappings []
    8075 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Initiated backend operations thread pool of size 4
    8241 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Configuring total store cache size: 94787290
    8641 [main] INFO  com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog  - Loaded unidentified ReadMarker start time 2017-01-21T16:31:28.750Z into com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller@3520958b
    8642 [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] was successfully configured via [conf/gremlin-server/srg.properties].
    8643 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Initialized Gremlin thread pool.  Threads in pool named with pattern gremlin-*
    14187 [main] INFO  com.jcabi.manifests.Manifests  - 108 attributes loaded from 264 stream(s) in 185ms, 108 saved, 3371 ignored: ["Agent-Class", "Ant-Version", "Archiver-Version", "Bnd-LastModified", "Boot-Class-Path", "Build-Date", "Build-Host", "Build-Id", "Build-Java-Version", "Build-Jdk", "Build-Job", "Build-Number", "Build-Time", "Build-Timestamp", "Build-Version", "Built-At", "Built-By", "Built-OS", "Built-On", "Built-Status", "Bundle-ActivationPolicy", "Bundle-Activator", "Bundle-BuddyPolicy", "Bundle-Category", "Bundle-ClassPath", "Bundle-Classpath", "Bundle-Copyright", "Bundle-Description", "Bundle-DocURL", "Bundle-License", "Bundle-Localization", "Bundle-ManifestVersion", "Bundle-Name", "Bundle-NativeCode", "Bundle-RequiredExecutionEnvironment", "Bundle-SymbolicName", "Bundle-Vendor", "Bundle-Version", "Can-Redefine-Classes", "Change", "Class-Path", "Created-By", "DynamicImport-Package", "Eclipse-AutoStart", "Eclipse-BuddyPolicy", "Eclipse-SourceReferences", "Embed-Dependency", "Embedded-Artifacts", "Export-Package", "Extension-Name", "Extension-name", "Fragment-Host", "Git-Commit-Branch", "Git-Commit-Date", "Git-Commit-Hash", "Git-Committer-Email", "Git-Committer-Name", "Gradle-Version", "Gremlin-Lib-Paths", "Gremlin-Plugin-Dependencies", "Gremlin-Plugin-Paths", "Ignore-Package", "Implementation-Build", "Implementation-Build-Date", "Implementation-Title", "Implementation-URL", "Implementation-Vendor", "Implementation-Vendor-Id", "Implementation-Version", "Import-Package", "Include-Resource", "JCabi-Build", "JCabi-Date", "JCabi-Version", "Java-Vendor", "Java-Version", "Main-Class", "Main-class", "Manifest-Version", "Maven-Version", "Module-Email", "Module-Origin", "Module-Owner", "Module-Source", "Originally-Created-By", "Os-Arch", "Os-Name", "Os-Version", "Package", "Premain-Class", "Private-Package", "Require-Bundle", "Require-Capability", "Scm-Connection", "Scm-Revision", "Scm-Url", "Specification-Title", "Specification-Vendor", "Specification-Version", "Tool", "X-Compile-Source-JDK", "X-Compile-Target-JDK", "hash", "implementation-version", "mode", "package", "url", "version"]
    14842 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-jython ScriptEngine
    15540 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded nashorn ScriptEngine
    16076 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-python ScriptEngine
    16553 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-groovy ScriptEngine
    17410 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor  - Initialized gremlin-groovy ScriptEngine with scripts/empty-sample.groovy
    17410 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Initialized GremlinExecutor and configured ScriptEngines.
    17419 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - A GraphTraversalSource is now bound to [g] with graphtraversalsource[standardtitangraph[cassandrathrift:[cassandra]], standard]
    17565 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
    17566 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
    17808 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0
    17811 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0
    17958 [gremlin-server-boss-1] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1.
    17959 [gremlin-server-boss-1] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Channel started at port 8182.
    1/21/17 4:34:20 PM =============================================================
    
    -- Meters ----------------------------------------------------------------------
    org.apache.tinkerpop.gremlin.server.GremlinServer.errors
                 count = 0
             mean rate = 0.00 events/second
         1-minute rate = 0.00 events/second
         5-minute rate = 0.00 events/second
        15-minute rate = 0.00 events/second
    
    
    180564 [metrics-logger-reporter-thread-1] INFO  org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics  - type=METER, name=org.apache.tinkerpop.gremlin.server.GremlinServer.errors, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second
    

    症状

    到目前为止,一切似乎都按预期工作。日志表明我能够加载 srg.properties 并将数据结构绑定(bind)到名为 graph 的变量。

    当我尝试通过导出的端口 8182 连接到 Gremlin-Server 实例时出现问题,例如使用 gremlin-python :
    # executed via python 3.6.0 on the host machine, i.e. not inside of Docker
    from gremlin_python import statics
    from gremlin_python.structure.graph import Graph
    from gremlin_python.process.graph_traversal import __
    from gremlin_python.process.strategies import *
    from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
    
    g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','graph'))
    

    产生以下异常...
    ---------------------------------------------------------------------------
    HTTPError                                 Traceback (most recent call last)
    <ipython-input-10-59ad504f29b4> in <module>()
    ----> 1 g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/','g'))
    
    /Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gremlin_python/driver/driver_remote_connection.py in __init__(self, url, traversal_source, username, password, loop, graphson_reader, graphson_writer)
         41         self._password = password
         42         if loop is None: self._loop = ioloop.IOLoop.current()
    ---> 43         self._websocket = self._loop.run_sync(lambda: websocket.websocket_connect(self.url))
         44         self._graphson_reader = graphson_reader or GraphSONReader()
         45         self._graphson_writer = graphson_writer or GraphSONWriter()
    
    /Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/ioloop.py in run_sync(self, func, timeout)
        455         if not future_cell[0].done():
        456             raise TimeoutError('Operation timed out after %s seconds' % timeout)
    --> 457         return future_cell[0].result()
        458 
        459     def time(self):
    
    /Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/concurrent.py in result(self, timeout)
        235             return self._result
        236         if self._exc_info is not None:
    --> 237             raise_exc_info(self._exc_info)
        238         self._check_done()
        239         return self._result
    
    /Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/util.py in raise_exc_info(exc_info)
    
    HTTPError: HTTP 599: Stream closed
    

    怀疑此库特有的问题:

    1) 尝试使用 nc 连接到 websocket 端口
    $ nc -z -v localhost 8182
    found 0 associations
    found 1 connections:
         1: flags=82<CONNECTED,PREFERRED>
        outif lo0
        src ::1 port 58627
        dst ::1 port 8182
        rank info not available
        TCP aux info available
    
    Connection to localhost port 8182 [tcp/*] succeeded!
    

    2) 尝试使用不同的客户端库连接到 Gremlin-Server,即 go-gremlin

    测试用例:
    package main
    
    import (
        "fmt"
        "log"
    
        "github.com/go-gremlin/gremlin"
    )
    
    func main() {
        if err := gremlin.NewCluster("ws://localhost:8182/gremlin"); err != nil {
            log.Fatal(err)
        }
    
        data, err := gremlin.Query(`graph.V()`).Exec()
        if err != nil {
            log.Fatalf("Query error: %s", err)
        }
    
        fmt.Println(string(data))
    }
    

    输出:
    $ go run cmd/test/main.go 
    2017/01/21 14:47:42 Query error: unexpected EOF
    exit status 1
    

    当前的结论和问题

    从之前的测试中,我得出结论,这是一个应用程序级别的问题(即 websocket 或 ws 协议(protocol)级别的问题,而不是主机或容器网络堆栈的问题)。实际上,nc 报告套接字连接成功,但在 Python 和 Go 客户端库中,表面上都提示来自服务器的不适当(空)响应。

    我尝试在 gremlin-python 和 go-gremlin 中从 websocket URL 中删除 /gremlin 路径,但无济于事。

    我的问题是: 我从这里去哪里?任何建议或诊断路径将不胜感激!

    最佳答案

    主要问题是 host在 Gremlin 服务器配置中设置为默认值 localhost .这将只允许来自服务器本身的连接。您需要将值更改为服务器的外部 IP 或 0.0.0.0 .

    另一个问题是 gremlin-python服务器插件随 Apache TinkerPop 3.2.2 提供。 Titan 1.0.0 使用 TinkerPop 3.0.1。我不认为 gremlin-python 3.2.3插件适用于 Titan 1.0.0。

    更新 : 考虑使用 JanusGraph 0.1.1它使用 TinkerPop 3.2.3。 JanusGraphforked from Titan ,因此代码与更新的依赖项基本相同。

    关于titan - 为什么我无法连接到 Gremlin-Server?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41783800/

    相关文章:

    neo4j - 图形数据库(即 Titan、OrientDB、ArangoDB、Neo4J)中的属性允许哪些值?

    java - 如何在 Gremlin 服务器中禁用连接池

    graph - 小鬼/小叮当 : insert key:value property constant in every vertex and edge traversed then return path

    gremlin - 海王星 - 如何获得到所有具有比例权重的节点的距离 gremlin

    java - 通过 Gremlin 连接到 DynamoDB Local

    graph-databases - 如何使用 Gremlin 选择可选的图结构?

    scala - 使用 Scala Future 处理 TitanDB 中的异步事务时遇到问题

    graph-databases - 如何列出正在运行的 gremlin 查询?如何取消运行缓慢或长时间运行的查询?

    go - 如何连接到 Go 中的 Gremlin Websocket?