如何使用 JanusGraph Gremlin-Server 实例化 Hadoop Graph

问题描述

我们正在尝试在 Hadoop 集群上设置 JanusGraph (0.5.2) 图。

如果我使用内置 Gremlin 服务器启动 JanusGraph,则可以实例化 Hadoop 图。

conf/gremlin/gremlin-server-cql-es.yaml

host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.janusgraph.channelizers.JanusGraphWebSocketChannelizergraphManager: org.janusgraph.graphdb.management.JanusGraphManager
graphs: {
    graph: conf/gremlin/janusgraph-cql-es-server.properties,ConfigurationManagementGraph: conf/gremlin-server/janusgraph-cql-es-configuration-server.properties
}
scriptEngines: {
    gremlin-groovy: {
        plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},org.apache.tinkerpop.gremlin.spark.jsr223.SparkGremlinPlugin: {},org.apache.tinkerpop.gremlin.hadoop.jsr223.HadoopGremlinPlugin: {},org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkergraphGremlinPlugin: {},org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math],methodImports: [java.lang.Math#*]},org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}}}
serializers:
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0,config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0,config: { serializeResultToString: true }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0]} } # application/vnd.graphbinary-v1.0
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1,config: { serializeResultToString: true}} # application/vnd.graphbinary-v1.0-stringd
    # Older serialization versions for backwards compatibility:
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0,config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,config: { serializeResultToString: true }}    
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0,config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.Janus    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0,config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0,config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
    - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionopProcessor,config: { sessionTimeout: 28800000 }}
    - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor,config: { cacheExpirationTime: 600000,cacheMaxSize: 1000 }}
metrics: {
    consoleReporter: {enabled: true,interval: 180000},csvReporter: {enabled: true,interval: 180000,fileName: /tmp/gremlin-server-metrics.csv},jmxReporter: {enabled: true},slf4jReporter: {enabled: true,gangliaReporter: {enabled: false,addressingMode: MULTICAST},graphiteReporter: {enabled: false,interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536

conf/gemlin/janusgraph-cql-es-server.properties

gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname=<redacted>
storage.cql.keyspace=testgraph1
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
index.search.backend=elasticsearch
index.search.hostname=<redacted>
index.search.elasticsearch.client-only=true
graph.replace-instance-if-exists=true

在无服务器 gremlin 上实例化图

CLAsspATH=/etc/hadoop/conf /opt/janusgraph/bin/gremlin.sh conf/gremlin/gremlin-server-cql-es.yaml
:plugin use tinkerpop.tinkergraph
:plugin use tinkerpop.server
:plugin use tinkerpop.hadoop
:plugin use tinkerpop.spark
hdfs
==>storage[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1730931275_1,ugi=<redacted> (auth:SIMPLE)]]]
graph = GraphFactory.open('conf/hadoop-graph/read-cql.properties')
==>hadoopgraph[cqlinputformat->nulloutputformat]
g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cqlinputformat->nulloutputformat],sparkgraphcomputer]

按预期工作。

然而,在conf/gremlin/gremlin-server-cql-es.yaml中,只要我改变

graph: conf/gremlin/janusgraph-cql-es-server.properties

graph: conf/hadoop-graph/read-cql.properties

conf/hadoop-graph/read-cql.properties

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputForm
gremlin.hadoop.jarsIndistributedCache=true
gremlin.hadoop.inputLocation=none
janusgraphmr.ioformat.conf.storage.backend=cql
janusgraphmr.ioformat.conf.storage.hostname=<redacted>
janusgraphmr.ioformat.conf.storage.port=9042
janusgraphmr.ioformat.conf.storage.cql.keyspace=janusgraph
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch
janusgraphmr.ioformat.conf.index.search.hostname=<redacted>
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true
spark.master=local[*]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

启动 janusgraph gremlin-server 在启动时死亡,产生

janusgraph    | 581  [main] ERROR org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Could not invoke constructor on class org.janusgraph.graphdb.management.JanusGraphManager (defined by the 'graphManager' setting) with one argument of class Settings
janusgraph    | Exception in thread "main" java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
janusgraph    |         at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:91)
janusgraph    |         at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:122)
janusgraph    |         at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:86)
janusgraph    |         at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:345)
janusgraph    | Caused by: java.lang.reflect.InvocationTargetException
janusgraph    |         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
janusgraph    |         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
janusgraph    |         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
janusgraph    |         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
janusgraph    |         at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:80)
janusgraph    |         ... 3 more
janusgraph    | Caused by: java.lang.IllegalStateException: Need to set configuration value: root.storage.backend
janusgraph    |         at com.google.common.base.Preconditions.checkState(Preconditions.java:197)
janusgraph    |         at org.janusgraph.diskstorage.configuration.ConfigOption.get(ConfigOption.java:229)
janusgraph    |         at org.janusgraph.diskstorage.configuration.BasicConfiguration.get(BasicConfiguration.java:69)
janusgraph    |         at org.janusgraph.diskstorage.configuration.Configuration.get(Configuration.java:35)
janusgraph    |         at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:411)
janusgraph    |         at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:50)
janusgraph    |         at org.janusgraph.core.JanusGraphFactory.lambda$open$0(JanusGraphFactory.java:150)
janusgraph    |         at org.janusgraph.graphdb.management.JanusGraphManager.openGraph(JanusGraphManager.java:243)
janusgraph    |         at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:150)
janusgraph    |         at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:100)
janusgraph    |         at org.janusgraph.graphdb.management.JanusGraphManager.lambda$new$0(JanusGraphManager.java:75)
janusgraph    |         at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)
janusgraph    |         at org.janusgraph.graphdb.management.JanusGraphManager.<init>(JanusGraphManager.java:74)
janusgraph    |         ... 8 more

我想知道要指定哪个 root.storage.backend ? Hadoop 配置 conf/hadoop-graph/read-cql.properties 已经通过

指定了后端
janusgraphmr.ioformat.conf.storage.backend=cql
...
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch

因此,我认为 JanusGraph 不需要显式 root.storage.backend

您能告诉我需要指定哪个后端才能在服务器端进行 Hadoop 图遍历吗?

提前感谢您的任何提示

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)