Skip to content

pfilion/janusgraph-schema-manager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titan/JanusGraph Schema Manager

This is a simple schema manager that allows to maintain and apply JanusGraph (formely Titan) database schema.

The tool is written in Java and has not been carefully tested yet. Still, given the alternatives, it should be better than trying to run Java-based management calls via Gremlin or manually.

Prerequisites

Windows Users

Building

By default the tool is built without the sandbox components (see below).

Run "mvn clean package" to build the schema manager or "mvn -Psandbox clean package" if you want to build the schema manager with sandbox components. The distribution package will be available as target/titan-schema-manager--SNAPSHOT-dist[-sandbox].zip

Running the schema manager

To create a test graph using local DynamoDB backend you can use the following command:

bin/schema_manager.sh  -g graph.properties -w schema.json

To generate the documentation from the schema (without loading the schema into the graph) run this command:

bin/schema_manager.sh  -g graph.properties -d generated-docs schema.json

To load some sample data in the graph run this command:

bin/schema_manager.sh  -g graph.properties -l data-graphml.xml schema.json

Schema format

Please refer to src/main/resources/schema/graph-schema-def-1.0.json for details. examples/sandbox/graph-of-gods/graph-of-the-gods-v1.0.json contains a working example of the famous Graph Of The Gods.

Schema tagging

"doctags" property may be used for properties, edges, vertexes and indexes, as well as on the nested property and relationship descriptions. This property contains list of comma-separated tags to be applied to the element. The tags can be automatically cascaded in different ways to the nested elements (e.g. from the vertex to the declared properties and relationships). The cascading is controlled by "doctag_cascading" property of the schema. Note that this setting and cascading itself is done only in the current schema file, e.g. it is not extended to other files if the include mechanism is used.

Tagging can be used to partition the graph schema into logical fragments that may be analyzed separately.

Filtering the generated documentation using tags

"-t" option can be used to limit the documentation scope to the specific tag(s) or to remove the elements with particular tag(s) from the documentation. Refer to the usage information for the value format.

Note that the documentation generated with the filter may be inconsistent - it may be missing some elements that do not pass the filter.

Tag filter does not affect the schema generation and validation.

Schema coloring using tags

Doctags can optionally have ":color" suffix. If specified, the edges and vertices with corresponding tags will be drawn in the DOT file using that color. For supported color names please refer to http://www.graphviz.org/content/color-names.

Note that the color matching logic is as follows: tags for each vertex/edge are evaluated from the first to the last. First tag that has the color specified in the filter determines the color of this element. To support this logic you may want to place more specific tags first.

Sandbox environment

The schema manager comes with a feature allowing the creation of a local graph environment. The environment includes:

  • local DynamoDB instance
  • local Elasticsearch instance
  • local Gremlin server
  • Gremlin console

The schema manager can be used against this environment just like against the production one.

It is worth mentioning here that we are not running this schema manager via Gremlin connection. While it is technically possible, I am not sure it is the way to go. For now the tool runs with embedded JanusGraph instance that connects to both storage backend (DynamoDB) and ES instance, applies the changes and shuts down.

Starting the test environment

The test environment stores the graph data (DynamoDB) ES data in the directories you provide as arguments to the startup scripts. Thus, by erasing these directories and restarting everything you return to the empty database.

Build & get packages:

mvn -Psandbox clean package

Unpack target/titan-schema-manager--SNAPSHOT-dist-sandbox.zip somewhere.

Start local DynamoDB with the data stored in /tmp/db by running:

bin/start_local_dynamodb.sh /tmp/db

Start local ES instance with the data stored in /tmp/es-db by running:

bin/start_local_es.sh /tmp/es-db

Start local Gremlin server instance using default configuration from examples/sandbox/gremlin/gremlin-server.yaml by running:

bin/start_local_gremlin_server.sh

Finally, you can start Gremlin console using:

bin/start_local_gremlin_console.sh

All above-mentioned applications use the configuration files from "examples/sandbox" directory.

Note that the console does not connect anywhere by default. You need to connect to the local Gremlin server first using the following command:

:remote connect tinkerpop.server ./titan-schema-manager-1.0-SNAPSHOT/examples/sandbox/gremlin/gremlin-localhost.yaml

To test that everything is working you can run this command in Gremlin console:

gremlin> :> g
gremlin:graphtraversalsource[standardtitangraph[com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager:[127.0.0.1]], standard]

This means that the graph called "g" is ready to be queried.

Notes on the generated documentation

The documentation is generated in the form of XML files with the XSL+CSS styles used for rendering them. Due to various security limits, many browsers refuse to process such a combination locally. One of the simple ways to work around the problem is to access your documentation with an HTTP server:

python -m SimpleHTTPServer

Graph visualization

A primitive DOT file is generated (graph.dot) along with the documentation. You can visualize it using GraphViz or another tool. On Mac you can install GraphViz with HomeBrew, for example. The resulting file can be displayed with the following command:

dot -Tpdf generated-docs/graph.dot  | open -f -a /Applications/Preview.app

Plans for future

  • better validation of the schema before applying it and/or reviewing how transactions can be used to avoid partial failures

References

http://docs.janusgraph.org/latest/

About

A schema management tool for JanusGraph

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published