We are looking to see if we can utilize StallerGraph to help in the node classification for our data stored in a RDF triple store using some kind of supervised learning. In order to evaluate the capabilities of the Stellargraph, we have tried to utilize an OpenFlights database [Due to posting limitations adding the openflight data link in the consecutive post] where we have randomly picked 200 airport nodes and added a Label to match the types of planes flying in to that airport.
We have extracted these 200 classified nodes and its sub graph with depth of k=4 as an Ntriple file which is available in the following link: https://drive.google.com/file/d/0B5jIFNn8nV8oU0hiVGhDSFVTSVZHOFVFLVRkTTE0VWdjXzV3/view?usp=drivesdk
As Stallergraph only accepts the NetworkX format, we have use the RdfLib to convert the data in Ntriple format to a NetworkX format using the following code:
from rdflib.extras.external_graph_libs import *
from rdflib import Graph, URIRef, Literal
import networkx as nx
graph = Graph()
nx_graph = rdflib_to_networkx_multidigraph(graph)
Output of this can be found in the following file : example_airport.graphml : https://drive.google.com/file/d/0B5jIFNn8nV8oS0g4SGlWV0NMQWNwUnN0c2F0NVVzWmdMSUY0/view?usp=drivesdk
We are looking to see if we now can extract another Random Airport node and then its Sub Graph with the same k=4 and then try to predict the possible tags we can apply to this node based on the previous learnings.
Can someone please suggest how can we accomplish this using the StallerGraph Libraries. Our requirement is to use this on a Heterogeneous graph nodes.