StellarGraph

Node classification on RDF Graphs

We are looking to see if we can utilize StallerGraph to help in the node classification for our data stored in a RDF triple store using some kind of supervised learning. In order to evaluate the capabilities of the Stellargraph, we have tried to utilize an OpenFlights database [Due to posting limitations adding the openflight data link in the consecutive post] where we have randomly picked 200 airport nodes and added a Label to match the types of planes flying in to that airport.

We have extracted these 200 classified nodes and its sub graph with depth of k=4 as an Ntriple file which is available in the following link: https://drive.google.com/file/d/0B5jIFNn8nV8oU0hiVGhDSFVTSVZHOFVFLVRkTTE0VWdjXzV3/view?usp=drivesdk

As Stallergraph only accepts the NetworkX format, we have use the RdfLib to convert the data in Ntriple format to a NetworkX format using the following code:

from rdflib.extras.external_graph_libs import *
from rdflib import Graph, URIRef, Literal
import networkx as nx

graph = Graph()
graph.parse(“Airports-with-lables.nt”, format=“nt”)
nx_graph = rdflib_to_networkx_multidigraph(graph)
nx.write_graphml(nx_graph,“example_airport.graphml”)

Output of this can be found in the following file : example_airport.graphml : https://drive.google.com/file/d/0B5jIFNn8nV8oS0g4SGlWV0NMQWNwUnN0c2F0NVVzWmdMSUY0/view?usp=drivesdk

We are looking to see if we now can extract another Random Airport node and then its Sub Graph with the same k=4 and then try to predict the possible tags we can apply to this node based on the previous learnings.

Can someone please suggest how can we accomplish this using the StallerGraph Libraries. Our requirement is to use this on a Heterogeneous graph nodes.

OpenFlights database : http://icgc.link/repository/others/openflights.nt.zip

Hi susmik,

I don’t have permission to access the files on Google Drive.

However, if I understand you correctly, you wish to train a model using one graph and then use the trained model to make predictions on a different graph.

If so, then you need to use an inductive algorithm such as HinSAGE or GraphSAGE but preferably the former since yours is a heterogeneous network.

You can find an example of using GraphSAGE for inductive node classification at https://github.com/stellargraph/stellargraph/blob/develop/demos/node-classification/graphsage/graphsage-pubmed-inductive-node-classification-example.ipynb

The HinSAGE implementation in StellarGraph is very similar to GraphSAGE so you should be able to modify the above example to tackle your problem.

An example of how to use HinSAGE is available at https://github.com/stellargraph/stellargraph/tree/develop/demos/node-classification/hinsage

We have additional documentation on how to use our library at https://stellargraph.readthedocs.io/en/stable/

Regards,

P.

Hi elinas, I have given open permissions to the 2 documents. Can you please check once again. Thanks!

Hi elinas, Can you guide us on how to import the RDF NTriples (.nt files) in to StellarGraph. The code attached in my initial post is not compatible with StellarGraph.

Thanks!

Hi susmik,

generally speaking, StellarGraph does not support .nt files. Since a StellarGraph object is also a NetworkX object, if you can create a NetworkX object representing your data as a graph suitable for your task, then you should be able to use it to create a StellarGraph object. Maybe the NetworkX community can help you with loading your data using NetworkX.

If you have already successfully created a suitable (for your task) NetworkX object and then you get an error when creating the StellarGraph object from it, then please post the error message here so we can help you.

Furthermore, if you happen to have a Python script or a Jupyter notebook, (maybe on github?) with your work that you can share with us, then we can have a closer look and try to help you as much as possible.

Regards,

P.