StellarGraph

HinSAGE Yelp tutorial - Performance

Dear StellarGraph Community,

I have run the HinSAGE Yelp tutorial (https://github.com/stellargraph/stellargraph/tree/master/demos/node-classification/hinsage), with default values, and gotten the following result:

accuracy = 0.911, precision = 0.0, recall = 0.0, f1 = 0.0,  ROC AUC = 0.5

I have tried to optimize the hyperparameters but without success. Do you have optimized parameters for this example?

Regards,
Logi

Hi Logi,

Thanks for trying the StellarGraph library! It looks like the example is not working as intended. With the default parameters it should obtain a f1 score of around 0.8. Using the latest (develop) scripts and library, and running the pre-processing script on the round 13 Yelp dataset (this is the latest dataset):

python yelp-preprocessing.py -r 13 -l ~/Data/yelp/ -o .

Then running the yelp-example.py script gives the following results:

Test Set Metrics (on 29087 nodes)
Confusion matrix:
[[26211   291]
 [  600  1985]]
accuracy = 0.969, precision = 0.872, recall = 0.768, f1 = 0.817
ROC AUC = 0.878

To diagnose the problem, could you supply the version of StellarGraph, the version of the demo scripts that you are running and the round number of the Yelp dataset?

Best regards,
Andrew

Dear Andrew

This is what I am using:

python yelp-preprocessing.py -l . -o . -r 13 and Stellargraph version -  '0.7.1'

I downloaded the .py scripts from:

https://github.com/stellargraph/stellargraph/tree/develop/demos/node-classification/hinsage

but actually I get this error when running yelp-example.py:

Traceback (most recent call last):
  File "yelp-example.py", line 264, in <module>
    args.dropout,
  File "yelp-example.py", line 133, in train
    binary_predictions = predictions[:, 1] > 0.5
IndexError: index 1 is out of bounds for axis 1 with size 1

The predictions array is too nested:

predictions
array([[0.9090045 , 0.09099546],
       [0.9034515 , 0.09654851],
       [0.9134177 , 0.08658227],
       ...,

Therefore I added the following line after line 132:

predictions = predictions.squeeze()

giving

predictions
array([[0.9090045 , 0.09099546],
       [0.9034515 , 0.09654851],
       [0.9134177 , 0.08658227],
       ...,

and then the rest of the script runs resulting in the previously mentioned metrics

Best regards,
Logi

Hi Logi,

I’m glad that it’s working for you now! The bug you are experiencing has been fixed in the latest develop version of the stellargraph library. You can install the develop version with the following commands:

pip install git+https://github.com/stellargraph/stellargraph.git

This will work with the current develop version of the demos.

Best regards,
Andrew