StellarGraph GraphSAGE parameters

I am trying to understand the relationship between the parameters provided in the GraphSAGE paper and the input parameters in the implementation.

  • number_of_walks: number of random walks to take
  • length: corresponds to neighborhood depth K in the paper
  • num_samples: list defines the number of layers/iterations in the GraphSAGE encoder. E.g. [10,5] is a 2-layer GraphSAGE encoder and 10 and 5 represent the size of 1- and 2-hop neighbor samples for GraphSAGE. If we have number_of_walks = 1 and length 5, how can we have num_samples = [10,5]? Can you explain the relationship between the parameters?

Hi @user2020

There’s actually two different sets of parameters that’s causing the confusion here:

  • number_of_walks and length are parameters for the random walks we run to pairs of nodes as training examples for unsupervised learning. See Section 3.2 in the reference paper which defines the loss function in terms of pairs of nodes that come from running fixed-length random walks. So these are used with the UnsupervisedSampler object in stellargraph which takes care of generating the positive/negative context pairs for training, and has no relationship with the layers in the GraphSAGE encoder.

  • num_samples is the list that defines the number of layers/iterations in the GraphSAGE encoder as you’ve already pointed out.

Hope that helps!

