Graph neural networks A review of methods and applications

Posted on May 10, 2022 2 minute read ∼ Filed in : A paper note

GNN can capture the dependence of graphs via message passing between nodes of graphs. GCN, GAT, and GRN are proposed and have good performance. The paper discusses the variants of each component, systematically categorize the applications, and proposes four open problems for future work.

Introduction

Graph analysis focuses on node classification, link prediction, and clustering

Many methods can be applied to learn a vector to represent graph nodes, edges, or subgraphs.

hand-engineered features
DeepWalk
Node2Vec, Line, Tadw

But all of those have 2 limitations

No parameters are shared between nodes in the encoder, leading to computationally inefficiency.
Direct embedding methods lack the ability of generalization. They cannot deal with dynamic graphs or generalize to new graphs.

Based on CNNs and graph embedding, variants of graph neural networks (GNNs) are proposed to collect aggregate information from graph structure.

This survey has the following contributions:

Review existing graph neural network models.
present several major applications and category the application into structural/non-structural.
propose some research problems.

Existing Models.

Design pipeline of a GNN model for a specific task on a specific graph type

Find graph structure
- structure: Graph structure is explicit in the applications.
- Non-structure: Graphs are implicit so we have to first build the graph from the task.
Specify graph type and scale
- Directed/Undirected Graphs.
- Homogeneous/Heterogeneous Graphs: Node and edge has the same type in Homogeneous
- Static/Dynamic Graphs: Dynamic graphs’ input features or the topology varies with time
Design loss function based on our task type and the training set.
- Node-level tasks: Node classification, node regression (predict a continuous value for node), node clustering
- Edge-level tasks: Edge classification and link prediction( predict whether there is an edge).
- Graph-level tasks: Graph classification, Graph regression, and Graph matching.
In training, we could conduct supervised setting, semi-supervised setting, and unsupervised setting.

While in a semi-supervised setting, we could use the model to predict labels of each given node or infer from the same distribution.
Build a model using computational modules. eg.
- Propagation Module:
  - Propagate information between nodes
  - The convolution operator and recurrent operator are usually used to aggregate information from neighbors while the skip connection operation is used to gather information from historical representations of nodes and mitigate the over-smoothing problem
- Sampling Module:
  
  When graphs are large, sampling modules are usually needed to conduct propagation on graphs
- Pooling Module:
  
  When we need the representations of high-level subgraphs or graphs, pooling modules are needed to extract information from nodes.