Machine Learning On Graphs
Description
Deep learning has revolutionized many machine learning tasks in recent years.Successful applications range from computer vision, natural language processing to... Show moreDeep learning has revolutionized many machine learning tasks in recent years.Successful applications range from computer vision, natural language processing to
speech recognition, etc. The success is partially due to the availability of large
amounts of data and fast growing computing resources (i.e., GPU and TPU), and
partially due to the recent advances in deep learning technology. Neural networks,
in particular, have been successfully used to process regular data such as images and
videos. However, for many applications with graph-structured data, due to the irregular structure of graphs, many powerful operations in deep learning can not be
readily applied. In recent years, there is a growing interest in extending deep learning
to graphs.
We first propose graph convolutional networks (GCNs) for the task of classification or regression on time-varying graph signals, where the signal at each vertex is
given as a time series. An important element of the GCN design is filter design. We
consider filtering signals in either the vertex (spatial) domain, or the frequency (spectral) domain. Two basic architectures are proposed. In the spatial GCN architecture,
the GCN uses a graph shift operator as the basic building block to incorporate the
underlying graph structure into the convolution layer. The spatial filter directly utilizes the graph connectivity information. It defines the filter to be a polynomial in the
graph shift operator to obtain the convolved features that aggregate neighborhood
information of each node. In the spectral GCN architecture, a frequency filter is used
instead. A graph Fourier transform operator or a graph wavelet transform operator
first transforms the raw graph signal to the spectral domain, then the spectral GCN
uses the coe"cients from the graph Fourier transform or graph wavelet transform
to compute the convolved features. The spectral filter is defined using the graph’s
spectral parameters.
There are additional challenges to process time-varying graph signals as the
signal value at each vertex changes over time. The GCNs are designed to recognize
di↵erent spatiotemporal patterns from high-dimensional data defined on a graph. The
proposed models have been tested on simulation data and real data for graph signal
classification and regression. For the classification problem, we consider the power
line outage identification problem using simulation data. The experiment results
show that the proposed models can successfully classify abnormal signal patterns
and identify the outage location. For the regression problem, we use the New York
city bike-sharing demand dataset to predict the station-level hourly demand. The
prediction accuracy is superior to other models.
We next study graph neural network (GNN) models, which have been widely
used for learning graph-structured data. Due to the permutation-invariant requirement of graph learning tasks, a basic element in graph neural networks is the invariant
and equivariant linear layers. Previous work by Maron et al. (2019) provided a maximal collection of invariant and equivariant linear layers and a simple deep neural
network model, called k-IGN, for graph data defined on k-tuples of nodes. It is
shown that the expressive power of k-IGN is equivalent to k-Weisfeiler-Lehman (WL)
algorithm in graph isomorphism tests. However, the dimension of the invariant layer
and equivariant layer is the k-th and 2k-th bell numbers, respectively. Such high
complexity makes it computationally infeasible for k-IGNs with k > 3. We show
that a much smaller dimension for the linear layers is su"cient to achieve the same
expressive power. We provide two sets of orthogonal bases for the linear layers, each
with only 3(2k & 1) & k basis elements. Based on these linear layers, we develop
neural network models GNN-a and GNN-b, and show that for the graph data defined
on k-tuples of data, GNN-a and GNN-b achieve the expressive power of the k-WL
algorithm and the (k + 1)-WL algorithm in graph isomorphism tests, respectively.
In molecular prediction tasks on benchmark datasets, we demonstrate that low-order neural network models consisting of the proposed linear layers achieve better performance than other neural network models. In particular, order-2 GNN-b and order-3
GNN-a both have 3-WL expressive power, but use a much smaller basis and hence
much less computation time than known neural network models.
Finally, we study generative neural network models for graphs. Generative
models are often used in semi-supervised learning or unsupervised learning. We address two types of generative tasks. In the first task, we try to generate a component
of a large graph, such as predicting if a link exists between a pair of selected nodes,
or predicting the label of a selected node/edge. The encoder embeds the input graph
to a latent vector space via vertex embedding, and the decoder uses the vertex embedding to compute the probability of a link or node label. In the second task, we try
to generate an entire graph. The encoder embeds each input graph to a point in the
latent space. This is called graph embedding. The generative model then generates
a graph from a sampled point in the latent space. Di↵erent from the previous work,
we use the proposed equivariant and invariant layers in the inference model for all
tasks. The inference model is used to learn vertex/graph embeddings and the generative model is used to learn the generative distributions. Experiments on benchmark
datasets have been performed for a range of tasks, including link prediction, node
classification, and molecule generation. Experiment results show that the high expressive power of the inference model directly improves latent space embedding, and
hence the generated samples. Show less