Semi-supervised learning is the problem of finding clusters in a graph

or a point-clould dataset where we are given "few" initial input labels.

Label Spreading (LS) is a standard technique for this problem, which can

be interpreted as a diffusion process of the labels on the graph. While

there are many variants of LS, nearly all of them are linear models

which, for every node, only account for the information incoming from

its direct neighbors.

Recent work in network science has shown that in many graph algorithms a

great advantage can be obtained when accounting directly for

higher-order features. Such features may be built from the point-cloud

data or the adjacency matrix, for example by considering a triangle

involving nodes i, j and k. In other contexts, higher-order information

appears explicitly, for example, in a coauthorship network, a document

with three authors forms a natural triangle.

After reviewing the original LS algorithm I will present our proposed

variation of it, which takes directly advantage of higher-order

information. A key point of the proposed method is that we replace the

standard Laplacian matrix with a nonlinear Laplacian-inspired map which

is defined in terms of a order-three tensor.

Just like standard LS, we can show convergence of the new nonlinear

diffusion process to the gloabal minimum of a constrained

semi-supervised loss function that enforces local and global consistency

with the input labels. We demonstrate the efficiency and efficacy of our

approach on a variety of point cloud and network datasets, where the

proposed model outperforms classical label spreading, hypergraph

clustering methods, and graph neural networks. I will also point out

some open problems related to both the modeling part and the

computational one.