Categories
research

Exploring Frame Averaging for Invariant and Equivariant Network Design

written by Nursena Koprucu and Ruyu Yan

Introduction

In the first week of project SE(3) Invariant and Equivariant Neural Network for Geometry Processing, we studied the research approach frame averaging that tackles the challenge of representing intrinsic shape properties with neural networks. Here we introduce the paper Frame Averaging for Invariant and Equivariant Network Design[1] which grounded the theoretical analysis of generalizing frame averaging with different data types and symmetry groups.

Motivation

Deep neural networks are useful for approximating unknown functions with combinations of linear and non-linear units. Its application, especially in geometry processing, involves learning functions invariant or equivariant to certain symmetries of the input data. Formally, for a function \(f\) to be

  • invariant to transformation \(A\), we have \(f(Ax)=f(x)\);
  • equivariant to transformation \(A\), we have \(f(Ax)=Af(x)\).

This paper introduced Frame Averaging (FA), a systematic framework for incorporating invariance or equivariance into existing architecture. It was derived from the observation in group theory that

Arbitrary functions \(\phi:V\rightarrow\mathbb{R}, \Phi:V\rightarrow W\), where \(V, W\) are some vector spaces, can be made invariant or equivariant by symmetrization, that is averaging over the group.

Instead of averaging over the entire group \(G\) with potentially large cardinality, where exact averaging is intractable, the paper proposed an alternative of averaging over a carefully selected subset \(\mathcal{F}(X)\subset G\) such that the cardinality \(|\mathcal{F}(X)|\) is small. Computing the average over \(\mathcal{F}(X)\) can efficiently retain both the expressive power and the exact invariance/equivariance.

Frame Averaging

Let \(\phi\) and \(\Phi\) be arbitrary functions, e.g. neural networks.

Let \({G}\) be a group with representation \({p_i}\)’s that preserves the group structure by satisfying \({p_i}(gh)= {p_i}(g){p_i}(h)\) for all \({g}, {h}\) in \({G}\).

We want to make

  • \(\phi\) into an invariant function, i.e. \(\phi(p_1(g){X})= \phi({X})\); and
  • \(\Phi\) into an equivariant function, i.e. \(\Phi(p_1(g){X})= {p_2(g)}\Phi({X})\).

Instead of averaging over the entire group each time, we will do that by averaging over group elements. Specifically, we will take the average on a subset of the group elements – called a frame.

A frame is defined as a set valued function \(\mathcal{F}:V\rightarrow\ {2^G \setminus\emptyset}\).

A frame is \(G\)-equivariant if \(F(\rho_1(g)X) = gF(X)\) for every \(X \in V\) and \(g \in G\) where\(gF(X)= \{gh|h \in F(X)\}\) and the equality means that the two sets are equal.

Why are the equivariant frames useful?

Figure 1: Frame equivariance (sphere shape represents the group \(G\); square represents \(V\)).

If we have a case where the equivariant frame is easy to compute, and its cardinality is not too large, then averaging over the frame, provides the required function symmetrization. (for invariance and equivariance).

Frame Averaging

Let \(\mathcal{F}\) be a \({G}\) equivariant frame, and \(\phi:V\rightarrow\mathbb{R}, \Phi:V\rightarrow W\) be some functions.

Then, \(\langle \phi \rangle _F\) is \(G\) invariant, while \(\langle \Phi \rangle _F\) is \(G\) equivariant.

Incorporating G as a second symmetry

Here, we have \(\phi, \Phi\) that is already invariant/equivariant w.r.t. some symmetry group \(H\). We want to make it invariant/equivariant to \(H × G\).

Frame Average second symmetry

Let \(\mathcal{F}\) be a \({G}\) equivariant frame, and \(\phi:V\rightarrow\mathbb{R}, \Phi:V\rightarrow W\) be some functions.

Then, \(\langle \phi \rangle _F\) is \(G\) invariant, while \(\langle \Phi \rangle _F\) is \(G\) equivariant.

Assume \(F\) is \(H\)-invariant and \(G\)-equivariant. Then,

  • If \(\phi:V\rightarrow\mathbb{R}\) is \(H\) invariant, and \(ρ_1,τ_1\) commute, then \(\langle \phi \rangle _F\) is \(G×H\) invariant.
  • If \(\Phi:V\rightarrow W\) is \(H\) equivariant and \(ρ_i, τ_i, i = {1,2}\), commute then \(\langle \Phi \rangle _F\) is \(G×H\) equivariant.

Efficient calculation of invariant frame averaging

We can come up with a more efficient form of FA in the invariant case.

The stabilizer of an element \(X∈V\) is a subgroup of \(G\) defined by \(G_X =\{g∈G∣ρ_1(g)X=X\}.\) \(G_X\) naturally induces an equivalence relation ∼ on \(F(X)\), with \(g ∼ h ⇐⇒ hg^{−1} ∈ G_X.\) The equivalence classes (orbits) are \([g] = {h ∈ F(X)∣g ∼ h} = G_Xg ⊂ F(X), for g∈F(X)\), and the quotient set is denoted \(F(X)/G_X.\)

Equivariant frame \(F(X)\) is a disjoint union of equal size orbits, \([g] ∈ F(X)/G_X\).

Approximation of invariant frame averaging

When \(F(X)/G_X\) is hard to enumerate, we can draw a random element from \(F(X)/G_X\) with uniform probability.

Let \(F(X)\) be an equivariant frame, and \(g ∈ F(X)\) be a uniform random sample. Then \([g] ∈ F(X)/G_X\) is also uniform.

Hence, we get an efficient approximation strategy that is averaging over uniform samples.

Model Instances

An instance of the FA framework contains:

  1. The symmetry group \(G\), representations \(\rho_1,\rho_2\) , and the underlying frame \(\mathcal{F}\).
  2. The backbone architecture for \(\phi\) (invariant) or \(\Phi\) (equivariant).

Example: Normal Estimation with Point Cloud

Here we use normal estimation with point cloud data as an example. The goal is to incorporate \(E(d)\) equivariance to the existing backbone architecture (e.g. PointNet and DGCNN).

Mathematical Formulation

The paper proposed the frame \(\mathcal{F}(X)\) based on Principal Component Analysis (PCA). Formally, we present the centroid of \(X\) by

$$t=\frac{1}{n}X^T\mathbf{1}\in\mathbb{R}^d $$

$$t=\frac{1}{n}X^T\mathbf{1}\in\mathbb{R}^d $$

and covariance matrix computed after centering the data is

$$C=(X-\mathbf{1}t^T)^T(X-\mathbf{1}t^T)\in\mathbb{R}^{d\times d} $$

Then, we define the frame by

$$\mathcal{F}(X)=\{([\alpha_1v_1, …, \alpha_dv_d], t)|\alpha_i\in\{-1, 1\}\}\subset E(d) $$

where \(\mathbf{v}_1, …, v_d\) are the eigenvectors of \(C\). From that, we can form the representation for the frames as transformation matrices.

$$\rho(g)=\begin{pmatrix}R & t \\\mathbf{0}^T & 1\end{pmatrix}, R=\begin{bmatrix}\alpha_1v_1 &…&\alpha_dv_d\end{bmatrix} $$

Implementation

We show part of the implementation of normal estimation for point cloud in PyTorch. For more details, please refer to the open-source code provided by the authors of [1].

We first perform PCA with the input point cloud.

# Compute the centroid
center = pnts.mean(1,True)
# Center the points
pnts_centered = pnts - center
# Compute covariance matrix
R = torch.bmm(pnts_centered.transpose(1,2),pnts_centered)
# Eigen-decomposition
lambdas,V_ = torch.symeig(R.detach().cpu(),True)      
# Reshape the eigenvectors   
F =  V_.to(R).unsqueeze(1).repeat(1,pnts.shape[1],1,1)

Then, we can construct the rotation matrices by

# Write down permutations of alpha
ops = torch.tensor([
	[1,1,1],
    [1,1,-1],
    [1,-1,1],
    [1,-1,-1],
    [-1,1,1],
    [-1,1,-1],
    [-1,-1,1],
    [-1,-1,-1]
]).unsqueeze(1).to(point_cloud)

F_ops = ops.unsqueeze(0).unsqueeze(2) * Frame.unsqueeze(1)

Finally, we perform the preprocessing to the point cloud data by centering and computing the frame average.

pnts_centered = point_cloud - center

# Apply inverse of the frame operation
framed_input = torch.einsum('bopij,bpj->bopi',F_ops.transpose(3,4),pnts_centered)

# feed forward to the backbone neural network
# e.g. point net
outs = self.nn_forward(framed_input)

# Apply frame operation
outs = torch.einsum('bopij,bopj->bopi',F_ops,outs)

# Take the average
outs = outs.mean(1)

For a more detailed example application of frame averaging, please see the next blog post by our group: Frame Averaging for Invariant Point Cloud Classification.

[1] Omri Puny, Matan Atzmon, Heli Ben-Hamu, Ishan Misra, Aditya Grover, Edward J. Smith, and Yaron Lipman. Frame averaging for invariant and equivariant network design. In ICLR, 2022.