Categories
Research

Topology Control: Pathfinding for Genus Preservation (2/2)

SGI Fellows:
Stephanie Atherton, Marina Levay, Ualibyek Nurgulan, Shree Singhi, Erendiro Pedro

Recall (from post 1/2) that while DeepSDFs are a powerful shape representation, they are not inherently topology-preserving. By “preserving topology,” we mean maintaining a specific topological invariant throughout shape deformation. In low-dimensional latent spaces \(z_{\text{dim}} = 2\), linear interpolation between two topologically similar shapes can easily stray into regions with a different genus, breaking that invariant.

Visualizing genera count in 2D latent space

Higher-dimensional latent spaces may offer more flexibility—potentially containing topology-preserving paths—but these are rarely straight lines. This points to a broader goal: designing interpolation strategies that remain within the manifold of valid shapes. Instead of naive linear paths, we focus on discovering homotopic trajectories—routes that stay on the learned shape manifold while preserving both geometry and topology. In this view, interpolation becomes a trajectory-planning problem: navigating from one latent point to another without crossing into regions of invalid or unstable topology.

Pathfinding in Latent Space

Suppose you are a hiker hoping to a reach a point \(A\) from a point \(B\) amidst a chaotic mountain range. Now your goal is to plan your journey so that there will be minimal height change in your trajectory, i.e., you are a hiker that hates going up and down much! Fortunately, an oracle gives us a magical device, let us call it \(f\), that can give us the exact height of any point we choose. In other words, we can query \(f(x)\) for any position \(x\), and this device is differentiable, \(f'(x)\) exists!

Metaphors aside, the problem of planning a path from a particular latent vector \(A\) to another \(B\) in the learned latent space would greatly benefit from another auxiliary network that learns the mapping from the latent vectors to the desired topological or geometric features. We will introduce a few ways to use this magical device – a simple neural network.

Gradient as the Compass

Now the best case scenario would be to stay at the same height for the whole of the journey, no? This greedy approach puts a hard constraint on the problem, but it also greatly reduces the possible space of paths to take. For our case, we would love to move toward the direction that does not change our height, and this set of directions precisely forms the nullspace of the gradient.

No height change in mathematical terms means that we look for directions where the derivatives equal zero, as in

$$D_vf(x) = \nabla f(x)\cdot v = 0,$$


where the first equality is a fact of the calculus and the second shows that any desirable direction \(v\in \text{null}(\nabla f(x))\).

This does not give any definitive directions for a position \(x\) but a set of possible good directions. Then a greedy approach is to take the direction that aligns most with our relative position against the goal – the point \(B\).

Almost done, but the problem is that if we always take steps with the assumption that we are on the correct isopath, we would end up accumulating errors. So, we actually want to nudge ourselves a tiny bit along the gradient to land back on the isopath. Toward this, let \(x\) be the current position and \(\Delta x=\alpha \nabla f(x) + n\), where \(n\) is the projection of \(B-x\) to \(\text{null}(\nabla f(x))\). Then we want \(f(x+\Delta x) = f(B)\). Approximate

$$f(B) = f(x+\Delta x) \approx f(x) + \nabla f(x)\cdot \Delta x,$$

and substituting for \(\Delta x\), see that

$$f(x) + \nabla f(x) \cdot (\alpha\nabla f(x) + n) \
\Longleftrightarrow
\alpha \approx \frac{f(B) – f(x)}{|\nabla f(x)|^2}.$$

Now we just take \(\Delta x = \frac{f(B) – f(x)}{|\nabla f(x)|^2}\nabla f(x) + \text{proj}_{\text{null}(\nabla f(x))} (B- x)\) for each position \(x\) on the path.

Results. Latent to expected volume.

Optimizing Vertical Laziness

Sometimes it is impossible to hope for the best case scenario. Even when \(f(A)=f(B)\), it might happen that the latent space is structured such that \(A\) and \(B\) are in different components of the level set of the function \(f\). Then there is no hope for a smooth hike without ups and downs! But the situation is not completely hopeless if we are fine with taking detours as long as such undesirables are minimized. We frame the problem as

$$\text{argmin}_{x_i \in L, \forall i} \sum_{i=1}^n |f(x_i) – f(B)|^2 + \lambda\sum_{i=1}^{n-2} |(x_{i+2}-x_{i+1}) – (x_{i+1} – x_i)|^2,
$$

where \(L\) is the latent space and \({x_i}_{i=1}^n\) is the sequence defining the path such that \(x_1 = A\) and \(x_n=B\). The first term is to encourage consistency in the function values, while the second term discourages sudden changes in curvature, thereby ensuring smoothness. Once defined, various gradient-based optimization algorithms can be used on this problem, which is now imposed with a relatively soft constraint.

Results of pathfinding via optimization in different contexts.

Latent to expected volume.
Latent to expected genus.

A Geodesic Path

One alternative attempt of note to optimize pathfinding used the concept of geodesics, or the shortest, smoothest distance between two points on a Riemannian manifold. In the latent space, one is really working over a latent data manifold, so thinking geometrically, we experimented with a geodesic path algorithm. In writing this algorithm, we set our two latent endpoints and optimized the points in between by minimizing the energy of our path on the output manifold. This alternative method worked similarly well to the optimized pathfinding algorithm posed above!

Conclusion and Future Work

Our journey into DeepSDFs, gradient methods, and geodesic paths has shown both the promise and the pitfalls of blending multiple specialized components. The decoupled design gave us flexibility, but also revealed a fragility—every part must share the same initialized latent space, and retraining one element (like the DeepSDF) forces a retraining of its companions, the regressor and classifier. With each component carrying its own error, the final solution sometimes felt like navigating with a slightly crooked compass.

Yet, these challenges are also opportunities. Regularization strategies such as Lipschitz constraints, eikonal terms, and Laplacian smoothing may help “tame” topological features, while alternative frameworks like diffeomorphic flows and persistent homology could provide more robust topological guarantees. The full integration of these models remains a work in progress, but we hope that this exploration sparks new ideas and gives you, the reader, a sharper sense of how topology can be preserved—and perhaps even mastered—in complex geometric learning systems.

Genus is a commonly known and widely applied shape feature in 3D computer graphics, but there is also a whole mathematical menu of topological invariants to explore along with their preservation techniques! Throughout our research, we also considered:

  • Would topological accuracy benefit from a non-differentiable algorithm (e.g. RRT) that can directly encode genus as a discrete property? How would we go about certifying such an algorithm?
  • How can homotopy continuation be used for latent space pathfinding? How will this method complement continuous deformations of homotopic and even non-homotopic shapes?
  • What are some use cases to preserve topology, and what choice of topological invariant should we pair with those cases?

We invite all readers and researchers to take into account our struggles and successes in considering the questions posed.

Acknowledgements. We thank Professor Kry for his mentorship, Daria Nogina and Yuanyuan Tao for many helpful huddles, Professor Solomon for his organization of great mentors and projects, and the sponsors of SGI for making this summer possible.

References

DeepSDF

SDFConnect

Categories
Tutorial week

Ironing Out Wrinkles with Mesh Fairing

Picture your triangle mesh as that poor, crumpled T-shirt you just pulled out of the dryer. It’s full of sharp folds, uneven patches, and you wouldn’t be caught dead wearing it in public—or running a physics simulation! In geometry processing, these “wrinkles” manifest as skinny triangles, janky angles, and noisy curvature that spoil rendering, break solvers, or just look… ugly. Sometimes, we might want to fix these problems using something geometry nerds call surface fairing.

Figure 1. (Left) Clean mesh, (Right) Mesh damaged by adding Gaussian noise to each vertex (Middle) Result after applying surface fairing to the damaged mesh.

In essence, surface fairing is an algorithm one uses to smoothen their mesh. Surface fairing can be solved using energy minimization. We define an “energy” that measures how wrinkly our mesh is, then use optimization algorithms to slightly nudge each vertex so that the total energy drops.

  1. Spring Energy
    Think of each mesh edge (i,j) as a spring with rest length 0. If a spring is too long, it costs energy. Formally:
    Emem = ½ Σ(i,j)∈E wij‖xi − xj‖² with weights wij uniform or derived from entries of the discrete Laplacian matrix (using cotangents of opposite angles). We’ll call this term the Membrane Term
  2. Bending Energy
    Sometimes, we might want to additionally smooth the mesh even if it looks less wrinkly. In this case, we penalize the curvature of the mesh, i.e., the rate at which normals change at the vertices:
    Ebend = ½ Σi Ai‖Δxi‖² where Δxi is the discrete Laplacian at vertex i and Ai is its “vertex area”. For the uninitiated, the idea of a vertex having an area might seem a little weird. But it represents how much area a vertex controls and is often calculated as one-third of the summed areas of triangles incident to vertex i. We’ll call this energy term the Laplacian Term.
  3. Angle Energy
    As Nicholas has warned the fellows, sometimes, having long skinny triangles can cause numerical instability. So we might like our triangles to be more equilateral. We can additionally punish deviations from 60° for each triangle, by adding another energy term: Eangle = Σk=1..3k−60°)². We’ll call this the Angle Term.
Figure 2. (Left) Damaged mesh. (In order) Surface fairing using only 1) Spring Energy 2) Bending Energy 3) Angle Energy. Notice how the last two energy minimizations fail to smooth the mesh properly alone.

Note that in most cases using angle energy and bending energy terms are optional, but using the spring energy term is important! (however, you may encounter a few special cases where you can make do without the spring energy?? i’m not too sure, don’t take my word for it). Once we have computed these energies, we weight them appropriately with scaling factors and sum them up. The next step is to minimize them. I love Machine Learning and hate numerical solvers; and so, it brings me immense joy to inform you that since the problem is non-convex, we can should use gradient descent! At each vertex, compute the gradient xE and take a tiny step towards the minima: \(x_{i} \leftarrow x_i – \lambda \cdot (\nabla_{x} E)_{i}\)

Now, have a look at our final surface fairing algorithm in its fullest glory. Isn’t it beautiful? Maybe you and I are nerds after all 🙂

Figure 1. (Left) Clean mesh, (Right) Damaged mesh (Middle) Resulting mesh after each step of gradient descent on the three energy terms combined. Note how the triangles in this case are more equilateral than the red mesh in Figure 2, because we’ve combined all energy terms.

So, the next time your mesh looks like you’ve tossed a paper airplane into a blender, remember: with a bit of math and a few iterations, you can make it runway-ready; and your mesh-processing algorithm might just thank you for sparing its life. You can find the code for the blog at github.com/ShreeSinghi/sgi_mesh_fairing_blog.

Happy geometry processing!

References

  • Desbrun, M., Meyer, M., Schröder, P., & Barr, A. H. (1999). Implicit fairing of irregular meshes using diffusion and curvature flow.
  • Pinkall, U., & Polthier, K. (1993). Computing discrete minimal surfaces and their conjugates.

Categories
Research

Topology Control: Training a DeepSDF (1/2)

SGI Fellows:
Stephanie Atherton, Marina Levay, Ualibyek Nurgulan, Shree Singhi, Erendiro Pedro

In the Topology Control project mentored by Professor Paul Kry and project assistants Daria Nogina and Yuanyuan Tao, we sought to explore preserving topological invariants of meshes within the framework of DeepSDFs. Deep Signed Distance Functions are a neural implicit representation used for shapes in geometry processing, but they don’t come with the promise of respecting topology. After finishing our ML pipeline, we explored various topology-preserving techniques through our simple, initial case of deforming a “donut” (torus) into a mug.

DeepSDFs

Signed Distance Field (SDF) representation of a 3D bunny. The network predicts the signed distance from each spatial point to the surface. Source: (Park et al., 2019).

Signed Distance Functions (SDFs) return the shortest distance from any point in 3D space to the surface of an object. Their sign indicates spatial relation: negative if the point lies inside, positive if outside. The surface itself is defined implicitly as the zero-level set: the locus where \(\text{SDF}(x) = 0 \).

In 2019, Park et al. introduced DeepSDF, the first method to learn a continuous SDF directly using a deep neural network (Park et al., 2019). Given a shape-specific latent code \( z \in \mathbb{R}^d \) and a 3D point \( x \in \mathbb{R}^3 \), the network learns a continuous mapping:

$$
f_\theta(z_i, x) \approx \text{SDF}^i(x),
$$

where \( f_\theta \) takes a latent code \( z_i \) and a 3D query point \( x \) and returns an approximate signed distance.

The training set is defined as:

$$
X := {(x, s) : \text{SDF}(x) = s}.
$$

Training minimizes the clamped L1 loss between predicted and true distances:

$$
\mathcal{L}\bigl(f_\theta(x), s\bigr)
= \bigl|\text{clamp}\bigl(f_\theta(x), \delta\bigr) – \text{clamp}(s, \delta)\bigr|
$$

with the clamping function:

$$
\text{clamp}(x, \delta) = \min\bigl(\delta, \max(-\delta, x)\bigr).
$$

Clamping focuses the loss near the surface, where accuracy matters most. The parameter \( \delta \) sets the active range.

This is trained on a dataset of 3D point samples and corresponding signed distances. Each shape in the training set is assigned a unique latent vector \( z_i \), allowing the model to generalize across multiple shapes.

Once trained, the network defines an implicit surface through its decision boundary, precisely where \( f_\theta(z, x) = 0 \). This continuous representation allows smooth shape interpolation, high-resolution reconstruction, and editing directly in latent space.

Training Field Notes

We sampled training data from two meshes, torus.obj and mug.obj using a mix of blue-noise points near the surface and uniform samples within a unit cube. All shapes were volume-normalized to ensure consistent interpolation.

DeepSDF is designed to intentionally overfit. Validation is typically skipped. Effective training depends on a few factors: point sample density, network size, shape complexity, and sufficient epochs.

After training, the implicit surface can be extracted using Marching Cubes or Marching Tetrahedra to obtain a polygonal mesh from the zero-level set.

Training Parameters
SDF Delta1.0
Latent Mean0.0
Latent SD0.01
Loss FunctionClamped L1
OptimizerAdam
Network Learning Rate0.001
Latent Learning Rate0.01
Batch Size2
Epochs5000
Max Points per Shape3000
Network Architecture
Latent Dimension16
Hidden Layer Size124
Number of Layers8
Input Coordinate Dim3
Dropout0.0
Point Cloud Sampling
Radius0.02
Sigma0.02
Mu0.0
Number of Gaussians10
Uniform Samples5000

For higher shape complexity, increasing the latent dimension or training duration improves reconstruction fidelity.

Latent Space Interpolation

One compelling application is interpolation in latent space. By linearly blending between two shape codes \( z_a \) and \( z_b \), we generate new shapes along the path

$$
z(t) = (1 – t) \cdot z_a + t \cdot z_b,\quad t \in [0,1].
$$

Latent space interpolation between mug and torus.

While DeepSDF enables smooth morphing between shapes, it exposes a core limitation: a lack of topological consistency. Even when the source and target shapes share the same number of genus, interpolated shapes can exhibit unintended holes, handles, or disconnected components. These are not artifacts, they reveal that the model has no built-in notion of topology.

However, this limitation also opens the door for deeper exploration. If neural fields like DeepSDF lack an inherent understanding of topology, how can we guide them toward preserving it? In the next post, we explore a fundamental topological property—genus—and how maintaining it during shape transitions could lead us toward more structurally meaningful interpolations.

References

Categories
Research

Quasi-harmonic Geodesic Distance

I kicked off this project with my advisor, Yu Wang, over Zoom the weekend before our official start, checked in mid-week to ask questions and provide a progress report, and wrapped up the week with a final Zoom review—each meeting improving my understanding of the fundamental problem and how to improve my approach in Python. I’m excited to share these quasi-harmonic ideas of how a blend of PDE insight and mesh-based propagation can yield both speed and exactness in geodesic computation.

Finding the shortest path on a curved surface—called the geodesic distance—is deceptively challenging. Exact methods track how a wavefront would travel across every triangle of the mesh, which is accurate but can be painfully slow on complex shapes. The Heat Method offers a clever shortcut: imagine dropping a bit of heat at your source point, let it diffuse for a moment, then “read” the resulting temperature field to infer distances. By solving two linear systems—one for the heat spread and one to recover a distance‐like potential—you get a fast, global approximation. It runs in near-linear time and parallelizes beautifully, but it can smooth over sharp creases and slightly misestimate in highly curved regions.

To sharpen accuracy where it matters, I adopted a hybrid strategy. First, detect “barrier” edges—those sharp creases where two faces meet at a steep dihedral angle—and temporarily slice the mesh along them. Then apply the Heat Method independently on each nearly‐flat patch, pinning the values along the cut edges to their true geodesic distances. Finally, stitch everything back together by running a precise propagation step only along those barrier edges. The result is a distance field that retains the Heat Method’s speed and scalability on smooth regions, yet achieves exactness along critical creases. It’s remarkable how something as seemingly simple as measuring distance on a surface can lead into rich territory—mixing partial-differential equations, sparse linear algebra, and discrete geometry in pursuit of both efficiency and precision.

Categories
Research

From Calculus to \(k\)-forms

Differential geometry makes heavy use of calculus in order to study geometric objects such as smooth manifolds. Just as differential geometry builds upon calculus and other fundamentals, the notion of differential forms extend the approaches of differential geometry and calculus, even allowing us to study the larger class of surfaces that are not necessarily smooth. We will motivate from calculus a brief brush with differential forms and their geometric potential.

Review of Vector Fields

Recall a vector field is a special case of a multivariable vector-valued function (taking in a scalar or a vector and outputting a set of multidimensional vectors). Often they are defined in \(\mathbb{R}^2\) or \(\mathbb{R}^3\), but can also be defined more generally on smooth manifolds.

In three dimensional space, a vector field is a function \(V\) that assigns to each point \((x, y, z)\) a three dimensional vector given by \(V(x, y, z)\). Consider our identity function

$$ V(x, y, z) = \begin{bmatrix}x\\y\\z \end{bmatrix}, $$

where for each point \((x, y, z)\) we associate a vector with those same coordinates. Generalizing, for any vector function \(V(x, y, z)\), we visualize its vector field by placing the tail of the corresponding vector \(V(x_0, y_0, z_0)\) at its point \((x_0, y_0, z_0)\) in the domain of \(V(x, y, z)\).

Here we consider fluid particles (displayed as blue dots) as points, each with its own magnitude, direction, and velocity that correspond to overall fluid flow. Then to each point in our sample of particles we can assign a vector with speed of velocity as its length. Each arrow not only represents the velocity of the individual particle it is attached to, but also takes into account how the neighborhood of particles around it move.

Vector fields can be transformed into scalar fields via \(\nabla\) taken as a vector differentiable operator known as the gradient operator.

Our gradient points in the direction of steepest ascent of a scalar function \(f(x_1, x_2, x_3, . . . x_n)\), indicating the direction of greatest change of \(f\). A vector field \(V\) over an open set \(S\) is a gradient field if there exists a differentiable real-valued function (the scalar field) \(f\) on \(S\). Then

$$V = \nabla f = ( \frac{\partial f}{\partial x_1}, \frac{\partial f}{\partial x_2}, \frac{\partial f}{\partial x_3}, . . . \frac{\partial f}{\partial x_n})$$

is a special type of vector field \(V\) defined as the gradient of a scalar function, where each \( v \in V\) is the gradient of \(f\) at that point.

The Divergence and Curl Operators

Now that we have reviewed vector fields, recall that we can also perform operations like divergence and curl over them. When fluid flows over time, some regions of fluid will become less dense with particles as they flow away (negative divergence). Meanwhile, particles tending towards each other will cause the fluid in that region to be more dense (positive divergence).

Luckily, we have our handy divergence operator to take in the vector-valued function of our vector field and output a scalar-valued function measuring the change in density of the fluid around each point. Given a vector field \(V\), we can think of divergence

$$\text{div}\space V = \nabla \cdot V = \frac{\partial V_1}{\partial x} + \frac{\partial V_2}{\partial y}+ \frac{\partial V_3}{\partial z} . . . $$

as a variation of the derivative that generalizes to vector fields \(V\) of any dimension. Another “trick” to thinking about the divergence formula is as the trace of the Jacobian matrix of the vector field.

In differential geometry, one can also define divergence on a Riemannian manifold \((M, g)\), a smooth manifold \(M\) endowed with the Riemannian metric \(g\) as a choice of inner product for each tangent space of \(M\).

The Riemannian metric allows us to take the inner product of the two vectors tangent to the sphere.
Every smooth manifold admits a Riemannian metric.

Whereas divergence describes changes in density, curl measures the “rotation”, or angular momentum, of the vector field \(V\). Likewise, the curl operator takes in a vector-valued function and outputs a function that describes the flow of rotation given by the vector field at each point.

Here, all the points go in circles in the counterclockwise direction, which we define by \(V(x, y) = \begin{bmatrix} -y \\ x \end{bmatrix} = -y\hat{i} + x\hat{j}\).

For a vector field function

$$V(x, y) = v_1(x, y)\hat{i} + v_2(x, y)\hat{j},$$

we can look at the change of a point \((x_0, y_0)\) by looking at the change in \(v_1\) as \(y\) changes and \(v_2\) as \(x\) changes. We calculate

$$2\text{d curl} \space V = \nabla \times V = \frac{\partial v_2}{\partial x}(x_0, y_0) \space – \frac{\partial v_1}{y}(x_0, y_0),$$

with a positive evaluation indicating counterclockwise rotation around \((x_0, y_0)\) and a negative evaluation indicating clockwise.

Approach by Differential Forms

Vector calculus can also be viewed more generally and geometrically via the theory of differential forms.

While curl is typically only defined in two and three dimensions, it can be generalized in the context of differential forms. The calculus of differential \(k\)-forms offers a unified approach to define integrands over curves, surfaces, and higher-dimensional manifolds. Differential 1-forms of expression form

$$\omega = f(x)dx$$

are defined on an open interval \( U \subseteq \mathbb{R^n}\) with \(f: U \to \mathbb{R}\) a smooth function. We can have 1-forms

$$\omega = f(x, y)dx + g(x, y)dy$$

defined on an open subset \( U \subseteq \mathbb{R^2}\) ,

$$\omega = f(x, y, z)dx + g(x, y, z)dy + h(x, y, z)dz $$

defined on an open subset \( U \subseteq \mathbb{R^3}\), and so on for any positive integer \(n\).

Correspondence Theorem: Given a 1-form \(\omega = f_1dx_1 + f_2dx_2 + . . . f_ndx_n\), we can define an associated vector field with component functions \((f_1, f_2, . . . f_n)\). Conversely, given a smooth vector field \(V = (f_1, f_2, . . . f_n)\), we can define an associated 1-form \(\omega = f_1dx_1 + f_2dx_2 + . . . f_ndx_n\).

Via the exterior derivative and an extension of the correspondence between 1-forms and vector fields to arbitrary \(k\)-forms, we can generalize the well-known operators of vector calculus.

The exterior derivative \(d\omega\) of a \(k\)-form is simply a \(k + 1\)-form. We provide an example of its use in defining curl:

Let \(\omega = f(x, y, z)dx + g(x, y, z)dy + h(x, y, z)dz\) be a 1-form on \( U \subseteq \mathbb{R}^3\) with its associated vector field \(V = (f, g, h)\). Its exterior derivative \(d\omega\) is the 2-form

$$d\omega = (\frac{\partial h}{\partial y} – \frac{\partial g}{\partial z}) dy \wedge dz + (\frac{\partial f}{\partial z} – \frac{\partial h}{\partial x}) dz \wedge dx + (\frac{\partial g}{\partial x} – \frac{\partial f}{\partial y}) dx \wedge dy,$$

where \(\wedge\) is an associative operation known as the exterior product. We define the curl of \(V\)

$$ \nabla \times V = (\frac{\partial h}{\partial y} – \frac{\partial g}{\partial z}), (\frac{\partial f}{\partial z} – \frac{\partial h}{\partial x}), (\frac{\partial g}{\partial x} – \frac{\partial f}{\partial y})$$

as the vector field associated to the 2-form \(d\omega\). Attempt to see how this definition of curl agrees with the traditional version, but can be applied in higher dimensions.

The natural pairing between vector fields and 1-forms allows for the modern approach of \(k\)-forms for multivariable calculus and has also heavily influenced the area of geometric measure theory (GMT), where we define currents as functionals on the space of \(k\)-forms. Via currents, we have the impressive result that a surface can be uniquely defined by the result of integrating differential forms over it. Further, the differential forms fundamental to GMT open up our toolbox to studying geometry that is not necessarily smooth!

Acknowledgements. We thank Professor Chien, Erick Jiminez Berumen, and Matheus Araujo for exposing us to new math, questions, and insights, as well as for their mentorship and dedication throughout the “Developing a Medial Axis for Measures” project. We also acknowledge Professor Solomon for suggesting edits incorporated into this post, as well as for his organization of the wonderful summer program in the first place!

References

Categories
Research

A Derivation of the Force Density Method

Thanks to Ioannis Mirtsopoulos, Elshadai Tegegn, Diana Avila, Jorge Rico Esteban, and Tinsae Tadesse for invaluable assistance with this article.

A truss is a collection of rigid beams connected together with joints which can rotate, but not slip by each other. Trusses are used throughout engineering, architecture, and other structural design; you’ll see them in railway bridges, supporting ceilings, and forming the foundation for many older roller coasters. When designing trusses, it is often necessary to design for static equilibrium; that is, such that the forces at each joint in the truss sum to zero, and the truss
does not move. One of the most effective ways to accomplish this is by parameterizing our graphs through the ratios between the axial force along each edge and the length of that edge. Our goal for this article is to show that these force-densities parametrize graphs in static equilibrium, subject to certain constraints.

The first constraint we will place upon our system is topological. Recall that a graph is a collection of vertices, together with a collection of edges linking pairs of those vertices. We will fix, once and for all, a graph \(G\) to represent the topology of our truss. We will also designate some of the vertices of our graph as fixed vertices and the rest as free vertices. The fixed vertices represent vertices which are attached to some solid medium external to the truss, and so experience normal forces which cancel out any forces which are present from the loading of the truss. In addition to these fixed vertices, to each free vertex we assign a load, which represents the force placed on that point.

First we introduce the incidence matrix of a (directed)
graph. Recall that a directed graph where the edges have a destinguished ‘first’ vertex; that is, the edges are represented by ordered pairs rather than sets. Given such a graph, the incidence matrix of the graph is the matrix having in the \(i,j\)-th position a 1 if the \(i\)-th edge begins with the
\(j\)-th vertex, a \(-1\) if the \(i\)-th edge ends with the \(j\)-th vertex, and a 0 otherwise. We split the entire adjacency matrix \(A\) of our graph \(G\) into two matrices, \(C\) and \(C_f\), where \(C\) consists of the columns of \(A\) which correspond to free vertices, and \(C_f\) consists of those columns corresponding to fixed vertices. We will also work with the vectors \(p_x\) and \(p_y\) of forces, and
the vectors \(x_f\) and \(y_f\) of equilibrium points. We wish to construct a map \[F:\mathscr Q \to \mathscr C\]
where \(\mathscr Q\) is the space of vectors \(q = (q_1, …, q_n)\) of force-length ratios, and \(\mathscr C\) is the space of all possible lists of coordinate pairs \((x, y) = (x_1, …, x_n, y_1,
…, y_n)\). It turns out that the map we want is \[F(q) = \left(D^{-1}(p_x – D_fx_f), D^{-1}(p_y – D_fy_f)\right)\]
Where we set \(D = C^TQC, D_f = C^TQC_f\), and \(Q\) is the diagonal matrix with the \(q_i\) along the diagonal. Clearly, this map is a rational map in the \(q_i\), and is regular away from where \(\det(D) = 0\) (when viewed as a function of \(q\)).

Derivation

We suppose we have a system with free coordinate vectors \((x, y) = (x_1, …, x_n, y_1, …, y_n)\) and fixed coordinate vectors \((x_f, y_f) = (x_{1f}, …, y_{1f}, …)\) which is
in static equilibrium. First note that we can easily obtain the vectors of coordinate differences \(u\) and \(v\) (in the \(x\) and \(y\) directions respectively) using the incidence matrices \(C\) and \(C_f\).
\[u = Cx + C_fx_f\]
\[v = Cy + C_fy_f\]
We form the diagonal matrices \(U\) and \(V\) corresponding to \(u\) and \(v\), letting \(L\) be the matrix with the lengths of the edges down the diagonal and \(s\) the vector of axial forces along each edge. We note that if we divide the axial force along each edge by that edge’s length, and then multiply by the length in the \(x\) or \(y\) direction, we have then calculated the force that edge contributes to each vertex at it’s ends. Multiplying on the left by the transpose of the incedence matrix \(C\) then gives the total force on the vertex from all the bars coincident with it, which must then be
equal to the external load applied. This yields the equations
\[C^TUL^{-1}s = p_x,\ \ \ C^TVL^{-1}s = p_y,\]
which hold if and only if the system is in equilibrium. We define \(q=L^{-1}s\) to be the vector of force densities, simplifying our equations to
\[C^TUq = p_x,\ \ \ C^TVq = p_y.\]
We note that \(Uq = Qu\) and \(Vq=Qv\), by the definition of matrix multiplication (each matrix is a diagonal matrix!). Thus we obtain by the definitions of \(u\) and \(v\)
\[p_x = C^TUq = C^TQ(Cx+C_fx_f) = C^TQCx+C^TQC_fx_f, \]
\[p_y = C^TVq = C^TQ(Cy+C_fy_f) = C^TQCy+C^TQC_fy_f.\]
We set, as above, \(D = C^TQC\) and \(D_f = C^TQC_f\), yielding
\[p_x – D_fx_f= Dx , \]
\[p_y – D_fy_f= Dy ,\]
or as promised
\[D^{-1}(p_x – D_fx_f)=x ,\]
\[D^{-1}(p_y – D_fy_f)=y .\]
Clearly these equations hold if and only if the system is in equilibrium. Since every system (in equilibrium or not) has force densities, this yields an important corallary: Every system which is in equilibrium is of the form \(F(q)\) for some choice of force-density ratios \(q\), and no system which is not in equilibrium is of such a form.

It turns out to be incredibly useful to have this method to parametrize the set of all graphs in static equilibrium. This method is used to calculate the deformation of nets, cloth, and to design load-bearing trusses; it is also used to generate trusses based on particular design parameters. In
a later article we will also describe how one can use FDM to find an explicit representation for the set of possible point configurations which are in static equilibrium.

Categories
Tutorial week

Tutorial Week Day 4: Texture Optimization

Intro

Now that you’ve gotten the grasp of meshes, and created your very first 3D object on Polyscope, what can we do next? On Day 4 of the SGI Tutorial Week, we had speakers Richard Liu and Dale Decatur from the Threedle Lab at UChicago explaining about Texture Optimization.

In animated movies and video games, we are used to observing them with realistic textures. Intuitively, we might imagine that we have this texture as a map applied as a flat “sticker” over it. However, we know from some well-known examples in cartography that it is difficult to create a 2D map of a 3D shape like the Earth without introducing distortions or cuts.

Besides, ideally, we would want to be able to transition from the 2D to the 3D map interchangeably. Framing this 2D map as a function on \(\mathbb{R^2}\), we would like it to be bijective.

The process of creating a bijective map from a mesh S in \(\mathbb{R^3}\) to a given domain \(\Omega\), or \(F: \Omega \leftrightarrow S\), is known as mesh parametrization.

UV Unwrapping

Consider a mesh defined by the object as \(M = {V, E, F}\), with Vertices, Edges and Faces respectively, where:

\begin{equation*} V \subset R^3 \end{equation*} \begin{align*} E = \{(i, j) \quad &| \text{\quad vertex i is connected to vertex j}\} \\ F = \{(i, j, k) \quad &| \quad \text{vertex i, j , k share a face}\} \end{align*}

For a mesh M, we define the parametrization as a function mapping vertices \(V\) to some domain \(\Omega\)

\begin{equation*} U: V \subset \mathbb{R}^3 \rightarrow \Omega \subset \mathbb{R}^2 \end{equation*}

For every vertex \(v_i\) we assign a 2D coordinate \(U(v_i) = (u_i, v_i)\) (this is why the method is called UV unwrapping).

This parametrization has a constraint, however. Only disk topology surfaces (those with a 1 boundary loop) surfaces admit a homeomorphism (a continuous and bijective map) to the bounded 2D plane.

Source: Richard Liu

As a workaround, we can always introduce a cut.

Source: Richard Liu

However, these seams create discontinuities, and in turn, distortions we want to minimize. We can distinguish different kinds of distortions using the Jacobian, the matrix of all first-order partial derivatives of a vector-valued function. Specifically, we look at the singular values of the Jacobian (which in the UV function is defined as a 2×3 matrix per triangle).

\begin{align*} J_f = U\Sigma V^T = U \begin{pmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \\ 0 & 0 \end{pmatrix} V^T \\ f \text{ is equiareal or area preserving} \Leftrightarrow \sigma_1 \sigma_2 &= 1 \\ f \text{ is equiareal or area preserving} \Leftrightarrow \sigma_1 &= \sigma_2 \\ f \text{ is equiareal or area preserving} \Leftrightarrow \sigma_1 &= \sigma_2 = 1 \end{align*}

We can find the Jacobians of the UV map assignment \(U\) using the mesh gradient operator \(G \in \mathbb{R}^{3|F|\times V}\)

\begin{align*} GU = J \in \mathbb{R}^{3|F|\times V} \end{align*}

The classical approach for optimizing a mesh parametrization is viewing the mesh surface as the graph of a mass-spring system (least squares optimization problem). The mesh vertices are the nodes (with equal mass) with each edge acting as a spring with some stiffness constant.

Source: Richard Liu

Two key design choices in setting up this problem are:

  • Where to put the boundary?
  • How to set the spring weights (stiffness)?

To compute the parametrization, we minimize the sum of spring potential energies.

\begin{align*} \min_U \sum_{\{i, j\} \in E} w_ij || u_i – u_j||^2 \end{align*}

Where \(u_i = (u_i, v_i)\) is the UV coordinate to vertex i. We then aim to solve the ODE:

\begin{align*} \frac{\partial E}{\partial u _i }= \sum_{j \in N_i} 2 \cdot w_{ij} (u_i – u_j) = 0 \end{align*}

Where \(N_i\) indexes all neighbors of vertex \(i\). We can now fix the boundary vertices and solve each equation for the remaining \(u_i\).

A solution by Tutte (1963) and Floater (1997) was computed such that if the boundary is convex and the weights are positive, the solution is guaranteed to be injective. While an injective map has its limitations in terms of transferring the texture back into the 3D shape, it allows us to easily visualize the 2D map of this shape.

Here, we would need to fix or map a boundary to a circle, where all weights are 1.

An alternative to the classical approach is trivial parametrization, where we split the mesh up into its individual triangles in a triangle soup and then flatten each triangle (isometric distortion). We would then translate and scale all triangles to be packed into the unit square.

Source: Richard Liu

Translation and uniform scaling does not affect the angles, such that we have no angle distortion. So this tells us that this parametrization approach is conformal.

Still, when packing triangles into the texture image, we might need to rescale them so they fit, meaning the areas are not guaranteed to be the same. We can verify this based on the SDF values:

\begin{align*} \sigma_1\sigma_2 \neq 1 \end{align*}

This won’t be too relevant in the end though because all triangles get scaled by the same factor, so that the area of each triangle is preserved up to some global scaler. As a result, we would still get an even sampling of the surface, which is what we would care about, at least intuitively, for texture mapping.

However, applying a classical algorithm to trivial parametrization is typically a bad solution because it introduces maximal cuts (discontinuities). How can we overcome this issue?

Parametrization with Neural Networks

Assume we have some parametrized function and some target value. With an adaptive learning algorithm, we determine the direction in which to change our parameters so that the function moves closer to our target.

We compute some loss or measure of closeness between our current function evaluation and our target value.We then compute the gradient of this loss with respect to our parameters. Take a small step in that direction. We would repeat this process until we ideally achieved the global minimum.

Source (left) and Source (right)

Similarly, we can use an approach to optimize the mesh texture.

  • Parametric function = rendering function
  • Parameters = texture map pixel values (texel = texture pixel)
  • Target = existing images of the object with the desired colors
  • Loss = pixel differences between our rendered images and target images.

What happens if we do this? We do achieve a texture map, but each pixel is mostly optimized independently. As a result, the texture ends up more grainy.

Source: Dale Decatur

Improving Smoothness

What can we do to improve this? Well, we would want to make the function more smooth. For this, we can optimize a new function \(\phi(u, v) : \mathbb{R}^2 \rightarrow \mathbb{R}^2\) that maps the texels to RGB color values. By smoothness in \(\phi\), we mean that \(\phi\) is continuous.

To get the color at any point in the 2D texture map, we just need to query \(\phi\) at that UV coordinate!

Source: Dale Decatur

To encourage smoothness in \(\phi\), we can use a 3D coordinate network.


Tannic et al, CVPR 2021

Consider a mapping from a 3D surface to a 2D texture. We can learn a 3D coordinate \(\phi\): \(\mathbb{R}^3 \rightarrow \mathbb{R}^2\) that maps 3D surface points to RGB colors. \(\phi(x, y, z) = (R, G, B)\).

We can then use the inverse map to go from 2D texture coordinates to 3D surface points, then query those points with the coordinate network to get colors, then map those colors back to the 2D texture for easy rendering.

Source: Dale Decatur

How to compute an inverse map?

We aim to have a mapping from texels to surface poitns, but we already have a mapping from the vertices to the texel plane. The way we can map an arbitrary coordinate to the surface is through barycentric interpolation, a coordinate system relative to a given triangle. In this system, any points has 3 barycentric coordinates.

Categories
Tutorial week

Tutorial Week Day 5: Debugging and Robustness

On July 11th, Nicholas Sharp, the creator of the amazing Polyscope library, gave the SGI Community an insightful talk on debugging and robustness in geometry processing—insights that would later save several fellows hours of head-scratching and mental gymnastics. The talk was broadly organized into five parts, each corresponding to a paradigm where bugs commonly arise:

  1. Representing Numbers
  2. Representing Meshes
  3. Optimization
  4. Simulation and PDEs
  5. Geometric Machine Learning

Representing Numbers

The algorithms developed for dealing with geometry are primarily built to work with clean and pretty real numbers. However, computers deal with floats, and sometimes they behave ugly. Floats are computers’ approximations of real numbers. Sometimes, it may require an infinite amount of storage to correctly represent a single number of arbitrary precision (unless you can get away with \(π=3.14\)). Each floating number can either be represented using 32 bits (single precision) or 64 bits (double precision).

Figure 1. An example of the float32 representation

In the floating realm, we have quirks like:

  1. \((x+y)+z \neq x+(y+z)\)
  2. \( a>0, b>0\) but \(a+b=a\)

It is important to note that floating-point arithmetic is not random; it is simply misaligned with real arithmetic. That is, the same operation will consistently yield the same result, even if it deviates from the mathematically expected one. Possible alternatives to floats are integers and binary fraction representations; however, they come with their own obvious limitations.

Who’s your best friend? The NaN! The NaN is a special “floating-point” computers spit out when they’re asked to perform invalid operations like

  1. \(\frac{0}{0}\rightarrow \) NaN
  2. \(\sqrt{-1} \rightarrow\) NaN (not \(i\) haha)

Every operation against a NaN results in… NaN. Hence, one small slip-up in the code can result in the entire algorithm being thrown off its course. Well then… why should I love NaNs? Because a screaming alarm is better than a silent error. If your algorithm is dealing with positive numbers and it computes \(\sqrt{-42}\) somewhere, you would probably want to know if you went wrong.

Well, what are some good practices to minimize numerical error in your code?

  1. Don’t use equality for comparison, use thresholds like: \(\left| x – x^* \right| < \epsilon\) or \(\frac{\left| x – x^* \right|}{\left| x^* \right|} < \epsilon\)
  2. Avoid transcendental functions wherever possible. A really cool example of this is to avoid \(\theta = \arccos \left( \frac{ \mathbf{u} \cdot \mathbf{v} }{ \left| \mathbf{u} \right| \left| \mathbf{v} \right| } \right)\) and use \(\cot \theta = \frac{ \cos \theta }{ \sin \theta } = \frac{ \mathbf{u} \cdot \mathbf{v} }{ \left| \mathbf{u} \times \mathbf{v} \right| }\) instead.
  3. Clamp inputs to safe bounds, e.g. \(\sqrt{x} \rightarrow \sqrt{max(x, 0)}\). (Note that while this keeps your code running smoothly, it might convert NaNs into silent errors!)
  4. Perturb inputs to singular functions to ensure numerical stability, e.g. \(\frac{1}{\left| x \right|} \to \frac{1}{\left| x \right| + \epsilon}\)

Some easy solutions include:

  1. Using a rational arithmetic system, as mentioned before. Caveats include: 1) no transcendental operations 2) some numbers might require very large integers to represent them, leading to performance issues, in terms of memory and/or speed.
  2. Use robust predicates for specific functional implementations, the design of which is aware of the common floating-point problems

Representing Surface Meshes

Sometimes, the issue lies not with the algorithm but with the mesh itself. However, we still need to ensure that our algorithm works seamlessly on all meshes. Common problems (and solutions) include:

  1. Unreferenced vertices and repeated vertices: throw them out
  2. A mixture of quad and triangular meshes: subdivide, retriangulate, or delete
  3. Degenerate faces and spurious topology: either repair these corner cases or adjust the algorithm to handle these situations
  4. Non-manifold and non-orientable meshes: split the mesh into multiple manifolds or orientable patches
  5. Foldover faces, poorly tessellated meshes, and disconnected components: use remeshing algorithms like Instant Meshes or use Generalized Winding Numbers instead of meshes

Optimization

Several geometry processing algorithms involve optimization of an objective function. Generally speaking, linear and sparse linear solvers are well-behaved, whereas more advanced methods like gradient descent or non-linear solvers may fail. A few good practices include:

  1. Performing sanity checks at each stage of your code, e.g. before applying an algorithm that expects SPD matrices, check if the matrix you’re passing is actually SPD
  2. When working with gradient-descent-like solvers, check if the gradient magnitudes are too large; that may cause instability for convergence

Simulations and PDEs

Generally, input meshes and algorithms are co-designed for engineering and scientific computing applications, so there aren’t too many problems with their simulations. However, visual computing algorithms need to be robust to arbitrary inputs, as they are typically just one component of a larger pipeline. Algorithms often fail when presented with “bad” meshes, even if it is perfect in terms of connectivity (like being a manifold, being oriented, etc.). Well then, what qualifies as a bad mesh? Meshes with skinny or obtuse triangles are particularly problematic. The solution is to remesh them using more equilateral triangles

Figure 2. (Left) An example of a “bad” mesh with skinny triangles. (Middle) A high-fidelity re-mesh that might be super-expensive to process. (Right) A low-fidelity re-mesh that trades fidelity for efficiency.

Geometric Machine Learning

Most geometric machine learning stands very directly atop geometry processing; hence, it’s important to get it right. The most common problems encountered in geometric machine learning are not so different from those encountered in standard machine learning. These problems include:

  1. Array shape errors: You can use the Python aargh library to monitor and validate tensor shapes
  2. NaNs and infs: Maybe your learning rate is too big? Maybe you’re passing bad inputs into singular functions? Use torch.autograd.set_detect_anomaly(mode, check_nan=True) to track these problematic numbers at inception.
  3. “My trained model works on shape A, but not shape B.”: Is your normalization, orientation, and resolution consistent?

It is a good idea to overfit your ML model on a single shape and ensure that it works on simple objects like cubes and spheres before moving on to more complex examples.

And above all, the golden rule when your algorithm fails:
Visualize everything.

Categories
Talks

Week 2: Guest Lecture

On Wednesday, July 9th, SGI fellows were treated to a one hour presentation by guest lecturer Aaron Hertzmann, Principal Scientist at Adobe Research in San Francisco, CA. Aaron was introduced by SGI fellow Amber Bajaj, who among other accomplishments, noted that Aaron was recently recognized by the Association for Computing Machinery (ACM)’s SIGGRAPH professional society “for outstanding achievement in computer graphics and interactive techniques” and correspondingly awarded the Computer Graphics Achievement Award. The title of the talk was “Toward a Theory of Perspective: Perception in Pictures” and began on a personal note, with Aaron conveying how he was often critical of his own art differing from the corresponding photo he would take of the scene he was illustrating.

From that anecdotal example, Aaron expanded his talk to cover topics of human perception, vision, theory of perspective, and much more, weaving it all together to paint a compelling picture of what factors contribute to what we, as humans, perceive as more accurate representations of our three dimensional reality on a two dimensional medium. He made a compelling point that, while single point perspective is typically how cameras capture scenes and that single point linear perspective is a common tenant of formal art classes, multi-point perspective more faithfully represents how we remember our experiences. In a world of electronics, digital imagery, and automation, it was striking how the lecturer made it clear that artists are still able to convey an image more faithful to the experience than digital cameras and rendered three dimensional imagery can capture.

Key points from Aaron’s talk:

  • Only 3% of classical paintings strictly follow linear perspective
  • A multi-perspective theory more faithfully captures our experience
  • MaDCoW (Zhang et al., CVPR 2025) is a warping algorithm that works for a variety of input scenes
  • 99.9% of vision is peripheral, which leads to inattention blindness (object lying outside the focus of our fovea)
  • We don’t store a consistent single 3-D representation in our head… it is fragmentary across fixations
  • There are systematic differences between drawings, photographs, and experiments

Finally, the lecture came full circle with Aaron returning to the art piece he presented at the start and noting seven trends he’s identified from his own work that merit further research: good agreement with linear perspective in many case; distant object size, nearby object size; fit to canvas shape; reduced / eliminated foreshortening; removed objects and simplified shapes; multiperspective for complex composition. Overall the lecture was thought provoking and motivational for the fellows currently engrossed in the 2025 Summer Geometry Initiative.

Categories
Research

Hidden Quivers: Supporting the Manifold Hypothesis

Quivers are a tool that are known to help us simplify problems in math. In particular, representations of quivers contribute to geometric perspectives in representation theory: the theory of reducing complex algebraic structures to simpler ones. Lesser known, neural networks can also be represented using quiver representation theory.

Fundamentally, a quiver is just a directed graph.

Intrinsic definitions to consider include:

  • A source vertex of a quiver has no edges directed towards it
  • A sink vertex has no edges directed away from it
  • A loop in a quiver is an oriented edge such that the start vertex is the same as the end vertex

A fancy type of quiver known as an Auslander-Reiten quiver, courtesy of the author. But remember!, a quiver is simply a directed graph.

Just like an MLP, a network quiver \(Q\) is arranged by input, output, and hidden layers in between. Likewise, they also have input vertices (a subset of source vertices), bias vertices (the source vertices that are not input vertices), and output vertices (sinks of \(Q\)). All remaining vertices are hidden vertices. The hidden quiver \(\tilde{Q}\) consists of all hidden vertices \(\tilde{V}\) of \(Q\) and all oriented edges \(\tilde{E}\) between \(\tilde{V}\) of \(Q\) that are not loops.

Def: A network quiver \(Q\) is a quiver arranged by layers such that:

  1. There are no loops on source (input and bias) nor sink vertices.
  2. There exists exactly one loop on each hidden vertex

For any quiver \(Q\), we can also define its representation \(\mathcal{Q}\), in which we assign a vector space to each vertex of \(Q\) and regard our directed edges of \(Q\) as \(k\)-linear maps. In a thin representation, each \(k\)-linear map is simply a \(1\times1\) matrix.

A quiver with 4 vertices, courtesy of the author.
A representation of the quiver directly above, courtesy of the author.

Defining a neural network \((W, f)\) over a network quiver \(Q\), where \(W\) is a specific thin representation and \(f = (f_v)_{v \in V}\) are activation functions, allows much of the language and ideas of quiver representation theory to carry over to neural networks .

A neural network over a network quiver.

When a neural network like an MLP does its forward pass, it gives rise to a pointwise activation function \(f\), defined here as a one variable non-linear function \(f: \mathbb{C} \to \mathbb{C}\) differentiable except in a set of measure zero. We assign these activation functions to loops of \(Q\).

Further, for a neural network \((W, f)\) over \(Q\), we have a network function

$$ \Psi(W, f): \mathbb{C}^d \to \mathbb{C}^k $$

where the coordinates of \(\Psi(W, f)(x)\) are the score of the neural net as the activation outputs of the output vertices of \((W, f)\) with respect to an input data vector \(x \in \mathbb{C}^d\).

The manifold hypothesis critical to deep learning proposes that high-dimensional data actually lies in a low-dimensional, latent manifold within the input space. We can map the input space to the geometric moduli space of neural networks \(_d\mathcal{M}_k(\tilde{Q})\) so that our latent manifold is also translated to the moduli space. While \(_d\mathcal{M}_k(\tilde{Q})\) depends on the combinatorial structure of the neural network, activation and weight architectures of the neural network determine how data is distributed inside the moduli space.

A three-dimensional data manifold.

We will approach the manifold hypothesis via framed quiver representations. A choice of a thin representation \(\tilde{\mathcal{Q}}\) of the hidden quiver \(\tilde{Q}\) and a map \(h\) from the hidden representation \(\tilde{\mathcal{Q}}\) to hidden vertices determine a pair \((\tilde{\mathcal{Q}}, h)\), where \(h = \{h_v\}{v \in \tilde{V}}\). The pair \((\tilde{\mathcal{Q}}, h)\) is used to denote our framed quiver representation.

Def: A double-framed thin quiver representation is a triple \((l, \tilde{\mathcal{Q}}, h)\) where:

  • \(\tilde{\mathcal{Q}}\) is a thin representation of the hidden quiver \(\tilde{Q}\)
  • \((\tilde{\mathcal{Q}}, h)\) is framed representation of \(\tilde{Q}\)
  • \((\tilde{\mathcal{Q}}, l)\) is a co-framed representation of \(\tilde{Q}\) (the dual of a framed representation)

Denote by \(_d\mathcal{R}_k(\tilde{\mathcal{Q}})\) the space of all double-framed thin quiver representations. We will use stable double-framed thin quiver representations in our construction of moduli space.

Def: A double-framed thin quiver representation \(\texttt{W}_k^f = (l, \tilde{\mathcal{Q}}, h)\) is stable if :

  1. The only sub-representation of \(\tilde{\mathcal{Q}}\) contained in the kernel of \(h\) is the zero sub-representation
  2. The only sub-representation of \(\tilde{\mathcal{Q}}\) contained in the image of \(l\) is \(\tilde{\mathcal{Q}}\)

Def: We present the moduli space of double-framed thin quiver representations as

$$ _d\mathcal{M}_k(\tilde{Q}):=\{[V]: _d\mathcal{R}_k(\tilde{\mathcal{Q}}) \space \text{is stable} \}. $$

The moduli space depends on the hidden quiver as well as the chosen vector spaces. Returning to neural networks \((W, f)\), and given an input data vector \(x \in \mathbb{C}^d\), we can define a map

$$ \varphi(W, f): \mathbb{C}^d \to _d\mathcal{R}_k(\tilde{\mathcal{Q}})\\x \mapsto \texttt{W}_k^f. $$

This map takes values in the moduli space, the points of which parametrize isomorphism classes of stable double-framed thin quiver representations. Thus we have

$$ \varphi(W, f): \mathbb{C}^d \to _d\mathcal{M}_k(\tilde{Q}).
$$

As promised, we have mapped our input space containing our latent manifold to the moduli space \(_d\mathcal{M}_k(\tilde{Q})\) of neural networks, mathematically validating the manifold hypothesis.

Independent of the architecture, activation function, data, or task, any decision of any neural network passes through the moduli (as well as representation) space. With our latent manifold translated into the moduli space, we have an algebro-geometric way to continue to study the dynamics of neural network training.

Looking through the unsuspecting the lens of quiver representation theory has the potential to provide new insights in deep learning, where network quivers appear as a combinatorial tool for understanding neural networks and their moduli spaces. More concretely:

  • Continuity and differentiability of the network function \(\Psi(W, f)\) and map \(\varphi(W, f)\) should allow us to apply further algebro-geometric tools to the study of neural networks, including to our constructed moduli space \(_d\mathcal{M}_k(\tilde{Q})\).
  • Hidden quivers can aid us in comprehending optimization hyperparameters in deep learning. We may be able to transfer gradient descent optimization to the setting of the moduli space.
  • Studying training within moduli spaces can lead to the development of new convergence theorems to guide deep learning.
  • The dimension of \(_d\mathcal{M}_k(\tilde{Q})\) could be used to quantify the capacity of neural networks.

The manifold hypothesis has played a ubiquitous role throughout deep learning since originally posed, and formalizing its existence via the moduli of quiver representations can help us to understand and potentially improve upon the effectiveness of neural networks and their latent spaces.

Notes and Acknowledgements. Content for this post was largely borrowed from and inspired by The Representation Theory of Neural Networks, smoothing over many details more rigorously presented in the original paper. We thank the 2025 SGI organizers and sponsors for supporting the author’s first deep learning-related research experience via the “Topology Control” project as well as mentors and other research fellows involved for their diverse expertise and patience.

Author