Categories
Tutorial week

Tutorial Week Day 5: Debugging and Robustness

On July 11th, Nicholas Sharp, the creator of the amazing Polyscope library, gave the SGI Community an insightful talk on debugging and robustness in geometry processing—insights that would later save several fellows hours of head-scratching and mental gymnastics. The talk was broadly organized into five parts, each corresponding to a paradigm where bugs commonly arise:

  1. Representing Numbers
  2. Representing Meshes
  3. Optimization
  4. Simulation and PDEs
  5. Geometric Machine Learning

Representing Numbers

The algorithms developed for dealing with geometry are primarily built to work with clean and pretty real numbers. However, computers deal with floats, and sometimes they behave ugly. Floats are computers’ approximations of real numbers. Sometimes, it may require an infinite amount of storage to correctly represent a single number of arbitrary precision (unless you can get away with \(π=3.14\)). Each floating number can either be represented using 32 bits (single precision) or 64 bits (double precision).

Figure 1. An example of the float32 representation

In the floating realm, we have quirks like:

  1. \((x+y)+z \neq x+(y+z)\)
  2. \( a>0, b>0\) but \(a+b=a\)

It is important to note that floating-point arithmetic is not random; it is simply misaligned with real arithmetic. That is, the same operation will consistently yield the same result, even if it deviates from the mathematically expected one. Possible alternatives to floats are integers and binary fraction representations; however, they come with their own obvious limitations.

Who’s your best friend? The NaN! The NaN is a special “floating-point” computers spit out when they’re asked to perform invalid operations like

  1. \(\frac{0}{0}\rightarrow \) NaN
  2. \(\sqrt{-1} \rightarrow\) NaN (not \(i\) haha)

Every operation against a NaN results in… NaN. Hence, one small slip-up in the code can result in the entire algorithm being thrown off its course. Well then… why should I love NaNs? Because a screaming alarm is better than a silent error. If your algorithm is dealing with positive numbers and it computes \(\sqrt{-42}\) somewhere, you would probably want to know if you went wrong.

Well, what are some good practices to minimize numerical error in your code?

  1. Don’t use equality for comparison, use thresholds like: \(\left| x – x^* \right| < \epsilon\) or \(\frac{\left| x – x^* \right|}{\left| x^* \right|} < \epsilon\)
  2. Avoid transcendental functions wherever possible. A really cool example of this is to avoid \(\theta = \arccos \left( \frac{ \mathbf{u} \cdot \mathbf{v} }{ \left| \mathbf{u} \right| \left| \mathbf{v} \right| } \right)\) and use \(\cot \theta = \frac{ \cos \theta }{ \sin \theta } = \frac{ \mathbf{u} \cdot \mathbf{v} }{ \left| \mathbf{u} \times \mathbf{v} \right| }\) instead.
  3. Clamp inputs to safe bounds, e.g. \(\sqrt{x} \rightarrow \sqrt{max(x, 0)}\). (Note that while this keeps your code running smoothly, it might convert NaNs into silent errors!)
  4. Perturb inputs to singular functions to ensure numerical stability, e.g. \(\frac{1}{\left| x \right|} \to \frac{1}{\left| x \right| + \epsilon}\)

Some easy solutions include:

  1. Using a rational arithmetic system, as mentioned before. Caveats include: 1) no transcendental operations 2) some numbers might require very large integers to represent them, leading to performance issues, in terms of memory and/or speed.
  2. Use robust predicates for specific functional implementations, the design of which is aware of the common floating-point problems

Representing Surface Meshes

Sometimes, the issue lies not with the algorithm but with the mesh itself. However, we still need to ensure that our algorithm works seamlessly on all meshes. Common problems (and solutions) include:

  1. Unreferenced vertices and repeated vertices: throw them out
  2. A mixture of quad and triangular meshes: subdivide, retriangulate, or delete
  3. Degenerate faces and spurious topology: either repair these corner cases or adjust the algorithm to handle these situations
  4. Non-manifold and non-orientable meshes: split the mesh into multiple manifolds or orientable patches
  5. Foldover faces, poorly tessellated meshes, and disconnected components: use remeshing algorithms like Instant Meshes or use Generalized Winding Numbers instead of meshes

Optimization

Several geometry processing algorithms involve optimization of an objective function. Generally speaking, linear and sparse linear solvers are well-behaved, whereas more advanced methods like gradient descent or non-linear solvers may fail. A few good practices include:

  1. Performing sanity checks at each stage of your code, e.g. before applying an algorithm that expects SPD matrices, check if the matrix you’re passing is actually SPD
  2. When working with gradient-descent-like solvers, check if the gradient magnitudes are too large; that may cause instability for convergence

Simulations and PDEs

Generally, input meshes and algorithms are co-designed for engineering and scientific computing applications, so there aren’t too many problems with their simulations. However, visual computing algorithms need to be robust to arbitrary inputs, as they are typically just one component of a larger pipeline. Algorithms often fail when presented with “bad” meshes, even if it is perfect in terms of connectivity (like being a manifold, being oriented, etc.). Well then, what qualifies as a bad mesh? Meshes with skinny or obtuse triangles are particularly problematic. The solution is to remesh them using more equilateral triangles

Figure 2. (Left) An example of a “bad” mesh with skinny triangles. (Middle) A high-fidelity re-mesh that might be super-expensive to process. (Right) A low-fidelity re-mesh that trades fidelity for efficiency.

Geometric Machine Learning

Most geometric machine learning stands very directly atop geometry processing; hence, it’s important to get it right. The most common problems encountered in geometric machine learning are not so different from those encountered in standard machine learning. These problems include:

  1. Array shape errors: You can use the Python aargh library to monitor and validate tensor shapes
  2. NaNs and infs: Maybe your learning rate is too big? Maybe you’re passing bad inputs into singular functions? Use torch.autograd.set_detect_anomaly(mode, check_nan=True) to track these problematic numbers at inception.
  3. “My trained model works on shape A, but not shape B.”: Is your normalization, orientation, and resolution consistent?

It is a good idea to overfit your ML model on a single shape and ensure that it works on simple objects like cubes and spheres before moving on to more complex examples.

And above all, the golden rule when your algorithm fails:
Visualize everything.

Categories
Talks

Week 2: Guest Lecture

On Wednesday, July 9th, SGI fellows were treated to a one hour presentation by guest lecturer Aaron Hertzmann, Principal Scientist at Adobe Research in San Francisco, CA. Aaron was introduced by SGI fellow Amber Bajaj, who among other accomplishments, noted that Aaron was recently recognized by the Association for Computing Machinery (ACM)’s SIGGRAPH professional society “for outstanding achievement in computer graphics and interactive techniques” and correspondingly awarded the Computer Graphics Achievement Award. The title of the talk was “Toward a Theory of Perspective: Perception in Pictures” and began on a personal note, with Aaron conveying how he was often critical of his own art differing from the corresponding photo he would take of the scene he was illustrating.

From that anecdotal example, Aaron expanded his talk to cover topics of human perception, vision, theory of perspective, and much more, weaving it all together to paint a compelling picture of what factors contribute to what we, as humans, perceive as more accurate representations of our three dimensional reality on a two dimensional medium. He made a compelling point that, while single point perspective is typically how cameras capture scenes and that single point linear perspective is a common tenant of formal art classes, multi-point perspective more faithfully represents how we remember our experiences. In a world of electronics, digital imagery, and automation, it was striking how the lecturer made it clear that artists are still able to convey an image more faithful to the experience than digital cameras and rendered three dimensional imagery can capture.

Key points from Aaron’s talk:

  • Only 3% of classical paintings strictly follow linear perspective
  • A multi-perspective theory more faithfully captures our experience
  • MaDCoW (Zhang et al., CVPR 2025) is a warping algorithm that works for a variety of input scenes
  • 99.9% of vision is peripheral, which leads to inattention blindness (object lying outside the focus of our fovea)
  • We don’t store a consistent single 3-D representation in our head… it is fragmentary across fixations
  • There are systematic differences between drawings, photographs, and experiments

Finally, the lecture came full circle with Aaron returning to the art piece he presented at the start and noting seven trends he’s identified from his own work that merit further research: good agreement with linear perspective in many case; distant object size, nearby object size; fit to canvas shape; reduced / eliminated foreshortening; removed objects and simplified shapes; multiperspective for complex composition. Overall the lecture was thought provoking and motivational for the fellows currently engrossed in the 2025 Summer Geometry Initiative.

Categories
Research

Hidden Quivers: Supporting the Manifold Hypothesis

Quivers are a tool that are known to help us simplify problems in math. In particular, representations of quivers contribute to geometric perspectives in representation theory: the theory of reducing complex algebraic structures to simpler ones. Lesser known, neural networks can also be represented using quiver representation theory.

Diagram of a Deep Neural Network

Fundamentally, a quiver is just a directed graph.

A fancy type of quiver known as an Auslander-Reiten quiver, courtesy of the author. But remember!, a quiver is simply a directed graph.

Intrinsic definitions to consider include:

  • A source vertex of a quiver has no edges directed towards it
  • A sink vertex has no edges directed away from it
  • A loop in a quiver is an oriented edge such that the start vertex is the same as the end vertex

Just like an MLP, a network quiver \(Q\) is arranged by input, output, and hidden layers in between. Likewise, they also have input vertices (a subset of source vertices), bias vertices (the source vertices that are not input vertices), and output vertices (sinks of \(Q\)). All remaining vertices are hidden vertices. The hidden quiver \(\tilde{Q}\) consists of all hidden vertices \(\tilde{V}\) of \(Q\) and all oriented edges \(\tilde{E}\) between \(\tilde{V}\) of \(Q\) that are not loops.

Def: A network quiver \(Q\) is a quiver arranged by layers such that:

  1. There are no loops on source (input and bias) nor sink vertices.
  2. There exists exactly one loop on each hidden vertex

For any quiver \(Q\), we can also define its representation \(\mathcal{Q}\), in which we assign a vector space to each vertex of \(Q\) and regard our directed edges of \(Q\) as \(k\)-linear maps. In a thin representation, each \(k\)-linear map is simply a \(1\times1\) matrix.

A representation of the quiver directly above, courtesy of the author.

Defining a neural network \((W, f)\) over a network quiver \(Q\), where \(W\) is a specific thin representation and \(f = (f_v)_{v \in V}\) are activation functions, allows much of the language and ideas of quiver representation theory to carry over to neural networks .

A neural network over a network quiver

When a neural network like an MLP does its forward pass, it gives rise to a pointwise activation function \(f\), defined here as a one variable non-linear function \(f: \mathbb{C} \to \mathbb{C}\) differentiable except in a set of measure zero. We assign these activation functions to loops of \(Q\).

Further, for a neural network \((W, f)\) over \(Q\), we have a network function

$$ \Psi(W, f): \mathbb{C}^d \to \mathbb{C}^k $$

where the coordinates of \(\Psi(W, f)(x)\) are the score of the neural net as the activation outputs of the output vertices of \((W, f)\) with respect to an input data vector \(x \in \mathbb{C}^d\).

The mobius strip is a well-known geometric manifold.

The manifold hypothesis critical to deep learning proposes that high-dimensional data actually lies in a low-dimensional, latent manifold within the input space. We can map the input space to the geometric moduli space of neural networks \(_d\mathcal{M}_k(\tilde{Q})\) so that our latent manifold is also translated to the moduli space. While \(_d\mathcal{M}_k(\tilde{Q})\) depends on the combinatorial structure of the neural network, activation and weight architectures of the neural network determine how data is distributed inside the moduli space.

A three-dimensional data manifold

We will approach the manifold hypothesis via framed quiver representations. A choice of a thin representation \(\tilde{\mathcal{Q}}\) of the hidden quiver \(\tilde{Q}\) and a map \(h\) from the hidden representation \(\tilde{\mathcal{Q}}\) to hidden vertices determine a pair \((\tilde{\mathcal{Q}}, h)\), where \(h = \{h_v\}{v \in \tilde{V}}\). The pair \((\tilde{\mathcal{Q}}, h)\) is used to denote our framed quiver representation.

Def: A double-framed thin quiver representation is a triple \((l, \tilde{\mathcal{Q}}, h)\) where:

  • \(\tilde{\mathcal{Q}}\) is a thin representation of the hidden quiver \(\tilde{Q}\)
  • \((\tilde{\mathcal{Q}}, h)\) is framed representation of \(\tilde{Q}\)
  • \((\tilde{\mathcal{Q}}, l)\) is a co-framed representation of \(\tilde{Q}\) (the dual of a framed representation)

Denote by \(_d\mathcal{R}_k(\tilde{\mathcal{Q}})\) the space of all double-framed thin quiver representations. We will use stable double-framed thin quiver representations in our construction of moduli space.

Def: A double-framed thin quiver representation \(\texttt{W}_k^f = (l, \tilde{\mathcal{Q}}, h)\) is stable if :

  1. The only sub-representation of \(\tilde{\mathcal{Q}}\) contained in the kernel of \(h\) is the zero sub-representation
  2. The only sub-representation of \(\tilde{\mathcal{Q}}\) contained in the image of \(l\) is \(\tilde{\mathcal{Q}}\)

Def: We present the moduli space of double-framed thin quiver representations as

$$ _d\mathcal{M}_k(\tilde{Q}):=\{[V]: _d\mathcal{R}_k(\tilde{\mathcal{Q}}) \space \text{is stable} \}. $$

The moduli space depends on the hidden quiver as well as the chosen vector spaces. Returning to neural networks \((W, f)\), and given an input data vector \(x \in \mathbb{C}^d\), we can define a map

$$ \varphi(W, f): \mathbb{C}^d \to _d\mathcal{R}_k(\tilde{\mathcal{Q}})\\x \mapsto \texttt{W}_k^f. $$

This map takes values in the moduli space, the points of which parametrize isomorphism classes of stable double-framed thin quiver representations. Thus we have

$$ \varphi(W, f): \mathbb{C}^d \to _d\mathcal{M}_k(\tilde{Q}).
$$

As promised, we have mapped our input space containing our latent manifold to the moduli space \(_d\mathcal{M}_k(\tilde{Q})\) of neural networks, mathematically validating the manifold hypothesis.

Independent of the architecture, activation function, data, or task, any decision of any neural network passes through the moduli (as well as representation) space. With our latent manifold translated into the moduli space, we have an algebro-geometric way to continue to study the dynamics of neural network training.

Looking through the unsuspecting the lens of quiver representation theory has the potential to provide new insights in deep learning, where network quivers appear as a combinatorial tool for understanding neural networks and their moduli spaces. More concretely:

  • Continuity and differentiability of the network function \(\Psi(W, f)\) and map \(\varphi(W, f)\) should allow us to apply further algebro-geometric tools to the study of neural networks, including to our constructed moduli space \(_d\mathcal{M}_k(\tilde{Q})\).
  • Hidden quivers can aid us in comprehending optimization hyperparameters in deep learning. We may be able to transfer gradient descent optimization to the setting of the moduli space.
  • Studying training within moduli spaces can lead to the development of new convergence theorems to guide deep learning.
  • The dimension of \(_d\mathcal{M}_k(\tilde{Q})\) could be used to quantify the capacity of neural networks.

The manifold hypothesis has played a ubiquitous role throughout deep learning since originally posed, and formalizing its existence via the moduli of quiver representations can help us to understand and potentially improve upon the effectiveness of neural networks and their latent spaces.

Notes and Acknowledgements. Content for this post was largely borrowed from and inspired by The Representation Theory of Neural Networks, smoothing over many details more rigorously presented in the original paper. We thank the 2025 SGI organizers and sponsors for supporting the author’s first deep learning-related research experience via the “Topology Control” project as well as mentors and other research fellows involved for their diverse expertise and patience.

Categories
Logistics

Welcome to SGI 2025!

Welcome to the official blog of the Summer Geometry Initiative (SGI) 2025, taking place July 7-August 15! I’m Justin Solomon, director of SGI 2025 and PI of the MIT Geometric Data Processing Group.

First launched in 2021, SGI is a completely online program engaging a paid cohort of undergraduate and early master’s students in six weeks of training and research experiences related to applied geometry and geometry processing. SGI Fellows come from all over the globe and represent a wide variety of educational institutions, life/career paths, and fields of interest.

SGI aims to accomplish the following objectives:

  • spark collaboration among students and researchers in geometry processing,
  • launch inter-university research projects in geometry processing involving team members across broad levels of seniority (undergraduate, graduate, faculty, industrial researcher),
  • introduce students to geometry processing research and development, and
  • diversify the “pipeline” of students entering geometry processing research, in terms of gender, race, socioeconomic background, and home institution.

SGI aims to address a number of challenges and inequities in geometry processing. Not all universities host faculty whose work touches on this emerging field, reducing the cohort of students exposed to this discipline during their undergraduate careers. Moreover, as with many engineering and mathematical fields, geometry processing suffers from serious gender, racial, and socioeconomic imbalance; by giving a broad set of students access to geometry processing research experiences, over the longer term we hope to affect the composition of the geometry processing community.

SGI is supported by a worldwide network of volunteers, including faculty, graduate students, and research scientists in geometry and related disciplines. This team supports the SGI Fellows through mentorship, instruction, panel discussions, and several other means.

SGI 2025 is due to start in a few days! Each SGI Fellow has been mailed a box of swag from our many sponsors, a certificate, and a custom-made coffee mug designed by SGI 2025 Fellow Marina Levay and others.

We’ll kick off next week with tutorials in geometry processing led by Oded Stein (USC), Silvia Sellán (Columbia University), Nicole Feng (Carnegie Mellon University), Dale Decatur/Richard Liu (University of Chicago), and Nick Sharp (NVIDIA). Then, in the remaining 5 weeks, our Fellows will have the opportunity to participate in multiple short-term (1-2 week) research projects, intended to kick off collaborations that last over the longer term. Check out last year’s SGI blog for examples of the kinds of projects they’ll be working on.

Revisit this blog as the summer progresses for updates on SGI 2025 and to read about the exciting ideas our Fellows are developing in geometry!