Tag: RXMesh

Implementing Spectral Conformal Parameterization using RXMesh

Post author By sachinkishan02
Post date August 26, 2024

Textured bunny with black and white checker texture

Fellows: Sachin Kishan and Diany Pressato

Teaching Assistant: Supriya Gadi Patil

Mentor: Ahmed Mahmoud

Introduction

RXMesh is a framework for processing triangle mesh on the GPU. It allows developers to easily use the GPU’s massive parallelism to perform common geometry processing tasks on triangle mesh. For a preliminary introduction to RXMesh, check out this tutorial blog here.

Our task was to use RXMesh to implement some classic algorithms in geometry processing. This blog details one of the algorithms we implemented: Spectral Conformal Parameterization (SCP).

This blog will discuss the algorithm in terms of its basic concepts, the steps of the algorithm, the implementation using RXMesh, and our results.

Our entire implementation of SCP application in RXMesh can be found here.

Spectral Conformal Parameterization

Parameterization refers to the mapping of a triangle surface mesh in \(R^{3}\) space to a planar \(R^{2}\) space. This mapping can then be subsequently used in different application such as texture mapping, where we map the values from some other \(R^{2}\) space (e.g., image) to the surface of the mesh. In this case, we map color values to the mesh surface.

Often when we perform mesh parameterization, there is a possibility to change some properties of the geometric elements of the mesh, such as the size, shape, or angle between edges. It is desired to preserve these properties since we usually want to create a direct mapping of our intended values (e.g., texture color) to be in the “correct” positions on the mesh. What is correct is usually up to artistic opinion, and thus a mapping that preserves these properties is desirable. A mapping that conserves angles (and thus shape) is called a conformal mapping.

Mullen et al. (2008) provides a method to perform conformal parameterization of an open surface mesh by modelling the problem as a generalized Eigenvalue problem. Using an initial guess, we can perform multiple iterations of a general eigen vector solving method (i.e., the power method [2] ) to determine the largest eigen vector. This eigen vector, in our case, is the parameterized coordinates of the given mesh, giving us our desired output.

The algorithm for the power method is as follows:

Given \((A-\lambda B)x=0 \)

We want to find \(x\).

Given an initial guess of eigen-vector \(x_{0}\)

For some number of iterations, on iteration k

\(B y_{k+1} = A x_{k}\)
\(x_{k+1}= \frac{y_{k+1}}{\left| y_{k+1} \right|} \)

where \(x\) is the desired eigenvector solution.

Implementation

We applied the power method to our application in the following way, based on the formulation given by Mullen et al. (2008) in equation 7:

\([B-\frac{1}{V_{b}}e_{b}e_{b}^{t}]u=\mu L_{c}u \)

To find the eigenvector \(u\), we can the power method since the problem is in the form \(B y_{k+1} = A x_{k}\). This can be done in the following way:

\(T2=dot(eb,u) \)
\(T1=Bu – T2 * eb \)
\(L_{c}u=T1 \)
\(u=u/normalise (u) \)

Where \(u\) is the parameterized coordinates. \(T1\) and \(T2\) are temporary variables to store certain scalar or vector values. Other variables are explained by Mullen et al. (2008). This loop can be implemented like this using RXMesh:

for (int i = 0; i < iterations; i++) 
{
    cuComplex T2 = eb.dot(uv_mat);
    rx.for_each_vertex(rxmesh::DEVICE,
                       [eb, T2, T1, uv_mat, B] __device__(
                       const rxmesh::VertexHandle vh) mutable
                       {
                           T1(vh, 0) =
                       cuCsubf(cuCmulf(B(vh, vh), uv_mat(vh, 0)),
                       cuCmulf(eb(vh, 0), T2));
                       });

    Lc.solve(T1, uv_mat);
    float norm = uv_mat.norm2();
    uv_mat.multiply(1.0f / norm);
    if (std::abs(prv_norm - norm) < 0.0001)//convergence check
    {
        break; 
    }
    prv_norm = norm;
}

\(T1\) is a vector whose values are per-vertex, which is why we perform a per vertex computation. \(T2\) is a scalar in which we store the value of the dot product dot(eb,uv).

Since our system matrix \(L_{c}\) has dimensions 2V x 2V, we use complex representations of value entries in the vectors and matrices we use. So, for example, our \(L_{c}\) matrix now has dimensions V x V where each entry is a complex number (with a real and complex part).

This means subsequent computations must use CUDA functions that operate on complex numbers (e.g., cuCsubf which subtracts complex numbers).

All initial pre-computations only allocate to the real part of the entry, except for the calculation of the area term \(A\) where we calculate \(L_{c}=L_{D}-A\) . For the area term, we assign values to the complex part of the area term. \(L_{D}\) represents the standard Laplace-Beltrami operator, which we calculate using the cotangent edge weights and assign to the real part.

After we obtain the eigenvector \(u\) in complex form, we can convert it to a standard V x 2 matrix form and use that for further processing (such as rendering).

Results

Using the SCP implementation, we can now apply texturing to our open meshes as shown in the examples below.

*The beetle mesh with a checker texture.*

References

[1] Mullen, P., Tong, Y., Alliez, P., & Desbrun, M. (2008). Spectral Conformal Parameterization. Computer Graphics Forum, 27(5), 1487–1494. https://doi.org/10.1111/j.1467-8659.2008.01289.x

[2] Kozłowski, W. (1992). The power method for the generalized eigenvalue problem. Roczniki Polskiego Towarzystwa Matematycznego. Seria 3, Matematyka Stosowana/Matematyka Stosowana/Mathematica Applicanda, 21(35). https://doi.org/10.14708/ma.v21i35.1790

This blog was written by Sachin Kishan and Diany Pressato during the SGI 2024 Fellowship as one of the outcomes of a two-week project under the mentorship of Ahmed Mahmoud and the support of Supriya Gadi Patil as a teaching assistant.

Post author By sachinkishan02
Post date August 26, 2024
Tags GPU, Paramterization, RXMesh, sgi2024

Research

Implementing ARAP Deformations using RXMesh

Post author By sachinkishan02
Post date August 26, 2024

Fellows: Sachin Kishan

Teaching Assistant: Supriya Gadi Patil

Mentor: Ahmed Mahmoud

Introduction

RXMesh is a framework for processing triangle mesh on the GPU. It allows developers to easily use the GPU’s massive parallelism to perform common geometry processing tasks. For a preliminary introduction to RXMesh, check out this tutorial blog here.

Our task was to use RXMesh to implement some classic algorithms in geometry processing. This blog details one of the algorithms we implemented: As-Rigid-As-Possible(ARAP) Surface Modeling [1].

This blog will discuss the algorithm in terms of its basic concepts, the steps of the algorithm, the implementation using RXMesh, and the final results.

Our entire implementation of ARAP deformation in RXMesh can be found here.

As-Rigid-As-Possible Surface Modeling

Sorkine and Alexa (2007) formulated a method where we can retain the shape of an object as much as possible despite deforming one (or a set of) its vertices. This retention of shape is achieved by ensuring that subsequent transformations to the mesh after deforming its vertices are rigid. This property of rigidity means that a mesh can only undergo rigid transformations such as translation or rotation, but not stretching or shearing (since these are not rigid transformations).

The algorithm’s focus is to minimize the energy which measures the extent of deformation (dubbed rigidity energy). This energy is minimized when given a specific set of positions or a specific set of rotations. Notice that we can not minimize both at once—since we need the minimized rotational transformation for minimizing the energy from deformed positions. Thus, we instead solve both alternatively, i.e., solve for deformed positions assuming rotation is fixed, then solve for rotation assuming the position is fixed.

Using a Laplacian, we can form a sparse system of linear equations which we can then solve to obtain a new set of positions for the mesh. The algorithm works as follows for a single iteration

Identify the rotation matrix for the given deformed positions which minimizes deformation energy.
Solve for the new positions to minimize the energy, given a set of linear equations.

In this case, the set of linear equations to solve is in the form :

\(Lx=b\)

Where \(L\) is the Laplace-Beltrami operator, which can be formed by finding the cotangent edge weights of the mesh, \(x\) represents the set of new vertex positions we solve for, and \(b\) represents the right-hand side of Equation 8 from Sorkine and Alexa (2007). The derivation is detailed further in the original paper.

\(b = \sum_{j\in N(i)}^{} \frac{w_{ij}}{2} (R_{i}+R_{j})(p_{i}-p_{j})\)

To further minimize the energy, we can perform multiple iterations of the same steps, using updated vertex positions and consequent rotations per iteration.

Implementation

The following code snippets show how the algorithm is implemented in RXMesh.

We first deform the input mesh and set constraints on each vertex depending on whether it is fixed, deformed, or free to move. This will be taken into account when calculating the system matrix \(L\).

 // set constraints
    const vec3<float> sphere_center(0.1818329, -0.99023, 0.325066);
    rx.for_each_vertex(DEVICE, [=] __device__(const VertexHandle& vh) {
        const vec3<float> p(deformed_vertex_pos(vh, 0),
                            deformed_vertex_pos(vh, 1),
                            deformed_vertex_pos(vh, 2));

        // fix the bottom
        if (p[2] < -0.63) {
            constraints(vh) = 2;
        }

        // move the jaw
        if (glm::distance(p, sphere_center) < 0.1) {
            constraints(vh) = 1;
        }
    });

In the code above, we set the mesh to deform the dragon’s jaw and fix its bottom vertices (legs). This can vary depending on the type of input we want to have. Another possibility is to add a feature for users to click and select vertices to deform or fix them in space interactively.

Before we perform the iteration step, we calculate the Laplacian. It can be calculated as the cotangent edge weight matrix. So, we create a kernel using RXMesh.

    auto calc_mat = [&](VertexHandle v_id, VertexIterator& vv) {
        if (constraints(v_id, 0) == 0) {
            for (int i = 0; i < vv.size(); i++) {
          laplace_mat(v_id, v_id) += weight_matrix(v_id, vv[i]);
          laplace_mat(v_id, vv[i]) -= weight_matrix(v_id, vv[i]);
            }
        } else {
            for (int i = 0; i < vv.size(); i++) {
                laplace_mat(v_id, vv[i]) = 0;
            }
            laplace_mat(v_id, v_id) = 1;
        }
    };

The weight matrix is calculated using a special query called “EVDiamond”, where we obtain the two end vertices and the two opposite vertices of an edge. This gives us exactly the data we need to find the cotangent edge weight.

    auto calc_weights = [&](EdgeHandle edge_id, VertexIterator& vv) {
        // the edge goes from p-r while the q and s are the opposite vertices
        const rxmesh::VertexHandle p_id = vv[0];
        const rxmesh::VertexHandle r_id = vv[2];
        const rxmesh::VertexHandle q_id = vv[1];
        const rxmesh::VertexHandle s_id = vv[3];

        if (!p_id.is_valid() || !r_id.is_valid() || !q_id.is_valid() ||
            !s_id.is_valid()) {
            return;
        }

        const vec3<T> p(coords(p_id, 0), coords(p_id, 1), coords(p_id, 2));
        const vec3<T> r(coords(r_id, 0), coords(r_id, 1), coords(r_id, 2));
        const vec3<T> q(coords(q_id, 0), coords(q_id, 1), coords(q_id, 2));
        const vec3<T> s(coords(s_id, 0), coords(s_id, 1), coords(s_id, 2));


        T weight = 0;
   weight += dot((p - q), (r - q)) / length(cross(p - q, r - q));
   weight += dot((p - s), (r - s)) / length(cross(p - s, r - s));

        weight /= 2;
        weight = std::max(0.f, weight);

        edge_weights(p_id, r_id) = weight;
        edge_weights(r_id, p_id) = weight;
    };

We now have all the building blocks necessary to perform an iteration step. This is what an iteration loop looks like:

    // process step
    for (int i = 0; i &lt; iterations; i++) {
        // solver for rotation
        calculate_rotation_matrix&lt;float, CUDABlockSize&gt;
            &lt;&lt;&lt;lb_rot.blocks, lb_rot.num_threads, lb_rot.smem_bytes_dyn&gt;&gt;&gt;(
                rx.get_context(),
                ref_vertex_pos,
                deformed_vertex_pos,
                rotations,
                weight_matrix);

        // solve for position
        calculate_b&lt;float, CUDABlockSize&gt;
            &lt;&lt;&lt;lb_b_mat.blocks,
               lb_b_mat.num_threads,
               lb_b_mat.smem_bytes_dyn&gt;&gt;&gt;(rx.get_context(),
                                          ref_vertex_pos,
                                          deformed_vertex_pos,
                                          rotations,
                                          weight_matrix,
                                          b_mat,
                                          constraints);

        laplace_mat.solve(b_mat, *deformed_vertex_pos_mat);
    }

We perform multiple iterations of the algorithm to effectively minimize the energy. We can see in the code above that we alternatively solve for rotation and then for positions. We first calculate the rotation matrix given a set of deformed positions, then calculate the \(b\) matrix (as defined earlier) and then solve the system of equations to identify the new vertex positions.

The following kernel calculates the rotation matrix per vertex to minimize the rigidity energy given their positions.

auto cal_rot = [&](VertexHandle v_id, VertexIterator& vv) {
        Eigen::Matrix3f S = Eigen::Matrix3f::Zero();

        for (int j = 0; j < vv.size(); j++) {

            float w = weight_matrix(v_id, vv[j]);

            Eigen::Vector<float, 3> pi_vector = {
              ref_vertex_pos(v_id, 0) - ref_vertex_pos(vv[j], 0),
              ref_vertex_pos(v_id, 1) - ref_vertex_pos(vv[j], 1),
              ref_vertex_pos(v_id, 2) - ref_vertex_pos(vv[j], 2)
};


            Eigen::Vector<float, 3> pi_dash_vector = {
                deformed_vertex_pos(v_id, 0) - deformed_vertex_pos(vv[j], 0),
                deformed_vertex_pos(v_id, 1) - deformed_vertex_pos(vv[j], 1),
                deformed_vertex_pos(v_id, 2) - deformed_vertex_pos(vv[j], 2)};

            S = S + w * pi_vector * pi_dash_vector.transpose();
        }

        Eigen::Matrix3f U;         // left singular vectors
        Eigen::Matrix3f V;         // right singular vectors
        Eigen::Vector3f sing_val;  // singular values

        svd(S, U, sing_val, V);

        const float smallest_singular_value = sing_val.minCoeff();


        Eigen::Matrix3f R = V * U.transpose();

        if (R.determinant() < 0) {
            U.col(smallest_singular_value) =
                U.col(smallest_singular_value) * -1;
            R = V * U.transpose();
        }

        // Matrix R to vector attribute R

        for (int i = 0; i < 3; i++)
            for (int j = 0; j < 3; j++)
                rotations(v_id, i * 3 + j) = R(i, j);
    };

We first find the Single Value Decomposition (SVD) of the sum (\(S_{i} \)) where

\(S_{i}=\sum_{j\in N(i)} w_{ij}e_{ij}e^{‘T}_{ij} \) (Equation 5 in Sorkine and Alexa(2007))

We then use that to calculate rotation and assign that as the vertice’s rotation. After we calculate rotation, we calculate the right-hand side of Equation 8 as stated earlier.

auto init_lambda = [&](VertexHandle v_id, VertexIterator& vv) {
        // variable to store ith entry of b_mat
        Eigen::Vector3f bi(0.0f, 0.0f, 0.0f);

        // get rotation matrix for ith vertex
        Eigen::Matrix3f Ri = Eigen::Matrix3f::Zero(3, 3);

        for (int i = 0; i < 3; i++) {
            for (int j = 0; j < 3; j++)
                Ri(i, j) = rotations(v_id, i * 3 + j);
        }

        for (int nei_index = 0; nei_index < vv.size(); nei_index++) {
            // get rotation matrix for neightbor j
            Eigen::Matrix3f Rj = Eigen::Matrix3f::Zero(3, 3);
            for (int i = 0; i < 3; i++)
                for (int j = 0; j < 3; j++)
                    Rj(i, j) = rotations(vv[nei_index], i * 3 + j);

            // find rotation addition
            Eigen::Matrix3f rot_add = Ri + Rj;
            // find coord difference
            Eigen::Vector3f vert_diff = {
                ref_vertex_pos(v_id, 0) - ref_vertex_pos(vv[nei_index], 0),
                ref_vertex_pos(v_id, 1) - ref_vertex_pos(vv[nei_index], 1),
                ref_vertex_pos(v_id, 2) - ref_vertex_pos(vv[nei_index], 2)};

            // update bi
            bi = bi +
                 0.5 * weight_mat(v_id, vv[nei_index]) * rot_add * vert_diff;
        }

        if (constraints(v_id, 0) == 0) {
            b_mat(v_id, 0) = bi[0];
            b_mat(v_id, 1) = bi[1];
            b_mat(v_id, 2) = bi[2];
        } else {
            b_mat(v_id, 0) = deformed_vertex_pos(v_id, 0);
            b_mat(v_id, 1) = deformed_vertex_pos(v_id, 1);
            b_mat(v_id, 2) = deformed_vertex_pos(v_id, 2);
        }
    };

After calling both these kernels, we use the solve function to solve the equations. We then update the mesh with this new set of vertex positions.

Results

Using the implementation above, we can create a per-frame ARAP implementation to animate movement for a mesh. In the example below, we fix the upper half vertices of Spot the Cow and allow the bottom half to freely deform. We then move each of the bottom “feet” vertices in every frame to animate a swimming or walking-like movement in Spot.

The bottom feet vertices are user-deformed , the upper vertices are fixed, and the bottom half (except the feet) are free to move—and are determined by the ARAP algorithm.

Here’s another example with the dragon mesh.

Deform a vertex of the dragon’s jaw while keeping its feet fixed.

Further work

Polyscope shows us that the framerate of these demos is very poor (below 30 fps). We would expect it to do better, especially when using the GPU. A performance analysis as well as optimizing of the solver will be required to make the application reach desired levels of real-time processing.

References

[1] Olga Sorkine and Marc Alexa. 2007. As-rigid-as-possible surface modeling. In Proceedings of the fifth Eurographics symposium on Geometry processing (SGP ’07). Eurographics Association, Goslar, DEU, 109–116. https://doi.org/10.2312/SGP/SGP07/109-116

This blog was written by Sachin Kishan during the SGI 2024 Fellowship as one of the outcomes of a two-week project under the mentorship of Ahmed Mahmoud and the support of Supriya Gadi Patil as a teaching assistant.

Post author By sachinkishan02
Post date August 26, 2024
Tags ARAP, As-rigid-as-possible, GPU, RXMesh, SCP, sgi2024, Spectral Conformal Parameterization

Research Tutorials

An Introduction to RXMesh

Post author By sachinkishan02
Post date August 2, 2024

This post serves as a tutorial for using RXMesh.

RXMesh is a framework for processing triangle mesh on the GPU. It allows developers to easily use the GPU’s massive parallelism to perform common geometry processing tasks.

This tutorial will cover the following topics:

Setting up RXMesh
RXMesh basic workflow
Writing kernels for queries
Example: Visualizing face normals

Setting up RXMesh

For the sake of this tutorial, you can set up a fork of RXMesh and then create your own small section to implement your own tutorial examples in the fork.

Go to https://github.com/owensgroup/RXMesh and fork the repository. You can create your own directory with a simple CUDA script to start using RXMesh. To do so, go to https://github.com/owensgroup/RXMesh/blob/main/apps/CMakeLists.txt in your fork and add a line:

add_subdirectory(introduction)

In the apps directory on your local fork, create a folder called “introduction”. There should be two files in this folder.

A .cu file to run the code. You can call this introduction.cu. To start, it can have the following contents:

#include "rxmesh/query.cuh"
#include "rxmesh/rxmesh_static.h"

#include "rxmesh/matrix/sparse_matrix.cuh"

using namespace rxmesh;

int main(int argc, char** argv)
{
    Log::init();

    const uint32_t device_id = 0;
    cuda_query(device_id);

    RXMeshStatic rx(STRINGIFY(INPUT_DIR) "sphere3.obj");

#if USE_POLYSCOPE
    polyscope::show();
#endif
}

2. A CMake file called CMakeLists.txt. It must have the following contents:

add_executable(introduction)

set(SOURCE_LIST
    introduction.cu	
)

target_sources(introduction
    PRIVATE
    ${SOURCE_LIST}
)

set_target_properties(introduction PROPERTIES FOLDER "apps")

set_property(TARGET ARAP PROPERTY CUDA_SEPARABLE_COMPILATION ON)

source_group(TREE ${CMAKE_CURRENT_LIST_DIR} PREFIX "introduction" FILES ${SOURCE_LIST})

target_link_libraries(introduction
    PRIVATE RXMesh
)

#gtest_discover_tests( introduction )

Once you have done the above steps, building the project with CMake is simple. Refer to the README in https://github.com/owensgroup/RXMesh to know the requirements and steps to build the project for your device.

Once the project has been set up, we can begin to test and run our introduction.cu file. If you were to run it, you should get an output of a 3D sphere.

This means the project set up has been successful, we can now begin learning RXMesh.

RXMesh basic workflow

RXMesh provides custom data structures and functions to handle various low-level computational tasks that would otherwise require low-level implementation. Let’s explore the fundamentals of how this workflow operates.

To start, look at this piece of code.

int main(int argc, char** argv)
{
    Log::init();
    const uint32_t device_id = 0;
    rxmesh::cuda_query(device_id);

    rxmesh::RXMeshStatic rx(STRINGIFY(INPUT_DIR) "dragon.obj");
    auto polyscope_mesh = rx.get_polyscope_mesh();

    auto vertex_pos   = *rx.get_input_vertex_coordinates();
    auto vertex_color = *rx.add_vertex_attribute<float>("vColor", 3);

    rx.for_each_vertex(rxmesh::DEVICE,
    [vertex_color, vertex_pos] __device__(const rxmesh::VertexHandle vh)
    {
            vertex_color(vh, 0) = 0.9;
            vertex_color(vh, 1) = vertex_pos(vh, 1);
            vertex_color(vh, 2) = 0.9;
    });

    vertex_color.move(rxmesh::DEVICE, rxmesh::HOST);

    polyscope_mesh->addVertexColorQuantity("vColor", vertex_color);
    polyscope::show();

    return 0;
}

The above is a slightly simplified version of what a for_each computation program would look like using RXMesh. for_each involves accessing every mesh element of a specific type (vertex, edge, or face) and performing some process on/with it.

In the code example above, the computation adjusts the green component of the vertex’s color based on the vertex’s Y coordinate in 3D space.

But how do we comprehend this code? We will look at it line by line:

    rxmesh::RXMeshStatic rx(STRINGIFY(INPUT_DIR) "dragon.obj");

The above line declares the RXMeshStatic object. Static here stands for a static mesh (one whose connectivity remains constant for the duration of the program’s runtime). Using the RXMesh’s Input directory, we can use some meshes that come with it. In this case, we pass in the dragon.obj file, which holds the mesh data for a dragon.

    auto polyscope_mesh = rx.get_polyscope_mesh();

This line returns the data needed for Polyscope to display the mesh along with any attribute content we add to it for visualization.

    auto vertex_pos = *rx.get_input_vertex_coordinates();

This line is a special function that directly gives us the coordinates of the input mesh.

auto vertex_color = *rx.add_vertex_attribute<float>("vColor", 3);

This line gives vertex attributes called “vColor”. Attributes are simply data that lives on top of vertices, edges, or faces. To learn more about how to handle attributes in RXMesh, check out this. In this case, we associate three float-point numbers to each vertex in the mesh.

rx.for_each_vertex(rxmesh::DEVICE,
    [vertex_color, vertex_pos] __device__(const rxmesh::VertexHandle vh)
    {
            vertex_color(vh, 0) = 0.9;
            vertex_color(vh, 1) = vertex_pos(vh, 1);
            vertex_color(vh, 2) = 0.9;
    });

The above lines represent a lambda function that is utilized by the for_each computation. In this case, for_each_vertex accesses each of the vertices of the mesh associated with our dragon. We pass in the arguments:

rxmesh::DEVICE – to let RXMesh know this is happening on the device i.e., the GPU.
[vertex_color, vertex_pos] – represents the data we are passing to the function.
(const rxmesh::VertexHandle vh) – is the handle that is used in any RXMesh computation. A handle allows us to access data associated to individual geometric elements. In this case, we have a handle that allows us to access each vertex in the mesh.

What do these lines mean?

            vertex_color(vh, 0) = 0.9;
            vertex_color(vh, 1) = vertex_pos(vh, 1);
            vertex_color(vh, 2) = 0.9;

These 3 lines bring together a lot of the declarations made earlier. Through the kernel, we are accessing the attribute data we defined before. Since we need to know which vertex we are accessing its color, we pass in vh as the first argument. Since a vertex has three components (standing for RGB), we also need to pass which index in the attribute’s vector (you can think of it as a 1D array too) we are accessing. Hence vertex_color(vh, 0) = 0.9; which stands for “in the vertex color associated with the handle vh (which for the kernel represents a specific vertex on the mesh), the value of the first component is 0.9”. Note that this “first component” represents red for us.

What about vertex_color(vh, 1) = vertex_pos(vh, 1)? This line, similar to the previous one, is accessing the second component associated with the color, in the vertex the handle is associated with.

But what is on the right-hand side? We are accessing vertex_pos (our coordinates of each vertex in the mesh) and we are accessing it the same way we access our color. In this case, the line is telling us that we are accessing the 2nd positional coordinate (y coordinate) associated with our vertex (that our handle gives to the kernel).

    vertex_color.move(rxmesh::DEVICE, rxmesh::HOST);

This line moves the attribute data from the GPU to the CPU.

polyscope_mesh->addVertexColorQuantity("vColor", vertex_color);

This line uses Polyscope to add a “quantity” to Polyscope’s visualization of the mesh. In this case, when we add the VertexColorQuantity and pass vertex_color, Polyscope will now visualize the per-vertex color information we calculated in the lambda function.

    polyscope::show();

We finally render the entire mesh with our added quantities using the show() function from Polyscope.

Throughout this basic workflow, you may have noticed it works similarly to any other program where we receive some input data, perform some processing, and then render/output it in some form.

More specifically, we use RXMesh to read that data, set up our attributes, and then pass that into a kernel to perform some processing. Once we move that data from the device to the host, we can either perform further processing or render it using Polyscope.

It is important to look at the kernel as computing per geometric element. This means we only need to think of computation on a single mesh element of a certain type (i.e., vertex, edge, or face) since RXMesh then takes this computation and runs it in parallel on all mesh elements of that type.

Writing kernels for queries

While our basic workflow covers how to perform for_each operation using the GPU, we may often require geometric connectivity information for different geometry processing tasks. To do that, RXMesh implements various query operations. To understand the different types of queries, check out this part of the README.

We can say there are two parts to run a query.

The first part consists of creating the launchbox for the query, which defines the threads and (shared) memory allocated for the process along with calling the function from the host to run on the device.

The second part consists of actually defining what the kernel looks like.

We will look at these one by one.

Here’s an example of how a launchbox is created and used for a process where we want to find all the vertex normals:

// Vertex Normal
auto vertex_normals = rx.add_vertex_attribute("vNormals", 3);
constexpr uint32_t CUDABlockSize = 256;

rxmesh::LaunchBox<CUDABlockSize> launch_box;

rx.prepare_launch_box(
  {rxmesh::Op::FV},
  launch_box,
  (void*)compute_vertex_normal<float,CUDABlockSize>);

compute_vertex_normal<float, CUDABlockSize>
<<<launch_box.blocks, 
   launch_box.num_threads, 
   launch_box.smem_bytes_dyn>>>
(rx.get_context(), vertex_pos, *vertex_normals);

vertex_normals->move(rxmesh::DEVICE, rxmesh::HOST);

Notice how, in replacing the for_each part of our basic workflow, we instead declare the launchbox and call our function compute_vertex_normal (which we will look at next) from our main function.

We must also define the kernel which will run on the device.

template <typename T, uint32_t blockThreads>
__global__ static void compute_vertex_normal(const rxmesh::Context      context,
rxmesh::VertexAttribute<T> coords,
rxmesh::VertexAttribute<T> normals)
{


    auto vn_lambda = [&](FaceHandle face_id, VertexIterator& fv){
        // get the face's three vertices coordinates
        glm::fvec3 c0(coords(fv[0], 0), coords(fv[0], 1), coords(fv[0], 2));
        glm::fvec3 c1(coords(fv[1], 0), coords(fv[1], 1), coords(fv[1], 2));
        glm::fvec3 c2(coords(fv[2], 0), coords(fv[2], 1), coords(fv[2], 2));

        // compute the face normal
        glm::fvec3 n = cross(c1 - c0, c2 - c0);

        // the three edges length
        glm::fvec3 l(glm::distance2(c0, c1),
                     glm::distance2(c1, c2),
                     glm::distance2(c2, c0));

        // add the face's normal to its vertices
        for (uint32_t v = 0; v < 3; ++v) {      
// for every vertex in this face
            for (uint32_t i = 0; i < 3; ++i) {  
// for the vertex 3 coordinates
                atomicAdd(&normals(fv[v], i), n[i] / (l[v] + l[(v + 2) % 3]));
            }
        }
    };

    auto block = cooperative_groups::this_thread_block();

    Query<blockThreads> query(context);
    ShmemAllocator      shrd_alloc;
    query.dispatch<Op::FV>(block, shrd_alloc, vn_lambda);
}

A few things to note about our kernel

We pass in a few arguments to our kernel.
- We pass the context which allows RXMesh to access the data structures on the device.
- We pass in the coordinates of each vertex as a VertexAttribute
- We pass in the normals of each vertex as an attribute. Here, we accumulate the face’s normal on its three vertices.
The function that performs the processing is a lambda function. It takes in the handles and iterators as arguments. The type of the argument will depend on what type of query is used. In this case, since it is an FV query, we have access to the current face(face_id) the thread is acting on as well as the vertices of that face using fv.
Notice how within our lambda function, we do the same as before with our for_each operation, i.e., accessing the data we need using the attribute handle and processing it for some output for our attribute.
Outside our lambda function, notice how we need to set some things up with regards to memory and the type of query. After that, we call the lambda function to make it run using query.dispatch

Visualizing face normals

Now that we’ve learnt all the pieces required to do some interesting calculations using RXMesh, let’s try a new one out.

Try to visualize the face normals of a given mesh. This would mean obtaining an output like this for the dragon mesh given in the repository:

Here’s how to calculate the face normals of a mesh:

Take a vertex. Calculate two vectors that are formed by subtracting the selected vertex from the other two vertices.
Take the cross product of the two vectors.
The vector obtained from the cross product is the normal of the face.

Now try to perform the calculation above. We can do this for each face using the vertices it is connected to.

All the information above should give you the ability to implement this yourself. If required, the solution is given below.

The kernel that runs on the device:

template <typename T, uint32_t blockThreads>
global static void compute_face_normal(
const rxmesh::Context context,
rxmesh::VertexAttribute coords, // input
rxmesh::FaceAttribute normals) // output
{
auto vn_lambda = [&](FaceHandle face_id, VertexIterator& fv) {
    // get the face's three vertices coordinates

    glm::fvec3 c0(coords(fv[0], 0), coords(fv[0], 1), coords(fv[0], 2));
    glm::fvec3 c1(coords(fv[1], 0), coords(fv[1], 1), coords(fv[1], 2));
    glm::fvec3 c2(coords(fv[2], 0), coords(fv[2], 1), coords(fv[2], 2));

    // compute the face normal
    glm::fvec3 n = cross(c1 - c0, c2 - c0);
    normals(face_id, 0) = n[0];
    normals(face_id, 1) = n[1];
    normals(face_id, 2) = n[2];
};

auto block = cooperative_groups::this_thread_block();
Query<blockThreads> query(context);
ShmemAllocator shrd_alloc;
query.dispatch<Op::FV>(block, shrd_alloc, vn_lambda);

}

The function call from main on HOST:

int main(int argc, char** argv)
{
    Log::init();

    const uint32_t device_id = 0;
    cuda_query(device_id);

    rxmesh::RXMeshStatic rx(STRINGIFY(INPUT_DIR) "sphere3.obj");

    auto vertex_pos = *rx.get_input_vertex_coordinates();
    auto face_normals = rx.add_face_attribute<float>("fNorm", 3);

    constexpr uint32_t CUDABlockSize = 256;
    LaunchBox<CUDABlockSize> launch_box;
  
    compute_face_normal<float, CUDABlockSize>
    <<<launch_box.blocks,
    launch_box.num_threads,
    launch_box.smem_bytes_dyn>>>
    (rx.get_context(), vertex_pos, *face_normals);
    
    face_normals->move(DEVICE, HOST);

    rx.get_polyscope_mesh()->addFaceVectorQuantity("fNorm", *face_normals);

#if USE_POLYSCOPE
    polyscope::show();
#endif
}

As an extra example, try the following:

Color a vertex depending on how many vertices it is connected to.
Test the above on different meshes and see the results. You can access different kinds of mesh files from the “Input” folder in the RXMesh repo and add your own meshes if you’d like.

By the end of this, you should be able to do the following:

Know how to create new subdirectories in the RXMesh Fork
Know how to create your own for_each computations on the mesh
Know how to create your own queries
Be able to perform some basic geometric analyses’ using RXMesh queries

References

More information can be found in RXMesh GitHub Repository.

This blog was written by Sachin Kishan during the SGI 2024 Fellowship as one of the outcomes of a two week project under the mentorship of Ahmed Mahmoud and support of Supriya Gadi Patil as teaching assistant.

Post author By sachinkishan02
Post date August 2, 2024
Tags GPU, RXMesh, SGI, sgi2024