Whither GPU? - Demystifying Matrices

On Christmas last year, I came across a video by Tsoding. Now when I say that he has a gift for distilling complex and sometimes even scary topics in a way that completely makes sense, I’m not exaggerating. In this particular video, he quite literally demystifies the illusion of 3d graphics created on a flat 2d screen.

I spent some time building an extension to his project here, an attempt to demystify rendering on the GPU, by way of matrices.

MNKY | Repository

In this post, we’ll build a Canvas2d renderer to display a rotating cube, and then build intuition towards using matrices for linear transformations (like rotation).

Those who have already watched the video, can skim all the way to Matrices.

Drawing a Cube

For simplicity, I will start with rendering using Canvas 2D APIs, and slowly move to the GPU. It will allow for seamlessly switching between the two rendering pipelines and seeing the difference.

There’s a bit of boilerplate in the repo, which you don’t necessarily need, however using TypeScript and a bunch of tools (like Biome) make the experience much more pleasant.

Instantiating Canvas2D

This requires creating an HTML canvas element, and using JavaScript to grab a handle to it. If you haven’t watched Tsoding’s video, he shows some neat tricks like accessing elements by id directly on the global object, however we’re going to do it the more verbose way.

// src/canvas.ts
function setupCanvas2d(element: HTMLCanvasElement) {
  // this wrapper will make sense later when we draw per frame
  const init = () => {
    element.width = CANVAS_SIZE;
    element.height = CANVAS_SIZE;

    // 1. handle to the canvas 2d context
    const context2d = element.getContext("2d");
    if (!context2d) {
      throw new Error("Failed to get drawing context");
    }

    // 2. set fill color for shapes
    context2d.fillStyle = CLEAR_COLOR;
    // 3. sets the entire canvas color to CLEAR_COLOR
    context2d.fillRect(0, 0, CANVAS_SIZE, CANVAS_SIZE);

    context2d.fillStyle = COLOR;
    // 4. rectangle spans from (x1, y1) to (x2, y2) coordinates
    context2d.rect(0, 0, RECT_SIZE, RECT_SIZE);
    context2d.fill();
  };

  init();
}

(-1.0, -1.0)               ( 1.0, -1.0)
     ┌──────────────────────────┐
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │       CLEAR_COLOR        │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     └──────────────────────────┘
(-1.0,  1.0)               ( 1.0,  1.0)

The handle provides access to the actual drawing APIs.
Canvas2D APIs are imperative, which means that you set some config, and all subsequent API calls use the updated config values. In this case the following fillRect call will fill the rect with the CLEAR_COLOR.
This fills the entire canavas to a dark gray color, which gives me a decent enough starting point to continue drawing shapes on.
Finally draw a rectangle of width x2 - x1, and height y2 - y1. (0, 0) is the top left of the canvas, and positive values move towards the right and down. We’re going to use this rectangle as the background.

const root = document.querySelector<HTMLDivElement>("#app");

if (!root) {
  throw new Error("No root to bind");
}

// 1. attach the canvas to root
root.innerHTML = `
  <div>
    <div class="card">
      <canvas id="canvas-2d" />
    </div>
  </div>
`;

const canvas2d: HTMLCanvasElement | null = document.querySelector("#canvas-2d");

if (!canvas2d) {
  throw new Error("No root to bind");
}

// 2. call setupCanvas2d from the previous section
setupCanvas2d(canvas2d);

Attach a canvas element to the root, so we can draw things to it.
Call the setup function, which in turn draws the rectangle with CLEAR_COLOR.

Making a point

Now that drawing a background is out of the way, let’s try drawing some actual points. As we’re going to move to WebGPU rendering pipeline pretty soon, we’ll create a mapping between the coordinate spaces used by Canvas2D and WebGPU for easy translation. This allows us to store points in a standardized form, used by portable formats like Obj. We’ll start with drawing a single point and move on to drawing quads, and finally a cube.

// src/canvas.ts

type Dimensions = {
  width: number;
  height: number;
};

interface Point2D {
  x: number;
  y: number;
}

// 1. NDC to screen space coordinates
export const screen = (
  { x, y }: Point2D,
  screenDimensions: Dimensions,
): Point2D => {
  return {
    x: ((x + 1) / 2) * screenDimensions.width,
    // 2. y-flip
    y: (1 - (y + 1) / 2) * screenDimensions.height,
  };
};

WebGPU uses [-1, -1] to [1, 1] as it’s coordinate space. We’ll use the same normalized coordinates when issuing draw commands to Canvas2D API to keep things consistent. The types Dimensions and Point2D are just for readability, and encoding intent into the code.
Y-axis is flipped, as WebGPU coordinates increase from top to bottom, as opposed to that of the Canvas2D API.

// src/canvas.ts
function setupCanvas2d(element: HTMLCanvasElement) {
  const init = () => {
    ...
    context2d.fillStyle = COLOR;

    // 1. first point's coordinates converted to screen space
+   const point1 = screen({ x: -0.5, y: 0.5 }, SCREEN_DIMENSIONS);

    // 2. draw a square at the position
+   context2d.rect(point1.x, point1.y, POINT_SIZE, POINT_SIZE);
    ...
  };

  init();
}

(-1.0, -1.0)               ( 1.0, -1.0)
     ┌──────────────────────────┐
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │     •                    │
     │(-0.5,  0.5)              │
     │                          │
     └──────────────────────────┘
(-1.0,  1.0)               ( 1.0,  1.0)

[-0.5, 0.5] is the lower left corner of a square of side 1.0 centered at [0, 0]. Which gets converted to screen space coordinates.
There’s no reason to draw the point as a square, other than that it’s convenient.

Drawing a quad

Now let’s see if you can deduce the coordinates for the other three points.

(-1.0, -1.0)               ( 1.0, -1.0)
     ┌──────────────────────────┐
     │                          │
     │     •             •      │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │                          │
     │     •             •      │
     │                          │
     └──────────────────────────┘
(-1.0,  1.0)               ( 1.0,  1.0)

Drawing these points is quite straight now that we just have to repeat the previous step with different coordinates.

function setupCanvas2d(element: HTMLCanvasElement) {
  const init = () => {
    ...
    const point1 = screen({ x: -0.5, y: 0.5 }, SCREEN_DIMENSIONS);
    // 1. coordinates for the other three points
+   const point2 = screen({ x: 0.5, y: 0.5 }, SCREEN_DIMENSIONS);
+   const point3 = screen({ x: 0.5, y: -0.5 }, SCREEN_DIMENSIONS);
+   const point4 = screen({ x: -0.5, y: -0.5 }, SCREEN_DIMENSIONS);

    context2d.rect(point1.x, point1.y, POINT_SIZE, POINT_SIZE);
    // 2. draw the new points
+   context2d.rect(point2.x, point2.y, POINT_SIZE, POINT_SIZE);
+   context2d.rect(point3.x, point3.y, POINT_SIZE, POINT_SIZE);
+   context2d.rect(point4.x, point4.y, POINT_SIZE, POINT_SIZE);
    ...
  };

  init();
}

Just projecting

Let’s draw another set of points, but a little farther into the screen. So far, we don’t have a concept of depth, so let’s add another dimension to our points. And add a way to project 3d points onto a 2d surface. This is how the illusion of depth is created on a flat 2d-screen, make some things smaller in comparison to others, and our brain magically thinks something is far away.

// src/canvas.ts
interface Point3D extends Point2D {
  z: number;
}

// 1. x and y scaled down in proportion to depth
const project = ({ x, y, z }: Point3D): Point2D => {
  return { x: x / z, y: y / z };
};

function setupCanvas2d(element: HTMLCanvasElement) {
  const init = () => {
    ...
-   const point1 = screen({ x: -0.5, y:  0.5 }, SCREEN_DIMENSIONS);
-   const point2 = screen({ x:  0.5, y:  0.5 }, SCREEN_DIMENSIONS);
-   const point3 = screen({ x:  0.5, y: -0.5 }, SCREEN_DIMENSIONS);
-   const point4 = screen({ x: -0.5, y: -0.5 }, SCREEN_DIMENSIONS);
    // project existing coordinates
+   const point1 = screen(project({ x: -0.5, y:  0.5, z: 1.0 }), SCREEN_DIMENSIONS);
+   const point2 = screen(project({ x:  0.5, y:  0.5, z: 1.0 }), SCREEN_DIMENSIONS);
+   const point3 = screen(project({ x:  0.5, y: -0.5, z: 1.0  }), SCREEN_DIMENSIONS);
+   const point4 = screen(project({ x: -0.5, y: -0.5, z: 1.0  }), SCREEN_DIMENSIONS);
    // add the second set of quad points
+   const point5 = screen(project({ x: -0.5, y:  0.5, z: 2.0 }), SCREEN_DIMENSIONS);
+   const point6 = screen(project({ x:  0.5, y:  0.5, z: 2.0 }), SCREEN_DIMENSIONS);
+   const point7 = screen(project({ x:  0.5, y: -0.5, z: 2.0  }), SCREEN_DIMENSIONS);
+   const point8 = screen(project({ x: -0.5, y: -0.5, z: 2.0  }), SCREEN_DIMENSIONS);

    ...
    // draw the second set of points after the first set
+   context2d.rect(point5.x, point5.y, POINT_SIZE, POINT_SIZE);
+   context2d.rect(point6.x, point6.y, POINT_SIZE, POINT_SIZE);
+   context2d.rect(point7.x, point7.y, POINT_SIZE, POINT_SIZE);
+   context2d.rect(point8.x, point8.y, POINT_SIZE, POINT_SIZE);

    ...
  };

  init();
}

This is a simplified projection, makes the farther objects smaller compared to the ones near the observer, in proportion to the depth.

(-1.0, -1.0)               ( 1.0, -1.0)
     ┌──────────────────────────┐
     │                          │
     │   3 •            • 2     │
     │                          │
     │      7 •       • 6       │
     │                          │
     │                          │
     │                          │
     │      4 •       • 5       │
     │                          │
     │   0 •             • 1    │
     │                          │
     └──────────────────────────┘
(-1.0,  1.0)               ( 1.0,  1.0)

2d to 3d

To turn eight points into a cube we have to draw some lines. First, let’s cleanup the code so it makes it easier for us to draw lines between pairs of points.

// src/canvas.ts
const POINTS: Point3D[] = [
  // front face
  { x: -0.5, y: 0.5, z: 1.0 },
  { x: 0.5, y: 0.5, z: 1.0 },
  { x: 0.5, y: -0.5, z: 1.0 },
  { x: -0.5, y: -0.5, z: 1.0 },
  // back face
  { x: -0.5, y: 0.5, z: 2.0 },
  { x: 0.5, y: 0.5, z: 2.0 },
  { x: 0.5, y: -0.5, z: 2.0 },
  { x: -0.5, y: -0.5, z: 2.0 },
];

// 1. point pairs to draw lines between
const EDGES = [
  // front face points
  [0, 1],
  [1, 2],
  [2, 3],
  [3, 0],
  // back face points
  [4, 5],
  [5, 6],
  [6, 7],
  [7, 4],
  // connected corners
  [0, 4],
  [1, 5],
  [2, 6],
  [3, 7],
];

Now let’s draw some lines between these points, so it’s easier to see the cube.

function setupCanvas2d(element: HTMLCanvasElement) {
  // this wrapper will make sense later when we draw per frame
  const init = () => {
    ...

    context2d.fillRect(0, 0, CANVAS_SIZE, CANVAS_SIZE);
    // 1. set stroke style so the same color as the points
+   context2d.strokeStyle = COLOR;
+   context2d.lineWidth = 2;

    // 2. start the path
+   context2d.beginPath();

    // 3. for each edge; draw line between the first and second point
    for (const edge of EDGES) {

      const from = screen(project(POINTS[edge[0]]), SCREEN_DIMENSIONS);
      const to = screen(project(POINTS[edge[1]]), SCREEN_DIMENSIONS);

      // this moves the point without drawing a line
      context2d.moveTo(from.x, from.y);
      // actually draw a line
      context2d.lineTo(to.x, to.y);
    }
    // 4. submit the path to create a stroke
    context2d.stroke();
  };
  };

  init();
}

(-1.0, -1.0)               ( 1.0, -1.0)
     ┌──────────────────────────┐
     │                          │
     │    3┌─────────────┐2     │
     │     │ \         / │      │
     │     │ 7┌───────┐6 │      │
     │     │  │       │  │      │
     │     │  │       │  │      │
     │     │  │       │  │      │
     │     │ 4└───────┘5 │      │
     │     │ /         \ │      │
     │    0└─────────────┘1     │
     │                          │
     └──────────────────────────┘
(-1.0,  1.0)               ( 1.0,  1.0)

Even though we can kinda see a cube, the next step will drive the 3-dimentionality of it home.

Rotating the cube

Can you visualize a fully rotating cube in your mind? Mine isn’t very consistent, it jumps around, and the color changes from frame to frame.

First, let’s organize the code, so it’s easier to animate the rotating cube.

// src/canvas.ts
function setupCanvas2d(element: HTMLCanvasElement) {
  let context2d: CanvasRenderingContext2D | null = null;

  // 1.1
  const init = () => {
    ...

-   const context2d = element.getContext('2d');
+   context2d = element.getContext('2d');
  };

  // 1.2
+ const clear = () => {
+   context2d.fillStyle = CLEAR_COLOR;
+   context2d.fillRect(0, 0, CANVAS_SIZE, CANVAS_SIZE);
+ };

  // 1.3
+ const draw = () => {
    if (!context2d) {
      throw new Error('Failed to get drawing context');
    }
+   clear();
+   context2d.beginPath();

    context2d.stroke();

    // 2. call draw 60 times a second
+   setTimeout(draw, 1000/60)
  };

  init();
  // 3. start the draw loop
+ draw();
}

Now, let’s actually rotate the cube. For now, we’re just going to rotate the cube around the y-axis, which is another way to say that we’re rotating in the xz-plane, so y coordinates of our cube remain unchanged. If you’d like to explore how the rotation formulae work, check out this excellent explanation: Rotation matrix derivation. The video goes over rotation around the z-axis, however rotation around y-axis is quite similar.

const rotate_xz = ({ x, y, z }: Point3D, angle: number) => {
  const c = Math.cos(angle);
  const s = Math.sin(angle);
  return {
    x: x * c - z * s,
    y,
    z: x * s + z * c,
  };
};

// src/canvas.ts
function setupCanvas2d(element: HTMLCanvasElement) {
  let context2d: CanvasRenderingContext2D | null = null;

  // 1.
+ let angle = 0;
+ const dt = 1/FPS;

  // 2.
+ const draw = () => {
+   angle += (Math.PI * dt) / 2;
    context2d.beginPath();

    for (const edge of EDGES) {
      // 3. rotate the points each frame as angle accumulates
-     const from = screen(project(POINTS[edge[0]]), SCREEN_DIMENSIONS);
-     const to = screen(project(POINTS[edge[1]]), SCREEN_DIMENSIONS);
+     const from = screen(project(rotate_xz(POINTS[edge[0]], angle)), SCREEN_DIMENSIONS);
+     const to = screen(project(rotate_xz(POINTS[edge[1]], angle)), SCREEN_DIMENSIONS);

    ...
  };

  ...
}

Matrices

For this next trick, we’re going to look at why people won’t shut up about matrices when it comes to graphics programming. Matrices are the hot shit, but what are they? They are, among other things, a data structures that make it trivial to represent linear transformations like rotation. Let’s decompose our original rotation function into something that can be represented by a matrix. The Rotation matrix derivation video provides a great explanation of how to get to a matrix from the trignometry for 2 dimensions, however we need a solution for 3 dimensions.

┌    ┐   ┌         ┐┌   ┐   ┌                 ┐   ┌           ┐
│ x' │   │ c  0 -s ││ x │   │ c*x + 0*y - s*z │   │ c*x - s*z │
│ y' │ = │ 0  1  0 ││ y │ = │ 0*x + 1*y + 0*z │ = │       1*y │
│ z' │   │ s  0  c ││ z │   │ s*x + 0*y + c*z │   │ s*x + c*z │
└    ┘   └         ┘└   ┘   └                 ┘   └           ┘
         │          │       │        
         │          input   output
         xz-rotation

Matrix multiplication

If you’re unfamiliar with matrix multiplication it can seem intimidating. However, it’s a monster that seems less scary every time you come back and look at it again.

We take the first element of the first row of the rotation matrix, multiply it with the first element of the first column of the vector, and place it at the first position in the output matrix.

                  col      output           
                   ↓         ↓              
     ┌         ┐ ┌   ┐   ┌                 ┐
row →│ c  •  • │ │ x │   │ c*x +   • +   • │
     │ •  •  • │ │ • │ = │   • +   • +   • │
     │ •  •  • │ │ • │   │   • +   • +   • │
     └         ┘ └   ┘   └                 ┘
                   ↓               ↓        
     ┌         ┐ ┌   ┐   ┌                 ┐
    →│ •  0  • │ │ • │   │   • + 0*y +   • │
     │ •  •  • │ │ y │ = │   • +   • +   • │
     │ •  •  • │ │ • │   │   • +   • +   • │
     └         ┘ └   ┘   └                 ┘
                   ↓                     ↓  
     ┌         ┐ ┌   ┐   ┌                 ┐
    →│ •  • -s │ │ • │   │   • +   • - s*z │
     │ •  •  • │ │ • │ = │   • +   • +   • │
     │ •  •  • │ │ z │   │   • +   • +   • │
     └         ┘ └   ┘   └                 ┘
                                            
// completed row                            
     ┌         ┐ ┌   ┐   ┌                 ┐ 
     │ c  0 -s │ │ x │   │ c*x + 0*y - s*z │←
     │ •  •  • │ │ y │ = │   • +   • +   • │ 
     │ •  •  • │ │ z │   │   • +   • +   • │ 
     └         ┘ └   ┘   └                 ┘

Then the second row.

                  col                       
                   ↓                        
     ┌         ┐ ┌   ┐   ┌                 ┐ 
     │ •  •  • │ │ x │   │   • +   • +   • │ 
row →│ 0  1  0 │ │ y │ = │ 0*x + 1*y + 0*z │←
     │ •  •  • │ │ z │   │   • +   • +   • │ 
     └         ┘ └   ┘   └                 ┘

And finally the last one.

                  col                       
                   ↓                        
     ┌         ┐ ┌   ┐   ┌                 ┐ 
     │ •  •  • │ │ x │   │   • +   • +   • │ 
     │ •  •  • │ │ y │ = │   • +   • +   • │ 
row →│ s  0  c │ │ z │   │ s*x + 0*y + c*z │← 
     └         ┘ └   ┘   └                 ┘

Which brings us to the output.

┌    ┐   ┌         ┐┌   ┐   ┌                 ┐   ┌           ┐
│ x' │   │ c  0 -s ││ x │   │ c*x + 0*y - s*z │   │ c*x - s*z │
│ y' │ = │ 0  1  0 ││ y │ = │ 0*x + 1*y + 0*z │ = │       1*y │
│ z' │   │ s  0  c ││ z │   │ s*x + 0*y + c*z │   │ s*x + c*z │
└    ┘   └         ┘└   ┘   └                 ┘   └           ┘
         │          │       │        
         │          input   output
         xz-rotation
i.e.
x' = c*x - s*z
y' = y
z' = s*x + c*z

If you look closely at the final output you’ll see that values of x', y', z' look the same as the return value of our rotate_xz function.

// src/canvas.ts
const rotate_xz = ({ x, y, z }: Point3D, angle: number) => {
  const c = Math.cos(angle);
  const s = Math.sin(angle);
  return {
    x: x * c - z * s,
    y,
    z: x * s + z * c,
  };
};

So how do we implement this in code? Let’s first create a matrix class, that we can use to abstract away all this math. I’ll keep this section concise, the full diff is here.

// src/mat4x4.ts

class Mat4x4 {
  multiplyPoint(point: Point3D): Point3D {
    const { x, y, z } = point;
    const m = this.mat;

    return {
      x: m[0] * x + m[1] * y + m[2] * z + m[3],
      y: m[4] * x + m[5] * y + m[6] * z + m[7],
      z: m[8] * x + m[9] * y + m[10] * z + m[11],
    };
  }
}

Now we can update the caller to use the matrix instead.

// src/canvas.ts
  const draw = (shape: Shape) => {
    ...
    angle += (Math.PI * dt) / 2;
+   const rotationMatrix = Mat4x4.fromRotationXZ(angle);

    for (const face of shape.faces) {
      ...
-     const from = screen(project(rotate_xz(POINTS[edge[0]], angle)), SCREEN_DIMENSIONS);
-     const to = screen(project(rotate_xz(POINTS[edge[1]], angle)), SCREEN_DIMENSIONS);
+     const from = screen(project(rotationMatrix.multiplyPoint(POINTS[edge[0]])), SCREEN_DIMENSIONS);
+     const to = screen(project(rotationMatrix.multiplyPoint(POINTS[edge[1]])), SCREEN_DIMENSIONS);
    }
  }

This concludes the first part in the series, of Whither GPU. In the next part, we’ll write a wireframe renderer using WebGPU.

Drawing a Cube¶

Instantiating Canvas2D¶

Making a point¶

Drawing a quad¶

Just projecting¶

2d to 3d¶

Rotating the cube¶

Matrices¶

Matrix multiplication¶