Skip to main content

Voxel Performance

Charles is a software engineer and college professor interested in technology, medicine, economics, and nutrition.

Displaying in the geometry shader, triangle meshes, and Unity Terrain.

Displaying in the geometry shader, triangle meshes, and Unity Terrain.

Voxel Problem

This project began when I wanted to make creatures, items, and landscapes out of voxels. I liked the idea of voxels because they can be used to build up and break down objects in an intuitive way. The problems were speed, memory use, and lack of standardization.

I knew I'd have to write my own voxel package, so I did a brief survey of some free and paid Assets for Unity. Most of them dodged any responsibility for performance, and were suited to using a few tens of thousands of voxels in a game scene the way you might use particle effects. I wanted to use voxels as a core game mechanic.

Even with bloated classes used for individual voxels, modern PCs have so much RAM that it's almost not a problem. The biggest problem was speed.


Profiling showed that the conversion of a voxel chunk into a mesh of triangles took up over 90% of runtime. Creating large voxel maps quickly was a simple matter of reusing calls to the Perlin noise function, and once converted to meshes a voxel volume performs similar to any other object.

Converting chunks can easily be done in parallel, but with eight threads, performance was still less than satisfactory. Using a geometry shader to display voxels more directly was no help and began to lag seriously with volumes of only a few million voxels. Sending only surface voxels to the shader handed the problem back to CPU threads. CPU performance is adequate for updates but not for the initial massive volumes required for random map generation.


Using a series of Compute Shader kernels to produce maps and compress them into an array of visible voxels turned out to be a satisfactory solution. A Geometry Shader can accept a GPU buffer that contains the array to avoid any need to copy to and from main system RAM before the initial display.

In the Unity Asset I wrote to implement and test this process, I settled upon cube voxel chunks 256 on each side that use one byte per voxel. Sending a chunk to or from the GPU would take less than a second on a newer PCIe bus but still be far too slow. With map generation and initial display wholly on the graphics card, the CPU is freed for other scene setup and raw voxel data can be transferred as needed (if at all).


The speed of the Compute and Geometry Shaders will depend on hardware, but an onboard GPU takes about a third of a second per chunk. Converting to a mesh takes a little longer and uses more memory. Converting to Unity TerrainData is very fast, but only stores a height map.

Sixteen million voxels (256 x 256 x 256) in a chunk take up 16 MB. Surface voxels are typically 512 KB (128K x 4 bytes) for each chunk ready for display.

Level five Menger sponge displaying 3.2 million voxels.

Level five Menger sponge displaying 3.2 million voxels.

Stress Testing

This is a fractal that is sometimes used for stress testing voxel libraries because it has no hidden voxels. That is, every voxel has at least one visible face from some angle. It's 14.3 million (243 x 243 x 243) in total volume, with 3.2 million solid voxels.

Using an onboard GPU the FPS varies from 5 to 9 as the camera moves around and through the object.


The terrain demonstration video shows a grid of 16 chunks with all data generated at runtime and near the end you can see the entire 1024x1024 map. Even one 256x256 voxel area could be a game scene but with this asset much larger scenes can be generated and kept active with minimal impact on CPU and main RAM.

Implementation: Geometry Shader

A Geometry Shader has some overhead. One reason is that it can produce a variable amount of output that will need to be moved into a contiguous memory block. My solution is to produce the same amount of output for each geometry function call, and to minimize branching and copying as much as possible.

Each cube face requires four vertices, and the camera can see at most three faces depending on its position relative to the voxel. This makes for three branches that draw very similar faces, except for a shift along one of the axes. The shift variable is assigned to minimize the effect of the branch and used to adjust the face position when the vertices are created.

See the full source code at the link below.

Geometry Shader

// For each voxel that is visible from some angle, paint the
// three sides that the given camera might see.
void geom( point inputGS p[1], inout TriangleStream<input> triStream )
float4 pos = p[0].pos * float4( _Size, _Size, _Size, 1 );
float4 shift;
float4 voxelPosition = pos + _chunkPosition;
float halfS = _Size * 0.5;  // x, y, z is the center of the voxel,
                            // paint sides offset by half of Size
input pIn1, pIn2, pIn3, pIn4;

  pIn1._color = p[0]._color;
  pIn1.uv = float2( 0.0f, 0.0f );

  pIn2._color = p[0]._color;
  pIn2.uv = float2( 0.0f, 1.0f );

  pIn3._color = p[0]._color;
  pIn3.uv = float2( 1.0f, 0.0f );

  pIn4._color = p[0]._color;
  pIn4.uv = float2( 1.0f, 1.0f );

  shift = (_cameraPosition.x < voxelPosition.x)
        ? float4( 1, 1, 1, 1 ) : float4( -1, 1, -1, 1 );

  pIn1.pos = mul( UNITY_MATRIX_VP, mul( _worldMatrixTransform,
                     pos + shift*float4( -halfS, -halfS, halfS, 0 ) ));
  triStream.Append( pIn1 );

  pIn2.pos = mul( UNITY_MATRIX_VP, mul( _worldMatrixTransform,
                     pos + shift*float4( -halfS, halfS, halfS, 0 ) ));
  triStream.Append( pIn2 );

  pIn3.pos = mul( UNITY_MATRIX_VP, mul( _worldMatrixTransform,
                     pos + shift*float4( -halfS, -halfS, -halfS, 0 )));
  triStream.Append( pIn3 );

  pIn4.pos = mul( UNITY_MATRIX_VP, mul( _worldMatrixTransform,
                     pos + shift*float4( -halfS, halfS, -halfS, 0 ) ));
  triStream.Append( pIn4 );



Future Directions

I created this asset for my own use, packaging it into what I hope is a simple, performant, and reusable form. I have ideas for later versions, but I would also love to hear yours if you're willing to share them, or let me know how you're using this asset to conquer the world of voxels.

One idea I've been toying with lately is changing the pre-display optimization step to return several GPU buffers instead of just one. This would allow more specialized shaders to handle different voxel information, such as support for clouds, liquids, or a graphics change for different voxel faces.

Let me know what you think!

The video tutorial shown below is by Charles Humphrey who is a scholar and a gentleman for showing how to pass a GPU buffer from a Compute Shader to a Geometry Shader, a technique only recently available in Unity so it remains somewhat of a black art. The tutorial also shows how to achieve some weather effects in Unity using billboard mode so snowflakes are always facing the camera. It's a bunch of neat shader tricks that I think he'd like to pursue further and put into one of his assets on the Unity store but he's being pulled in ten different directions at once.

This article is accurate and true to the best of the author’s knowledge. Content is for informational or entertainment purposes only and does not substitute for personal counsel or professional advice in business, financial, legal, or technical matters.