Geometry Clipmaps

Image courtesy of Losasso & Hoppe.

I am currently reading a paper called “Geometry Ciipmaps: Terrain Rendering Using Nested Grids” by Frank Losasso and Hughes Hoppe, for my next project in IRG for terrain-rendering related project.

What is a Geometry Clipmap: 

It caches the terrain in a set of nested regular grids (which are filtered version of the terrain at power of two resolutions) centered about the viewer and it’s incrementally updated with new data as the viewer moves.

What is the difference between Mipmaps / Texture Clipmaps / Geometry Clipmaps ? 

A mipmap is a pyramid of images of the same scene. The set of images goes from fine to coarse, where the highest mipmap level consists of just one pixel. Mipmap level rendered at a pixel is a function of screen space parametric derivatives, which depend on view parameters and not on the content of the image.

A texture clipmap caches a view dependent subset of a mipmap pyramid.  The visible subset of the lower mip levels is limited by the resolution of the display. So it’s not necessary to keep the entire texture in memory. So you “clip” the mipmap to only the region needed to render the scene. Texture clipmaps compute LOD (level of detail) per-pixel based on screen space projected geometry.

With terrains, geometry for screen space does not exist until the level of detail is selected. But texture clipmaps compute LOD per-pixel based on existing geometry. See the problem?

So geometry clipmaps selects the LOD in world space based on viewer distance. It does this by using set of rectangular regions about the view point and uses transition regions to blend between LOD levels.

Refinement Hierarchy

Geometry clipmap’s refinement hierarchy is based on viewer centric grids but ignores local surface geometry.

Overview of Geometry Clipmap

Consists of m levels of terrain pyramid. Each level contains n x n array of vertices, stored as vertex buffer in video memory. Each vertex contains (x,y,z,z_c) coordinates, where z_c indicates height value at (x,y) in the next coarser level (for transition morphing).


Each clipmap level contains associated texture image(s), which are stored as 8-bit per channel normal map for surface shading (more efficient than storing per-vertex normals. The normal map is computed from the geometry whenever the clipmap is updated.

Per-frame Algorithm

– determine the desired active regions (extent we wish to render)

-update the geometry clipmap

-crop the active regions to the clip regions (world extent of nxn grid of data stored at that level), and render.

Computing Desired Active Regions

Approximate screen-space triangle size s in pixels is given by

W is the window size

phi is the filed of view

If W = 640 pixels, phi = 90 degrees, we obtain good results with clipmap size n=255.

Normal maps are stored at twice the resolution, which ives 1.5 pixels per texture sample.

Geometry Clipmap Update

Instead of copying over the old data when shifting a level, we fill the newly exposed L-shaped region (since the texture look up wraps around using mod operations on x and y). The new data comes from either decompressing the terrain (for coarser levels), or synthesizing (for finer levels). The finer level texture is synthesized from the coarser one using interpolatory subdivision scheme.

Constraints on the clipmap regions

– clip regions are nested fro coarse-to-fine geometry prediction. Prediction requires maintaining one grid unit on all sides.

– rendered data (active region) is subset of data present in clipmap (clip region).

– perimeter of active region must lie on even vertices for watertight boundary with coarser level.

– render region (active region) must be at least two grid units wide to allow a continuous transition between levels).

Rendering the Geometry Clipmap

//crop the active regions
for each level l = 1:m in coarse-to-fine order:
Crop active_region(l) to clip_region(l)
Crop active_region(l) to active_region(l-1)
//Render all levels
for each level l=1:m in fine-to-coarse order:
Render_region(l) = active_region(l) - active_region(l+1)
Render render_region(l)

Transition Regions for Visual Continuity

In order to render regions at different levels of detail, the geometry near the boundary is morphed for each render region s.t. it transitions to geom of coarser level.

Morphed elevation (z’):

where blend parameter alpha is computed as alpha = max(alpha_x, alpha_y).

v’_x denotes continuous coordinates of the viewpoint in the grid of clip region.

x_min and x_max are the integer extents of the active_region(l).

Texture Mapping

Each clipmap level stores texture images for use in rasterization (i.e. a normal map). Mipmapping is disabled but LOD is performed on the texture using the same spatial transition regions applied to the geometry. So texture LOD is based on viewer distance rather than on screen-space derivatives (which is how hardware mipmapping works).

View-Frustrum Culling

For each level of clipmap, maintain z_min, z_max bounds for the local terrain. Each render region is partitioned into four rectangular regions (see Figure 6). Each rectangular region is extruded by zmin and zmax (the terrain bounds) to form a axis-aligned bounding box. The bounding boxes are intersected with the four-sided viewing frustum and the resulting convex set is projected onto an XY plane. The axis aligned rectangle bounding this set is used to crop the given rectangular region.

Terrain Compression

Create a terrain pyramid by downsampling from fine terrain to coarse terrain using a linear filter. Then reconstruct the levels in coarse-to-fine order using interpolatory subdivision from next coarser level  + residual.

Terrain Synthesis

Fractal noise displacement is done by adding uncorrelated Gaussian noise to the upsampled coarser terrain. The Gaussian noise values are precomputed and stored in a look up table for efficiency.

Here are some good references:

A good follow up paper:

Awesome website on terrain rendering: