Fast kd-tree Construction with an Adaptive Error-Bounded Heuristic

Warren Hunt, William R. Mark, Gordon Stoll

2006 IEEE Symposium on Interactive Ray Tracing, September 2006.


Construction of effective acceleration structures for ray tracing is a well studied problem. The highest quality acceleration structures are generally agreed to be those built using greedy cost optimization based on a surface area heuristic (SAH). This technique is most often applied to the construction of kd-trees, as in this work, but is equally applicable to the construction of other hierarchical acceleration structures. Unfortunately, SAH-optimized data structure construction has previously been too slow to allow per-frame rebuilding for interactive ray tracing of dynamic scenes, leading to the use of lower-quality acceleration structures for this application. The goal of this paper is to demonstrate that high-quality SAH based acceleration structures can be constructed quickly enough to make them a viable option for interactive ray tracing of dynamic scenes.

We present a scanning-based algorithm for choosing kd-tree split planes that are close to optimal with respect to the SAH criteria. Our approach approximates the SAH cost function across the spatial domain with a piecewise quadratic function with bounded error and picks minima from this approximation. This algorithm takes full advantage of SIMD operations (e.g., SSE) and has favorable memory access patterns. In practice this algorithm is faster than sorting-based SAH build algorithms with the same asymptotic time complexity, and is competitive with non-SAH build algorithms which produce lower-quality trees. The resulting trees are almost as good as those produced by a sorting-based SAH builder as measured by ray tracing time. For a test scene with 180k polygons our system builds a high-quality kd-tree in 0.26 seconds that only degrades ray tracing time by 3.6% compared to a full quality tree.

Paper -- final version (PDF, 1.78 MB)

Code and paper (.tar.gz, 1.4 MB)
Note on code: Performance is very sensitive to choice of compiler and to compiler optimization settings. We use 'icc'. In particular, make sure that expand() (from ssebasic.h) compiles to a single SSE shuffle instruction, not to a run-time switch statement.

BibTex Citation

 author = {Warren Hunt and William R. Mark and Gordon Stoll},
 title = {Fast kd-tree Construction with an Adaptive Error-Bounded Heuristic},
 booktitle = {2006 IEEE Symposium on Interactive Ray Tracing},
 month = {Sept.}
 year = {2006},
 publisher = {IEEE}