Kd Tree¶
Problem¶
Given a set of k-dimensional points, build a data structure that supports efficient nearest-neighbor queries.
Approach¶
Recursively partition points by cycling through dimensions, splitting on the median. For nearest-neighbor queries, traverse the tree pruning branches whose bounding hyperplane is farther than the current best distance.
When to Use¶
Nearest neighbor in multi-dimensional space — "closest point", "k-nearest neighbors", range search in 2D/3D. Aviation: closest waypoint/airport lookup, collision avoidance in 3D airspace. See also: geohash_grid for grid-based spatial indexing.
Complexity¶
| Time | `` |
| Space | O(n) |
Implementation¶
kd_tree ¶
K-dimensional tree for nearest neighbor search.
Problem
Given a set of k-dimensional points, build a data structure that supports efficient nearest-neighbor queries.
Approach
Recursively partition points by cycling through dimensions, splitting on the median. For nearest-neighbor queries, traverse the tree pruning branches whose bounding hyperplane is farther than the current best distance.
When to use
Nearest neighbor in multi-dimensional space — "closest point", "k-nearest neighbors", range search in 2D/3D. Aviation: closest waypoint/airport lookup, collision avoidance in 3D airspace. See also: geohash_grid for grid-based spatial indexing.
Complexity
Build: O(n log n) Query: O(log n) average, O(n) worst case Space: O(n)
KDNode ¶
A node in a KD-tree.
Source code in src/algo/graphs/kd_tree.py
KDTree ¶
K-dimensional tree for nearest neighbor search.
tree = KDTree([(2, 3), (5, 4), (9, 6), (4, 7), (8, 1), (7, 2)]) tree.nearest((5, 5)) (5, 4)
Source code in src/algo/graphs/kd_tree.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |
"""Tests for KD-tree nearest neighbor search."""
import math
from algo.graphs.kd_tree import KDTree
class TestKDTree:
def test_nearest_basic(self) -> None:
points = [(2, 3), (5, 4), (9, 6), (4, 7), (8, 1), (7, 2)]
tree = KDTree(points)
assert tree.nearest((5, 5)) == (5, 4)
def test_exact_match(self) -> None:
points = [(1, 2), (3, 4), (5, 6)]
tree = KDTree(points)
assert tree.nearest((3, 4)) == (3, 4)
def test_single_point(self) -> None:
tree = KDTree([(10, 20)])
assert tree.nearest((0, 0)) == (10, 20)
def test_empty_tree(self) -> None:
tree = KDTree([])
assert tree.nearest((1, 2)) is None
def test_three_dimensions(self) -> None:
points = [(1, 0, 0), (0, 1, 0), (0, 0, 1)]
tree = KDTree(points)
assert tree.nearest((0.9, 0.1, 0.1)) == (1, 0, 0)
def test_range_search(self) -> None:
points = [(0, 0), (1, 0), (2, 0), (10, 10)]
tree = KDTree(points)
results = tree.range_search((0, 0), 1.5)
result_set = set(results)
assert (0, 0) in result_set
assert (1, 0) in result_set
assert (10, 10) not in result_set
def test_range_search_empty(self) -> None:
points = [(10, 10), (20, 20)]
tree = KDTree(points)
assert tree.range_search((0, 0), 1.0) == []
def test_nearest_brute_force_agreement(self) -> None:
"""Nearest result matches brute-force on a larger set."""
points = [(float(i), float(i % 7)) for i in range(50)]
tree = KDTree(points)
target = (12.3, 4.5)
result = tree.nearest(target)
assert result is not None
brute = min(points, key=lambda p: math.dist(p, target))
assert result == brute
Implement it yourself
Run: just challenge graphs kd_tree
Then implement the functions to make all tests pass.
Use just study graphs for watch mode.
Reveal Solution
kd_tree ¶
K-dimensional tree for nearest neighbor search.
Problem
Given a set of k-dimensional points, build a data structure that supports efficient nearest-neighbor queries.
Approach
Recursively partition points by cycling through dimensions, splitting on the median. For nearest-neighbor queries, traverse the tree pruning branches whose bounding hyperplane is farther than the current best distance.
When to use
Nearest neighbor in multi-dimensional space — "closest point", "k-nearest neighbors", range search in 2D/3D. Aviation: closest waypoint/airport lookup, collision avoidance in 3D airspace. See also: geohash_grid for grid-based spatial indexing.
Complexity
Build: O(n log n) Query: O(log n) average, O(n) worst case Space: O(n)
KDNode ¶
A node in a KD-tree.
Source code in src/algo/graphs/kd_tree.py
KDTree ¶
K-dimensional tree for nearest neighbor search.
tree = KDTree([(2, 3), (5, 4), (9, 6), (4, 7), (8, 1), (7, 2)]) tree.nearest((5, 5)) (5, 4)
Source code in src/algo/graphs/kd_tree.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |