k-d Tree Data Structure

A k-d Tree Data Structure is a tree-based data structure for organizing points in a k-dimensional space.

AKA: k-dimensional Tree, Multidimensional Binary Search Tree.
Context:
- It can be defined as a binary tree where every node is a k-dimensional point.
- It can (typically) be a Space Partitioning Data Structure
Example(s):
- scipy.spatial.KDTree
- an Adaptive k-d Tree.
- ...
- …
Counter-Example(s):
- Octree,
- Quadtree,
- Regression Tree.
See: Tree Data Structure, Euclidean Space, Nearest Neighbor Search, Binary Space Partitioning, KD-Tree Clustering Algorithm.

References

2018

(Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/K-d_tree#Informal_description Retrieved:2018-8-5.
- The k-d tree is a binary tree in which every node is a k-dimensional point. Every non-leaf node can be thought of as implicitly generating a splitting hyperplane that divides the space into two parts, known as half-spaces. Points to the left of this hyperplane are represented by the left subtree of that node and points to the right of the hyperplane are represented by the right subtree. The hyperplane direction is chosen in the following way: every node in the tree is associated with one of the k-dimensions, with the hyperplane perpendicular to that dimension's axis. So, for example, if for a particular split the "x" axis is chosen, all points in the subtree with a smaller "x" value than the node will appear in the left subtree and all points with larger "x" value will be in the right subtree. In such a case, the hyperplane would be set by the x-value of the point, and its normal would be the unit x-axis.

2014

(Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/k-d_tree Retrieved:2014-3-20.
- In computer science, a k-d tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). k-d trees are a special case of binary space partitioning trees.

1975

(Bentley, 1975) ⇒ Jon Louis Bentley (1975). "Multidimensional binary search trees used for associative searching". Communications of the ACM, 18(9), 509-517. DOI:10.1145/361002.361007.
- QUOTE: If a file is represented as a k-d tree, then each record in the file is stored as a node in the tree. In addition to the k keys which comprise the record, each node contains two pointers, which are either null or point to another node in the k-d tree. (Note that each pointer can be considered as specifying a subtree.) Associated with each node, though not necessarily stored as a field, is a discriminator, which is an integer between 0 and [math]\displaystyle{ k -1 }[/math], inclusive. Let us define a notation for these items: the k keys of node P will be called [math]\displaystyle{ K_o(P),\;\cdots, K_{k-I}(P) }[/math], the pointers will be LOSON(P) and HISON(P), and the discriminator will be DISC(P). The defining order imposed by a k-d tree is this: For any node P in a k-d tree, let j be DISC(P), then for any node Q in LOSON(P), it is true that [math]\displaystyle{ K_j(Q) \lt K_j(P) }[/math]; likewise, for any node R in HISON(P), it is true that [math]\displaystyle{ K_j(R) \gt K_j(P) }[/math]. (This statement does not take into account the possibility of equality of keys, which will be discussed shortly.)
  All nodes on any given level of the tree have the same discriminator. The root node has discriminator 0, its two sons have discriminator 1, and so on to the k-th level on which the discriminator is k - 1 ; the (k + 1)-th level has discriminator 0, and the cycle repeats. In general, NEXTDISC is a function defined as
  [math]\displaystyle{ NEXTDISC(i) = (i + 1) \mod k }[/math],
  and NEXTDISC(DISC(P)) = DISC(LOSON(P)), and likewise for HISON(P) (if they are non-null). Figure 1 gives an example of records in 2-space stored as nodes in a 2-d tree.


Fig.1: Records in 2-space stored as nodes in a 2-d tree (boxes represent range of subtree).	Planar graph representation of the same 2-d tree (LOSON's are expressed by left branches, HISON's by right branches, and null sons by boxes.

(...) First, k-d trees are intended primarily for use in a 1-level store, but using secondary storage (such as disk or drum memory) for overflow might be acceptable if there were few additions and deletions from the file. This factor is becoming less of a restriction as memories rapidly decrease in cost and increase in size. Second, it is necessary for k-d trees to have some minimum number of nodes before they become useful. For example, if the deepest node in a 10-d tree is on the 9th level, one of the keys will never have been used as a discriminator as the tree was being built. A good rule of thumb might be to use k-d trees only if [math]\displaystyle{ n \gt 2^{2k} }[/math].

k-d Tree Data Structure

References

2018

2014

1975

Navigation menu

Search