An Improved Succinct Representation for Dynamic k-ary Trees

Diego Arroyuelo

$k$-ary trees are a fundamental data structure in many text-processing algorithms (e.g., text searching). The traditional pointer-based representation of trees is space consuming, and hence only relatively small trees can be kept in main memory. Nowadays, however, many applications need to store a huge amount of information. In this paper we present a \emph{succinct} representation for dynamic $k$-ary trees of $n$ nodes, requiring space close to the \emph{information-theoretic lower bound}. Our data structure requires $2n + n\log{k} + o(n\log{k})$ bits of space, and unlike alternative representations where the operations on the tree can be usually computed in $O(\log{n})$ time, our data structure is able to take advantage of asymptotically smaller values of $k$, since it solves the basic operations $\mathsf{parent}$ and $\mathsf{child}$ in $O(\log{k}+\log\log{n})$ time, which is $o(\log{n})$ time whenever $\log{k} = o(\log{n})$ holds. Insertions and deletions of leaves in the tree are supported in $O((\log{k}+\log\log{n})(1+\frac{\log{k}}{\log{(\log{k} + \log\log{n})}}))$ amortized time. Our representation also allows us to compute more specialized operations (like $\mathsf{subtreesize}$, $\mathsf{degree}$, $\mathsf{depth}$, etc.) in $O(\log{k}+\log\log{n})$ time, and provides a new trade-off when $k=O(1)$ (e.g., binary trees) allowing basic operations in $O(\log\log{n})$ time (versus $O(1)$ time of [Raman et al., ICALP'03]) and updates in $O(\log\log{n})$ amortized time (versus $O((\log\log{n})^{1+\epsilon})$ amortized time of Raman et al., for $\epsilon > 0$).