Tree Path Majority Data Structures

Travis Gagie, Meng He, and Gonzalo Navarro

We present the first solution to t-majorities on tree paths. Given a tree of n nodes, each with a label from [1..s], and a fixed threshold 0 <t<1, such a query gives two nodes u and v and asks for all the labels that appear more than t * |P_uv| times in the path P_uv from u to v, where |P_uv| denotes the number of nodes in P_uv. Note that the answer to any query is of size up to 1/t. On a w-bit RAM, we obtain a linear-space data structure with O((1/t) log* n log log_w s) query time. For any k > 1, we can also build a structure that uses O(n log[r] n) space, where log[r] n denotes the function that applies logarithm r times to n, and answers queries in time O((1/t) log log_w s). The construction time of both structures is O(n log n). We also describe two succinct-space solutions with the same query time of the linear-space structure. One uses 2nH + 4n + o(n)(H+1) bits, where H <= lg s is the entropy of the label distribution, and can be built in O(n log n) time. The other uses nH + O(n) + o(nH) bits and is built in O(n log n) randomized time.