Still, I have huge respect for Phil Bagwell and I make heavy use of his hash array mapped trie. But this issue with RRB trees makes it impossible for me to use them, especially for an on-disk data structure whose long lifespan makes it way more likely that the problem will eventually happen.
If you have a relaxed root, you know the size of every subtree. If the the subtree is dense you can easily calculate the shift, meaning you get dense radix tree subnodes with fast lookup.
This gets you the benefits of a relaxed b tree, but no issues with accidentally having too deep shifts since they are relative to where the dense subtree starts.
This could be done to a rrb tree as well, which would mitigate the issues you are seeing, but would still lead to an arbitrarily deep tree if the concatenation isn't done properly.
The only tradeoff would be slower indexing on relaxed trees because more nodes would end up unbalanced, but at the same time concatenations, insertions and removals would probably be faster.
I will try to implement this.
Edit: no, seriously. This is a splendid idea. The b tree concat is faster than the rrb tree "full" merge. And simpler. And storing the tree subtree height in a byte somewhere means we can go from relaxed to dense and then do a proper radix tree search. (I have one of those rrb tree implementations that probably fails, btw. It has a proper concat, but the improper one is so much faster. I know. I am ashamed.)
I have already started. Not only is the concat faster (which means the derived operations in the rrb tree will be faster), but the derived operations can be done directly since the relaxation is going to be between 16 and 32 elements. A remove will just be a copy 16 out of 17 times.
This will mean that indexing might be slower in the b tree (it will be higher since the fill ratio is lower) part of the tree, and there might be slightly more of the relaxed nodes, but sweet jeebus this is a simplification.
A simplification that makes correctness easier and speeds up all other operations.setting the minimal amount of elements to 24 will still make concat faster, and will make memory waste almost the same (I can use c# inline arrays without feeling too bad).
I have both a fast b+tree and am Rrb tree. Why didn't I think of this before?!
https://rikspucko.koketteriet.se/bjoli/PersistentMap/src/bra...