Fosstodon @fosstodon

Recent searches

Search options

Only available when logged in.

Alaric Snell-Pym @kitten_tech@fosstodon.org

Ok, a question for #database #dbms nerds. Many years ago I worked on a join order optimizer that used a heuristic formula to estimate the number of rows produced by an equijoin, based on the number of rows in the two tables being joined and the number of distinct values in the join keys. This formula came from an academic paper, but I can't remember what the formula was or find the paper. Does this ring a bell for anyone? Boosts appreciated...

Feb 18, 2025, 05:59 PM·

18boosts·5favorites

**Alaric Snell-Pym** @kitten_tech · Feb 18

Feb 18

Alaric Snell-Pym @kitten_tech

I think the formula involved (some large number) to the power of 1-1/(another large number). Whoever originally implemented it was not familiar with floating point quirks, and computed 1-1÷(large number) directly, producing a value that was invariably 1 so our join plans were always terrible. I rewrote it using C's log1p function from math.h, and fixed another even more terrible bug, and suddenly we produced sensible join plans!

**Alaric Snell-Pym** @kitten_tech · Feb 18

Feb 18

Alaric Snell-Pym @kitten_tech

...the other even more terrible bug was that they seemingly didn't know math.h had a pow() function so tried to do an exponentiation through repeated multiplication. But that took too long so they truncated it after 1000 iterations of the loop.

**Ashok Argent–Katwala** @ashok@mstdn.ca · Feb 18

Feb 18

Ashok Argent–Katwala @ashok@mstdn.ca

@kitten_tech Only skimmed the PDF, but possibly “A statistical model for estimating the number of records in a relational database” by
To-Yat Cheung? (1981/82)

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back