Another fun #debugging adventure, this time with a happy ending.
A unit test in libXmu failed on x86-32 (https://gitlab.freedesktop.org/xorg/lib/libxmu/-/issues/2). I looked for the typical things first like bad casts but didn't see anything wrong.
I noticed that the unit test program runs two subtests and the log indicates that it completed the first successfully before crashing, but when I ran it under gdb the back trace showed it in the first unit test when the segfault occurred. Very strange.
Single stepping in #gdb, I ultimately came across a longjmp() call (apparently libXt handles errors this way?), and it was this call that triggered the failure.
Turns out each of the subtests checks some exceptional case and expects a function call to fail by longjmp()'ing, but only the first unit test actually prepared for the jump with a call to setjmp().
As a result, when the second subtest triggered its own longjmp() it jumped to the first subtest's function that had already completed!
@mattst88 I wish I knew the official way to clear the jmpenv before exiting the function makes it invalid so that this would have been a crash at the longjmp() instead of a corrupted stack, but I've not found one documented yet.
@mattst88 @keithp For those following this, we took the discussion to the libc-coord mailing list: https://www.openwall.com/lists/libc-coord/2024/04/