fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

10K
active users

Alex Bradbury

I wrote a little blog post on storing data in pointers - initially motivated by documenting the impact of >48-bit virtual addresses on such tricks and the various available hardware pointer masking schemes muxup.com/2023q4/storing-data- Any corrections or additional notes, do let me know!

MuxupStoring data in pointersSome notes on storing data in pointers and the impact of >48-bit virtual addresses

@asb Regarding "top byte ignore" and such things, on x86 you can also use SIB addressing to do something similar portably, e.g. [8*rax] to load from an 8-byte aligned address which effectively ignores the top 3 bits of the value in rax which can therefore be used for pointer tagging.

@asb Probably mostly a curiosity but if we also assume that a high address bit of 1 is reserved for kernel space addresses and can be used without clashing with tagged addresses then you can recruit all of the modes [rax], [2*rax], [4*rax] and [8*rax] with a variable length tagging scheme. If the bit pattern is 0b0... it encodes a 1-byte aligned address. Similarly, 0b10... for 2-byte aligned, 0b110... for 4-byte aligned and 0b1110... for 8-byte aligned.

@asb That kind of tagging where you used multiple alignments would presumably only be useful if you have some kind of type tagging scheme where those size alignments naturally correspond to different type tags. Again, probably mostly a curiosity.

@pervognsen I've tried to work this idea into the text - thanks!

@asb i would note that for low-bits tagging you often don’t need to mask off the bits: if you know what they are you just subtract them, which is free for immediate offsets. allows e.g. [rax+8] for untagged to turn into [rax+7] for tagged

@wingo thanks! I've added a note to mention this

@asb The jump from 24 to 32-bit virtual addresses produced quite a stir in the old Motorola-based Macs and early ARM-based Acorn systems. Reading through the top of this article filled me with the same old dread, and yet I still find myself uttering the traditional words of hubris:

"128 terabytes per process should be enough for everyone."

@mcmartin @asb I was gonna point out the same thing! On the original 68K Mac the use of the upper 8 bits (of 32) as flags, because we “only need 24 bits for addressing memory”, is a famous mea culpa from that era.

macgui.com/news/article.php?t=

@asb The Linux kernel has a hashtable for its directory entry cache where the buckets are linked lists and the lower bit of each bucket's head pointer is a spin lock, see the comment at the top of git.kernel.org/pub/scm/linux/k

git.kernel.orglist_bl.h « linux « include - kernel/git/torvalds/linux.git - Linux kernel source tree

@vegard Thanks! I've added a note on that to the article

@asb Nice explanation and overview of examples! Thank you for sharing.

Since you asked for more examples: Øyvind Kolås has experimented with using the least significant bit trick for string interning of ultra-short strings in the pointer to the string itself. He describes his findings here: squoze.org/

squoze.orgsquoze - unicode encodings for pointers and hashes

@vanderZwan I probably can't mention all the cases using LSB for a space efficient tagged union, but but this is a nice one and I didn't do a great job of highlighting Objective-C's similar optimisation - so I've added this one in. Cheers!

@asb oh wow the XOR linked list WP article you linked has another link to a XOR binary tree, and now I'm worried this will turn into a rabbit hole

@asb Can you talk about how this interacts with pointer provenance and whether it interferes with compiler optimizations?

@esoterra a truly excellent question that I wish I could more confidently answer! I believe these kind of manipulations are supported in the proposed pointer provenance models (see page 23 here open-std.org/jtc1/sc22/wg14/ww). LLVM also gained an llvm.ptrmask intrinsic in the past few years which means pointers can be masked without a ptrtoint and inttoptr round-trip, making making it less likely to trip up analyses.

@asb I have found that these sort of tricks are usually avoidable. However, these tricks can be kind of necessary when working with lock-free algorithms. You need bit hacking to fit in ABA tags as many architecture just don't offer wide enough atomic operations.

You can also use indexes/side tables but it gets messy.

struct node {
_Atomic uint8_t next;
};

struct info;
struct info infos[];

struct info {
uint64_t aba;
struct node *node;
};

Then preallocate the nodes at startup.

@asb The big problem with side tables is memory management gets icky. It really helps if you already have a garbage collector or are able to preallocate things.

@asb also bit fiddling pointers is difficult to formally verify so I had to do a side table approach with preallocation to get it to sort of work in Ada Spark.

@asb @federicomena Glad to see Mike Ash's article on Obj-C's tagged pointers mentioned. I remember first reading that article a long time ago and feeling a mix of horror and awe at what I thought were clear boundaries between metadata and program data shatter in front of my face. I think that was the main thing that led me to understand the length programmers go through to extract every last bit of performance from computers.