fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

10K
active users

After a long period of quiet, I have released an update to the `unicode-age` #Python package

pypi.org/project/unicode-age/

The package now supports #Unicode 16.0

When I wrote `unicode-age` I just sorta felt like writing it in Cython as a fun exercise, but upon reflection (and naturally, immediately after updating it), I'm wondering if it can be converted into a pure Python module.

The main waste of parsing ages into `list[int | None]` is that it ignores the span-oriented nature of DerivedAge.txt

A quick sketch suggests that the in-memory representation of the span information as `list[tuple[int, int, int, int]]` is ~300 KiB worth. That's ~10x the Cython approach (mostly because CPython's integers are >=24 bytes worth), but still pretty small.

We'll see.

I'll file an issue about it for the next update and forget about it until Fall (or whenever Unicode 16.1 would be if there will be one)

I may actually write up the pure Python implementation (which will be my first serious use of `struct.Struct`!), open a PR

@graingert maybe, but then I'd still have Cython in my life which isn't really worth it for this project

@graingert might be slightly less of a bother but I'm trying to move *away* from an extension module here, not tweak how I get it