fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

8.7K
active users

#datacompression

1 post1 participant0 posts today

🚀 C-Blosc2 2.19.0 is out!
We’ve added b2nd_expand_dims(), making it easy to add new dimensions to your b2nd arrays—perfect for evolving your data structures on the fly.
Big thanks to @lshaw8317 for the contribution! 🙏

Check out the release notes: github.com/Blosc/c-blosc2/blob

-Blosc2

A fast, compressed, persistent binary data store library for C. - Blosc/c-blosc2
GitHubc-blosc2/RELEASE_NOTES.md at main · Blosc/c-blosc2A fast, compressed, persistent binary data store library for C. - Blosc/c-blosc2

🚀 C-Blosc2 2.18.0 is out now!

✨ What's new:

* Introducing b2nd_concatenate() - now you can easily join b2nd arrays together!

* Fixed mmap files to flush modified pages only in write mode (thanks Jan Sellner!)

Get the full details: github.com/Blosc/c-blosc2/blob

A fast, compressed, persistent binary data store library for C. - Blosc/c-blosc2
GitHubc-blosc2/RELEASE_NOTES.md at main · Blosc/c-blosc2A fast, compressed, persistent binary data store library for C. - Blosc/c-blosc2

SIMD blog series: @folkertdev shows examples of using SIMD in the zlib-rs project.

Part 2 explains what to do when the compiler is not capable of using the SIMD capabilities of modern CPUs effectively. We end up with a basic, but very effective, example of a custom SIMD implementation beating the compiler.

tweedegolf.nl/en/blog/155/simd

@trifectatech

tweedegolf.nlSIMD in zlib-rs (part 2): compare256 - Blog - Tweede golfIn part 1 of the "SIMD in zlib-rs" series, we've seen that, with a bit of nudging, autovectorization can produce optimal code for some problems. But that does not always work: with SIMD clever pr ...

So @rl_dane introduced #bzip3 to me to use instead of #bzip2. Let's turn some bz2 files into bz3 to see the difference.

First example: 90k opus files

hey snips wake word dataset. It has ~90k opus files and a tar file of 3.1GB. bzip2 produces the same 3.1GB which is as expected. bzip3 created 3.0GB but used tons of computation power. Not worth the 100MB

Second example: Windows 7 virtual box VM image

Windows7.vdi it's Windows 7 VM image for the "special" days. I think I have to get rid of it. But while it is still there, let's see how each will perform. It is 16GB uncompressed. bzip2 -9 is 7.0GB. bzip3 is 6.3GB but at the expense of like 3x CPU time. Deleting all of them anyway. Down with Windows.

Third example: Pure XML text file

Pure XML file. It's Persian and English characters. Uncompressed is 1.7GB. bzip2 -9 is 276M while bzip3 is 260MB

Final example: Creating a simple bomb

So I did this:

dd if=/dev/zero of=./justzero bs=2G count=6

So now I have a 16GB with only zero bytes. bzip2 -9 is 672KB. bzip3 is 46KB.

Conclusion

Thank you @rl_dane

Real nice thing!

📣Python-Blosc2 3.3.3 is out! 🚀

This release brings bug fixes & optimizations, including improved string lazy expression chaining and a C-Blosc2 update fixing Windows mmap issues (thanks @JanSellner!).

More info: github.com/Blosc/python-blosc2

Get the latest: pip install blosc2 --update

GitHubReleases · Blosc/python-blosc2A high-performance library for compressed ndarrays, with a flexible computational engine - Blosc/python-blosc2