fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

10K
active users

Cam

I've been wanting to try out for awhile now, and with some of the tinkering I've done at work, I finally had an excellent use case for it.
Its an opinionated implementation of splitting documents as well as some post processors. For cleaning and splitting, I've clocked it at between 40 and 75x faster than the python implementation, and on my machine it can clean and split 25,000 documents in a second.

Check it out at github.com/cam-barts/rs_docume

GitHubGitHub - cam-barts/rs_document: A opinionated Rust implementation of various common functions of LangChain's Document model as well as Unstructured.io's post processors.A opinionated Rust implementation of various common functions of LangChain's Document model as well as Unstructured.io's post processors. - GitHub - cam-barts/rs_document: A opinionated Rus...