“We just launched a 16TB archive of every dataset that has been available on data.gov since November. This will be updated day by day as new datasets appear. It can be freely copied, and we're sharing the code behind it to help others make their own archives of data they depend on.” Harvard Library Innovation Lab (via BlueSky)
https://lil.law.harvard.edu/blog/2025/02/06/announcing-data-gov-archive/
https://bsky.app/profile/harvardlil.bsky.social/post/3lhjzh7f54226
@molly0xfff make it a torrent so we can all share!!
@notanonymous26 @molly0xfff As someone who thinks BitTorrent is underrated and should be used more... I don't think it's a good fit for a giant dataset updated daily.
@nicolas17 @notanonymous26 @molly0xfff
The people who invented IPFS ~10 years ago were aiming at exactly that problem: https://en.wikipedia.org/wiki/InterPlanetary_File_System
@nicolas17 @notanonymous26 @molly0xfff I wish there was something like BitTorrent that could handle this, like Resilio Sync but open.