For anyone interested, what I was doing was creating a ring using #io_uring that I shared between say 20 threads. Then in each of those threads I opened a file, wrote to it, and closed the file. They were all doing this to the same file. About 1/3 of the time one of the write operations would just get lost and never complete. As far as I know I'm doing io_uring right, so unless there is a but in the rust library I'm using that I haven't found, Linux has a bug.
@zethra isn’t io_uring not designed for sharing between threads? Do you have synchronization around adding the SQEs?
@zethra very strange! A 1/3 failure rate seems pretty bad; I suppose most use-cases (that I've seen, at least) use a ring per thread, so that sort of concurrent use probably hasn't had as much attention paid to it...
@jamesnvc oh, something I didn't realize I was doing until after I posted that was all of the were doing this to the same file. If they're operating on different files this doesn't happen.
@zethra oh yeah...ugh, every time I've tried to deal with concurrent file access it's been a nightmare and I ultimately try to serialize writing. I don't know nearly enough about filesystem & kernel internals to guess exactly what's going wrong, but there certainly a quite a few things that could be 😬
Fosstodon is an English speaking Mastodon instance that is open to anyone who is interested in technology; particularly free & open source software.