I'm probably never going to write the actual article I'd originally intended these charts for. But if you want to see the difference between latencies between #OpenZFS and #btrfs on an eight-drive system that's creating and replicating automated snapshots regularly, here ya go.
We're looking at fio random access, limited to (simultaneous) 8MiB/sec read and 23.0MiB/sec write. The system has eight 12TB Ironwolf rust drives, in four ZFS mirrors vs one eight-wide btrfs-raid1.
In each case, the system is creating and replicating snapshots regularly.
Most of the latency deltas you're seeing come from the snapshot/replication tasks. Without those, you do still get ZFS advantages—but not this catastrophically severe.
Note that #btrfs is displaying latencies two and sometimes THREE orders of magnitude higher than #ZFS across a disturbing amount of the range of results. This is not just an issue at the absolute fastest or slowest ends of the scale, this is... normality.
You might very reasonably ask "what about btrfs-raid10?" especially considering that it's the closest-to-sane multi-drive #btrfs topography.
Well, it's just the tiniest bit slower than btrfs-raid1 (in latency terms) in my testing, but the shape of the graph is unchanged.
This is the fio control file I used for both ZFS and btrfs. (Hint: if you want the raw text, check the ALT tag; you can copy and paste from there.)
Snapshots are taken every five seconds; replication happens every twenty.
Adding insult to injury, in the roughly 45-minute runtime of each test, ZFS replicated 589 snapshots of 593 taken... btrfs only replicated 186, of 374 taken.
These tests were performed in Jan 2021, using the then-current HWE kernel for Ubuntu 20.04 (kernel v5.8).
The tests and charts demonstrate problems I first noticed *in production* seven years earlier, in Jan 2014.
@jimsalter the way every kernel seems to have some btrfs work in it, it would be cool to see that graph move over time.
@keyboardg personal gripe, as somebody who's followed the filesystem for more than a decade: it would be nice to see a LOT of btrfs-related things actually moving over time. But none of the things I really care about ever seem to.
@jimsalter Excuse my ignorance, but what does the x-axis stand for? Percentage of how full the filesystem is?
@ojs no pardon necessary, thank you for asking!
This is a range of fio latency results on a long-running test. What you're looking at is a line of individual data points running from best result (lowest latency) on the left, at x=0%, to worst result (highest latency) on the right, at x=100%.
@ojs so, let's say a graph like this showed two overlapping lines, but from 0%<x<5%, one line was higher. That would mean the two systems perform equivalently 95% of the time, but one block out of 20 is faster for the lower line... which probably doesn't much matter, since those are the fastest results anyway.
If you see the same at 95%<x<100%, that means one system is slower than the other on the worst-case 5%. This is more likely to be significant since the difference is where the pain lives.
@ojs what we're seeing here is much worse than either of those cases. These are log-scale graphs, meaning each major line on the Y axis represents an increase of 10x.
Take the read latency chart: for roughly 40% of the ENTIRE range, btrfs is 10x OR MORE slower to return each block than ZFS is. This is not a minor issue, it shows you a massive degradation that you will experience constantly with a similar workload.
@ojs moving on, because this is a rate-limited workload, that means we're not seeing how each system performs under the worst possible conditions on an unreasonably heavy workload: we're seeing how it operates with a REASONABLE workload that the hardware is more than capable of sustaining.
@jimsalter Fantastic, thanks for the clarification