On Sun, Jun 28, 2020 at 9:04 AM <alexandrebfarias@xxxxxxxxx> wrote: > I'm willing to perform further testing. There shouldn't be anything very special about my workload. I was working mostly with NodeJS 12 and React Native. VS Code (I should mention I make use of TabNine, which can be a huge drain on system resources). So, in a typical work session I'd have android emulator open, PostgreSQL, some chrome tabs, VS Code, probably Emacs, plus the React Native metro server and an Express.js backend. Databases and VM images are things btrfs is bad at out of the box. Most of this has to do with fsync dependency of other file systems. Btrfs is equipped to deal with an fsync heavy world out of the box, using treelog enabled by default. But can still be slow for some workloads. What I think is going on in your case: Btrfs by default is copy-on-write for everything. If the workload involves heavy writes on a small volume [1], the SSD has no hinting about deallocated blocks. XFS will overwrite, that is the hint the SSD firmware needs to erase those blocks to prepare them for fast writes. While you can turn copy-on-write off for data, it's always copy-on-write for metadata. And this is a metadata heavy workload you're describing. I don't think your SSD is bad, I think it's just (a) small for the workload and (b) not getting any hints about what's freed up for it to prepare for future writes. The SSD is trying to erase blocks right at the moment of allocation - super slow for any SSD to do that. Also the workload implies a lot of fsyncs. Other file systems need them, Btrfs really doesn't. But it's an fsync dominate world, so it has to fit in. And while it has some optimizations for it, it can still be slower than XFS. How to address this? Stick with what's working. Use XFS. This is also consistent with Facebook's workloads still on XFS. But if you really wanna give btrfs a shot at your workload. There are three possible optimizations: 1. Mount option space_cache=v2 (this will be the default soon), discard=async. This might fix most of the problem. If I'm correct that the SSD is just inundated, this will give it the hint it needs to prepare blocks for fast writes. But not so aggressively that the hints themselves slow things down. (It's a fine line between getting no hints, and a fire hose of them. discard=async is in between.) 2. Mount option flushoncommit (you'll get benign, but annoying, WARNONS in dmesg) And use fsync = off in postgresql.conf (really everywhere you can) Note: if you get a crash you'll lose the last ~30s of commits, but the database and the file system are expected to be consistent. The commit interval is configurable, defaults to 30s. I suggest leaving it there for testing. It is mainly a risk vs performance assessment, as to why you'd change it. 3. VM images have two schools of thought, depending on your risk tolerance. A. nodatacow: (chattr +C). Use with cache=writeback. flushoncommit isn't necessary. B. datacow: Use with compression (mount option or chattr +c). Use with cache=unsafe. flushoncommit highly recommended. Note 1: chattr +C/+c needs to be set at the time of the file's creation, it won't work after the fact. Set it on the containing directory before copying one over. Note 2: Invariably you will prefer the performance of B. But obviously it's not going to be a default configuration, because unsafe basically drops fsyncs, and flushoncommit does WARNON spamming. And yeah, how would anyone know all of this? And is it an opportunity for docs (probably) or desktop integration? Detect this workload or ask the user? I'm not sure. [1] From your email, the kickstart shows > part btrfs.475 --fstype="btrfs" --ondisk=sda --size=93368 93G is likely making things worse for your SSD. It's small for this workload. Chances are if it were bigger, it'd cope better by effectively being over provisioned, and it'd more easily get erase blocks ready. But discard=async will mitigate this. -- Chris Murphy _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx