On Mon, Sep 20, 2021 at 12:42:44PM +0100, Matthew Wilcox wrote: > On Mon, Sep 20, 2021 at 09:54:31AM +0100, Mel Gorman wrote: > > This has been lightly tested only and the testing was useless as the > > relevant code was not executed. The workload configurations I had that > > used to trigger these corner cases no longer work (yey?) and I'll need > > to implement a new synthetic workload. If someone is aware of a realistic > > workload that forces reclaim activity to the point where reclaim stalls > > then kindly share the details. > > The stereeotypical "stalling on I/O" problem is to plug in one of the > crap USB drives you were given at a trade show and simply > dd if=/dev/zero of=/dev/sdb > sync > The test machines are 1500KM away so plugging in a USB stick but worst comes to the worst, I could test it on a laptop. I considered using the IO controller but I'm not sure that would throttle background writeback. I dismissed doing this for a few reasons though -- the dirtying should be rate limited based on the speed of the BDI so it will not necessarily trigger the condition. It also misses the other interesting cases -- throttling due to excessive isolation and throttling due to failing to make progress. I've prototyped a synthetic case that uses 4..(NR_CPUS*4) workers. 1 worker measures mmap/munmap latency. 1 worker under fio is randomly reading files. The remaining workers are split between fio doing random write IO on separate files and anonymous memory hogs reading large mappings every 5 seconds. The aggregate WSS is approximately totalmem*2 split between 60% anon and 40% file-backed (40% to be 2xdirty_ratio). After a warmup period based on the writeback speed, it runs for 5 minutes per number of workers. The primary metric of "goodness" will be the mmap latency because it's the smallest worker that should be able to make quick progress and I want to see how much it is interfered with during reclaim. I'll be graphing the throttling times to see what processes get throttled and for how long. I was hoping though that there was a canonical realistic case that the FS people use to stress the paths where the allocator fails to return memory. While my synthetic workload *might* work to trigger the cases, I would prefer to have something that can compare this basic approach with anything that is more clever. Similarly, it would be nice to have a reasonable test case that phase changes what memory is hot while there is heavy IO in the background to detect whether the hot WSS is being properly protected. I used to use memcached and a heavy writer to simulate this but it's weak because there is no phase change so it's poor at evaluating vmscan. > You can also set up qemu to have extremely slow I/O performance: > https://serverfault.com/questions/675704/extremely-slow-qemu-storage-performance-with-qcow2-images > Similar problem to the slow USB case, it's only catching one part of the picture except now I have to worry about differences that are related to the VM configuration (e.g. pinning virtual CPUs to physical CPUs and replicating topology). Fine for a functional test, not so fine for measuring if the patch is any good performance-wise. -- Mel Gorman SUSE Labs