On Fri, Aug 23, 2019 at 08:39:59AM +1000, Dave Chinner wrote: > On Thu, Aug 22, 2019 at 09:40:17AM -0400, Brian Foster wrote: > > On Thu, Aug 22, 2019 at 07:14:52AM +1000, Dave Chinner wrote: > > > On Wed, Aug 21, 2019 at 09:35:33AM -0400, Brian Foster wrote: > > > > On Wed, Aug 21, 2019 at 06:38:19PM +1000, Dave Chinner wrote: > > > > kmem_alloc_io() interface) to skip further kmem_alloc() calls from this > > > > path once we see one unaligned allocation. That assumes this behavior is > > > > tied to functionality that isn't dynamically configured at runtime, of > > > > course. > > > > > > vmalloc() has a _lot_ more overhead than kmalloc (especially when > > > vmalloc has to do multiple allocations itself to set up page table > > > entries) so there is still an overall gain to be had by trying > > > kmalloc even if 50% of allocations are unaligned. > > > > > > > I had the impression that this unaligned allocation behavior is tied to > > enablement of debug options that otherwise aren't enabled/disabled > > dynamically. Hence, the unaligned allocation behavior is persistent for > > a particular mount and repeated attempts are pointless once we see at > > least one such result. Is that not the case? > > The alignment for a given slab is likely to be persistent, but it > will be different for different sized slabs. e.g. I saw 128 offsets > for 512 slabs, and 1024 byte offsets for 4k slabs. The 1024 byte > offsets worked just fine (because multiple of 512 bytes!) but the > 128 byte ones didn't. > Interesting. I wasn't aware offsets could be large enough to maintain alignment requirements. > Hence it's not a black and white "everythign is unaligned and > unsupportable" situation, nor is the alignment necessarily an issue > for the underlying driver. e.g. most scsi and nvme handle 8 > byte alignment of buffers, and if the buffer is not aligned they > bounce it (detected via the same blk_rq_alignment() check I added) > and can still do the IO anyway. So a large amount of the IO stack > just doesn't care about user buffers being unaligned.... > > > Again, I don't think performance is a huge deal so long as testing shows > > that an fs is still usable with XFS running this kind of allocation > > pattern. > > It's added 10 minutes to the runtime of a full auto run with KASAN > enabled on pmem. To put that in context, the entire run took: > > real 461m36.831s > user 44m31.779s > sys 708m37.467s > > More than 7.5 hours to complete, so ten minutes here or there is > noise. > Agreed. Thanks for the data. > > In thinking further about it, aren't we essentially bypassing > > these tools for associated allocations if they don't offer similar > > functionality for vmalloc allocations? > > kasan uses guard pages around vmalloc allocations to detect out of > bound accesses. It still tracks the page allocations, etc, so we > still get use after free tracking, etc. i.e. AFAICT we don't > actually lose any debugging functonality by using vmalloc here. > Ok. > > It might be worth 1.) noting that > > as a consequence of this change in the commit log and 2.) having a > > oneshot warning somewhere when we initially hit this problem so somebody > > actually using one of these tools realizes that enabling it actually > > changes allocation behavior. For example: > > > > XFS ...: WARNING: Unaligned I/O memory allocation. VM debug enabled? > > Disabling slab allocations for I/O. > > > > ... or alternatively just add a WARN_ON_ONCE() or something with a > > similar comment in the code. > > Well, the WARN_ON_ONCE is in xfs_bio_add_page() when it detects an > invalid alignment. So we'll get this warning on production systems > as well as debug/test systems. I think that's the important case to > catch, because misalignment will result in silent data corruption... > That's in the subsequent patch that I thought was being dropped (or more accurately, deferred until Christoph has a chance to try and get it fixed in the block layer..)..? Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx