On Fri, May 31, 2019 at 01:11:36PM -0400, Brian Foster wrote: > On Sun, May 26, 2019 at 08:43:17AM +1000, Dave Chinner wrote: > > Most of the cases I've seen have had the same symptom - "skip to > > next AG, allocate at same high-up-in AGBNO target as the previous AG > > wanted, then allocate backwards in the same AG until freespace > > extent is exhausted. It then skips to some other freespace extent, > > and depending on whether it's a forwards or backwards skip the > > problem either goes away or continues. This is not a new behaviour, > > I first saw it some 15 years ago, but I've never been able to > > provoke it reliably enough with test code to get to the root > > cause... > > > > I guess the biggest question to me is whether we're more looking for a > backwards searching pattern or a pattern where we split up a larger free > extent into smaller chunks (in reverse), or both. I can definitely see > some isolated areas where a backwards search could lead to this > behavior. E.g., my previous experiment to replace near mode allocs with > size mode allocs always allocates in reverse when free space is > sufficiently fragmented. To see this in practice would require repeated > size mode allocations, however, which I think is unlikely because once > we jump AGs and do a size mode alloc, the subsequent allocs should be > near mode within the new AG (unless we jump again and again, which I > don't think is consistent with what you're describing). > > Hmm, the next opportunity for this kind of behavior in the near mode > allocator is probably the bnobt left/right span. This would require the > right circumstances to hit. We'd have to bypass the first (cntbt) > algorithm then find a closer extent in the left mode search vs. the > right mode search, and then probably repeat that across however many > allocations it takes to work out of this state. > > If instead we're badly breaking up an extent in the wrong order, it > looks like we do have the capability to allocate the right portion of an > extent (in xfs_alloc_compute_diff()) but that is only enabled for non > data allocations. xfs_alloc_compute_aligned() can cause a similar effect > if alignment is set, but I'm not sure that would break up an extent into > more than one usable chunk. This is pretty much matches what I've been able to infer about the cause, but lacking a way to actually trigger it and be able to monitor the behviour in real time is where I've got stuck on this. I see the result in aged, fragmented filesystems and can infer how it may have occurred, but can't cause it to occur on demand... > In any event, maybe I'll hack some temporary code in the xfs_db locality > stuff to quickly flag whether I happen to get lucky enough to reproduce > any instances of this during the associated test workloads (and if so, > try and collect more data). *nod* Best we can do, I think, and hope we stumble across an easily reproducable trigger... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx