On Mon, 9 Oct 2023, Dave Chinner wrote: > On Thu, Oct 05, 2023 at 10:35:33PM -0700, Hugh Dickins wrote: > > On Thu, 5 Oct 2023, Dave Chinner wrote: > > > > > > Hmmmm. IIUC, this only works for addition that approaches the limit > > > from below? > > > > That's certainly how I was thinking about it, and what I need for tmpfs. > > Precisely what its limitations (haha) are, I'll have to take care to > > spell out. > > > > (IIRC - it's a while since I wrote it - it can be used for subtraction, > > but goes the very slow way when it could go the fast way - uncompared > > percpu_counter_sub() much better for that. You might be proposing that > > a tweak could adjust it to going the fast way when coming down from the > > "limit", but going the slow way as it approaches 0 - that would be neat, > > but I've not yet looked into whether it's feasily done.) Easily done once I'd looked at it from the right angle. > > > > > > > > So if we are approaching the limit from above (i.e. add of a > > > negative amount, limit is zero) then this code doesn't work the same > > > as the open-coded compare+add operation would? > > > > To it and to me, a limit of 0 means nothing positive can be added > > (and it immediately returns false for that case); and adding anything > > negative would be an error since the positive would not have been allowed. > > > > Would a negative limit have any use? There was no reason to exclude it, once I was thinking clearly about the comparisons. > > I don't have any use for it, but the XFS case is decrementing free > space to determine if ENOSPC has been hit. It's the opposite > implemention to shmem, which increments used space to determine if > ENOSPC is hit. Right. > > > It's definitely not allowing all the possibilities that you could arrange > > with a separate compare and add; whether it's ruling out some useful > > possibilities to which it can easily be generalized, I'm not sure. > > > > Well worth a look - but it'll be easier for me to break it than get > > it right, so I might just stick to adding some comments. > > > > I might find that actually I prefer your way round: getting slower > > as approaching 0, without any need for specifying a limit?? That the > > tmpfs case pushed it in this direction, when it's better reversed? Or > > that might be an embarrassing delusion which I'll regret having mentioned. > > I think there's cases for both approaching and upper limit from > before and a lower limit from above. Both are the same "compare and > add" algorithm, just with minor logic differences... Good, thanks, you've saved me: I was getting a bit fundamentalist there, thinking to offer one simplest primitive from which anything could be built. But when it came down to it, I had no enthusiam for rewriting tmpfs's used_blocks as free_blocks, just to avoid that limit argument. > > > > Hence I think this looks like a "add if result is less than" > > > operation, which is distinct from then "add if result is greater > > > than" operation that we use this same pattern for in XFS and ext4. > > > Perhaps a better name is in order? > > > > The name still seems good to me, but a comment above it on its > > assumptions/limitations well worth adding. > > > > I didn't find a percpu_counter_compare() in ext4, and haven't got > > Go search for EXT4_FREECLUSTERS_WATERMARK.... Ah, not a percpu_counter_compare() user, but doing its own thing. > > > far yet with understanding the XFS ones: tomorrow... > > XFS detects being near ENOSPC to change the batch update size so > taht when near ENOSPC the percpu counter always aggregates to the > global sum on every modification. i.e. it becomes more accurate (but > slower) near the ENOSPC threshold. Then if the result of the > subtraction ends up being less than zero, it takes a lock (i.e. goes > even slower!), undoes the subtraction that took it below zero, and > determines if it can dip into the reserve pool or ENOSPC should be > reported. > > Some of that could be optimised, but we need that external "lock and > undo" mechanism to manage the reserve pool space atomically at > ENOSPC... Thanks for going above and beyond with the description; but I'll be honest and admit that I only looked quickly, and did not reach any conclusion as to whether such usage could or should be converted to percpu_counter_limited_add() - which would never take any XFS locks, of course, so might just end up doubling the slow work. But absolutely I agree with you, and thank you for pointing out, how stupidly useless percpu_counter_limited_add() was for decrementing - it was nothing more than a slow way of doing percpu_counter_sub(). I'm about to send in a 9/8, extending it to be more useful: thanks. Hugh