On Thu, 2023-09-14 at 08:38 +0200, Vlastimil Babka wrote: > On 9/14/23 07:40, Jay Patel wrote: > > On Thu, 2023-09-07 at 15:42 +0200, Vlastimil Babka wrote: > > > On 8/24/23 12:52, Jay Patel wrote: > > > How can increased fraction_size ever result in a lower order? I > > > think > > > it can > > > only result in increased order (or same order). And the > > > simulations > > > with my > > > hack patch don't seem to counter example that. Note previously I > > > did > > > expect > > > the order to be lower (or same) and was surprised by my results, > > > but > > > now I > > > realized I misunderstood the v4 patch. > > > > Hi, Sorry for late reply as i was on vacation :) > > > > You're absolutely > > right. Increasing the fraction size won't reduce the order, and I > > apologize for any confusion in my previous response. > > No problem, glad that it's cleared :) > > > > > 2) Have also seen reduction in overall slab cache numbers as > > > > because of > > > > increasing page order > > > > > > I think your results might be just due to randomness and could > > > turn > > > out > > > different with repeating the test, or converge to be the same if > > > you > > > average > > > multiple runs. You posted them for "160 CPUs with 64K Page size" > > > and > > > if I > > > add that combination to my hack print, I see the same result > > > before > > > and > > > after your patch: > > > > > > Calculated slab orders for page_shift 16 nr_cpus 160: > > > 8 0 > > > 1824 1 > > > 3648 2 > > > 7288 3 > > > 174768 2 > > > 196608 3 > > > 524296 4 > > > > > > Still, I might have a bug there. Can you confirm there are actual > > > differences with a /proc/slabinfo before/after your patch? If > > > there > > > are > > > none, any differences observed have to be due to randomness, not > > > differences > > > in order. > > > > Indeed, to eliminate randomness, I've consistently gathered data > > from > > /proc/slabinfo, and I can confirm a decrease in the total number of > > slab caches. > > > > Values as on 160 cpu system with 64k page size > > Without > > patch 24892 slab caches > > with patch 23891 slab caches > > I would like to see why exactly they decreased, given what the patch > does it > has to be due to getting a higher order slab pages. So the values of > "<objperslab> <pagesperslab>" columns should increase for some caches > - > which ones and what is their <objsize>? yes correct, increase in page order for a slab cache will result in increasing values of "<objperslab> <pagesperslab>" I just check total numbers of slab cache, so let me check this values in details and will get back with objsize :) > > > > Going back to the idea behind your patch, I don't think it makes > > > sense to > > > try increase the fraction only for higher-orders. Yes, with 1/16 > > > fraction, > > > the waste with 64kB page can be 4kB, while with 1/32 it will be > > > just > > > 2kB, > > > and with 4kB this is only 256 vs 128bytes. However the object > > > sizes > > > and > > > counts don't differ with page size, so with 4kB pages we'll have > > > more > > > slabs > > > to host the same number of objects, and the waste will accumulate > > > accordingly - i.e. the fraction metric should be independent of > > > page > > > size > > > wrt resulting total kilobytes of waste. > > > > > > So maybe the only thing we need to do is to try setting it to 32 > > > initial > > > value instead of 16 regardless of page size. That should > > > hopefully > > > again > > > show a good tradeoff for 4kB as one of the earlier versions, > > > while on > > > 64kB > > > it shouldn't cause much difference (again, none at all with 160 > > > cpus, > > > some > > > difference with less than 128 cpus, if my simulations were > > > correct). > > > > > Yes, We can modify the default fraction size to 32 for all page > > sizes. > > I've noticed that on a 160 CPU system with a 64K page size, there's > > a > > noticeable change in the total memory allocated for slabs – it > > decreases. > > > > Alright, I'll make the necessary changes to the patch, setting the > > fraction size default to 32, and I'll post v5 along with some > > performance metrics. > > Could you please also check my cleanup series at > > https://lore.kernel.org/all/20230908145302.30320-6-vbabka@xxxxxxx/ > > (I did Cc you there). If it makes sense, I'd like to apply the > further > optimization on top of those cleanups, not the other way around. > > Thanks! > I've just gone through that patch series,and yes we can adjust the fraction size related change within that series :) > > > > > > > > Anyway my point here is that this evaluation approach might > > > > > be > > > > > useful, even > > > > > if it's a non-upstreamable hack, and some postprocessing of > > > > > the > > > > > output is > > > > > needed for easier comparison of before/after, so feel free to > > > > > try > > > > > that out. > > > > > > > > Thank you for this details test :) > > > > > BTW I'll be away for 2 weeks from now, so further feedback > > > > > will > > > > > have > > > > > to come > > > > > from others in that time... > > > > > > > > > Do we have any additional feedback from others on the same > > > > matter? > > > > > > > > Thank > > > > > > > > Jay Patel > > > > > > Thanks! > > > > > > -- > > > > > > Hyeonggon