On 9/14/23 07:40, Jay Patel wrote: > On Thu, 2023-09-07 at 15:42 +0200, Vlastimil Babka wrote: >> On 8/24/23 12:52, Jay Patel wrote: >> How can increased fraction_size ever result in a lower order? I think >> it can >> only result in increased order (or same order). And the simulations >> with my >> hack patch don't seem to counter example that. Note previously I did >> expect >> the order to be lower (or same) and was surprised by my results, but >> now I >> realized I misunderstood the v4 patch. > > Hi, Sorry for late reply as i was on vacation :) > > You're absolutely > right. Increasing the fraction size won't reduce the order, and I > apologize for any confusion in my previous response. No problem, glad that it's cleared :) >> >> > 2) Have also seen reduction in overall slab cache numbers as >> > because of >> > increasing page order >> >> I think your results might be just due to randomness and could turn >> out >> different with repeating the test, or converge to be the same if you >> average >> multiple runs. You posted them for "160 CPUs with 64K Page size" and >> if I >> add that combination to my hack print, I see the same result before >> and >> after your patch: >> >> Calculated slab orders for page_shift 16 nr_cpus 160: >> 8 0 >> 1824 1 >> 3648 2 >> 7288 3 >> 174768 2 >> 196608 3 >> 524296 4 >> >> Still, I might have a bug there. Can you confirm there are actual >> differences with a /proc/slabinfo before/after your patch? If there >> are >> none, any differences observed have to be due to randomness, not >> differences >> in order. > > Indeed, to eliminate randomness, I've consistently gathered data from > /proc/slabinfo, and I can confirm a decrease in the total number of > slab caches. > > Values as on 160 cpu system with 64k page size > Without > patch 24892 slab caches > with patch 23891 slab caches I would like to see why exactly they decreased, given what the patch does it has to be due to getting a higher order slab pages. So the values of "<objperslab> <pagesperslab>" columns should increase for some caches - which ones and what is their <objsize>? >> >> Going back to the idea behind your patch, I don't think it makes >> sense to >> try increase the fraction only for higher-orders. Yes, with 1/16 >> fraction, >> the waste with 64kB page can be 4kB, while with 1/32 it will be just >> 2kB, >> and with 4kB this is only 256 vs 128bytes. However the object sizes >> and >> counts don't differ with page size, so with 4kB pages we'll have more >> slabs >> to host the same number of objects, and the waste will accumulate >> accordingly - i.e. the fraction metric should be independent of page >> size >> wrt resulting total kilobytes of waste. >> >> So maybe the only thing we need to do is to try setting it to 32 >> initial >> value instead of 16 regardless of page size. That should hopefully >> again >> show a good tradeoff for 4kB as one of the earlier versions, while on >> 64kB >> it shouldn't cause much difference (again, none at all with 160 cpus, >> some >> difference with less than 128 cpus, if my simulations were correct). >> > Yes, We can modify the default fraction size to 32 for all page sizes. > I've noticed that on a 160 CPU system with a 64K page size, there's a > noticeable change in the total memory allocated for slabs – it > decreases. > > Alright, I'll make the necessary changes to the patch, setting the > fraction size default to 32, and I'll post v5 along with some > performance metrics. Could you please also check my cleanup series at https://lore.kernel.org/all/20230908145302.30320-6-vbabka@xxxxxxx/ (I did Cc you there). If it makes sense, I'd like to apply the further optimization on top of those cleanups, not the other way around. Thanks! >> >> > > Anyway my point here is that this evaluation approach might be >> > > useful, even >> > > if it's a non-upstreamable hack, and some postprocessing of the >> > > output is >> > > needed for easier comparison of before/after, so feel free to try >> > > that out. >> > >> > Thank you for this details test :) >> > > BTW I'll be away for 2 weeks from now, so further feedback will >> > > have >> > > to come >> > > from others in that time... >> > > >> > Do we have any additional feedback from others on the same matter? >> > >> > Thank >> > >> > Jay Patel >> > > > Thanks! >> > > > -- >> > > > Hyeonggon >