On 07.08.20 09:08, Pekka Enberg wrote: > Hi Cho and David, > > On Mon, Aug 3, 2020 at 10:57 AM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >> On 03.08.20 08:10, pullip.cho@xxxxxxxxxxx wrote: >>> From: Cho KyongHo <pullip.cho@xxxxxxxxxxx> >>> >>> LPDDR5 introduces rank switch delay. If three successive DRAM accesses >>> happens and the first and the second ones access one rank and the last >>> access happens on the other rank, the latency of the last access will >>> be longer than the second one. >>> To address this panelty, we can sort the freelist so that a specific >>> rank is allocated prior to another rank. We expect the page allocator >>> can allocate the pages from the same rank successively with this >>> change. It will hopefully improves the proportion of the consecutive >>> memory accesses to the same rank. >> >> This certainly needs performance numbers to justify ... and I am sorry, >> "hopefully improves" is not a valid justification :) >> >> I can imagine that this works well initially, when there hasn't been a >> lot of memory fragmentation going on. But quickly after your system is >> under stress, I doubt this will be very useful. Proof me wrong. ;) >> >> ... I dislike this manual setting of "dram_rank_granule". Yet another mm >> feature that can only be enabled by a magic command line parameter where >> users have to guess the right values. >> >> (side note, there have been similar research approaches to improve >> energy consumption by switching off ranks when not needed). > > I was thinking of the exact same thing. PALLOC [1] comes to mind, but > perhaps there are more recent ones? A more recent one is "Footprint-Based DIMM Hotplug" (https://dl.acm.org/doi/abs/10.1109/TC.2019.2945562), which triggers memory onlinng/offlining from the kernel to disable banks where possible (I don't think the approach is upstream material in that form). Also, I stumbled over "Towards Practical Page Placement for a Green Memory Manager" (https://ieeexplore.ieee.org/document/7397629), proposing an adaptive buddy allocator that tries to keep complete banks free in the buddy where possible. That approach sounded quite interesting while skimming over the paper. > > I also dislike the manual knob, but is there a way for the OS to > detect this by itself? My (perhaps outdated) understanding was that > the DRAM address mapping scheme, for example, is not exposed to the > OS. I guess one universal approach is by measuring access times ... not what we might be looking for :) > > I think having more knowledge of DRAM controller details in the OS > would be potentially beneficial for better page allocation policy, so > maybe try come up with something more generic, even if the fallback to > providing this information is a kernel command line option. > > [1] http://cs-people.bu.edu/rmancuso/files/papers/palloc-rtas2014.pdf > > - Pekka > -- Thanks, David / dhildenb