> Understood, but why is that? Well because customers requested it :) What we try to do here is having a parameter which says when less than x megabytes of memory are left then fail the allocation. This is basically to prevent buggy applications which try to allocate as much memory as possible until they receive an -ENOMEM from running into the OOM killer. > That's true, but with VRAM, TTM overcommits swap space which may lead > to ugly memory allocation failures at hibernate time. Yeah, that is exactly the reason why I said that Roger should disable the limit during suspend swap out :) Regards, Christian. Am 07.02.2018 um 14:17 schrieb Thomas Hellstrom: > Hi, Roger. > > On 02/07/2018 09:25 AM, He, Roger wrote: >> Â Â Â Â Why should TTM be different in that aspect? It would be good to >> know your reasoning WRT this? >> >> Now, in TTM struct ttm_bo_device it already has member no_retry to >> indicate your option. >> If you prefer no OOM triggered by TTM, set it as true. The default is >> false to keep original behavior. >> AMD prefers return value of no memory rather than OOM for now. > > Understood, but why is that? I mean just because TTM doesn't invoke > the OOM killer, that doesn't mean that the process will, the next > millisecond, page in a number of pages and invoke it? So this > mechanism would be pretty susceptible to races? >> Â Â Â Â One thing I looked at at one point was to have TTM do the >> swapping itself instead of handing it off to the shmem system. That >> way we could pre-allocate swap entries for all swappable (BO) memory, >> making sure that we wouldn't run out of swap space when, >> >> I prefer current mechanism of swap out. At the beginning the swapped >> pages stay in system memory by shmem until OS move to status with >> high memory pressure, that has an obvious advantage. For example, if >> the BO is swapped out into shmem, but not really be flushed into swap >> disk. When validate it and swap in it at this moment, the overhead is >> small compared to swap in from disk. > > But that is true for a page handed off to the swap-cache as well. It > won't be immediately flushed to disc, only when the swap cache is shrunk. > >> In addition, No need swap space reservation for TTM pages when >> allocation since swap disk is shared not only for TTM exclusive. > > That's true, but with VRAM, TTM overcommits swap space which may lead > to ugly memory allocation failures at hibernate time. > >> So again we provide a flag no_retry in struct ttm_bo_device to let >> driver set according to its request. > > Thanks, > Thomas > > >> >> >> Thanks >> Roger(Hongbo.He) >> >> -----Original Message----- >> From: Thomas Hellstrom [mailto:thomas at shipmail.org] >> Sent: Wednesday, February 07, 2018 2:43 PM >> To: He, Roger <Hongbo.He at amd.com>; amd-gfx at lists.freedesktop.org; >> dri-devel at lists.freedesktop.org >> Cc: Koenig, Christian <Christian.Koenig at amd.com> >> Subject: Re: [PATCH 0/5] prevent OOM triggered by TTM >> >> Hi, Roger, >> >> On 02/06/2018 10:04 AM, Roger He wrote: >>> currently ttm code has no any allocation limit. So it allows pages >>> allocatation unlimited until OOM. Because if swap space is full of >>> swapped pages and then system memory will be filled up with ttm pages. >>> and then any memory allocation request will trigger OOM. >>> >> I'm a bit curious, isn't this the way things are supposed to work on >> a linux system? >> If all memory resources are used up, the OOM killer will kill the >> most memory hungry (perhaps rogue) process rather than processes >> being nice and try to find out themselves whether allocations will >> succeed? >> Why should TTM be different in that aspect? It would be good to know >> your reasoning WRT this? >> >> Admittedly, graphics process OOM memory accounting doesn't work very >> well, due to not all BOs not being CPU mapped, but it looks like >> there is recent work towards fixing this? >> >> One thing I looked at at one point was to have TTM do the swapping >> itself instead of handing it off to the shmem system. That way we >> could pre-allocate swap entries for all swappable (BO) memory, making >> sure that we wouldn't run out of swap space when, for example, >> hibernating and that would also limit the pinned non-swappable memory >> (from TTM driver kernel allocations for example) to half the system >> memory resources. >> >> Thanks, >> Thomas >> >