On Fri 05-08-22 20:34:32, Baoquan He wrote: > On 08/04/22 at 02:01pm, Michal Hocko wrote: > ...snip... > > > > > > > > Thinking about it. We should get a warning when the actual allocation > > > > > > > > from the pool fails no? That would be more useful information than the > > > > > > > > pre-allocation failure when it is not really clear whether anybody is > > > > > > > > ever going to consume it. > > > > > > > > > > > > > > Hi Michal, > > > > > > > > > > > > > > You haven't told on which ARCH you met this issue, is it x86_64? > > > > > > > > > > > > yes x86_64, so a small 16MB DMA zone. > > > > > > > > > > Yeah, the 16M DMA zone is redicilous and exists only for hardly seen > > > > > ISA-style devices support. Haven't prepared the log well. > > > > > > > > Agreed on that! I would essentially suggest to completely ignore pool > > > > pre-allocation failures for the small DMA zone. There is barely anything > > > > to be ever consuming it. > > > > > > I would personally suggest to keep it. W/o that, we even don't know the > > > issue we are talking about now. I see below commit as a workaround, and > > > have been trying to fix it finally with a better solution. > > > > > > commit c4dc63f0032c ("mm/page_alloc.c: do not warn allocation failure on zone DMA if no managed pages") > > > > This will not help in any case but an empty DMA zone. As you can see > > this is not the case in my example. > > Yeah, it's different case. > > > > > > After attempts, I realize it's time to let one zone DMA or DMA32 cover > > > the whole low 4G memory on x86_64. That's the real fix. The tiny 16M DMA > > > on 64bit system is root cause. > > > > Yes, I would agree with that. This means DMA zone is gone completely. > > > > [...] > > > > This also mostly papers over this particular problem by allocating > > > > allocating two pools for the same range. > > > > > > No, it doesn't paper over anything, and isn't a hack. Zone dma now covers > > > low 4G, just zone DMA32 is empty. Any allocation request with GFP_DMA will > > > be satisfied, while request with GFP_DMA32 will fall back to zone DMA. > > > > This is breaking expectation of the DMA zone on x86. This is not > > acceptable. You really want to do it other way around. Only have DMA32 > > zone and cover the whole 32b phys address range. DMA zone would be empty > > which is something we can have already. This would make the crap HW fail > > on allocations. > > OK, I will consider it in the direction that only DMA32 zone cover the > low 4G as you suggested. Do you have pointer about them? Really > appreciate it. Not sure I understand what you are asking. [...] > > the expected behavior should be to fail the allocation. > > Could you tell why we should fail the allocation? If we cannot guarantee DMA (low 16MB) then the allocation should fail otherwise any HW that really requires that address range could fail in unpredictable way. Sure you can try harder and check whether DMA32 zone has any pages from that range but I am not really sure this is worth it. > With my understanding, whether it is allocation request with GFP_DMA > or GFP_DMA32, it's requesting memory for DMA buffer, or requesting > memory under 32bit. If we only have zone DMA32 covering low 4G, and > people requests meomry with GFP_DMA, we should try to satisfy it with > memory from zone DMA32. Because people may not know the internal detail > about memory from zone DMA or DMA32, and they just want to get a buffer > for DMA transfering. GFP_DMA is a misnomer. It doesn't really say that the allocation should be done for actual DMA. GFP_DMA really says allocate from ZONE_DMA. It is my understanding that all actual DMA users should use a dedicated dma allocation APIs which should do the right thing wrt. address constrains. > With your saying, if there's only DMA32 existing and cover the whole low > 4G, any allocation request with GFP_DMA need be failed, we need search > and change all those places where GFP_DMA is set. Yes a mass review of GFP_DMA usage is certainly due and it would have to preceed any other changes. This is likely the reason why no unification has happened yet. > > > See gfp_zone(), it will return ZONE_NORMAL to user even though > > > user expects to get memory for DMA handling? > > > > It's been quite some time since I've had gfp_zone cached in my brain. I > > suspect it would require some special casing for GFP_DMA to always fail. > > I would have to study the code again. We have discussed this at some > > LSFMM few years back. Maybe you can find something at lwn. > > > > In any case. This is a larger change. Do you agree that the warning is > > pointless and __GFP_NOWARN is a very simple way to deal with it until we > > sort out situation of the DMA zone on x86 which is a long standing > > problem? > > I understand the warning will be annoying and customer will complain > about it. We can mute it for the time being, and may open it later > when we get rid of the 16M DMA zone. > > So, yes, I agree that there's nothing we can do currently except of > muting it. I can add my tag when you update and post. Thanks. I will wait for Christoph and his comments and post a full patch next week. -- Michal Hocko SUSE Labs