On Tue 28-11-23 10:07:08, Pingfan Liu wrote: > On Sun, Nov 26, 2023 at 5:24 AM Jiri Bohac <jbohac@xxxxxxx> wrote: > > > > Hi Tao, > > > > On Sat, Nov 25, 2023 at 09:51:54AM +0800, Tao Liu wrote: > > > Thanks for the idea of using CMA as part of memory for the 2nd kernel. > > > However I have a question: > > > > > > What if there is on-going DMA/RDMA access on the CMA range when 1st > > > kernel crash? There might be data corruption when 2nd kernel and > > > DMA/RDMA write to the same place, how to address such an issue? > > > > The crash kernel CMA area(s) registered via > > cma_declare_contiguous() are distinct from the > > dma_contiguous_default_area or device-specific CMA areas that > > dma_alloc_contiguous() would use to reserve memory for DMA. > > > > Kernel pages will not be allocated from the crash kernel CMA > > area(s), because they are not GFP_MOVABLE. The CMA area will only > > be used for user pages. > > > > User pages for RDMA, should be pinned with FOLL_LONGTERM and that > > would migrate them away from the CMA area. > > > > But you're right that DMA to user pages pinned without > > FOLL_LONGTERM would still be possible. Would this be a problem in > > practice? Do you see any way around it? > > > > I have not a real case in mind. But this problem has kept us from > using the CMA area in kdump for years. Most importantly, this method > will introduce an uneasy tracking bug. Long term pinning is something that has changed the picture IMHO. The API had been breweing for a long time but it has been established and usage spreading. Is it possible that some driver could be doing remote DMA without the long term pinning? Quite possible but this means such a driver should be fixed rather than preventing cma use for this usecase TBH. > For a way around, maybe you can introduce a specific zone, and for any > GUP, migrate the pages away. I have doubts about whether this approach > is worthwhile, considering the trade-off between benefits and > complexity. No, a zone is definitely not an answer to that because because a) userspace would need to be able to use that memory and userspace might pin memory for direct IO and others. So in the end longterm pinning would need to be used anyway. -- Michal Hocko SUSE Labs _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec