On Thu, Dec 21, 2023 at 1:53 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, 20 Dec 2023 13:28:39 +0800 "jiajun.xie" <jiajun.xie.sh@xxxxxxxxx> wrote: > > > From: Jiajun Xie <jiajun.xie.sh@xxxxxxxxx> > > > > The bug happens when highest bit of holebegin is 1, suppose > > holebign is 0x8000000111111000, after shift, hba would be > > 0xfff8000000111111, then vma_interval_tree_foreach would look > > it up fail or leads to the wrong result. > > > > error call seq e.g.: > > - mmap(..., offset=0x8000000111111000) > > |- syscall(mmap, ... unsigned long, off): > > |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT); > > > > here pgoff is correctly shifted to 0x8000000111111, > > but pass 0x8000000111111000 as holebegin to unmap > > would then cause terrible result, as shown below: > > > > - unmap_mapping_range(..., loff_t const holebegin) > > |- pgoff_t hba = holebegin >> PAGE_SHIFT; > > /* hba = 0xfff8000000111111 unexpectedly */ > > > > turn holebegin to be unsigned first would fix the bug. > > > > Thanks. Are you able to describe the runtime effects of this > (obviously bad, but it's good to spell it out) and under what > circumstances it occurs? Thanks for the quick reply. The issue happens in Heterogeneous computing, where the device(e.g. gpu) and host share the same virtual address space. A simple workflow pattern which hit the issue is: /* host */ 1. userspace first mmap a file backed VA range with specified offset. e.g. (offset=0x800..., mmap return: va_a) 2. write some data to the corresponding sys page e.g. (va_a = 0xAABB) /* device */ 3. gpu workload touches VA, triggers gpu fault and notify the host. /* host */ 4. reviced gpu fault notification, then it will: 4.1 unmap host pages and also takes care of cpu tlb (use unmap_mapping_range with offset=0x800...) 4.2 migrate sys page to device 4.3 setup device page table and resolve device fault. /* device */ 5. gpu workload continued, it accessed va_a and got 0xAABB. 6. gpu workload continued, it wrote 0xBBCC to va_a. /* host */ 7. userspace access va_a, as expected, it will: 7.1 trigger cpu vm fault. 7.2 driver handling fault to migrate gpu local page to host. 8. userspace then could correctly get 0xBBCC from va_a 9. done But in step 4.1, if we hitted the bug this patch mentioned, then user space would never trigger cpu fault, and still get the old value: 0xAABB.