On 1/29/24 19:58, Fenghua Yu wrote: > wmb() is used to ensure status in the completion record is written > after the rest of the completion record, making it visible to the user. > However, on SMP systems, this may not guarantee visibility across > different CPUs. > > Considering this scenario that event log handler is running on CPU1 while > user app is polling completion record (cr) status on CPU2: > > CPU1 CPU2 > event log handler user app > > 1. cr = 0 (status = 0) > 2. copy X to user cr except "status" > 3. wmb() > 4. copy Y to user cr "status" > 5. poll status value Y > 6. read rest cr which is still 0. > cr handling fails > 7. cr value X visible now > > Although wmb() ensure value Y is written and visible after X is written > on CPU1, the order is not guaranteed on CPU2. So user app may see status > value Y while cr value X is still not visible yet on CPU2. This will > cause reading 0 from the rest of cr and cr handling fails. > > Changing wmb() to smp_wmb() ensures Y is written after X on both CPU1 > and CPU2. This guarantees that user app can consume cr in right order. > > Fixes: b022f59725f0 ("dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling") > Suggested-by: Nikhil Rao <nikhil.rao@xxxxxxxxx> > Tested-by: Tony Zhu <tony.zhu@xxxxxxxxx> > Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx> Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx> > --- > drivers/dma/idxd/cdev.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c > index 77f8885cf407..9b7388a23cbe 100644 > --- a/drivers/dma/idxd/cdev.c > +++ b/drivers/dma/idxd/cdev.c > @@ -681,9 +681,10 @@ int idxd_copy_cr(struct idxd_wq *wq, ioasid_t pasid, unsigned long addr, > * Ensure that the completion record's status field is written > * after the rest of the completion record has been written. > * This ensures that the user receives the correct completion > - * record information once polling for a non-zero status. > + * record information on any CPU once polling for a non-zero > + * status. > */ > - wmb(); > + smp_wmb(); > status = *(u8 *)cr; > if (put_user(status, (u8 __user *)addr)) > left += status_size;