On 27.01.2021 12:09, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@xxxxxxxx> >> Sent: 27 January 2021 10:57 >> To: Paul Durrant <paul@xxxxxxx> >> Cc: Paul Durrant <pdurrant@xxxxxxxxxx>; Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>; Roger Pau >> Monné <roger.pau@xxxxxxxxxx>; Jens Axboe <axboe@xxxxxxxxx>; Dongli Zhang <dongli.zhang@xxxxxxxxxx>; >> linux-kernel@xxxxxxxxxxxxxxx; linux-block@xxxxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx >> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings >> >> On 27.01.2021 11:30, Paul Durrant wrote: >>> From: Paul Durrant <pdurrant@xxxxxxxxxx> >>> >>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid >>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the >>> behaviour of xen-blkback when connecting to a frontend was: >>> >>> - read 'ring-page-order' >>> - if not present then expect a single page ring specified by 'ring-ref' >>> - else expect a ring specified by 'ring-refX' where X is between 0 and >>> 1 << ring-page-order >>> >>> This was correct behaviour, but was broken by the afforementioned commit to >>> become: >>> >>> - read 'ring-page-order' >>> - if not present then expect a single page ring >>> - expect a ring specified by 'ring-refX' where X is between 0 and >>> 1 << ring-page-order >>> - if that didn't work then see if there's a single page ring specified by >>> 'ring-ref' >>> >>> This incorrect behaviour works most of the time but fails when a frontend >>> that sets 'ring-page-order' is unloaded and replaced by one that does not >>> because, instead of reading 'ring-ref', xen-blkback will read the stale >>> 'ring-ref0' left around by the previous frontend will try to map the wrong >>> grant reference. >>> >>> This patch restores the original behaviour. >> >> Isn't this only the 2nd of a pair of fixes that's needed, the >> first being the drivers, upon being unloaded, to fully clean up >> after itself? Any stale key left may lead to confusion upon >> re-use of the containing directory. > > In a backend we shouldn't be relying on, nor really expect IMO, a frontend to clean up after itself. Any backend should know *exactly* what xenstore nodes it’s looking for from a frontend. But the backend can't know whether a node exists because the present frontend has written it, or because an earlier instance forgot to delete it. It can only honor what's there. (In fact the other day I was wondering whether some of the writes of boolean "false" nodes wouldn't better be xenbus_rm() instead.) Jan