On Tue, Jul 06, 2021 at 08:15:47PM +0100, Robin Murphy wrote: > On 2021-07-06 18:50, Catalin Marinas wrote: > > On Mon, Jun 28, 2021 at 05:22:30PM +0100, Robin Murphy wrote: > > > @@ -62,6 +64,12 @@ EXPORT_SYMBOL(__arch_copy_to_user) > > > .section .fixup,"ax" > > > .align 2 > > > +9997: cmp dst, dstin > > > + b.ne 9998f > > > + // Before being absolutely sure we couldn't copy anything, try harder > > > + ldrb tmp1w, [srcin] > > > +USER(9998f, sttrb tmp1w, [dstin]) > > > + add dst, dstin, #1 > > > 9998: sub x0, end, dst // bytes not copied > > > ret > > > .previous > > > > I think it's worth doing the copy_to_user() fallback in a loop until it > > faults or hits the end of the buffer. This would solve the problem we > > currently have with writing more bytes than actually reported. The > > copy_from_user() is not necessary, a byte would suffice. > > The thing is, we don't really have that problem since the set_fs cleanup > removed IMP-DEF STP behaviour from the picture - even with the current mess > we could perfectly well know which of the two STTRs faulted if we just put a > little more effort in. I think there are some corner cases: STTR across a page boundary, faulting on the second page. The architecture allows some data to be written (or not) in the first page, so we'd under-report if we use the destination update. If we use the fault address it's even worse as we may over-report in case the instruction did not write anything. > But yuck... If you think the potential under-reporting is worth fixing right > now, rather than just letting it disappear in a future rewrite, then I'd > still rather do it by passing the actual fault address to the current > copy_to_user fixup. After some more digging in the ARM ARM, I don't think that's fixable by using the actual fault address. B2.2.1 and D1.13.5 in version G.a (thanks to Will for digging them out) mean that for an interrupted store (exception, interrupt), any bytes stored by the instruction become UNKNOWN. In practice, this means left unchanged or written. So I think a byte-wise write loop is the only chance we have at a more precise reporting, unless we change the loops to align the writes. > A retry loop could still technically under-report if the > page disappears (or tag changes) between faulting on the second word of a > pair and retrying from the first, so we'd want to pin the initial fault down > to a single access anyway. All the loop would achieve after that is > potentially fill in an extra 1-7 bytes right up to the offending page/tag > boundary for the sake of being nice, which I remain unconvinced is worth the > bother :) There is indeed the risk of a race but we can blame the user for concurrently changing the permissions or tag. The kernel wouldn't normally do this. -- Catalin