On Tue, 8 Sep 2020 14:40:10 +0200 Christophe Leroy <christophe.leroy@xxxxxxxxxx> wrote: > > > Le 08/09/2020 à 14:09, Christian Borntraeger a écrit : > > > > > > On 08.09.20 07:06, Christophe Leroy wrote: > >> > >> > >> Le 07/09/2020 à 20:00, Gerald Schaefer a écrit : > >>> From: Alexander Gordeev <agordeev@xxxxxxxxxxxxx> > >>> > >>> Commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast > >>> code") introduced a subtle but severe bug on s390 with gup_fast, due to > >>> dynamic page table folding. > >>> > >>> The question "What would it require for the generic code to work for s390" > >>> has already been discussed here > >>> https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 > >>> and ended with a promising approach here > >>> https://lkml.kernel.org/r/20190419153307.4f2911b5@mschwideX1 > >>> which in the end unfortunately didn't quite work completely. > >>> > >>> We tried to mimic static level folding by changing pgd_offset to always > >>> calculate top level page table offset, and do nothing in folded pXd_offset. > >>> What has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end do > >>> not reflect this dynamic behaviour, and still act like static 5-level > >>> page tables. > >>> > >> > >> [...] > >> > >>> > >>> Fix this by introducing new pXd_addr_end_folded helpers, which take an > >>> additional pXd entry value parameter, that can be used on s390 > >>> to determine the correct page table level and return corresponding > >>> end / boundary. With that, the pointer iteration will always > >>> happen in gup_pgd_range for s390. No change for other architectures > >>> introduced. > >> > >> Not sure pXd_addr_end_folded() is the best understandable name, allthough I don't have any alternative suggestion at the moment. > >> Maybe could be something like pXd_addr_end_fixup() as it will disappear in the next patch, or pXd_addr_end_gup() ? > >> > >> Also, if it happens to be acceptable to get patch 2 in stable, I think you should switch patch 1 and patch 2 to avoid the step through pXd_addr_end_folded() > > > > given that this fixes a data corruption issue, wouldnt it be the best to go forward > > with this patch ASAP and then handle the other patches on top with all the time that > > we need? > > I have no strong opinion on this, but I feel rather tricky to have to > change generic part of GUP to use a new fonction then revert that change > in the following patch, just because you want the first patch in stable > and not the second one. > > Regardless, I was wondering, why do we need a reference to the pXd at > all when calling pXd_addr_end() ? > > Couldn't S390 retrieve the pXd by using the pXd_offset() dance with the > passed addr ? Apart from performance impact when re-doing that what has already been done by the caller, I think we would also break the READ_ONCE semantics. After all, the pXd_offset() would also require some pXd pointer input, which we don't have. So we would need to start over again from mm->pgd. Also, it seems to be more in line with other primitives that take a pXd value or pointer.