On Tue, Aug 18, 2020 at 07:16:46PM +0200, Greg KH wrote: > On Tue, Aug 18, 2020 at 11:49:29AM -0400, Mike Marshall wrote: > > upstream commit id: ec95f1dedc9c64ac5a8b0bdb7c276936c70fdedd > > > > I verified that ec95f1de "orangefs: get rid of knob code..." > > will apply to 5.4 and I compiled and ran a patched 5.4 kernel > > against my normal xfstests... I wish that ec95f1de could be > > in the 5.4 long term stable kernel. > > > > ec95f1de went upstream in 5.7. When I sent up the patch it was > > just a theoretical race condition to me: I accepted what Christoph > > said about it. We now have experienced in-the-real-world how > > important the patch is... > > > > Someone was trying to read a whole large (more than 100 meg) > > file from orangefs into some kind of cloud bucket. The > > resulting read failed with a "Bad address" error. I > > immediately thought of this patch. I reproduced the > > "Bad address" error with dd in kernel versions that > > lack ec95f1de. The "Bad address" error does not occur > > in kernels that include ec95f1de: > > > > 5.7.11-100.fc31.x86_64: > > > > $ ./wr.sh 10000000 > /pvfsmnt/wr.10000000 > > $ dd if=/pvfsmnt/wr.10000000 of=/tmp/wr.10000000 count=10 bs=419430400 > > $ ls -l /pvfsmnt/wr.10000000 /tmp/wr.10000000 > > -rw-rw-r--. 1 hubcap hubcap 498888897 Aug 14 15:41 /pvfsmnt/wr.10000000 > > -rw-rw-r--. 1 hubcap hubcap 498888897 Aug 14 16:51 /tmp/wr.10000000 > > $ md5sum /pvfsmnt/wr.10000000 /tmp/wr.10000000 > > 669daa04f91f561f5fb2851fb30e4ffe /pvfsmnt/wr.10000000 > > 669daa04f91f561f5fb2851fb30e4ffe /tmp/wr.10000000 > > > > 5.6.0hubcap: > > > > $ ./wr.sh 10000000 > /pvfsmnt/wr.10000000 > > $ dd if=/pvfsmnt/wr.10000000 of=/tmp/wr.10000000 count=10 bs=419430400 > > dd: error reading '/pvfsmnt/wr.10000000': Bad address > > 0+0 records in > > 0+0 records out > > 0 bytes copied, 10.3365 s, 0.0 kB/s > > Sounds reasonable, I'll queue this up after this next round of releases > in the next few days, thanks! Now queued up, thanks. greg k-h