Ok, i'll change the mount options and report back in a few days on that topic. This is fairly reproducible, so I would expect to see the effect if it works. On Fri, Apr 10, 2020 at 6:32 PM Tony Lill <ajlill@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On 4/10/20 3:13 AM, Jesper Krogh wrote: > > Hi. What is the suggested change? - is it Ceph that has an rsize,wsize > > of 64MB ? > > > > Sorry, set rsize and wsize in the mount options for your cephfs to > something smaller. > > My problem with this is that I use autofs to mount my filesystem. > Starting with 4.14.82, after a few mount/unmount cycles, the mount would > fail with order 4 allocation error, and I'd have to reboot. > > I traced it to a change that doubled CEPH_MSG_MAX_DATA_LEN from 16M to > 32M. Later in the 5 series kernels, this was doubled again, and that > caused an order 5 allocation failure. This define is used to set the max > and default rsize and wsize. > > Reducing the rsize and wsize in the mount option fixed the problem for > me. This may do nothing for you, but, if it clears your allocation issue... > > > > On Fri, Apr 10, 2020 at 12:47 AM Tony Lill <ajlill@xxxxxxxxxxxxxxxxxxx> wrote: > >> > >> > >> > >> On 4/9/20 12:30 PM, Jeff Layton wrote: > >>> On Thu, 2020-04-09 at 18:00 +0200, Jesper Krogh wrote: > >>>> Thanks Jeff - I'll try that. > >>>> > >>>> I would just add to the case that this is a problem we have had on a > >>>> physical machine - but too many "other" workloads at the same time - > >>>> so we isolated it off to a VM - assuming that it was the mixed > >>>> workload situation that did cause us issues. I cannot be sure that it > >>>> is "excactly" the same problem we're seeing but symptoms are > >>>> identical. > >>>> > >>> > >>> Do you see the "page allocation failure" warnings on bare metal hosts > >>> too? If so, then maybe we're dealing with a problem that isn't > >>> virtio_net specific. In any case, let's get some folks more familiar > >>> with that area involved first and take it from there. > >>> > >>> Feel free to cc me on the bug report too. > >>> > >>> Thanks, > >>> > >> > >> In 5.4.20, the default rsize and wsize is 64M. This has caused me page > >> allocation failures in a different context. Try setting it to something > >> sensible. > >> -- > >> Tony Lill, OCT, ajlill@xxxxxxxxxxxxxxxxxxx > >> President, A. J. Lill Consultants (519) 650 0660 > >> 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 > >> -------------- http://www.ajlc.waterloo.on.ca/ --------------- > >> > >> > >> > > -- > Tony Lill, OCT, ajlill@xxxxxxxxxxxxxxxxxxx > President, A. J. Lill Consultants (519) 650 0660 > 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 > -------------- http://www.ajlc.waterloo.on.ca/ --------------- > > >