Re: [PATCH] libceph: don't set memalloc flags in loopback case

Mel Gorman <mgorman@xxxxxxx> · Thu, 2 Apr 2015 06:41:24 +0100

On Thu, Apr 02, 2015 at 02:40:19AM +0300, Ilya Dryomov wrote:
> On Thu, Apr 2, 2015 at 2:03 AM, Mel Gorman <mgorman@xxxxxxx> wrote:
> > On Wed, Apr 01, 2015 at 08:19:20PM +0300, Ilya Dryomov wrote:
> >> Following nbd and iscsi, commit 89baaa570ab0 ("libceph: use memalloc
> >> flags for net IO") set SOCK_MEMALLOC and PF_MEMALLOC flags for rbd and
> >> cephfs.  However it turned out to not play nice with loopback scenario,
> >> leading to lockups with a full socket send-q and empty recv-q.
> >>
> >> While we always advised against colocating kernel client and ceph
> >> servers on the same box, a few people are doing it and it's also useful
> >> for light development testing, so rather than reverting make sure to
> >> not set those flags in the loopback case.
> >>
> >
> > This does not clarify why the non-loopback case needs access to pfmemalloc
> > reserves. Granted, I've spent zero time on this but it's really unclear
> > what problem was originally tried to be solved and why dirty page limiting
> > was insufficient. Swap over NFS was always a very special case minimally
> > because it's immune to dirty page throttling.
> 
> I don't think there was any particular problem tried to be solved,

Then please go back and look at why dirty page limiting is insufficient
for ceph.

> certainly not one we hit and fixed with 89baaa570ab0.  Mike is out this
> week, but I'm pretty sure he said he copied this for iscsi from nbd
> because you nudged him to (and you yourself did this for nbd as part of
> swap-over-NFS series).

In http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/23708 I
stated that if ceph insisted on using using nbd as justification for ceph
using __GFP_MEMALLOC that it was preferred that nbd be broken instead. In
commit 7f338fe4540b1d0600b02314c7d885fd358e9eca, the use case in mind was
the swap-over-nbd case and I regret I didn't have userspace explicitly
tell the kernel that NBD was being used as a swap device.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html