On Thu, Apr 02, 2015 at 02:40:19AM +0300, Ilya Dryomov wrote: > On Thu, Apr 2, 2015 at 2:03 AM, Mel Gorman <mgorman@xxxxxxx> wrote: > > On Wed, Apr 01, 2015 at 08:19:20PM +0300, Ilya Dryomov wrote: > >> Following nbd and iscsi, commit 89baaa570ab0 ("libceph: use memalloc > >> flags for net IO") set SOCK_MEMALLOC and PF_MEMALLOC flags for rbd and > >> cephfs. However it turned out to not play nice with loopback scenario, > >> leading to lockups with a full socket send-q and empty recv-q. > >> > >> While we always advised against colocating kernel client and ceph > >> servers on the same box, a few people are doing it and it's also useful > >> for light development testing, so rather than reverting make sure to > >> not set those flags in the loopback case. > >> > > > > This does not clarify why the non-loopback case needs access to pfmemalloc > > reserves. Granted, I've spent zero time on this but it's really unclear > > what problem was originally tried to be solved and why dirty page limiting > > was insufficient. Swap over NFS was always a very special case minimally > > because it's immune to dirty page throttling. > > I don't think there was any particular problem tried to be solved, Then please go back and look at why dirty page limiting is insufficient for ceph. > certainly not one we hit and fixed with 89baaa570ab0. Mike is out this > week, but I'm pretty sure he said he copied this for iscsi from nbd > because you nudged him to (and you yourself did this for nbd as part of > swap-over-NFS series). In http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/23708 I stated that if ceph insisted on using using nbd as justification for ceph using __GFP_MEMALLOC that it was preferred that nbd be broken instead. In commit 7f338fe4540b1d0600b02314c7d885fd358e9eca, the use case in mind was the swap-over-nbd case and I regret I didn't have userspace explicitly tell the kernel that NBD was being used as a swap device. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html