Re: XFS: possible memory allocation deadlock in kmem_alloc on glusterfs setup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Dec 4, 2016, at 5:22 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
> On Sun, Dec 04, 2016 at 05:14:51PM -0800, Cyril Peponnet wrote:
>> 
>>> On Dec 4, 2016, at 3:50 PM, Dave Chinner <david@xxxxxxxxxxxxx>
>>> wrote:
>>> 
>>> On Sun, Dec 04, 2016 at 03:24:50PM -0800, Cyril Peponnet wrote:
>>>>> On Dec 4, 2016, at 2:46 PM, Dave Chinner <david@xxxxxxxxxxxxx>
>>>>> Which used LVM snapshots to take snapshots of the entire
>>>>> brick.  I don't see any LVM in your config, so I'm not sure
>>>>> what snapshot implementation you are using here. What are you
>>>>> using to take the snapshots of your VM image files? Are you
>>>>> actually using the qemu qcow2 snapshot functionality rather
>>>>> than anything native to gluster?
>>>>> 
>>>> 
>>>> Yes sorry it was not clear enough, qemu-img snapshots no native
>>>> snapshots.
>>> 
>>> Ok, so that's a fragmentation problem in it's own right. both
>>> internal qcow2 fragmentation and file fragmentation.
>>> 
>>>>> Also, can you attach the 'xfs_bmap -vp' output of some of
>>>>> these image files and their snapshots?
>>>> 
>>>> A snapshot:
>>>> https://gist.github.com/CyrilPeponnet/8108c74b9e8fd1d9edbf239b2872378d
>>>> (let me know if you need more basically there is around 600
>>>> live snapshots sitting here).
>>> 
>>> 1200 extents, mostly small, almost entirely adjacent. Typical
>>> qcow2 file fragmentation pattern. That's not going to cause your
>>> memory allocation problems - can you find one that has hundreds
>>> of thousands of extents?
>> 
>> I found one with 10799109 :/ 576GB in size (I need to find why
>> this one is so big this is not normal…)… Could it lead to
>> the issue?
> 
> The memory allocation issue, yes. 10 million extents is
> unusually high even for VM image files…

Will dig on that one. Its size is also unusual…

> 
>> I mean could one file cause the deadlock of the entire
>> FS?
> 
> What deadlock is that? XFS is reporting memory allocation issues,
> not that there is a filesystem deadlock. Your comments that dropping
> caches make the problem go away indicate that there isn't any
> deadlock, just blocking on memory allocation that is taking a long
> time to resolve…

You are right by deadlock I meant that mount point hang for ls commands for instance (from the server it self) and basically the glusterfs mount on the hypervisors is also hanging for all vms (it makes the vms to remount RO).

On this server I have 1200 living snapshots, usually not too much write in it except for some of them…

Is there a way to optimize the memory allocation? Or we likely need to add more ram…

Thanks for your time Dave I apreciate.

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux