On Thu, Jan 05, 2017 at 10:12:25PM +0100, Zdenek Kabelac wrote: > Dne 5.1.2017 v 20:29 Eric Sandeen napsal(a): > >On 1/5/17 1:13 PM, Zdenek Kabelac wrote: > >>>Anyway, at this point I'm not convinced that anything but the filesystem > >>>should be making decisions based on storage error conditions. > >> > >>So far I'm not convinced doing nothing is better then trying at least unmount. > >> > >>Since doing nothing is known to cause SEVERE filesystem damages, > >>while I've haven't heard about them when 'unmount' is in the field. > > > >I'm pretty sure that's exactly what started this thread. ;) > > > >Failing IOs should never cause "severe filesystem damage" - that is what > >a journaling filesystem is /for/. Can you explain further? > > well all I know are user reports - which we capable to use 'XFS' > with exhausted thin-pool while having 'snapshots' of their volumes. > > Since there was no 'umount' and XFS upon write error just retried > endlessly to write block over and over - system appeared Which has already been fixed upstream. And my 2c worth on the "lvm unmounting filesystems on error" - stop it, now. It's the wrong thing to do, and it makes it impossible for filesystems to handle the error and recover gracefully when possible. > to the users nice & usable for quite long time (especially when > boxes had 32G of RAM or more...) > > Maybe writes passed to 'uniquely' owned blocs.... > > Then after some day,two,free OOM finally killed. > Users realized thin-pool was out-of-space - added room to VG and pool > and tried xfs_repair - but whole FS was largely lost. That sounds very much like a block device snapshot corruption problem, not a filesystem problem. As always, the filesystem gets blamed for data loss, regardless of where the problem really lies. > Use LV and make some thin snapshots. > > Then change various parts of origin - at various moment before pool > is out-of-space > > So you will get lots of different scenarios of missing data. > > You will mostly not get into those mentioned trouble if you > have just single thinLV and you exhaust thin-pool while using it. > > Games with snapshot are needed. This really sounds like a problem with snapshot ENOSPC error handling, not a filesystem issue - the filesystem is simply the messenger here... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel