On Sat, Aug 04 2018 at 3:37pm -0400, Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > On Sat, Aug 04, 2018 at 02:18:47PM -0400, Mike Snitzer wrote: > > > Fair enough. I don't think I would consider that makes dm-snapshot a > > > "steaming pile". For me, protection against data loss is Job One. > > > > What's your point Ted? Do you have _any_ intention of actually using > > anything DM or is this just a way for you to continue to snipe at it? > > My point is that putting down dm-snapshot by calling it a "steaming > pile" because it can't perform well on workloads that weren't a > requirement when it was first designed is neither accurate nor fair. As a person who has written a fair amount of dm-snapshot code I'm free to have my opinion. It is slow. Period. If it works for you, great. But it isn't adequate for most modern usecases I'm aware of. > And steering users away from it by badmouthing to a technology which > ever so often, requires enterprise support to recover, is something > that *I* at least would classify as "marginal". dm-snapshot is slow, as such I will badmouth it because dm-thinp is a much more capable replacement. I have to maintain both, so I'm free to steer people according to my experience. > Maybe it's just that file system developers have higher standards. I > know that Dave Chinner at LSF/MM commented that using some of the > things he has been developing for XFS subvolume support might be > interesting precisely because it could provide some of the facilities > currently provided by thin provisioning (perhaps not all of them; I'm > not sure how well his virtual block remapping layer would handle > hundreds of snapshots) but with file system tools which have a lot > more seasoning and where people have spent a lot of effort on data > recovery tools. Even new XFS features will have bugs. Just because XFS's fsck is historically robust, oversights and bugs happen when new features are added. And AFAIK future XFS would be looking to leverage DM thinp via its producer/consumer model. But this is going off on a tangent now. > In any case, I do use DM quite a lot. I use LVM2 and dm-snapshot (and > it's been working just *fine* for my use cases). I've wanted to use > dm-thin, but have been put off by it being labeled as experimental and > by some of the comments about how robust its recovery tools are. The Documentation was stale. I personally don't reference it so the need to update it got overlooked. > If there was documentation about how an advanced user/developer could > use low level tools to do manual repair of a thin pool when the > automated procedures didn't work, without having to pay $$$ to some > company for "enterprise support", I'd be a lot more willing to give it > a try. We could certainly improve out documentation for the use of thin_check and thin_repair. I know lvm2 has seen improvements to allow the metdata voulme to be activated in standalone mode (activate the metadata volume without the thin-pool or thin devices ontop) so that the thin_check and thin_repair tools can be used on it. I'd imagine you aren't aware of the lvm2 package's lvmthin manpage? See: man lvmthin It'd likely be one of the documentation locations to see improvements. I'll talk with others about where we can improve docs, the manpages for thin_check and thin_repair are _very_ sparse. Anyway, room for improvement for sure. But demonizing "enterprise support" like you don't provide that to your stake holders is bizarre. I again was candid and forthcoming about what drives/catches the need for thin_check and thin_repair fixes and improvements: it just so happens that "enterprise" deployments make use of DM-thinp and have exposed the need for support more than community users. I'm not saying I, or other DM thinp oriented developers, wouldn't provided the same type of support if a community user like yourself hit a problem. It is just that enterprise users are the prontlines of advanced usage and scale. Deploying hundreds of Gluster servers with every brick layered ontop of DM thinp historically exposed issues. Those issues get fixed and benefit everyone. This discussion, and my need to explain how "enterprise support" drives innovation, is so.. weird. > Sorry, I just care a *lot* about data robustness. You aren't alone. > > Maybe read your email from earlier today before repeating yourself: > > https://lkml.org/lkml/2018/8/4/366 > > Apologies. I'm currently staying at an Assisted Living facility > keeping an eye on my Dad this week, and the internet at the senior > living center has been.... marginal. As a result I've been reading my > e-mail in batches, and so I hadn't seen the e-mail you had posted > earlier before I had sent my reply. Best wishes. I've been dealing with stresses in my personal life myself. Might explain why we've had the awkwardness in this thread. Mike