On Mon, Jul 29 2013 at 2:48pm -0400, Eric Sandeen <sandeenatredhat.com> wrote: > On 7/27/13 11:56 AM, Lennart Poettering wrote: > > On Fri, 26.07.13 22:13, Miloslav Trmač (mitr at volny.cz) wrote: > > > >> Hello all, > >> with thin provisioning available, the total and free space values > >> reported by a filesystem do not necessarily mean that that much space > >> is _actually_ available (the actual backing storage may be smaller, or > >> shared with other filesystems). > >> > >> If your package reports disk space usage to users, and bases this on > >> filesystem free space, please consider whether it might need to take > >> LVM thin provisioning into account. > >> > >> The same applies if your package automatically allocates a certain > >> proportion of the total or available space. > >> > >> A quick way to check whether your package is likely to be affected, is > >> to look for statfs() or statvfs() calls in C, or the equivalent in > >> your higher-level library / programming language. > > > > Well, I am pretty sure the burden must be on the file systems to report > > a useful estimate free blocks value in statfs()/statvfs(). Exporting that > > problem to userspace and expecting userspace to work around this is just > > wrong. In fact, this would be quite an API breakage if applications > > cannot rely that the value returned is at least a rough estimate on how > > much data can be stored on disk. > > > > journald will scale how much disk usage it will use of /var/log/journal > > based on the file system size and free level. It will also module the > > per-service rate limit levels based on the amount of free disk space. If > > you break the API of statfs()/statvfs(), then you will end up break this > > and all programs like it. > > Any program needs to be prepared for ENOSPC; as Ric mentioned elsewhere, > until you successfully write to it, it's not yours! :) (Ok, thinp > running out of space won't generate ENOSPC today, either, but you see > my general point...) > > And how much space are we really talking about here? If you're running > thin-provisioning on thin margins, especially w/o some way to automatically > hot-add storage, you're probably doing it wrong. > > (And if journald sees "100T free" and decides it can use 50T of that, > it's doing it wrong, too) ;) > > The truth is somewhere in the middle, but quibbling over whether this > app or that can claim a bit of space behind a thin-provisioned volume > probably isn't useful. Right, so picking up on what we've discussed: adding the ability to have fallocate propagate to the underlying storage via a new REQ_RESERVE bio (if the storage opts-in, which dm-thinp could). This bio would be the reciprocal of discard -- thus enabling the caller to efficiently reserve space in the underlying storage (e.g. dm-thin-pool). So volumes or apps (e.g. journald) that _expect_ to have fully-provisioned space from thinp could. This would also allow for a hyrid setup where the thin-pool is configured to use a smaller block size to benefit taking many snapshots -- but then allows select apps and/or volumes to reserve contiguous space from the thin-pool. It obviously also offers the other traditional fallocate benefits too (reserving large contiguous space for performance, etc). I'll draft an RFC patch or 2 for LKML... may take some time for me to get to it but I can make it a higher priority if others have serious interest. > The admin definitely needs tools to see the state of thinly provisioned > storage, but that's the admin's job to worry about, not the app's, IMHO. Yeah, in a data center the admin really should be all over these thinp concerns, making them a non-issue. But on the desktop the fedora developers need to provide sane policy/defaults. Mike -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct