On Mon, Jul 29 2013 at 2:49pm -0400, Daniel P. Berrange <berrangeatredhat.com> wrote: > On Mon, Jul 29, 2013 at 02:38:23PM -0400, Ric Wheeler wrote: > > On 07/29/2013 10:18 AM, Daniel P. Berrange wrote: > > >On Mon, Jul 29, 2013 at 08:01:23AM -0600, Chris Murphy wrote: > > >>On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" <berrange at redhat.com> wrote: > > >> > > >>>Yep, we need to be able to report free space on filesystems, so that > > >>>apps provisioning virtual machines can get an idea of how much storage > > >>>they can provide to VMs without risk of over comitting. > > >>> > > >>>I agree that we really want the kernel, or at least a reusable shared > > >>>library, to provide some kind of interface to determine this, rather > > >>>than requiring every userspace app which cares to re-invent the wheel. > > >>What does it mean for an app to use stat to get free space, and then > > >>proceeds to create too big a VM image in a directory that has a quota > > >>set? I still think apps are asking an inappropriate/unqualified question > > >>by asking for volume free space, instead of what's available to them for > > >>a specified path. > > > From an API POV, libvirt doesn't need/care about the free space on the > > >volume underlying the filesystem. We actually only care about the free > > >space in a given directory that we're using for disk images. It just > > >happens that we implement this using statvfs() currently. So when I > > >ask for an API above, don't take this to mean I want a statvfs() that > > >knows about sparse volumes. An API or syscall that provides free space > > >for individual directories is fine with me. > > > > > > > Just another note, it is never safe to assume that storage under any > > file system is yours for the taking. > > > > If application A does a stat or statvfs() call, sees 1GB of space > > left and then does a write, we could easily lose that race to any > > other application. > > This race doesn't matter from libvirt's POV. It is just providing a > mechanism via its API. It is upto the management application using > libvirt to make use of the mechanism to provide a usage policy. > Their usage scenario may well enable them to make certain assumptions > about the storage that you could not otherwise do in a race free > manner. > > In addition, even in more general purpose usage scenarios, it does > not neccessarily matter if there is a race, because there can be a > second line of defence. For example, KVM can be set to pause the VM > upon ENOSPC errors, giving management application or administrator > the chance to expand capacity the underlying storage and then unpause > the guest. In that case checking the free space is mostly just a > sanity check which serves to avoid hitting the pause-on-ENOSPC scenario > too frequently. Running out of free space _should_ be extremely rare. A properly configured dm-thin pool will have adequate free space, with an appropriate low water mark, that would give admins ample time to extend (even if a human were to do it). But lvm2 has support to autoextend the thin-pool with free space in the parent volume group. But I'm just talking about the not-really-chicken solution of leaning on a properly configured system (either by admins in a data center or by fedora developers with sane defaults). As an aside, this extra free space checking that KVM is doing is really broken by design (polling sucks -- especially if this polling is happening in the host for each guest). Would be much better to leverage something like lvm2 with a custom dmeventd plugin that fires when it receives the low watermark and/or -ENOSPC event. Thinly provisioned volumes offer the prospect of doing away with this polling -- as such proper dm-thin integration has been on the virt roadmap for a while. Just never seems to happen. Mike -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct