Re: [PATCH 0/2] shmem: Notify user space when file system is full

Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> · Tue, 11 Jan 2022 22:19:23 -0500

Amir Goldstein <amir73il@xxxxxxxxx> writes:

> On Tue, Jan 11, 2022 at 3:57 AM Gabriel Krisman Bertazi
> <krisman@xxxxxxxxxxxxx> wrote:
>>
>> Amir Goldstein <amir73il@xxxxxxxxx> writes:
>>
>> > Two things bother me about this proposal.
>> > One is that it makes more sense IMO to report ENOSPC events
>> > from vfs code.
>>
>> Hi Amir,
>>
>> I reimplemented this with FS_WB_ERROR in the branch below. It reports
>> writeback errors on mapping_set_error, as suggested.
>>
>>   https://gitlab.collabora.com/krisman/linux/-/tree/wb-error
>>
>> It is a WIP, and I'm not proposing it yet, cause I'm thinking about the
>> ENOSPC case a bit more...
>>
>> > Why should the requirement to monitor ENOSPC conditions be specific to tmpfs?
>> > Especially, as I mentioned, there are already wrappers in place to report
>> > writeback errors on an inode (mapping_set_error), where the fsnotify hook
>> > can fit nicely.
>>
>> mapping_set_error would trigger the ENOSPC event only when it happens on
>> an actual writeback error (i.e. BLK_STS_NOSPC), which is not the main
>> case I'm solving here.  In fact, most of the time, -ENOSPC will happen
>> before any IO is submitted, for instance, if an inode could not be
>> allocated during .create() or a block can't be allocated in
>> .write_begin(). In this case, it isn't really a writeback error
>> (semantically), and it is not registered as such by any file system.
>>
>
> I see.
> But the question remains, what is so special about shmem that
> your use case requires fsnotify events to handle ENOSPC?
>
> Many systems are deployed on thin provisioned storage these days
> and monitoring the state of the storage to alert administrator before
> storage gets full (be it filesystem inodes or blocks or thinp space)
> is crucial to many systems.
>
> Since the ENOSPC event that you are proposing is asynchronous
> anyway, what is the problem with polling statfs() and meminfo?

Amir,

I spoke a bit with Khazhy (in CC) about the problems with polling the
existing APIs, like statfs.  He has been using a previous version of
this code in production to monitor machines for a while now.  Khazhy,
feel free to pitch in with more details.

Firstly, I don't want to treat shmem as a special case.  The original
patch implemented support only for tmpfs, because it was a fs specific
solution, but I think this would be useful for any other (non-pseudo)
file system in the kernel.

The use case is similar to the use case I brought up for FAN_FS_ERROR.
A sysadmin monitoring a fleet of machines wants to be notified when a
service failed because of lack of space, without having to trust the
failed application to properly report the error.

Polling statfs is prone to missing the ENOSPC occurrence if the error is
ephemeral from a monitoring tool point of view. Say the application is
writing a large file, hits ENOSPC and, as a recovery mechanism, removes
the partial file.  If that happens, a daemon might miss the chance to
observe the lack of space in statfs.  Doing it through fsnotify, on the
other hand, always catches the condition and allows a monitoring
tool/sysadmin to take corrective action.

> I guess one difference is that it is harder to predict page allocation failure
> that causes ENOSPC in shmem, but IIUC, your patch does not report
> an fsevent in that case only in inode/block accounting error.
> Or maybe I did not understand it correctly?

Correct.  But we cannot predict the enospc, unless we know the
application.  I'm looking for a way for a sysadmin to not have to rely
on the application caring about the file system size.

-- 
Gabriel Krisman Bertazi