Re: thoughts about fanotify and HSM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Amir!

On Sun 11-09-22 21:12:06, Amir Goldstein wrote:
> I wanted to consult with you about preliminary design thoughts
> for implementing a hierarchical storage manager (HSM)
> with fanotify.
> 
> I have been in contact with some developers in the past
> who were interested in using fanotify to implement HSM
> (to replace old DMAPI implementation).

Ah, DMAPI. Shiver. Bad memories of carrying that hacky code in SUSE kernels
;)

So how serious are these guys about HSM and investing into it? Because
kernel is going to be only a small part of what's needed for it to be
useful and we've dropped DMAPI from SUSE kernels because the code was
painful to carry (and forwardport + it was not of great quality) and the
demand for it was not really big... So I'd prefer to avoid the major API
extension unless there are serious users out there - perhaps we will even
need to develop the kernel API in cooperation with the userspace part to
verify the result is actually usable and useful. But for now we can take it
as an interesting mental excercise ;)

> Basically, FAN_OPEN_PERM + FAN_MARK_FILESYSTEM
> should be enough to implement a basic HSM, but it is not
> sufficient for implementing more advanced HSM features.
> 
> Some of the HSM feature that I would like are:
> - blocking hook before access to file range and fill that range
> - blocking hook before lookup of child and optionally create child
> 
> My thoughts on the UAPI were:
> - Allow new combination of FAN_CLASS_PRE_CONTENT
>   and FAN_REPORT_FID/DFID_NAME
> - This combination does not allow any of the existing events
>   in mask
> - It Allows only new events such as FAN_PRE_ACCESS
>   FAN_PRE_MODIFY and FAN_PRE_LOOKUP
> - FAN_PRE_ACCESS and FAN_PRE_MODIFY can have
>   optional file range info
> - All the FAN_PRE_ events are called outside vfs locks and
>   specifically before sb_writers lock as in my fsnotify_pre_modify [1]
>   POC
> 
> That last part is important because the HSM daemon will
> need to make modifications to the accessed file/directory
> before allowing the operation to proceed.

My main worry here would be that with FAN_FILESYSTEM marks, there will be
far to many events (especially for the lookup & access cases) to reasonably
process. And since the events will be blocking, the impact on performance
will be large.

I think that a reasonably efficient HSM will have to stay in the kernel
(without generating work for userspace) for the "nothing to do" case. And
only in case something needs to be migrated, event is generated and
userspace gets involved. But it isn't obvious to me how to do this with
fanotify (I could imagine it with say overlayfs which is kind of HSM
solution already ;)).

> Naturally that opens the possibility for new userspace
> deadlocks. Nothing that is not already possible with permission
> event, but maybe deadlocks that are more inviting to trip over.
> 
> I am not sure if we need to do anything about this, but we
> could make it easier to ignore events from the HSM daemon
> itself if we want to, to make the userspace implementation easier.

So if the events happen only in the "migration needed" case, I don't think
deadlocks would be too problematic - it just requires a bit of care from
userspace so that the event processing & migration processes do not access
HSM managed stuff.

> Another thing that might be good to do is provide an administrative
> interface to iterate and abort pending fanotify permission/pre-content
> events.

You can always kill the listener. Or are you worried about cases where it
sleeps in UN state?

> You must have noticed the overlap between my old persistent
> change tracking journal and this design. The referenced branch
> is from that old POC.
> 
> I do believe that the use cases somewhat overlap and that the
> same building blocks could be used to implement a persistent
> change journal in userspace as you suggested back then.
> 
> Thoughts?

Yes, there is some overlap. But OTOH HSM seems to require more detailed and
generally more frequent events which seems like a challenge.

> [1] https://github.com/amir73il/linux/commits/fsnotify_pre_modify

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux