On Thu, Nov 7, 2024 at 9:48 PM Song Liu <song@xxxxxxxxxx> wrote: > > On Thu, Nov 7, 2024 at 12:34 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > On Thu, Nov 7, 2024 at 8:53 PM Song Liu <songliubraving@xxxxxxxx> wrote: > > > > > > > > > > > > > On Nov 7, 2024, at 3:10 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > > > On Wed, Oct 30, 2024 at 12:13 AM Song Liu <song@xxxxxxxxxx> wrote: > > > >> > > > >> This test shows a simplified logic that monitors a subtree. This is > > > >> simplified as it doesn't handle all the scenarios, such as: > > > >> > > > >> 1) moving a subsubtree into/outof the being monitoring subtree; > > > > > > > > There is a solution for that (see below) > > > > > > > >> 2) mount point inside the being monitored subtree > > > > > > > > For that we will need to add the MOUNT/UNMOUNT/MOVE_MOUNT events, > > > > but those have been requested by userspace anyway. > > > > > > > >> > > > >> Therefore, this is not to show a way to reliably monitor a subtree. > > > >> Instead, this is to test the functionalities of bpf based fastpath. > > > >> To really monitor a subtree reliably, we will need more complex logic. > > > > > > > > Actually, this example is the foundation of my vision for efficient and race > > > > free subtree filtering: > > > > > > > > 1. The inode map is to be treated as a cache for the is_subdir() query > > > > > > Using is_subdir() as the truth and managing the cache in inode map seems > > > promising to me. > > > > > > > 2. Cache entries can also have a "distance from root" (i.e. depth) value > > > > 3. Each unknown queried path can call is_subdir() and populate the cache > > > > entries for all ancestors > > > > 4. The cache/map size should be limited and when limit is reached, > > > > evicting entries by depth priority makes sense > > > > 5. A rename event for a directory whose inode is in the map and whose > > > > new parent is not in the map or has a different value than old parent > > > > needs to invalidate the entire map > > > > 6. fast_path also needs a hook from inode evict to clear cache entries > > > > > > The inode map is physically attached to the inode itself. So the evict > > > event is automatically handled. IOW, an inode's entry in the inode map > > > is automatically removed when the inode is freed. For the same reason, > > > we don't need to set a limit in map size and add evicting logic. Of > > > course, this works based on the assumption that we don't use too much > > > memory for each inode. I think this assumption is true. > > > > Oh no, it is definitely wrong each inode is around 1K and this was the > > main incentive to implement FAN_MARK_EVICTABLE and > > FAN_MARK_FILESYSTEK in the first place, because recursive > > pinning of all inodes in a large tree is not scalable to large trees. > > The inode map does not pin the inode in memory. The inode will > still be evicted as normal. The entry associated with the being > evicted inode in the map will be freed together as the inoide is freed. > The overhead I mentioned was the extra few bytes (for the flag, etc.) > per inode. When an evicted inode is loaded again, we will use > is_subdir() to recreate the entry in the inode map. > > Does this make sense and address the concern? > Yes, that sounds good. Thanks, Amir.