On Mon, Oct 23, 2017 at 8:14 PM, Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote: > On Mon, Oct 23, 2017 at 12:06:20PM +0300, Amir Goldstein wrote: >> On Mon, Oct 23, 2017 at 11:40 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> > On Mon, Oct 23, 2017 at 09:48:24AM +0300, Amir Goldstein wrote: >> >> On Mon, Oct 23, 2017 at 8:32 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> > On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote: >> >> >> If there is no such reason I would like to propose to store parent >> >> >> pointers of directory inodes with an xattr name that is diroffset agnostic, >> >> >> namely: >> >> >> >> >> >> struct xfs_dir_parent_name_rec { >> >> >> __be64 p_ino; >> >> >> __be32 p_gen; >> >> >> } __attribute__((packed)); >> >> >> >> >> >> It makes sense for metadata consistency anyway, but the reason I am >> >> >> proposing this is so xfs_fs_get_name() could be implemented at O(1). >> > >> > [...] >> > >> >> But you missed my question completely. I understand why diroffset >> >> is needed for non-dir. But directories are not allowed to have 2 differnt >> >> parents nor 2 entries in the parent dir, so I was asking if we can >> >> special case parent pointer stored in *directory inodes*. >> > >> > I did answer it - not directly, but the answers are there. >> > >> >> Am I missing something? >> > >> > Because you are just looking at it from a "reverse lookup" >> > perspective, I suspect you're not seeing the "detect and >> > reconstruct" broken directory structure side of the picture. >> > >> > The directory offset is redundant information which needed to fully >> > cross-check and, if necessary, reconstruct broken direcotry >> > structures. E.g. if a directory is missing a block due to a bad >> > sector, we can reconstruct that block exactly from the parent >> > pointer information because we know exactly what inodes had dirents >> > in the block that was lost, and we know exactly what order they >> > appeared in the directory block... >> > >> > If we don't have that info for all child inodes (directory or >> > regular file), we can't tell the difference between "lost/stale >> > child that we should ignore" and "child that was referenced by the >> > block we lost". So even directories need to have the diroffset in >> > their parent pointer.... >> > >> >> I see. Thanks for explaining that point. >> In that case, I would like to re-phrase my proposal to store parent >> diroffset in the value rather than in the key of xattr, i.e.: >> name={parent inode #, parent inode generation} >> value={dirent filename, dirent offset} >> OR value={dirent offset, dirent filename} > > Not possible, because the attribute name must be unique: Yes, I'm well aware of that, only the uniqueness is guaranties for directories even without diroffset. > > $ cd /root > $ echo hi > a > $ ln a b > > Produces in a: > > (rootino, rootinogen) -> (0, 'a') > (rootino, rootinogen) -> (1, 'b') > > We must store both pptrs, but we're not allowed to have duplicate attr > names. > > In reference to where this thread went over the weekend -- I don't look > favorably on the idea of having two different parent pointer disk > formats -- there has to be a very good justification for forcing > everyone to remember which format applies in which cases. > Yes. I can relate to that. Anyway, I have a workload that may be sensitive to dentry reconnect performance on large dir trees. After parent pointers settle down, I'll run a benchmark to see how much get_name() with xattr iteration improves performance. Thanks, Amir. -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html