On Wed, Apr 28, 2021 at 5:05 AM <hase.jin@xxxxxxxxxxx> wrote: > > Hi, > > Let me share current idea about source code modification. > Regarding readdir(), errno is set to indicate the error, so we can check readdir() error in Ceph codes. > > [readdir() specification] > ------------------------------------------------------------ > https://man7.org/linux/man-pages/man3/readdir.3.html > > DESCRIPTION > ... It returns NULL on reaching the end of the directory stream or if an error occurred. > > RETURN VALUE > ... If the end of the directory stream is reached, NULL is returned > and errno is not changed. If an error occurs, NULL is returned > and errno is set to indicate the error. To distinguish end of > stream from an error, set errno to zero before calling readdir() > and then check the value of errno if NULL is returned. > ------------------------------------------------------------ > > > How about modify the codes where readdir () is being executed in filestore as follows: > > ------------------------------------------------------------ > diff --git a/src/os/filestore/BtrfsFileStoreBackend.cc b/src/os/filestore/BtrfsFileStoreBackend.cc > index df1d2452a1f..ad0c4d1bb1e 100644 > --- a/src/os/filestore/BtrfsFileStoreBackend.cc > +++ b/src/os/filestore/BtrfsFileStoreBackend.cc > @@ -326,7 +326,13 @@ int BtrfsFileStoreBackend::list_checkpoints(list<string>& ls) > list<string> snaps; > char path[PATH_MAX]; > struct dirent *de; > - while ((de = ::readdir(dir))) { > + while (true) { > + errno = 0; > + de = ::readdir(dir); > + if (de == nullptr) { > + ceph_assert(errno == 0); > + break; > + } Yes! Checking the readdir result is something we should have been doing; ignoring it is a bug. I would adjust this code to also print the error code to the log, though. sage > snprintf(path, sizeof(path), "%s/%s", get_basedir_path().c_str(), de->d_name); > > struct stat st; > diff --git a/src/os/filestore/FileStore.cc b/src/os/filestore/FileStore.cc > index f059331b090..1aeec855e9a 100644 > --- a/src/os/filestore/FileStore.cc > +++ b/src/os/filestore/FileStore.cc > @@ -4987,7 +4987,13 @@ int FileStore::list_collections(vector<coll_t>& ls, bool include_temp) > } > > struct dirent *de = nullptr; > - while ((de = ::readdir(dir))) { > + while (true) { > + errno = 0; > + de = ::readdir(dir); > + if (de == nullptr) { > + ceph_assert(errno == 0); > + break; > + } > if (de->d_type == DT_UNKNOWN) { > // d_type not supported (non-ext[234], btrfs), must stat > struct stat sb; > diff --git a/src/os/filestore/LFNIndex.cc b/src/os/filestore/LFNIndex.cc > index bbf65bdc66a..530d7501ddc 100644 > --- a/src/os/filestore/LFNIndex.cc > +++ b/src/os/filestore/LFNIndex.cc > @@ -429,7 +429,13 @@ int LFNIndex::list_objects(const vector<string> &to_list, int max_objs, > int r = 0; > int listed = 0; > bool end = true; > - while ((de = ::readdir(dir))) { > + while (true) { > + errno = 0; > + de = ::readdir(dir); > + if (de == nullptr) { > + ceph_assert(errno == 0); > + break; > + } > end = false; > if (max_objs > 0 && listed >= max_objs) { > break; > @@ -477,7 +483,13 @@ int LFNIndex::list_subdirs(const vector<string> &to_list, > return -errno; > > struct dirent *de = nullptr; > - while ((de = ::readdir(dir))) { > + while (true) { > + errno = 0; > + de = ::readdir(dir); > + if (de == nullptr) { > + ceph_assert(errno == 0); > + break; > + } > string short_name(de->d_name); > string demangled_name; > if (lfn_is_subdir(short_name, &demangled_name)) { > ------------------------------------------------------------ > > I haven't verified this modification yet, but is this idea (check readdir() errors) acceptable? > If Ceph can't detect the error, it's hard to deal with. > But regarding readdir(), it can detect the error, so I would like to fix it. > > Jin > _______________________________________________ > Dev mailing list -- dev@xxxxxxx > To unsubscribe send an email to dev-leave@xxxxxxx _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx