On Mon, 2013-01-14 at 16:54 +0100, Sven Eckelmann wrote: > The filesystem can end up in a state were the filesystem is full and the > returned ss_nongc_ctime is smaller than sui_lastmod of all reclaimable > segments. The garbage collector will not clean anything and therefore no new > room for new files will be available and ss_nongc_ctime/sui_lastmod will not be > updated without using special tools. This makes the filesystem unusable without > manual recovery. > > Signed-off-by: Sven Eckelmann <sven@xxxxxxxxxxxxx> > -- > This problem appeared on a current 3.2 stable kernel (Debian Wheezy build). I > am not an FS developer and have therefore not much background knowledge about > the NILFS codebase. Nevertheless, this problem hit me quite hard after creating > some files on a nilfs partition until it was full and deleting them again. > > $ for i in `seq 0 150`; do dd if=/dev/zero of=foo$i count=22528; done > $ rm foo* > > Looking at the output debugging output using > > $ watch -n .5 'df -h;tail /var/log/syslog;' > > clearly showed that it was not finding any segments to delete. The only problem > I could find was the threshold. After "removing" this threshold, I was able to > get some clear segments again. I personally cannot explain why the check is > there at all. Maybe there is a good reason but the comment above it didn't help > much. > > So, here for completeness the threshold: 1358164666 (aka: Mon Jan 14 12:57:46 > CET 2013) > > And here are the output of lssu and lscp: > > $ lssu --all > SEGNUM DATE TIME STAT NBLOCKS > 0 2013-01-14 12:58:23 -d- 2047 > 1 2013-01-14 12:58:23 -d- 2048 [snip] > > Signed-off-by: Sven Eckelmann <sven@xxxxxxxxxxxxx> > --- > sbin/cleanerd/cleanerd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sbin/cleanerd/cleanerd.c b/sbin/cleanerd/cleanerd.c > index bfcd893..12ed975 100644 > --- a/sbin/cleanerd/cleanerd.c > +++ b/sbin/cleanerd/cleanerd.c > @@ -592,7 +592,7 @@ nilfs_cleanerd_select_segments(struct nilfs_cleanerd *cleanerd, > * selected. */ > thr = (config->cf_selection_policy.p_threshold != 0) ? > config->cf_selection_policy.p_threshold : > - sustat->ss_nongc_ctime; > + ~0ULL; > As I understand the code of nilfs_cleanerd, this code is correct without your changing. The ss_nongc_ctime is the creation time of the last segment not for GC. When thr is set then it compared with sui_lastmod. The sui_lastmod is the timestamp of last modification. So, the nilfs_cleanerd works right. I think that this is a bug on the kernel side. My current vision is that in some environment the ns_nongc_ctime can be not updated correctly. So, you have such threshold that prevent from segments clearing. Thank you for the issue report. With the best regards, Vyacheslav Dubeyko. > for (segnum = 0; segnum < sustat->ss_nsegs; segnum += n) { > count = (sustat->ss_nsegs - segnum < NILFS_CLEANERD_NSUINFO) ? -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html