Hi Dexen, I did have old cleanerd lock files from the crashed 2.0.23 in /dev/shm. The upgrade to 2.1 was hung on the lock files as you surmised on lock file removal 2.1 cleaned all the checkpoints. My box is using kernel 3.0.4, but I do have quite a few boxes still running kernel 2.6.18 from Centos 5.5 with the nilfs2 kernel mod back-ported to it (i.e. a builtin), so I will need to check if those boxes come out of the wedge of rewinded dates. Thanks a lot. Zahid -----Original Message----- From: dexen deVries [mailto:dexen.devries@xxxxxxxxx] Sent: Saturday, December 03, 2011 4:34 AM To: linux-nilfs@xxxxxxxxxxxxxxx Cc: Zahid Chowdhury Subject: Re: nilfs_cleanerd from nilfs-utils shutdown on version 2.0 and 2.1 does not fail but says nothing and does not clean the old checkpoints nor newer (actually older) ones. Hi Zahid, On Saturday 03 December 2011 01:33:09 you wrote: > (...) > I cannot ever start up the daemon. If I move to a 2.1 daemon, then it logs > no errors, but it cleans no old or newer (really older) checkpoints - it > just sits in a do-nothing mode (strace(1) shows he is hung on a > mq_timedreceive syscall). > (...) nilfs_cleanerd creates sort of a lock file in /dev/shm, named `sem.nilfs- cleanerd-$PID'. nilfs_cleanerd version 2.1 refuses to process a filesystem if it has an associated /dev/shm/sem.nilfs-cleanerd-$PID file -- to protect from corruption occuring when multiple cleanerds accessed same filesystem. This looks in strace as being stuck at mq_timedreceive syscall. All files in /dev/shm/ disappear after reboot (it's a temporary filesystem) so you don't usually see this behavior. However, when you start a new nilfs_cleanerd (v2.1) process without reboot, you need to clean relevant file by hand. Do ensure the old cleanerd process is dead before deleting the file. Otherwise corruption will happen when multiple cleanerd access same filesystem. On Saturday 03 December 2011 01:33:09 you wrote: > If I move the system date forward, have some checkpoints created and then > move the date backward a 2.0 cleanerd daemon fails on this error: Nov 30 > 14:39:37 nilfs_cleanerd[5789]: start > Nov 30 14:39:38 kernel: nilfs_ioctl_move_inode_block: conflicting data > buffer: ino=4, cno=0, offset=0, blocknr=665655, vblocknr=566462 > Nov 30 14:39:38 kernel: NILFS: GC failed during preparation: cannot > read source blocks: err=-17 > Nov 30 14:39:38 nilfs_cleanerd[5789]: cannot clean segments: File > exists Nov 30 14:39:38 nilfs_cleanerd[5789]: shutdown > (...) I got similar (or same) error with older kernel. Removing all checkpoints with rmcp helped -- but that doesn't seem like a 100% reliable solution to me. Right now I'm using kernels v3.1 and 3.2-rc3; seem rock-solid. Regards, -- dexen deVries > Gresham's Law for Computing: > The Fast drives out the Slow even if the Fast is Wrong. William Kahan in http://www.cs.berkeley.edu/~wkahan/Stnfrd50.pdf -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html