I've read though what I believe are all the threads related to the possible race condition between iput() and __sync_one(). It appears from Rahul's last posting to Marcello, that it was decided it was impossible to have iput() and __sync_one both trying to process the same inode. If this is a wrong assumption on may part, and the problem was already found and fixed, let me know, and just ignore the rest of this. Configuration: 8-way IA-64 / 2.4.20 (but the problem exists in 2.4.25) I can state with confidence that it does indeed occur, and I have multiple crash dumps that prove it. The most common occurence seems to be with writing to a /proc file (running irqbalance does this). If kupdated runs while the file is on the s_dirty list, it will call __sync_one(). If, at the same time, iput() is occurring, a race condition can occur. On 2.4.20 kernels, the window is between the clearing of the I_LOCK and the testing of I_FREEING in __sync_one(). In 2.4.25, the window is between the clearing of I_LOCK in __sync_one() and the test of I_FREEING in __refile_inode(). Here's how the window is opened: If an interrupt comes in between the clearing of I_LOCK and the testing of I_FREEING, then there is an opportunity for iput() to call clear_inode() (which clears all the bits except I_CLEAR, and can even go as far as calling destroy_inode(). Under severe memory pressure, we have seen the system go as far as returning the icache page and that page was given to another process. This is where bad things happen: When the interrupt returns, the inode could get move to the unused list. If the inode had been return to the inode cache in destroy inode, the next time the inode is allocated, it will be added to the in use list. At this point, the 2 lists are linked together (since get_new_inode() does not do a list_del() on the inode before doing a list_add() ). Also, when the interrupt returns, it is possible to interleave list operations between __sync_one() and dispose_list(), which doesn't hold the spinlock. This can cause all sorts of strange connections, including loops, depending on the architecture. One more bad thing. If the low latency patch is installed, and the in use and unused lists are linked, then it is possible for the unused head to be moved onto the in use list, and we've even seen the in use head on a dispose list. The easiest way I know to reproduce the problem is the following: 1) You need to have more than 2 processors (we're running 8). There's a report that it mayt have occured on a 2-way system. 2) The system needs to be busy (not overly busy). We have been creating inodes on multple volume by creating lots of files and deleting them. This just keeps up the demand on inodes and interrupts. 3) /proc seems to be the best trigger. Just have an infinite loop of writing to a /proc file. I was using /proc/sys/kernel/kdb. I just kept echoing a 0 into it. 4) You can either wait for something bad to happen, or do what we did: In __sync_one(), just before the wakeup() at the end, check for the I_CLEAR bit set. We also opened the window by adding mdelay(1) after clearing the I_LOCK bit. If the code was safe, this should not add any risk. /proc is interesting, because there is a delete_inode() function that just sets the I_CLEAR bit. That's another problem.... -- Charlie Brett <cfb@xxxxxx> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html