Thanks a lot for explaining this. I will have a look into the jbd2 code for having similar implementation on ext4 as well. I will keep you posted on any patches we try out and get your opinion. Best regards Naga On Thu, Jun 27, 2013 at 11:06 PM, Theodore Ts'o <tytso@xxxxxxx> wrote: > On Thu, Jun 27, 2013 at 06:28:21PM +0530, Nagachandra P wrote: >> Hi Theodore, >> >> Could you point me to the code where ext4_std_err is not triggered >> because of LMK? As I see it, if a memory allocation returns error in >> some of the case ext4_std_error would invariably be called. Please >> consider the following call stack > > Yes, that's one example where a memory allocation failure can lead to > ext4_std_error() getting called, and I've already acknowledged that's > one that we need to fix (although as I said, fixing it may be tricky, > short of calling congestion_wait() and then retrying the allocation, > and hoping that in the meantime the OOM killer has freed up some > memory). > > If you'd could give me a list of other memory allocations where > ext4_std_error() could get called, please let me know. Note that in > the jbd2 layer, though, we handle a memory allocation failure by > retrying the allocation, to avoid this the file system getting marked > read/only. Examples of this include in jbd2_journal_write_metadata_buffer(), > and in jbd2_journal_add_journal_head() when it calls > journal_alloc_journal_head(). (Although the way we're doing the retry > in the latter case is a bit ugly and we're not sleeping with a call to > congestion_wait(), so it's something we should clean up.) > > To give you an example of the intended use of ext4_std_error(), if the > journal commit code runs into a disk I/O error while writing to the > journal, the jbd2 code has to mark the journal as aborted. This could > happen because the disk has gone off-line, or the HDD has run out of > spare disk sectors in its bad block replacement pool, so it has to > return a write error to the OS. Once the journal has been marked as > aborted, the next time the ext4 code tries to access the journal, by > starting a new journal handle, or marking a metadata block dirty, the > jbd2 function will return an error, and this will cause > ext4_std_error() to be called so the file system can be marked as > requiring a file system check. > > Regards, > > - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html