On Mon, Mar 2, 2015 at 7:46 AM, Mateusz Guzik <mguzik@xxxxxxxxxx> wrote: > On Mon, Mar 02, 2015 at 05:38:29AM +0100, Mateusz Guzik wrote: >> On Sun, Mar 01, 2015 at 08:31:26AM +1100, Dave Chinner wrote: >> > On Sat, Feb 28, 2015 at 05:25:57PM +0300, Alexey Dobriyan wrote: >> > > Freezing and thawing are separate system calls, task which is supposed >> > > to thaw filesystem/superblock can disappear due to crash or not thaw >> > > due to a bug. At least record task name (we can't take task_struct >> > > reference) to make support engineer's life easier. >> > > >> > > Hopefully 16 bytes per superblock isn't much. >> > > >> > > TASK_COMM_LEN definition (which is userspace ABI, see prctl(PR_SET_NAME)) is >> > > moved to userspace exported header to not drag sched.h into every fs.h inclusion. >> > > >> > > Signed-off-by: Alexey Dobriyan <adobriyan@xxxxxxxxx> >> > >> > Freeze/thaw can be nested at the block level. That means the >> > sb->s_writers.freeze_comm can point at the wrong process. i.e. >> > >> > Task A Task B >> > freeze_bdev >> > freeze_super >> > freeze_comm = A >> > freeze_bdev >> > ..... >> > thaw_bdev >> > <device still frozen> >> > <crash> >> > >> > At this point, the block device will never be unthawed, but >> > the debug field is now pointing to the wrong task. i.e. The debug >> > helper has not recorded the process that is actually causing the >> > problem, and leads us all off on a wild goose chase down the wrong >> > path. >> > >> > IMO, debug code is only useful if it's reliable..... >> > >> >> It can be trivially modified to be very useful to support people. >> >> Actually this patch clears saved task name on unfreeze, so in this >> particular scenario we would end up with no data. >> >> Freezer and unfreezer names don't even have to match, so there is not >> much we can do here (e.g. recording all names in a linked list or >> something is a non-starter because of this). >> >> I propose the following: >> - on freezing: >> 1. if 0->1 save the name >> 2. if 1->2 have a flag to note there is an additional freezer >> - on unfreezing >> 1. if 1->0 clear the flag >> 2. DO NOT clear the name in any case >> > > Now that I sent this e-mail I realized we could actually keep a linked > list of freezer names. Unfreezing would delete all elements when going > 1->0, but would not touch it otherwise. If you do linked list two processes constantly keeping superblock frozen will allocate all the memory: F F T F T ... After all valid objections and comments I think the way to proceed is a) record freeze_comm, never clear it (maybe record thaw_comm) b) record "i am not the only one" flag c) introduce debug_freeze which will "record" everything in dmesg for situations when crashdump is not used but serial console is. Alexey -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html