On Thu, Mar 30, 2017 at 03:13:27PM -0400, Dave Anderson wrote: > > If we hit a NULL next pointer on a struct list_head list it means the > > list is corrupted. > > Yeah, that is true -- although it's always been this way and there's never > been a bug report. I'm curious as to what happened in your case where > you discovered this? I received a dump where the kernel had crashed while iterating through a linked list. We read out the list in crash and it was certainly corrupted: crash-arm> list -xH 0x7f047394 be44f544 8ff69004 8ff69f04 8ff693c4 8ffe0e44 8ffe0244 ... 8ffb53c4 be448b44 8fc2c3c4 8fc2c0c4 8fc2c604 8fc2cf04 ffff list: invalid kernel virtual address: ffff type: "list entry" crash-arm> Further investigation led us to suspect that this was not a simple case of a freed element still being on the list, but some other larger memory corruption. We wanted to find out if there were more corrupted entries on this list, so we dumped the list in reverse using the .prev pointers: crash-arm> list -rxH 0x7f047394 b957fcc4 b957f0c4 b4d863c4 bad41904 bad416c4 bad41c04 bad41784 bad41544 be5f4b44 ... 8ff7de44 8f7a4c04 8f7a4f04 8fc2c9c4 8fc2c904 8fc2c784 8fc2ce44 crash-arm> This, suprisingly, terminated succesfully. However, a closer look at the addresses showed that the last elements of the reverse iteration are not the first elements of the forward iteration. So crash had silently stopped iteration halfway into the list. This was because the 8fc2ce44 element had a NULL prev pointer. crash-arm> struct list_head 8fc2ce44 struct list_head { next = 0xffffffff, prev = 0x0 } Since crash knows that the list is corrupted, it would seem appropriate for it to alert the user to this fact instead of silently and succesfully terminating the iteration. -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility