Re: Debugging a Stall or a Freeze

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have posted a question earlier and I have confirmed that this is running in an infinite loop. However, I discovered that the infinite loop is happening inside kernel code. Specifically inside the kmalloc function. I know this is highly improbable, but I believe that this is the case.

The line of code that cause the infinite loop is in bold below and starts with buf =

If I comment this line out then it does not hang. If I uncomment it then it does. Further, more no print statements after that line are being printed and I have it surrounded by print statements.

KMALLOC is a macro defined as
# define KMALLOC(a,b)    kmalloc((a),(b))

The last line being printed:
b0b0b0    4096
4096 being the size of buffer.


The get_buffer method is called quite a few times before the last time where it goes into an infinite loop. I am thinking there could be a memory leak or if memory is low this can happen?

An advice on how to tackle this issue would be greatly appreciated.

Thanks.



static inline struct buffer *get_buffer(void)
{
    /* XXX:  __get_free_page should be used.  KMALLOC is for small stuff < PAGE_SIZE */
    struct buffer *buf;
    printk(KERN_EMERG " b0b0b0   %d\n", sizeof(struct buffer));
    buf = KMALLOC(sizeof(struct buffer), GFP_KERNEL);
    print_entry_location();
    printk(KERN_EMERG " b1b1b1\n");
    //if (buf)  //i commented these out
    //buf->ptr = buf->data + INIT_LOC;  //i commented these out
    printk(KERN_EMERG " b1b1b1\n");
    print_exit_location();
    return NULL;  //i changed it to return null so the next function just exits
}



Additional info
struct buffer {
    char *ptr;
    char data[DATA_SIZE];
};
#define DATA_SIZE (PAGE_SIZE - sizeof(int))


On Thu, Jul 25, 2013 at 2:23 PM, <Valdis.Kletnieks@xxxxxx> wrote:
On Thu, 25 Jul 2013 13:56:47 -0400, Salam Farhat said:

> When the guest OS freezes I get the following messages seen below. I would
> like to know what is a good approach for debugging this issue. I am not
> sure what a process stall is. Is that a deadlock?
>
>
> [  780.357876] BUG: soft lockup - CPU#0 stuck for 22s! [nautilus:1382]
> [  780.361658] Process nautilus (pid: 1382, ti=dca12000 task=dc837230 task.ti=d)
> [  780.361658]
> Stack:
> [  780.361658] Call Trace:
> [  780.361658] Code: 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff 90 55 ba 0
> [  808.356372] BUG: soft lockup - CPU#0 stuck for 22s!

That's probably not a deadlock.  That's code stuck in an infinite loop,
probably while running in a non-interruptible state.

Too bad we didn't get a stack dump out of it, that would tell us what
code is hung in a loop.

For debugging deadlocks, turning on CONFIG_PROVE_LOCKING=y in the .config
is the best bet - that will fire an alert not only when the kernel *does*
lock up, but also if there's even a *possible* deadlock (for instance, if one
section takes 2 locks in the order A B, it will trigger if it ever spots
another chunk of code taking B and then A - even if that doesn't actually
trigger a deadlock because neither lock is held at the time).



_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux