Re: BUG: Bad page map in process udevd (anon_vma: (null)) in 2.6.38-rc4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> writes:

> Em Fri, Feb 18, 2011 at 05:01:28PM -0200, Arnaldo Carvalho de Melo escreveu:
>> Em Fri, Feb 18, 2011 at 10:48:18AM -0800, Linus Torvalds escreveu:
>> > This seems to be a fairly straightforward bug.
>> > 
>> > In net/ipv4/inet_timewait_sock.c we have this:
>> > 
>> >   /* These are always called from BH context.  See callers in
>> >    * tcp_input.c to verify this.
>> >    */
>> > 
>> >   /* This is for handling early-kills of TIME_WAIT sockets. */
>> >   void inet_twsk_deschedule(struct inet_timewait_sock *tw,
>> >                             struct inet_timewait_death_row *twdr)
>> >   {
>> >           spin_lock(&twdr->death_lock);
>> >           ..
>> > 
>> > and the intention is clearly that that spin_lock is BH-safe because
>> > it's called from BH context.
>> > 
>> > Except that clearly isn't true. It's called from a worker thread:
>> > 
>> > > stack backtrace:
>> > > Pid: 10833, comm: kworker/u:1 Not tainted 2.6.38-rc4-359399.2010AroraKernelBeta.fc14.x86_64 #1
>> > > Call Trace:
>> > > Â[<ffffffff81460e69>] ? inet_twsk_deschedule+0x29/0xa0
>> > > Â[<ffffffff81460fd6>] ? inet_twsk_purge+0xf6/0x180
>> > > Â[<ffffffff81460f10>] ? inet_twsk_purge+0x30/0x180
>> > > Â[<ffffffff814760fc>] ? tcp_sk_exit_batch+0x1c/0x20
>> > > Â[<ffffffff8141c1d3>] ? ops_exit_list.clone.0+0x53/0x60
>> > > Â[<ffffffff8141c520>] ? cleanup_net+0x100/0x1b0
>> > > Â[<ffffffff81068c47>] ? process_one_work+0x187/0x4b0
>> > > Â[<ffffffff81068be1>] ? process_one_work+0x121/0x4b0
>> > > Â[<ffffffff8141c420>] ? cleanup_net+0x0/0x1b0
>> > > Â[<ffffffff8106a65c>] ? worker_thread+0x15c/0x330
>> > 
>> > so it can deadlock with a BH happening at the same time, afaik.
>> > 
>> > The code (and comment) is all from 2005, it looks like the BH->worker
>> > thread has broken the code. But somebody who knows that code better
>> > should take a deeper look at it.
>> > 
>> > Added acme to the cc, since the code is attributed to him back in 2005
>> > ;). Although I don't know how active he's been in networking lately
>> > (seems to be all perf-related). Whatever, it can't hurt.
>> 
>> Original code is ANK's, I just made it possible to use with DCCP, and
>> yeah, the smiley is appropriate, something 6 years old and the world
>> around it changing continually... well, thanks for the git blame ;-)
>
> But yeah, your analisys seems correct, with the bug being introduced by
> one of these world around it changing continually issues, networking
> namespaces broke the rules of the game on its cleanup_net() routine,
> adding Pavel to the CC list since it doesn't hurt ;-)

Which probably gets the bug back around to me.

I guess this must be one of those ipv4 cases that where the cleanup
simply did not exist in the rmmod sense that we had to invent.

I think that was Daniel who did the time wait sockets.  I do remember
they were a real pain.

Would a bh_disable be sufficient?  I guess I should stop remembering and
look at the code now.

Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]