Re: DEBUG output in init causes fifo to be ignored...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 16 Jun 2003, James Olin Oden wrote:

> On Mon, 16 Jun 2003, Bill Nottingham wrote:
> 
> > James Olin Oden (joden@xxxxxxxxxxxxxxxxxxxxx) said: 
> > > On Mon, 16 Jun 2003, Bill Nottingham wrote:
> > > 
> > > > James Olin Oden (joden@xxxxxxxxxxxxxxxxxxxxx) said: 
> > > > > and looked at things.  The last syscall I see  init in after 
> > > > > running the init 6, is:
> > > > > 
> > > > > 	futex(0x4212f1f4, FUTEX_WAIT, -1, NULL
> > > > 
> > > > What glibc are you running?
> > > >
> > > I am running:
> > > 
> > > 	glibc-2.3.2-27.9
> > > 
> > > I think this is the latest errata...I just downloaded all the errata (well
> > > what I did not have) today, and it was the most recent one.  BTW, I was 
> > > trying to recompile this version of glibc without stripping its symbols,
> > > and I get the following error:
> > 
> > Are you running the errata kernel as well?
> >
> I am now running with 2.4.20-18.9bigmem, and the problem is still 
> occuring.
>
Got it!  Here is what is happening when you run init 6 with the debug
output turned on in init:

	1) init reads the fifo when it gets around to it.
	2) It sees there is request for a runlevel change (6),
	   and begins killing appropriate processes.
	3) One of those processes will be a getty, inevitably.
	   The getty goes away, and inevitably some children are
	   left behind.  They are given to init by the kernel, and the
	   kernel sends  SIGCHILD to init.
	4) Meanwhile back in init, it has been going through its 
	   init_main loop again, and is printing debug output to
	   this effect and sending it to syslog.  When it sends the
	   message via the syslog call a futex is created so that
	   other processes can't do this till its done.
	5) While its in the glibc code, init receives the SIGCHILD
	   and and in the child handler it calls log() again set 
	   to send output to syslog and the console.
	6) When it tries to send the child handler log message
	   to syslog it enters the glibc code that blocks waiting
	   on the futex...and there it sits.

I patched init to block all signals while talking to syslog,
and this seems to have fixed it.  I will submit a patch via bugzilla
in the morning.   This probably seems to only happen on our duel processor
machines because the sigchild can truly be sent asynchronously from 
init.  That is my theory anyway.

This was problem number two, though, so I will be back problem number
one soon (the internal buffer overflow).  I am pretty sure what is
happening in that scenario:

	1) init goes to print "Entering runlevel 4", only
	   the runlevel data is munged causing a segfault in
	   syslog.
	2) The segv handler is kicked off and tries to log its message.
	   It can't, because the lock has not gone away on the 
	   syslog code.
	3) init hangs waiting on the futex. 

This corruption though is much more infrequent (sometimes requiring
hundreds of reboots), but with the patch I did, expect to see it happen
only this time get a core.

Cheers...james
 
> Cheers...james 
> > Bill
> > 
> > 
> > _______________________________________________
> > Redhat-devel-list mailing list
> > Redhat-devel-list@xxxxxxxxxx
> > https://www.redhat.com/mailman/listinfo/redhat-devel-list
> > 
> 
> 
> _______________________________________________
> Redhat-devel-list mailing list
> Redhat-devel-list@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/redhat-devel-list
> 


_______________________________________________
Redhat-devel-list mailing list
Redhat-devel-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/redhat-devel-list

[Index of Archives]     [Kernel Newbies]     [Red Hat General]     [Fedora]     [Red Hat Install]     [Linux Kernel Development]     [Yosemite News]

  Powered by Linux