On Wed, 2004-09-01 at 12:11, Michael Stenner wrote: > On Wed, Sep 01, 2004 at 11:22:29AM -0400, David L. Parsley wrote: > > > unless your repos are REAAAAAAAAAAAAAAALLY slow, I've found that a hang > > > in yum is most often caused by an rpmdb lockup. > > > -sv > > > > > This issue was traced to a problem with ftp repos. The hung yum process > > has an open ftp control connection, with no packets queued for sending - > > it just sits there keeping yum alive. There's no time out in the ftp > > code, and no timeout in yum waiting for the ftp. My solution is to > > release a new default yumconf in the next few weeks with all http repos, > > because, I'm told, the http code gets 'more luvin'. > > *shrug* > > Hehe. While I doubt very much that's a real quote, I'm pretty sure > I'm the one who said it :) That's true. Frankly, I don't think > yum/urlgrabber is the place to fix such quirky ftp behavior, but > rather in ftplib (which is part of the standard python distro). > > However, this is a significant problem. If folks can provide me with > a way to recreate such problems, I'd be happy to seriously look into > them. Also, each of these messages leads me to think we should bump > timeouts up on the priority list a bit. > > This may not be something timeouts can help though. How long will it > hang? A few minutes? Hours? If the latter, it's probably broken ftp > handling (possibly server, more likely client, conceivably both). In > that case, timeouts won't help. It hangs for _days_; from ps: root 10267 10266 0 Aug19 ? 00:00:00 /usr/bin/python /usr/bin/yum -R 10 -e 0 -d 0 -y update yum from netstat: tcp 0 0 galadriel.alfred.:33657 some.ftp.server:40009 ESTABLISHED 10267/python tcp 0 0 galadriel.alfred.:33656 some.ftp.server:ftp ESTABLISHED 10267/python strace'ing it: Process 10267 attached - interrupt to quit read(8, Tracing fd 8 via /proc, it corresponds to port 40009 - so that's the data port, not the control port, as I thought. I've seen this happen with my own machines with two different ftp servers now. It seems like the client should, at some point, send some kind of 'hey, you gonna send my data?' packet (keepalive or something?), whereupon the server should RST - 'who the hell are you?'. In the case of the other ftp server, it reboots every morning. So, after being hung for a day, the server has no clue yum is sitting there waiting for data. Unfortunately, it's not something that seems to happen all the time, so not easily/readily reproducible. regards, David -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.dulug.duke.edu/pipermail/yum/attachments/20040901/705df960/attachment-0001.bin