On 18 Aug 2003, Peter Peltonen wrote: > Here's the results: > > [root@cuba os]# ls -la /tmp/XFree86-AGX-3.3.6-29.i386.rpm > -rw-rw-r-- 1 ayo ayo 851914 Jun 23 2001 > /tmp/XFree86-AGX-3.3.6-29.i386.rpm > [root@cuba os]# ls -la i386/XFree86-AGX-3.3.6-29.i386.rpm > -rw-rw-r-- 1 ayo ayo 851914 Jun 23 2001 > i386/XFree86-AGX-3.3.6-29.i386.rpm > [root@cuba os]# diff /tmp/XFree86-AGX-3.3.6-29.i386.rpm > i386/XFree86-AGX-3.3.6-29.i386.rpm > [root@cuba os]# > > So, no differences at the files at all. But one strange thing happened: > Before (on Friday) this change --stats produced info saying that > number of RH62 updates transferred was 106. On saturday the number > was 1 and the following does 0. But still the file listing shows > packages not being uptodate and still --stats is saying that over 400 > RH72 and RH73 updates are being transferred. But can it be really true, > as the cron job takes only 7 minutes to execute? And the connection is > only 1M SDSL. 7 minutes = 42 MB approximately at 1 Mbps (about 100KB/sec). Even compressed one would expect 400 packages to occupy a lot more than this. However, 7 minutes is also odd. Looking over your rsync outputs, I'm inclined to believe the "total bytes transferred" at the bottom of each transfer -- where it says that you're transferring (for example) 2711783 bytes at around 31 KB/sec, which is a reasonable fraction of 100 KB/sec allowing for all the file comparisons and the fact that you get AT MOST 100 KB/sec on your line, not a guarantee of QoS at 100 KB/sec. This leads be to believe that you're being misled by something odd in the way --stats works, but Duke's VPN is misbehaving and it is therefore a bit difficult for me to test this. I think that its "uptodate" vs not differentiation is somehow irrelevant as to whether or not the file actually is being transferred in full (as opposed to e.g. just headers). I say this because of its "matched data" entry, which appears to say that all the data it "transferred" was "matched" so that in sum, it transferred basically nothing. This is consistent with its speedup of "190", which of course is absurdly past any sort of compression algorithm. You could try doing the transfer with just rsync -av (since the archive says not to use -z). From my experience, the output in this format is just the actual files transferred (and deleted, if you use --delete) and you can "watch it work" and decide if you believe that files are indeed being transferred at each step. > > This too can be tested, both by the ritual above and by simply > > skipping > > the nightly cron altogether and running the update by hand the next > > day. > > If the hypothesis is approximately correct, the FIRST time you run the > > update script after their nightly cron you'll get an unnecessary > > update, > > If the hypothesis is approximately correct, the FIRST time you run the > > update script after their nightly cron you'll get an unnecessary > > update, but of course the second and so forth you won't. > > I have now (on Monday) moved the mirroring script away from > /etc/cron.daily and will run the script tomorrow by hand and see if > anything changes. > > I also mv'ed XFree86-twm-4.1.0-49.i386.rpm from the RH72 updates dir > to see if it produced same kind of new strange behaviour as happened > with RH62 rsyncing. > > (Wednesday) The results are: Running the cron job by hand next day > produces the same kind of behaviour as by cron normally does -- Packages > end up being marked not uptodate. You can see the output or rsync at > > http://www.iki.fi/peter.peltonen/linux/wed.txt > > And now the RH62 updates are being transferred again! So this suggests that my hypothesis somehow correct. Something is happening server side that causes rsync to attempt a retransfer of certain files, but rsync is smart and doesn't actually do the transfer -- perhaps comparing header and checksum or the like and deciding not to do it (but costing some data transfer to find out). > I really don't get this behaviour. There must be a logical explanation > for all this, but what could it be? a) Server is doing "something" to their archive each night that causes rsync to suspect that certain files have been changed. b) Client checks those files but discovers that server is mistaken and doesn't actually transfer them over c) Client skips altogether those files that are still marked up to date. d) Client dutifully gives full statistics on "transfer", but creates/reports large disparity between the "number of files transferred" (hundreds, totalling more than 400 MB in size) and "total bytes written/read" (a few MB). See if rsync -av doesn't make its output a bit more rational. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@xxxxxxxxxxxx