Hello, On Sunday 03 June 2007, Sergei Shtylyov wrote: > Geller Sandor wrote: > Hello. > > >>>>> The log of a typical IDE reset is available here: > > >>>>> http://petra.hos.u-szeged.hu/~wildy/syslog.gz > > >>>>> This was the worst case: the IDE bus was resetted during the system > >>>>> boot. > > >>>> Could you try setting HPT374_ALLOW_ATA133_6 to 0 in > >>>> drivers/ide/pci/hpt366.c and rebuild/reboot the kernel? > > >>> Hi Sergei, > > >>> This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce > >>> the problem within a few seconds. With the above modification the > >>> machine > >>> is running under heavy disk I/O without problems since 30 minutes... > > >> Did it fix the problem for good? > > > It seems so far. There hasn't been any problem since I've applied the fix. > > >> Sergei, do we need to disallow UDMA6 completely on HPT734 or > >> is it only an issue with some problematic devices (=> blacklist)? > > Note that I didn't change what the old code was doing in this regard -- > although the HPT374 spec does *not* say that UDMA6 is supported, it had been > enabled. What have *really* changed for HPT374 was: > > - in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table > instead of the old one, matching 50 MHz (and so, severely underclocked); > > - in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock. > > Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the I felt inspired by this explanation (thanks!) and took a look at hpt374-opensource-v2.10 vendor driver. Here is something interesting: glbdata.c: ... #ifdef CLOCK_66MHZ ULONG setting370_66[] = { 0xd029d5e, 0xd029d26, 0xc829ca6, 0xc829c84, 0xc829c62, 0x2c829d2c, 0x2c829c66, 0x2c829c62, 0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62, 0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62, }; ... hpt366.c: ... static u32 sixty_six_base_hpt37x[] = { /* XFER_UDMA_6 */ 0x1c869c62, /* XFER_UDMA_5 */ 0x1cae9c62, /* 0x1c8a9c62 */ ... So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't (the only other mode which uses Dual ATA Clock, in both drivers, is rarely used UDMA3). Thanks to this UDMA cycle time should be equal 22.5ns instead of 30ns (spec defines it at 16.8ns, ide_timings[] uses 20ns) when using 66 MHz DPLL clock. In theory everything should play nice but the data manual for HPT374 contains weird note that Dual ATA Clock is meant to implement ATA100 read and write at different clocks (there is no more explanation to this). Geller reported that the problems started after migrating from 2.6.20.7 to 2.6.21.1 (the affected disks are using UDMA5) and at the same time the driver switched from 33 MHz PCI to 66 MHz DPLL clock. Also the issue is completely fixed by using 50 MHz DPLL clock (UDMA5 timing for 50 MHz DPLL clock is 0x12848242 so UDMA cycle time equals 20ns and is smaller than the one obtained using 66 MHz DPLL clock). It all makes me wonder whether it is really safe to use Dual ATA Clock for UDMA5 and whether we should just be using "the offical" timing instead... Sergei? > original report claimed that something has changed to worse between 2.6.21.1 > and .3 but nothing changed in drivers/ide/ between those releases... It could be that md changes from 2.6.21.3 have influenced the situation (by putting more stress on disks etc)... Thanks, Bart - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html