On 01/30/2012 08:52 PM, Graeme Russ wrote: > Hi Mulyadi, > > On Tue, Jan 31, 2012 at 3:25 PM, Mulyadi Santosa > <mulyadi.santosa@xxxxxxxxx> wrote: >> Hi :) >> >> On Tue, Jan 31, 2012 at 10:00, Graeme Russ<graeme.russ@xxxxxxxxx> wrote: >>> I _think_ I've solved the problem - SDRAM Voltage >> >> You got my respect man, you're really stubborn :) >> >>> The SDRAM I am using has a rated operating voltage of 1.5V +/- 0.075. >>> It looked like the motherboard BIOS had decided to use the upper limit >>> of 1.575V when set to 'Auto'. I changed it to 'Manual' and set the >>> SDRAM voltage to 1.5V and it's been running stably for the longest >>> time it ever has. >> >> Thanks (again) for sharing. So this indeed has tight relationship with >> RAM "misbehaviour". How do you know it? Do you inspect every piece of >> your hardware? I am curious to know (maybe others too). > > The first symptom was that the screen would cycle through solid colour, so > naturally the video 'card' was the first to be blamed. Of course, the i5 > has the video built into the CPU, so the likelihood of a fault there is > probably minimal, so the graphics driver was next in line > > So I installed an nVidia 8600GT and ran the nouveau driver (now I did get > a glitch using this combo, but it wasn't a hang so I set that aside as a > driver bug as well... could be related) > > I then installed an nVidia G210 (it's a much smaller and quieter card). I > experienced one hang with this combination (right, now things are getting > interesting...) > > In the meantime, I had tried fiddling with the IGPU voltage offset - no > luck of course > > I removed my Linux hard drives and installed a spare hard drive and > proceeded to install Windows 7 (using the on-chip Intel graphics). The > machine hung once before the Window 7 drivers were installed (promising) > > I then installed the Windows 7 drivers and started downloading 3DMark 2006 > > ...Off to Australia Day Lunch with friends, back later... > > OK, so 3DMark downloaded OK and the machine was still running some 6 hours > later :( > > Before getting a chance to install 3DMark, I had some other things to > attend to... Glancing over bright flashing colours!!! Linux had been > exonerated :) > > So I took it back to the shop I bought it from (long argument about voiding > the warranty by taking of the cover blah blah blah). They ran a stress > test without failure. I suggested they run memtest which was met by 'Ah, > yeah, I should have thought of that first' (and _I_ voided the warranty!) > > So memtest failed, they put in another pair of memory modules and memtest > failed again. Now the plot thickens... They put the old memory back and > memtest passed! (what the!) then the put the new memory in and, you guessed > it, memtest passed! So the old memory goes back in and more stress testing > begins. > > It was run all day, no failure. So I went in and picked up the machine to > take back home on the assumption that the problem was the seating of the > memory modules - well I couldn't really fault that analysis (another > argument about voiding warranty, 'parts still in warranty, labour to run > the tests not', and 'Oh, it failed under Linux, must be software related, > not covered by warrantly' Me: 'It failed before I opened the case', > Them: 'doesn't matter, you opened the case') - Anyway, I got it back > without paying anything mumbling 'idiots' under my breath... > > so I put my Linux drives back in and run it over night. It survived and so > I thought the problem was solved but alas, it failed ten minutes after > waking it up in the morning... bugger! > > So RAM modules not the problem, that leaves CPU, Motherboard and PSU... > > So I switched out the PSU - Fail (really quickly this time... interesting) > > So that's when I decided to look at the SDRAM voltage - I looked up the > datasheet for the RAM and compared it to the BIOS setting... Hmm, right > at the upper limit of the spec'd DIMM voltage, so I set it to 1.5V > manually. > > Since then it has not skipped a beat (only been ~18 hours, but that's way > longer than previously) > > Now if it fails again, I'm just going to buy another motherboard. If that > works, I'm going to have a _very_ interesting time with the shop I > bought it from (after all, the parts are under warranty hardy, har har!) > >> NB: it could be a good lesson that system lock up might have >> absolutely nothing to do with kernel. > > Verily :) > > Regards, > > Graeme > Thank you Graeme for sharing this experience. Amazing persistence! I would not have gone this far. :) Sometimes you have to doubt even the nuts and bolts :) -Fredrick > _______________________________________________ > Kernelnewbies mailing list > Kernelnewbies@xxxxxxxxxxxxxxxxx > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies