----- Original Message ----- > From: "Joe Landman" <joe.landman@xxxxxxxxx> > Ok. I've had power supplies take down memory in the past. You might be > hitting a bad memory cell courtesy of the PS. Possibly, though see below. > >> Do you have EDAC (or mcelog) on? Any errors from this? > > > > I don't have mcelog on, and no, the memory isn't registered, but a > > 4-pass run of Memtest+ came up clean, so I'm speculating that the > > Not registered (which is just buffered), but ECC. ECC does a parity > computation on some number of bits, and provides you a rough "good/bad" > binary state of a particular area of memory. If the parity bits stored > don't match what is computed on read, then odds are that something is > wrong. Its not foolproof, but its a good mechanism to catch potential > errors. Sure. In my experience, all ECC is registered/buffered, and no non-ECC is, so I use it as shorthand. No possible chance this northbridge would do ECC, no. :-) > We've had cases where Memtest(*) reported everything fine, yet I was > able to generate ECC errors in a few minutes by running a memory > intensive app. Memtest does do some hardware exercise, but its not > usually hitting memory the way apps do. That difference can be > significant. This is in part why the day job stopped using memtest for > testing a number of years ago. We now run heavy duty electronic > structure codes, and pi/e/... computations for burn in. Fair point. I did also run the non-+ version of Memtest, which I understand uses a different algorithm, and a couple other things I found on the UBCD, so I'm *relatively* confident I don't have a running RAM problem, though as you say, not 100%. > > *continuing* problem isn't hardware; I'm pretty sure it was just the > > failing 12V rail on the dying PS. I just have to clean up after it > > enough to get *one* of these 2 drives cleaned off, then I can make a > > new FS, and play musical files. > > Ahhh ... > > I was running a Plex server on an old machine for a while. I had to > shift over to a beefier box with ECC ram and more CPUs. Right now my > Plex server has 8 cpus, 24 GB RAM, and about 1TB of disk (old). Once > you start doing recoding on the fly (multi-resolution output), you > need the ram and processor power. > > > > > Or, I may just go grab a 3TB external after all. :-) > > If you do that, and you still hit the error, chances are you might > need to swap out your MB and CPU/RAM to something newer (not to mention the > PS). I'd recommend ECC based systems if at all possible. Xfs can and > will get very unhappy if bits are flipped on its data structures while > you are making changes to the file system. As it happens, Dave helped me clean up a mess 4 or 5 years ago, where a *wire opened up* on the PATA cable, and all my data structures had a missing bit. Ghod was that a mess. We did end up getting the drive. So assuming I can reliably read the big drive (I have a 3T, a 2T, and a 1T all with different problems), I'm going to move all the files from it to the new 3T I just bought, and then play musical files down the chain one at a time. Thank ghod the new season hasn't started yet. ;-) Thanks for the help, Joe. Oh, and the script that Stan was so worried about? It's all rm and mv commands. 5859 of them. Cheers, -- jra -- Jay R. Ashworth Baylink jra@xxxxxxxxxxx Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA #natog +1 727 647 1274 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs