On Mon, 2009-01-05 at 10:31 +0100, Lennert Buytenhek wrote: > On Thu, Dec 11, 2008 at 08:59:40AM +0000, Russell King wrote: > > > > > Hacky patch that mlock()s rpmdb's environment mmap(2)s, in order to > > > > attempt to avoid spurious rpmdb corruption issues on Linux that seem > > > > to be somehow related to pagein/pageout occuring. > > > > > > Ick. > > > > > > No. > > > > The relevent questions are: > > > > 1. which kernel version is this occuring with? > > > > 2. what device is the swap on? > > > > 3. which drivers are being used? > > This issue goes back to May 2007 or so, when I noticed db4 corruption > when using rpm. I started digging into it, and ran into an issue with > fsx-linux, which you reported to linux-arch@ here: > > http://marc.info/?l=linux-arch&m=118026300719763&w=2 > > Unfortunately, the issue seen with fsx-linux turned out to be unrelated > to the rpm db4 corruption issue. > > I applied the hacky rpm db4 database mlock() patch (which was never > meant to go upstream!) to see if that would make it go away, and it > seems to have made it go away, since I haven't managed to reproduce > it since and haven't had any reports about it since. > > Without the mlock patch, the corruption would happen even in > qemu-system-arm, an environment in which cache aliasing effects don't > exist, so I abandoned the theory of it being a cache aliasing issue at > the time and theorised that somehow a dirty page was having its dirty > data discarded and an older stale copy being swapped back in, although > I've never been able to prove this -- after spending a week > unsuccessfully trying to hunt it down at the time I haven't spent any > more time on it since. (And everyone I mentioned this to seemed to > agree that shared writeable mmap() is icky and yuck and booh and "hard > to get right", and that didn't increase my motivation to look into it > further either.) > > I don't even know if it's an issue anymore in recent kernels. I don't > even know if it's (assuming that it _is_ indeed a kernel issue) an > arch/arm issue or a kernel-wide issue that simply occurs more often on > ARM because ARM systems generally have less memory and therefore > generally have more memory pressure. (There's certainly enough reports > of rpm database corruption on x86 as well, but in almost every report > there are more factors involved, such as people Ctrl-C'ing and killing > rpm processes as they are manipulating the database, etc.) > I have been running a few systems with a lot of rpm activity without this patch, and I haven't seen a problem with these (probably because of the rpm 4.4 to 4.6 transition?). I have taken that patch out from the F10 rpm patches that I had submitted earlier. > > thanks, > Lennert Kedar. _______________________________________________ fedora-arm mailing list fedora-arm@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-arm