On Tue, 13 Dec 2011 13:35:46 -0500 "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote: > Do you by any chance have active LVM snapshots? If so how many and > how long have they been provisioned for? I forgot to mention that. There are now three snapshots, one on each of three LVs, that have been provisioned for a few hours. These LVs aren't in active use, but are backups, synced daily. So basically the only activity is rsync once daily, bandwidth limited to be fairly slow. One logical volume that locked up when trying to write to it had a snapshot. Prior to this most recent rebuild, there were a lot of snap shots - three on each of fifteen LVs. I replaced that VG with a fresh one and it seemed to work for a while. I thought the problem was likely related to lots of long lived snapshots, but after completely rebuilding the VG after deleting all snapshots the problem recurred very quickly, before there were many snapshots and before there was a lot of IO to the snaps I realize I'm somewhat abusing snapshots - they weren't designed to be long lived. Therefore my "torture test" usage may reveal problems that wouldn't happen often with very short lived snapshots. Another similar server has more snapshots on more LVs running the same rsyncs without obvious trouble. I should also have mentioned sequential writes to one LV at a time don't seem to trigger the problem. I copied the whole VG one LV at a time with: dd if=/dev/oldvg/lv1 of=/dev/newvg/lv1 Copying the entire LVs sequentially saw no problems. Later when I tried to rsync to the LVs the problem showed itself. > >> filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|", > >> "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ] > > > Is it intentional to include sd devices? Just because the MD uses > them doesn't mean you have to make allowances for them here. Some /dev/sdX devices were used, but no more and I have now removed sd.* and etherd. > > < locking_dir = "/var/lock/lvm" > > --- > >> locking_dir = "/dev/shm" > > Why? This was changed AFTER the problem started. Because comment in the file says: # Local non-LV directory that holds file-based locks while commands # are in progress. Because /var/lock is on an LV, I tried switching it to a directory that will never be on an LV. That didn't seem to have any effect. -- Ray Morris support@bettercgi.com Strongbox - The next generation in site security: http://www.bettercgi.com/strongbox/ Throttlebox - Intelligent Bandwidth Control http://www.bettercgi.com/throttlebox/ Strongbox / Throttlebox affiliate program: http://www.bettercgi.com/affiliates/user/register.php -- Ray Morris support@bettercgi.com Strongbox - The next generation in site security: http://www.bettercgi.com/strongbox/ Throttlebox - Intelligent Bandwidth Control http://www.bettercgi.com/throttlebox/ Strongbox / Throttlebox affiliate program: http://www.bettercgi.com/affiliates/user/register.php _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/