Am Mi, den 31.12.2003 schrieb Kevin Corry um 18:25: > As Joe mentioned last week, we've been tossing around some ideas for changes > and bug-fixes to the DM snapshot code. So here's a first crack at a to-do > list. Perhaps Joe can put a copy of this on his web site. Anyone else with > comments or ideas, feel free to add to this list. Thank you. I'm currently going through the code myself because I've had a server crash when the backup script tried to take snapshots (unfortunately I couldn't see the oops and I've stopped the backup script for now). This apparently didn't happen with an older snapshot code version but perhaps it was just luck. ;) I can reproduce massive data corruptions when taking snapshots with reiserfs here (on the origin device!) so probably reiserfs caused the oops. I wanted to further investigate this. while true; do cp -r /usr/src/linux-2.6.0/drivers/net /data/ lvcreate -s -L 300M -n snap-data /dev/vg/data sync mount /dev/vg/snap-data /mnt/tmp rm -Rf /data/net umount /mnt/tmp lvremove -f /dev/vg/snap-data done Makes reiserfs go crazy. > 1. Reads to the snapshot > > Currently, a read for the snapshot is only submitted to the cow device when > there's a completed-exception. If there's a pending-exception, the request is > still sent to the origin device. Instead, the request should be queued on the > pending-exception, just like for the write requests. I also noted this when looking through the code. Perhaps this is causing the trouble I'm seeing. I wanted to experiment a bit and try to see if changing this fixes the problem. (Until now I was busy tracking down a bug in dm-crypt someone was seeing, I think I found it. Nasty bug causing a race condition which I can't reproduce here but is definitely a big bug...) I also noticed that the snapshot code is reordering the BIOs, it uses something like a stack then queueing single bios instead of a fifo. And while flushing blocks even the generic code allows new bios to be submitted in parallel instead of also delaying them. Jens Axboe confirmed that this will cause trouble once there will be BIO users that submit barriers.