Some of these patches are incomplete, but I'm hoping for some comments on them before finishing them to ensure I'm headed in the right direction. (Patches apply to 2.6.18.1) I'm providing all the patches in my set (even though some are already on their way upstream) so that later patches apply cleanly. I have not included any read-balancing work, and these patches are not dependent on that being there. These 4 patches have already been submitted to the list, and are waiting for inclusion upstream. dm-multipath-add_path_order_fix.patch dm-raid1-log_function_enhancement_and_name_change.patch dm-raid1-reset_sync_search_on_resume.patch dm-raid1-proper_suspend_fix.patch This patch adds the ability to print that the log device has failed on the status line. This is useful to user-space when responding to failures. dm-raid1-log_fault_detection.patch This next patch fixes something which may be controversial, which is why it is not part of the previous patch. When the log is resumed, it can return an error if it failed to read the log device. However, mirror can do nothing about it because the target resume function returns 'void'. Since the mirror will proceed regardless of a log resume failure, we have the log assume all regions are out-of-sync - just as you'd expect from a mirror with no persistent log. dm-raid1-log_fault_detection_part2.patch This patch is already in 19-rc4-mm1 dm-raid1-status_line_fix.patch This patch adds new options to the mirror mapping/constructor table. Specifically, it adds the ability to specify the 'handle_errors' feature. Other features can be specified here in the future as well (like 'async'). Note that this is a departure from having 'block_on_error' as a log argument, as was previously the case. I believe this feature is better kept in the mirroring code vs. the logging code. dm-raid1-features_addition_to_table.patch This patch was originally developed by NEC. It ensures that if a resync fails to a region we don't mark it clean - potentially allowing reads to unsynced regions. I've changed some things to support backwards compatibility. We still want correct behavior, but if we are ignoring failures as before, we must still mark the region sync finished. (This is necessary for pvmove to complete properly if moving off of a faulty device.) dm-raid1-handle_resync_failures.patch This patch adds handling for the case where a mirror device dies on write. The log must reflect that the region is not properly in-sync. We also add the ability to print the status of mirror devices - allowing user-space to take appropriate action. Again, it is important to maintain backwards compatibility for those you didn't specify 'handle_errors' as a feature to the mirror. Note that this patch is not complete until the ability to requeue request to device-mapper core is added (more detail in patch header). dm-raid1-write_fault_tolerance.patch This patch adds the ability to handle read failures. That is, to choose another device if the region is in-sync. dm-raid1-read_fault_tolerance.patch This following patches are incomplete, but should provide useful insight for those interested. (BTW, if anyone is really good with netlink, I'd love the help.) dm-raid1-add_cluster_ability.patch dm-raid1-version-bump.patch dm-raid1-cluster_logging.patch More comments can be found in the patch headers. All comments are welcome. brassow
Attachment:
2.6.18.x-mirror_patches-11022006.tgz
Description: application/compressed-tar
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel