Re: [ovirt-users] Re: VM disk corruption with LSM on Gluster

Sander Hoentjen <sander@xxxxxxxxxxx> · Wed, 27 Mar 2019 10:17:29 +0100

Hi Krutika, Leo,

Sounds promising. I will test this too, and report back tomorrow (or
maybe sooner, if corruption occurs again).

-- Sander

On 27-03-19 10:00, Krutika Dhananjay wrote:
> This is needed to prevent any inconsistencies stemming from buffered
> writes/caching file data during live VM migration.
> Besides, for Gluster to truly honor direct-io behavior in qemu's
> 'cache=none' mode (which is what oVirt uses),
> one needs to turn on performance.strict-o-direct and disable remote-dio.
>
> -Krutika
>
> On Wed, Mar 27, 2019 at 12:24 PM Leo David <leoalex@xxxxxxxxx
> <mailto:leoalex@xxxxxxxxx>> wrote:
>
>     Hi,
>     I can confirm that after setting these two options, I haven't
>     encountered disk corruptions anymore.
>     The downside, is that at least for me it had a pretty big impact
>     on performance.
>     The iops really went down - performing  inside vm fio tests.
>
>     On Wed, Mar 27, 2019, 07:03 Krutika Dhananjay <kdhananj@xxxxxxxxxx
>     <mailto:kdhananj@xxxxxxxxxx>> wrote:
>
>         Could you enable strict-o-direct and disable remote-dio on the
>         src volume as well, restart the vms on "old" and retry migration?
>
>         # gluster volume set <VOLNAME> performance.strict-o-direct on
>         # gluster volume set <VOLNAME> network.remote-dio off
>
>         -Krutika
>
>         On Tue, Mar 26, 2019 at 10:32 PM Sander Hoentjen
>         <sander@xxxxxxxxxxx <mailto:sander@xxxxxxxxxxx>> wrote:
>
>             On 26-03-19 14:23, Sahina Bose wrote:
>             > +Krutika Dhananjay and gluster ml
>             >
>             > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen
>             <sander@xxxxxxxxxxx <mailto:sander@xxxxxxxxxxx>> wrote:
>             >> Hello,
>             >>
>             >> tl;dr We have disk corruption when doing live storage
>             migration on oVirt
>             >> 4.2 with gluster 3.12.15. Any idea why?
>             >>
>             >> We have a 3-node oVirt cluster that is both compute and
>             gluster-storage.
>             >> The manager runs on separate hardware. We are running
>             out of space on
>             >> this volume, so we added another Gluster volume that is
>             bigger, put a
>             >> storage domain on it and then we migrated VM's to it
>             with LSM. After
>             >> some time, we noticed that (some of) the migrated VM's
>             had corrupted
>             >> filesystems. After moving everything back with
>             export-import to the old
>             >> domain where possible, and recovering from backups
>             where needed we set
>             >> off to investigate this issue.
>             >>
>             >> We are now at the point where we can reproduce this
>             issue within a day.
>             >> What we have found so far:
>             >> 1) The corruption occurs at the very end of the
>             replication step, most
>             >> probably between START and FINISH of
>             diskReplicateFinish, before the
>             >> START merge step
>             >> 2) In the corrupted VM, at some place where data should
>             be, this data is
>             >> replaced by zero's. This can be file-contents or a
>             directory-structure
>             >> or whatever.
>             >> 3) The source gluster volume has different settings
>             then the destination
>             >> (Mostly because the defaults were different at creation
>             time):
>             >>
>             >> Setting                                 old(src)  new(dst)
>             >> cluster.op-version                      30800     30800
>             (the same)
>             >> cluster.max-op-version                  31202     31202
>             (the same)
>             >> cluster.metadata-self-heal              off       on
>             >> cluster.data-self-heal                  off       on
>             >> cluster.entry-self-heal                 off       on
>             >> performance.low-prio-threads            16        32
>             >> performance.strict-o-direct             off       on
>             >> network.ping-timeout                    42        30
>             >> network.remote-dio                      enable    off
>             >> transport.address-family                -         inet
>             >> performance.stat-prefetch               off       on
>             >> features.shard-block-size               512MB     64MB
>             >> cluster.shd-max-threads                 1         8
>             >> cluster.shd-wait-qlength                1024      10000
>             >> cluster.locking-scheme                  full      granular
>             >> cluster.granular-entry-heal             no        enable
>             >>
>             >> 4) To test, we migrate some VM's back and forth. The
>             corruption does not
>             >> occur every time. To this point it only occurs from old
>             to new, but we
>             >> don't have enough data-points to be sure about that.
>             >>
>             >> Anybody an idea what is causing the corruption? Is this
>             the best list to
>             >> ask, or should I ask on a Gluster list? I am not sure
>             if this is oVirt
>             >> specific or Gluster specific though.
>             > Do you have logs from old and new gluster volumes? Any
>             errors in the
>             > new volume's fuse mount logs?
>
>             Around the time of corruption I see the message:
>             The message "I [MSGID: 133017] [shard.c:4941:shard_seek]
>             0-ZoneA_Gluster1-shard: seek called on
>             7fabc273-3d8a-4a49-8906-b8ccbea4a49f. [Operation not
>             supported]" repeated 231 times between [2019-03-26
>             13:14:22.297333] and [2019-03-26 13:15:42.912170]
>
>             I also see this message at other times, when I don't see
>             the corruption occur, though.
>
>             -- 
>             Sander
>             _______________________________________________
>             Users mailing list -- users@xxxxxxxxx <mailto:users@xxxxxxxxx>
>             To unsubscribe send an email to users-leave@xxxxxxxxx
>             <mailto:users-leave@xxxxxxxxx>
>             Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>             oVirt Code of Conduct:
>             https://www.ovirt.org/community/about/community-guidelines/
>             List Archives:
>             https://lists.ovirt.org/archives/list/users@xxxxxxxxx/message/M3T2VGGGV6DE643ZKKJUAF274VSWTJFH/
>
>         _______________________________________________
>         Users mailing list -- users@xxxxxxxxx <mailto:users@xxxxxxxxx>
>         To unsubscribe send an email to users-leave@xxxxxxxxx
>         <mailto:users-leave@xxxxxxxxx>
>         Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>         oVirt Code of Conduct:
>         https://www.ovirt.org/community/about/community-guidelines/
>         List Archives:
>         https://lists.ovirt.org/archives/list/users@xxxxxxxxx/message/ZUIRM5PT4Y4USOSDGSUEP3YEE23LE4WG/
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users