On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote: > It would be better to use sharding over stripe for your vm use case. It > offers better distribution and utilisation of bricks and better heal > performance. > And it is well tested. Basically the "striping" feature is deprecated, "sharding" is its improved replacement. I expect to see "striping" completely dropped in the next major release. Niels > Couple of things to note before you do that: > 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised > that you use 3.7.8 or above. > 2. When you enable sharding on a volume, already existing files in the > volume do not get sharded. Only the files that are newly created from the > time sharding is enabled will. > If you do want to shard the existing files, then you would need to cp > them to a temp name within the volume, and then rename them back to the > original file name. > > HTH, > Krutika > > On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan@xxxxxxxxxxxxxxxxx > > wrote: > > > I couldn't find anything related to cache in the HBAs. > > what logs are useful in my case ? i see only bricks logs which contains > > nothing during the failure. > > > > ### > > [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod] > > 0-vmware-posix: mknod on > > /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed > > [File exists] > > [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod] > > 0-vmware-posix: mknod on > > /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed > > [File exists] > > [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash: > > rmdir issued on /.trashcan/, which is not permitted > > [2016-03-13 18:07:55.027635] I [MSGID: 115056] > > [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR > > /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op) > > ==> (Operation not permitted) [Operation not permitted] > > [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed > > user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 > > [2016-03-13 18:11:34.353463] I [MSGID: 115029] > > [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client > > from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version: > > 3.7.8) > > [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed > > user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 > > [2016-03-13 18:11:34.591173] I [MSGID: 115029] > > [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client > > from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version: > > 3.7.8) > > ### > > > > ESXi just keeps telling me "Cannot clone T: The virtual disk is either > > corrupted or not a supported format. > > error > > 3/13/2016 9:06:20 PM > > Clone virtual machine > > T > > VCENTER.LOCAL\Administrator > > " > > > > My setup is 2 servers with a floating ip controlled by CTDB and my ESXi > > server mount the NFS via the floating ip. > > > > > > > > > > > > On 03/13/2016 08:40 PM, pkoelle wrote: > > > >> Am 13.03.2016 um 18:22 schrieb David Gossage: > >> > >>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan < > >>> mahdi.adnan@xxxxxxxxxxxxxxxxx > >>> > >>>> wrote: > >>>> > >>> > >>> My HBAs are LSISAS1068E, and the filesystem is XFS. > >>>> I tried EXT4 and it did not help. > >>>> I have created a stripted volume in one server with two bricks, same > >>>> issue. > >>>> and i tried a replicated volume with just "sharding enabled" same issue, > >>>> as soon as i disable the sharding it works just fine, niether sharding > >>>> nor > >>>> striping works for me. > >>>> i did follow up with some of threads in the mailing list and tried some > >>>> of > >>>> the fixes that worked with the others, none worked for me. :( > >>>> > >>>> > >>> Is it possible the LSI has write-cache enabled? > >>> > >> Why is that relevant? Even the backing filesystem has no idea if there is > >> a RAID or write cache or whatever. There are blocks and sync(), end of > >> story. > >> If you lose power and screw up your recovery OR do funky stuff with SAS > >> multipathing that might be an issue with a controller cache. AFAIK thats > >> not what we are talking about. > >> > >> I'm afraid but unless the OP has some logs from the server, a > >> reproducible testcase or a backtrace from client or server this isn't > >> getting us anywhere. > >> > >> cheers > >> Paul > >> > >> > >>> > >>> > >>> > >>> On 03/13/2016 06:54 PM, David Gossage wrote: > >>>> > >>>> > >>>> > >>>> > >>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan < > >>>> mahdi.adnan@xxxxxxxxxxxxxxxxx> wrote: > >>>> > >>>> Okay so i have enabled shard in my test volume and it did not help, > >>>>> stupidly enough, i have enabled it in a production volume > >>>>> "Distributed-Replicate" and it currpted half of my VMs. > >>>>> I have updated Gluster to the latest and nothing seems to be changed in > >>>>> my situation. > >>>>> below the info of my volume; > >>>>> > >>>>> > >>>> I was pointing at the settings in that email as an example for > >>>> corruption > >>>> fixing. I wouldn't recommend enabling sharding if you haven't gotten the > >>>> base working yet on that cluster. What HBA's are you using and what is > >>>> layout of filesystem for bricks? > >>>> > >>>> > >>>> Number of Bricks: 3 x 2 = 6 > >>>>> Transport-type: tcp > >>>>> Bricks: > >>>>> Brick1: gfs001:/bricks/b001/vmware > >>>>> Brick2: gfs002:/bricks/b004/vmware > >>>>> Brick3: gfs001:/bricks/b002/vmware > >>>>> Brick4: gfs002:/bricks/b005/vmware > >>>>> Brick5: gfs001:/bricks/b003/vmware > >>>>> Brick6: gfs002:/bricks/b006/vmware > >>>>> Options Reconfigured: > >>>>> performance.strict-write-ordering: on > >>>>> cluster.server-quorum-type: server > >>>>> cluster.quorum-type: auto > >>>>> network.remote-dio: enable > >>>>> performance.stat-prefetch: disable > >>>>> performance.io-cache: off > >>>>> performance.read-ahead: off > >>>>> performance.quick-read: off > >>>>> cluster.eager-lock: enable > >>>>> features.shard-block-size: 16MB > >>>>> features.shard: on > >>>>> performance.readdir-ahead: off > >>>>> > >>>>> > >>>>> On 03/12/2016 08:11 PM, David Gossage wrote: > >>>>> > >>>>> > >>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan < > >>>>> <mahdi.adnan@xxxxxxxxxxxxxxxxx>mahdi.adnan@xxxxxxxxxxxxxxxxx> wrote: > >>>>> > >>>>> Both servers have HBA no RAIDs and i can setup a replicated or > >>>>>> dispensers without any issues. > >>>>>> Logs are clean and when i tried to migrate a vm and got the error, > >>>>>> nothing showed up in the logs. > >>>>>> i tried mounting the volume into my laptop and it mounted fine but, > >>>>>> if i > >>>>>> use dd to create a data file it just hang and i cant cancel it, and i > >>>>>> cant > >>>>>> unmount it or anything, i just have to reboot. > >>>>>> The same servers have another volume on other bricks in a distributed > >>>>>> replicas, works fine. > >>>>>> I have even tried the same setup in a virtual environment (created two > >>>>>> vms and install gluster and created a replicated striped) and again > >>>>>> same > >>>>>> thing, data corruption. > >>>>>> > >>>>>> > >>>>> I'd look through mail archives for a topic "Shard in Production" I > >>>>> think > >>>>> it's called. The shard portion may not be relevant but it does discuss > >>>>> certain settings that had to be applied with regards to avoiding > >>>>> corruption > >>>>> with VM's. You may want to try and disable the > >>>>> performance.readdir-ahead > >>>>> also. > >>>>> > >>>>> > >>>>> > >>>>>> On 03/12/2016 07:02 PM, David Gossage wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan < > >>>>>> <mahdi.adnan@xxxxxxxxxxxxxxxxx>mahdi.adnan@xxxxxxxxxxxxxxxxx> wrote: > >>>>>> > >>>>>> Thanks David, > >>>>>>> > >>>>>>> My settings are all defaults, i have just created the pool and > >>>>>>> started > >>>>>>> it. > >>>>>>> I have set the settings as your recommendation and it seems to be the > >>>>>>> same issue; > >>>>>>> > >>>>>>> Type: Striped-Replicate > >>>>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 > >>>>>>> Status: Started > >>>>>>> Number of Bricks: 1 x 2 x 2 = 4 > >>>>>>> Transport-type: tcp > >>>>>>> Bricks: > >>>>>>> Brick1: gfs001:/bricks/t1/s > >>>>>>> Brick2: gfs002:/bricks/t1/s > >>>>>>> Brick3: gfs001:/bricks/t2/s > >>>>>>> Brick4: gfs002:/bricks/t2/s > >>>>>>> Options Reconfigured: > >>>>>>> performance.stat-prefetch: off > >>>>>>> network.remote-dio: on > >>>>>>> cluster.eager-lock: enable > >>>>>>> performance.io-cache: off > >>>>>>> performance.read-ahead: off > >>>>>>> performance.quick-read: off > >>>>>>> performance.readdir-ahead: on > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> Is their a raid controller perhaps doing any caching? > >>>>>> > >>>>>> In the gluster logs any errors being reported during migration > >>>>>> process? > >>>>>> Since they aren't in use yet have you tested making just mirrored > >>>>>> bricks > >>>>>> using different pairings of servers two at a time to see if problem > >>>>>> follows > >>>>>> certain machine or network ports? > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 03/12/2016 03:25 PM, David Gossage wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan < > >>>>>>> <mahdi.adnan@xxxxxxxxxxxxxxxxx>mahdi.adnan@xxxxxxxxxxxxxxxxx> wrote: > >>>>>>> > >>>>>>> Dears, > >>>>>>>> > >>>>>>>> I have created a replicated striped volume with two bricks and two > >>>>>>>> servers but I can't use it because when I mount it in ESXi and try > >>>>>>>> to > >>>>>>>> migrate a VM to it, the data get corrupted. > >>>>>>>> Is any one have any idea why is this happening ? > >>>>>>>> > >>>>>>>> Dell 2950 x2 > >>>>>>>> Seagate 15k 600GB > >>>>>>>> CentOS 7.2 > >>>>>>>> Gluster 3.7.8 > >>>>>>>> > >>>>>>>> Appreciate your help. > >>>>>>>> > >>>>>>>> > >>>>>>> Most reports of this I have seen end up being settings related. Post > >>>>>>> gluster volume info. Below is what I have seen as most common > >>>>>>> recommended > >>>>>>> settings. > >>>>>>> I'd hazard a guess you may have some the read ahead cache or prefetch > >>>>>>> on. > >>>>>>> > >>>>>>> quick-read=off > >>>>>>> read-ahead=off > >>>>>>> io-cache=off > >>>>>>> stat-prefetch=off > >>>>>>> eager-lock=enable > >>>>>>> remote-dio=on > >>>>>>> > >>>>>>> > >>>>>>>> Mahdi Adnan > >>>>>>>> System Admin > >>>>>>>> > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Gluster-users mailing list > >>>>>>>> <Gluster-users@xxxxxxxxxxx>Gluster-users@xxxxxxxxxxx > >>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users> > >>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >>> > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> Gluster-users@xxxxxxxxxxx > >>> http://www.gluster.org/mailman/listinfo/gluster-users > >>> > >>> > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users@xxxxxxxxxxx > >> http://www.gluster.org/mailman/listinfo/gluster-users > >> > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users