Hi Mauro,
The rebalance code started using fallocate in 3.10.5 (https://bugzilla.redhat.com/show_bug.cgi?id=1473132) which works fine on replicated volumes. However, we neglected to test this with EC volumes on 3.10. Once we discovered the issue, the EC fallocate implementation was made available in 3.11.
At this point, I'm afraid the only option I see is to upgrade to at least 3.12.
@Sunil, do you have anything to add?
Regards,
Nithya
On 13 September 2018 at 18:34, Mauro Tridici <mauro.tridici@xxxxxxx> wrote:
Hi Nithya,thank you for involving EC group.I will wait for your suggestions.Regards,MauroIl giorno 13 set 2018, alle ore 13:38, Nithya Balachandran <nbalacha@xxxxxxxxxx> ha scritto:This looks like an issue because rebalance switched to using fallocate which EC did not have implemented at that point.@Pranith, @Ashish, which version of gluster had support for fallocate in EC?Regards,NithyaOn 12 September 2018 at 19:24, Mauro Tridici <mauro.tridici@xxxxxxx> wrote:Dear All,I recently added 3 servers (each one with 12 bricks) to an existing Gluster Distributed Disperse Volume.Volume extension has been completed without error and I already executed the rebalance procedure with fix-layout option with no problem.I just launched the rebalance procedure without fix-layout option, but, as you can see in the output below, I noticed that some failures have been detected.[root@s01 glusterfs]# gluster v rebalance tier2 statusNode Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------localhost 71176 3.2MB 2137557 1530391 8128 in progress 13:59:05s02-stg 0 0Bytes 0 0 0 completed 11:53:28s03-stg 0 0Bytes 0 0 0 completed 11:53:32s04-stg 0 0Bytes 0 0 0 completed 0:00:06s05-stg 15 0Bytes 17055 0 18 completed 10:48:01s06-stg 0 0Bytes 0 0 0 completed 0:00:06Estimated time left for rebalance to complete : 0:46:53volume rebalance: tier2: successIn the volume rebalance log file, I detected a lot of error messages similar to the following ones:[2018-09-12 13:15:50.756703] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-tier2-dht: Create dst failed on - tier2-disperse-6 for file - /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2005-12_grid .nc [2018-09-12 13:15:50.757025] E [MSGID: 109023] [dht-rebalance.c:2733:gf_defrag_migrate_single_file] 0-tier2-dht: migrate-data failed for /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2005-12_grid .nc [2018-09-12 13:15:50.759183] E [MSGID: 109023] [dht-rebalance.c:844:__dht_rebalance_create_dst_file] 0-tier2-dht: fallocate failed for /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2005-09_grid on tier2-disperse-9 (Operation not supported).nc [2018-09-12 13:15:50.759206] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-tier2-dht: Create dst failed on - tier2-disperse-9 for file - /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2005-09_grid .nc [2018-09-12 13:15:50.759536] E [MSGID: 109023] [dht-rebalance.c:2733:gf_defrag_migrate_single_file] 0-tier2-dht: migrate-data failed for /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2005-09_grid .nc [2018-09-12 13:15:50.777219] E [MSGID: 109023] [dht-rebalance.c:844:__dht_rebalance_create_dst_file] 0-tier2-dht: fallocate failed for /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2006-01_grid on tier2-disperse-10 (Operation not supported).nc [2018-09-12 13:15:50.777241] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-tier2-dht: Create dst failed on - tier2-disperse-10 for file - /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2006-01_grid .nc [2018-09-12 13:15:50.777676] E [MSGID: 109023] [dht-rebalance.c:2733:gf_defrag_migrate_single_file] 0-tier2-dht: migrate-data failed for /CSP/sp1/CESM/archive/sps_2005 08_003/atm/hist/postproc/sps_ 200508_003.cam.h0.2006-01_grid .nc Could you please help me to understand what is happening and how to solve it?Our Gluster implementation is based on Gluster v.3.10.5Thank you in advance,Mauro
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
-------------------------Mauro TridiciFondazione CMCCCMCC Supercomputing Centerpresso Complesso Ecotekne - Università del Salento -Strada Prov.le Lecce - Monteroni sn73100 Lecce ITmobile: (+39) 327 5630841email: mauro.tridici@xxxxxxx
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users