Hi Rusty,
A rebalance involves 2 steps:
- Setting a new layout on a directory
- Migrating any files inside that directory that hash to a different subvol based on the new layout set in step 1.
A few things to keep in mind :
- Any new content created on this volume will currently go to the newly added brick.
- Having a more equitable file distribution is beneficial but you might not need to do a complete rebalance to do this. You can run the script on just enough directories to free up space on your older bricks. This should be done on bricks which contains large files to speed this up.
Do the following on one of the server nodes:
- Create a tmp mount point and mount the volume using the rebalance volfile
- mkdir /mnt/rebal
- glusterfs -s localhost --volfile-id rebalance/data /mnt/rebal
- Select a directory in the volume which contains a lot of large files and which has not been processed by the rebalance yet - the lower down in the tree the better. Check the rebalance logs to figure out which dirs have not been processed yet.
- cd /mnt/rebal/<chosen_dir>
- for dir in `find . -type d`; do echo $dir |xargs -0 -n1 -P10 bash process_dir.sh;done
- You can run this for different values of <chosen_dir> and on multiple server nodes in parallel as long as the directory trees for the different <chosen_dirs> don't overlap.
- Do this for multiple directories until the disk space used reduces on the older bricks.
This is a very simple script. Let me know how it works - we can always tweak it for your particular data set.
>and performance is basically garbage while it rebalances
Can you provide more detail on this? What kind of effects are you seeing?
How many clients access this volume?
Regards,
Nithya
On 30 July 2018 at 22:18, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:
I have not documented this yet - I will send you the steps tomorrow.Regards,NithyaOn 30 July 2018 at 20:23, Rusty Bower <rusty@xxxxxxxxxxxxxx> wrote:That would be awesome. Where can I find these?RustySent from my iPhoneHi Rusty,Sorry for the delay getting back to you. I had a quick look at the rebalance logs - it looks like the estimates are based on the time taken to rebalance the smaller files.We do have a scripting option where we can use virtual xattrs to trigger file migration from a mount point. That would speed things up.Regards,NithyaOn 28 July 2018 at 07:11, Rusty Bower <rusty@xxxxxxxxxxxxxx> wrote:Just wanted to ping this to see if you guys had any thoughts, or other scripts I can run for this stuff. It's still predicting another 90 days to rebalance this, and performance is basically garbage while it rebalances.RustyOn Mon, Jul 23, 2018 at 10:19 AM, Rusty Bower <rusty@xxxxxxxxxxxxxx> wrote:datanode03 is the newest brickthe bricks had gotten pretty full, which I think might be part of the issue:- datanode01 /dev/sda1 51T 48T 3.3T 94% /mnt/data- datanode02 /dev/sda1 51T 48T 3.4T 94% /mnt/data- datanode03 /dev/md0 128T 4.6T 123T 4% /mnt/dataeach of the bricks are on a completely separate disk from the OSI'll shoot you the log files offline :)Thanks!RustyOn Mon, Jul 23, 2018 at 3:12 AM, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:Hi Rusty,Sorry I took so long to get back to you.Which is the newly added brick? I see datanode02 has not picked up any files for migration which is odd.How full are the individual bricks (df -h ) output.Is each of your bricks in a separate partition?Can you send me the rebalance logs from all 3 nodes (offline if you prefer)?We can try using scripts to speed up the rebalance if you prefer.Regards,NithyaOn 16 July 2018 at 22:06, Rusty Bower <rusty@xxxxxxxxxxxxxx> wrote:Thanks for the reply Nithya.1. glusterfs 4.1.12. Volume Name: dataType: DistributeVolume ID: 294d95ce-0ff3-4df9-bd8c-a52fc50442ba Status: StartedSnapshot Count: 0Number of Bricks: 3Transport-type: tcpBricks:Brick1: datanode01:/mnt/data/bricks/data Brick2: datanode02:/mnt/data/bricks/data Brick3: datanode03:/mnt/data/bricks/data Options Reconfigured:performance.readdir-ahead: on3.Node Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------localhost 36822 11.3GB 50715 0 0 in progress 26:46:17datanode02 0 0Bytes 2852 0 0 in progress 26:46:16datanode03 3128 513.7MB 11442 0 3128 in progress 26:46:17Estimated time left for rebalance to complete : > 2 months. Please try again later.volume rebalance: data: success4. Directory structure is basically an rsync backup of some old systems as well as all of my personal media. I can elaborate more, but it's a pretty standard filesystem.5. In some folders there might be up to like 12-15 levels of directories (especially the backups)6. I'm honestly not sure, I can try to scrounge this number up7. My guess would be > 100k8. Most files are pretty large (media files), but there's a lot of small files (metadata and configuration files) as wellI've also appended a (moderately sanitized) snippet of the rebalance log (let me know if you need more)[2018-07-16 17:37:59.979003] I [MSGID: 0] [dht-rebalance.c:1799:dht_migrate_file] 0-data-dht: destination for file - /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/2040036.img.xm l is changed to - data-client-2 [2018-07-16 17:38:00.004262] I [MSGID: 109022] [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration of /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/2112002.img.xm l from subvolume data-client-0 to data-client-2 [2018-07-16 17:38:00.725582] I [dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) total_processed=43108305980 tmp_cnt = 55419279917056,rate_processed= 446597.869797, elapsed = 96526.000000 [2018-07-16 17:38:00.725641] I [dht-rebalance.c:5130:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete (size)= 124092127 seconds, seconds left = 123995601 [2018-07-16 17:38:00.725709] I [MSGID: 109028] [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 96526.00 secs [2018-07-16 17:38:00.725738] I [MSGID: 109028] [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files migrated: 36876, size: 12270259289, lookups: 50715, failures: 0, skipped: 0 [2018-07-16 17:38:02.769121] I [dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) total_processed=43108305980 tmp_cnt = 55419279917056,rate_processed= 446588.616567, elapsed = 96528.000000 [2018-07-16 17:38:02.769207] I [dht-rebalance.c:5130:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete (size)= 124094698 seconds, seconds left = 123998170 [2018-07-16 17:38:02.769263] I [MSGID: 109028] [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 96528.00 secs [2018-07-16 17:38:02.769286] I [MSGID: 109028] [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files migrated: 36876, size: 12270259289, lookups: 50715, failures: 0, skipped: 0 [2018-07-16 17:38:03.410469] I [dht-rebalance.c:1645:dht_migrate_file] 0-data-dht: /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/9201002.img.xm l: attempting to move from data-client-0 to data-client-2 [2018-07-16 17:38:03.416127] I [MSGID: 109022] [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration of /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/2040036.img.xm l from subvolume data-client-0 to data-client-2 [2018-07-16 17:38:04.738885] I [dht-rebalance.c:1645:dht_migrate_file] 0-data-dht: /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/9110012.img.xm l: attempting to move from data-client-0 to data-client-2 [2018-07-16 17:38:04.745722] I [MSGID: 109022] [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration of /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/9201002.img.xm l from subvolume data-client-0 to data-client-2 [2018-07-16 17:38:04.812368] I [dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) total_processed=43108308134 tmp_cnt = 55419279917056,rate_processed= 446579.386035, elapsed = 96530.000000 [2018-07-16 17:38:04.812417] I [dht-rebalance.c:5130:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete (size)= 124097263 seconds, seconds left = 124000733 [2018-07-16 17:38:04.812465] I [MSGID: 109028] [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 96530.00 secs [2018-07-16 17:38:04.812489] I [MSGID: 109028] [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files migrated: 36877, size: 12270261443, lookups: 50715, failures: 0, skipped: 0 [2018-07-16 17:38:04.992413] I [dht-rebalance.c:1645:dht_migrate_file] 0-data-dht: /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/2050000.img.xm l: attempting to move from data-client-0 to data-client-2 [2018-07-16 17:38:04.994122] I [MSGID: 109022] [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration of /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/9110012.img.xm l from subvolume data-client-0 to data-client-2 [2018-07-16 17:38:06.855618] I [dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) total_processed=43108318798 tmp_cnt = 55419279917056,rate_processed= 446570.244043, elapsed = 96532.000000 [2018-07-16 17:38:06.855719] I [dht-rebalance.c:5130:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete (size)= 124099804 seconds, seconds left = 124003272 [2018-07-16 17:38:06.855770] I [MSGID: 109028] [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 96532.00 secs [2018-07-16 17:38:06.855793] I [MSGID: 109028] [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files migrated: 36879, size: 12270266602, lookups: 50715, failures: 0, skipped: 0 [2018-07-16 17:38:08.511064] I [dht-rebalance.c:1645:dht_migrate_file] 0-data-dht: /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/9201055.img.xm l: attempting to move from data-client-0 to data-client-2 [2018-07-16 17:38:08.533029] I [MSGID: 109022] [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration of /this/is/a/file/path/that/exis ts/wz/wz/Npc.wz/2050000.img.xm l from subvolume data-client-0 to data-client-2 [2018-07-16 17:38:08.899708] I [dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) total_processed=43108318798 tmp_cnt = 55419279917056,rate_processed= 446560.991961, elapsed = 96534.000000 [2018-07-16 17:38:08.899791] I [dht-rebalance.c:5130:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete (size)= 124102375 seconds, seconds left = 124005841 [2018-07-16 17:38:08.899842] I [MSGID: 109028] [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 96534.00 secs [2018-07-16 17:38:08.899865] I [MSGID: 109028] [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files migrated: 36879, size: 12270266602, lookups: 50715, failures: 0, skipped: 0 On Mon, Jul 16, 2018 at 7:37 AM, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:If possible, please send the rebalance logs as well.On 16 July 2018 at 10:14, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:Hi Rusty,We need the following information:
- The exact gluster version you are running
- gluster volume info <volname>
- gluster rebalance status
- Information on the directory structure and file locations on your volume.
- How many levels of directories
- How many files and directories in each level
- How many directories and files in total (a rough estimate)
- Average file size
Please note that having a rebalance running in the background should not affect your volume access in any way. However I would like to know why only 6000 files have been scanned in 6 hours.Regards,NithyaOn 16 July 2018 at 06:13, Rusty Bower <rusty@xxxxxxxxxxxxxx> wrote:______________________________Hey folks,I just added a new brick to my existing gluster volume, but gluster volume rebalance data status is telling me the following: Estimated time left for rebalance to complete : > 2 months. Please try again later.I already did a fix-mapping, but this thing is absolutely crawling trying to rebalance everything (last estimate was ~40 years)Any thoughts on if this is a bug, or ways to speed this up? It's taking ~6 hours to scan 6000 files, which seems unreasonably slow.ThanksRusty_________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Attachment:
process_dir.sh
Description: application/shellscript
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users