Hi Tom, The volume info doesn't show the hot bricks. I think you have took the volume info output before attaching the hot tier. Can you send the volume info of the current setup where you see this issue. The logs you sent are from a later point in time. The issue is hit earlier than the logs what is available in the log. I need the logs from an earlier time. And along with the entire tier logs, can you send the glusterd and brick logs too? Rest of the comments are inline On Wed, Jan 10, 2018 at 9:03 PM, Tom Fite <tomfite@xxxxxxxxx> wrote: > I should add that additional testing has shown that only accessing files is > held up, IO is not interrupted for existing transfers. I think this points > to the heat metadata in the sqlite DB for the tier, is it possible that a > table is temporarily locked while the promotion daemon runs so the calls to > update the access count on files are blocked? > > > On Wed, Jan 10, 2018 at 10:17 AM, Tom Fite <tomfite@xxxxxxxxx> wrote: >> >> The sizes of the files are extremely varied, there are millions of small >> (<1 MB) files and thousands of files larger than 1 GB. The tier use case is for bigger size files. not the best for files of smaller size. That can end up hindering the IOs. >> >> Attached is the tier log for gluster1 and gluster2. These are full of >> "demotion failed" messages, which is also shown in the status: >> >> [root@pod-sjc1-gluster1 gv0]# gluster volume tier gv0 status >> Node Promoted files Demoted files Status >> run time in h:m:s >> --------- --------- --------- --------- >> --------- >> localhost 25940 0 in progress >> 112:21:49 >> pod-sjc1-gluster2 0 2917154 in progress >> 112:21:49 >> >> Is it normal to have promotions and demotions only happen on each server >> but not both? No. its not normal. >> >> Volume info: >> >> [root@pod-sjc1-gluster1 ~]# gluster volume info >> >> Volume Name: gv0 >> Type: Distributed-Replicate >> Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196 >> Status: Started >> Snapshot Count: 13 >> Number of Bricks: 3 x 2 = 6 >> Transport-type: tcp >> Bricks: >> Brick1: pod-sjc1-gluster1:/data/brick1/gv0 >> Brick2: pod-sjc1-gluster2:/data/brick1/gv0 >> Brick3: pod-sjc1-gluster1:/data/brick2/gv0 >> Brick4: pod-sjc1-gluster2:/data/brick2/gv0 >> Brick5: pod-sjc1-gluster1:/data/brick3/gv0 >> Brick6: pod-sjc1-gluster2:/data/brick3/gv0 >> Options Reconfigured: >> performance.cache-refresh-timeout: 60 >> performance.stat-prefetch: on >> server.allow-insecure: on >> performance.flush-behind: on >> performance.rda-cache-limit: 32MB >> network.tcp-window-size: 1048576 >> performance.nfs.io-threads: on >> performance.write-behind-window-size: 4MB >> performance.nfs.write-behind-window-size: 512MB >> performance.io-cache: on >> performance.quick-read: on >> features.cache-invalidation: on >> features.cache-invalidation-timeout: 600 >> performance.cache-invalidation: on >> performance.md-cache-timeout: 600 >> network.inode-lru-limit: 90000 >> performance.cache-size: 4GB >> server.event-threads: 16 >> client.event-threads: 16 >> features.barrier: disable >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: on >> cluster.lookup-optimize: on >> server.outstanding-rpc-limit: 1024 >> auto-delete: enable >> >> >> # gluster volume status >> Status of volume: gv0 >> Gluster process TCP Port RDMA Port Online >> Pid >> >> ------------------------------------------------------------------------------ >> Hot Bricks: >> Brick pod-sjc1-gluster2:/data/ >> hot_tier/gv0 49219 0 Y >> 26714 >> Brick pod-sjc1-gluster1:/data/ >> hot_tier/gv0 49199 0 Y >> 21325 >> Cold Bricks: >> Brick pod-sjc1-gluster1:/data/ >> brick1/gv0 49152 0 Y >> 3178 >> Brick pod-sjc1-gluster2:/data/ >> brick1/gv0 49152 0 Y >> 4818 >> Brick pod-sjc1-gluster1:/data/ >> brick2/gv0 49153 0 Y >> 3186 >> Brick pod-sjc1-gluster2:/data/ >> brick2/gv0 49153 0 Y >> 4829 >> Brick pod-sjc1-gluster1:/data/ >> brick3/gv0 49154 0 Y >> 3194 >> Brick pod-sjc1-gluster2:/data/ >> brick3/gv0 49154 0 Y >> 4840 >> Tier Daemon on localhost N/A N/A Y >> 20313 >> Self-heal Daemon on localhost N/A N/A Y >> 32023 >> Tier Daemon on pod-sjc1-gluster1 N/A N/A Y >> 24758 >> Self-heal Daemon on pod-sjc1-gluster2 N/A N/A Y >> 12349 >> >> Task Status of Volume gv0 >> >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> >> On Tue, Jan 9, 2018 at 10:33 PM, Hari Gowtham <hgowtham@xxxxxxxxxx> wrote: >>> >>> Hi, >>> >>> Can you send the volume info, and volume status output and the tier logs. >>> And I need to know the size of the files that are being stored. >>> >>> On Tue, Jan 9, 2018 at 9:51 PM, Tom Fite <tomfite@xxxxxxxxx> wrote: >>> > I've recently enabled an SSD backed 2 TB hot tier on my 150 TB 2 server >>> > / 3 >>> > bricks per server distributed replicated volume. >>> > >>> > I'm seeing IO get blocked across all client FUSE threads for 10 to 15 >>> > seconds while the promotion daemon runs. I see the 'glustertierpro' >>> > thread >>> > jump to 99% CPU usage on both boxes when these delays occur and they >>> > happen >>> > every 25 minutes (my tier-promote-frequency setting). >>> > >>> > I suspect this has something to do with the heat database in sqlite, >>> > maybe >>> > something is getting locked while it runs the query to determine files >>> > to >>> > promote. My volume contains approximately 18 million files. >>> > >>> > Has anybody else seen this? I suspect that these delays will get worse >>> > as I >>> > add more files to my volume which will cause significant problems. >>> > >>> > Here are my hot tier settings: >>> > >>> > # gluster volume get gv0 all | grep tier >>> > cluster.tier-pause off >>> > cluster.tier-promote-frequency 1500 >>> > cluster.tier-demote-frequency 3600 >>> > cluster.tier-mode cache >>> > cluster.tier-max-promote-file-size 10485760 >>> > cluster.tier-max-mb 64000 >>> > cluster.tier-max-files 100000 >>> > cluster.tier-query-limit 100 >>> > cluster.tier-compact on >>> > cluster.tier-hot-compact-frequency 86400 >>> > cluster.tier-cold-compact-frequency 86400 >>> > >>> > # gluster volume get gv0 all | grep threshold >>> > cluster.write-freq-threshold 2 >>> > cluster.read-freq-threshold 5 >>> > >>> > # gluster volume get gv0 all | grep watermark >>> > cluster.watermark-hi 92 >>> > cluster.watermark-low 75 >>> > >>> > _______________________________________________ >>> > Gluster-users mailing list >>> > Gluster-users@xxxxxxxxxxx >>> > http://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> -- >>> Regards, >>> Hari Gowtham. >> >> > -- Regards, Hari Gowtham. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users