Re: 3.8.3 Bitrot signature process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kotresh,

its same behaviour in replicated volume also, file fd opens after 120 seconds in brick pid.

for calculating signature for 100MB file it took 15m57s.


How can i increase CPU usage?, in your earlier mail you have said "To limit the usage of CPU, throttling is done using token bucket algorithm".
any possibility of increasing bitrot hash calculation speed ?.


Thanks,
Amudhan 


On Thu, Sep 22, 2016 at 11:44 AM, Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> wrote:
Hi Amudhan,

Thanks for the confirmation. If that's the case please try with dist-rep volume,
and see if you are observing similar behavior.

In any case please raise a bug for the same with your observations. We will work
on it.

Thanks and Regards,
Kotresh H R

----- Original Message -----
> From: "Amudhan P" <amudhan83@xxxxxxxxx>
> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
> Cc: "Gluster Users" <gluster-users@xxxxxxxxxxx>
> Sent: Thursday, September 22, 2016 11:25:28 AM
> Subject: Re: 3.8.3 Bitrot signature process
>
> Hi Kotresh,
>
> 2280 is a brick process, i have not tried with dist-rep volume?
>
> I have not seen any fd in bitd process in any of the node's and bitd
> process usage always 0% CPU and randomly it goes 0.3% CPU.
>
>
>
> Thanks,
> Amudhan
>
> On Thursday, September 22, 2016, Kotresh Hiremath Ravishankar <
> khiremat@xxxxxxxxxx> wrote:
> > Hi Amudhan,
> >
> > No, bitrot signer is a different process by itself and is not part of
> brick process.
> > I believe the process 2280 is a brick process ? Did you check with
> dist-rep volume?
> > Is the same behavior being observed there as well? We need to figure out
> why brick
> > process is holding that fd for such a long time.
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > ----- Original Message -----
> >> From: "Amudhan P" <amudhan83@xxxxxxxxx>
> >> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
> >> Sent: Wednesday, September 21, 2016 8:15:33 PM
> >> Subject: Re: 3.8.3 Bitrot signature process
> >>
> >> Hi Kotresh,
> >>
> >> As soon as fd closes from brick1 pid, i can see bitrot signature for the
> >> file in brick.
> >>
> >> So, it looks like fd opened by brick process to calculate signature.
> >>
> >> output of the file:
> >>
> >> -rw-r--r-- 2 root root 250M Sep 21 18:32
> >> /media/disk1/brick1/data/G/test59-bs10M-c100.nul
> >>
> >> getfattr: Removing leading '/' from absolute path names
> >> # file: media/disk1/brick1/data/G/test59-bs10M-c100.nul
> >> trusted.bit-rot.signature=0x010200000000000000e9474e4cc6
> 73c0c227a6e807e04aa4ab1f88d3744243950a290869c53daa65df
> >> trusted.bit-rot.version=0x020000000000000057d6af3200012a13
> >> trusted.ec.config=0x0000080501000200
> >> trusted.ec.size=0x000000003e800000
> >> trusted.ec.version=0x0000000000001f400000000000001f40
> >> trusted.gfid=0x4c091145429448468fffe358482c63e1
> >>
> >> stat /media/disk1/brick1/data/G/test59-bs10M-c100.nul
> >>   File: ‘/media/disk1/brick1/data/G/test59-bs10M-c100.nul’
> >>   Size: 262144000       Blocks: 512000     IO Block: 4096   regular file
> >> Device: 811h/2065d      Inode: 402653311   Links: 2
> >> Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
> >> Access: 2016-09-21 18:34:43.722712751 +0530
> >> Modify: 2016-09-21 18:32:41.650712946 +0530
> >> Change: 2016-09-21 19:14:41.698708914 +0530
> >>  Birth: -
> >>
> >>
> >> In other 2 bricks in same set, still signature is not updated for the
> same
> >> file.
> >>
> >>
> >> On Wed, Sep 21, 2016 at 6:48 PM, Amudhan P <amudhan83@xxxxxxxxx> wrote:
> >>
> >> > Hi Kotresh,
> >> >
> >> > I am very sure, No read was going on from mount point.
> >> >
> >> > Again i did same test but after writing data to mount point. I have
> >> > unmounted mount point.
> >> >
> >> > after 120 seconds i am seeing this file fd entry in brick 1 pid
> >> >
> >> > getfattr -m. -e hex -d test59-bs10
> >> > # file: test59-bs10M-c100.nul
> >> > trusted.bit-rot.version=0x020000000000000057bed574000ed534
> >> > trusted.ec.config=0x0000080501000200
> >> > trusted.ec.size=0x000000003e800000
> >> > trusted.ec.version=0x0000000000001f400000000000001f40
> >> > trusted.gfid=0x4c091145429448468fffe358482c63e1
> >> >
> >> >
> >> > ls -l /proc/2280/fd
> >> > lr-x------ 1 root root 64 Sep 21 13:08 19 -> /media/disk1/brick1/.
> >> > glusterfs/4c/09/4c091145-4294-4846-8fff-e358482c63e1
> >> >
> >> > Volume is a EC - 4+1
> >> >
> >> > On Wed, Sep 21, 2016 at 6:17 PM, Kotresh Hiremath Ravishankar <
> >> > khiremat@xxxxxxxxxx> wrote:
> >> >
> >> >> Hi Amudhan,
> >> >>
> >> >> If you see the ls output, some process has a fd opened in the backend.
> >> >> That is the reason bitrot is not considering for the signing.
> >> >> Could you please observe, after 120 secs of closure of
> >> >> "/media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> >> >> 85bf-f21f99fd8764"
> >> >> the signing happens. If so we need to figure out who holds this fd for
> >> >> such a long time.
> >> >> And also we need to figure is this issue specific to EC volume.
> >> >>
> >> >> Thanks and Regards,
> >> >> Kotresh H R
> >> >>
> >> >> ----- Original Message -----
> >> >> > From: "Amudhan P" <amudhan83@xxxxxxxxx>
> >> >> > To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
> >> >> > Cc: "Gluster Users" <gluster-users@xxxxxxxxxxx>
> >> >> > Sent: Wednesday, September 21, 2016 4:56:40 PM
> >> >> > Subject: Re: 3.8.3 Bitrot signature process
> >> >> >
> >> >> > Hi Kotresh,
> >> >> >
> >> >> >
> >> >> > Writing new file.
> >> >> >
> >> >> > getfattr -m. -e hex -d /media/disk2/brick2/data/G/
> test58-bs10M-c100.nul
> >> >> > getfattr: Removing leading '/' from absolute path names
> >> >> > # file: media/disk2/brick2/data/G/test58-bs10M-c100.nul
> >> >> > trusted.bit-rot.version=0x020000000000000057da8b23000b120e
> >> >> > trusted.ec.config=0x0000080501000200
> >> >> > trusted.ec.size=0x000000003e800000
> >> >> > trusted.ec.version=0x0000000000001f400000000000001f40
> >> >> > trusted.gfid=0x6e7c49e6094e443585bff21f99fd8764
> >> >> >
> >> >> >
> >> >> > Running ls -l in brick 2 pid
> >> >> >
> >> >> > ls -l /proc/30162/fd
> >> >> >
> >> >> > lr-x------ 1 root root 64 Sep 21 16:22 59 ->
> >> >> > /media/disk2/brick2/.glusterfs/quanrantine
> >> >> > lrwx------ 1 root root 64 Sep 21 16:22 6 ->
> >> >> > /var/lib/glusterd/vols/glsvol1/run/10.1.2.2-media-disk2-brick2.pid
> >> >> > lr-x------ 1 root root 64 Sep 21 16:25 60 ->
> >> >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> >> >> 85bf-f21f99fd8764
> >> >> > lr-x------ 1 root root 64 Sep 21 16:22 61 ->
> >> >> > /media/disk2/brick2/.glusterfs/quanrantine
> >> >> >
> >> >> >
> >> >> > find /media/disk2/ -samefile
> >> >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> >> >> 85bf-f21f99fd8764
> >> >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> >> >> 85bf-f21f99fd8764
> >> >> > /media/disk2/brick2/data/G/test58-bs10M-c100.nul
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Sep 21, 2016 at 3:28 PM, Kotresh Hiremath Ravishankar <
> >> >> > khiremat@xxxxxxxxxx> wrote:
> >> >> >
> >> >> > > Hi Amudhan,
> >> >> > >
> >> >> > > Don't grep for the filename, glusterfs maintains hardlink in
> >> >> .glusterfs
> >> >> > > directory
> >> >> > > for each file. Just check 'ls -l /proc/<respective brick pid>/fd'
> for
> >> >> any
> >> >> > > fds opened
> >> >> > > for a file in .glusterfs and check if it's the same file.
> >> >> > >
> >> >> > > Thanks and Regards,
> >> >> > > Kotresh H R
> >> >> > >
> >> >> > > ----- Original Message -----
> >> >> > > > From: "Amudhan P" <amudhan83@xxxxxxxxx>
> >> >> > > > To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
> >> >> > > > Cc: "Gluster Users" <gluster-users@xxxxxxxxxxx>
> >> >> > > > Sent: Wednesday, September 21, 2016 1:33:10 PM
> >> >> > > > Subject: Re: 3.8.3 Bitrot signature process
> >> >> > > >
> >> >> > > > Hi Kotresh,
> >> >> > > >
> >> >> > > > i have used below command to verify any open fd for file.
> >> >> > > >
> >> >> > > > "ls -l /proc/*/fd | grep filename".
> >> >> > > >
> >> >> > > > as soon as write completes there no open fd's, if there is any
> >> >> alternate
> >> >> > > > option. please let me know will also try that.
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > > Also, below is my scrub status in my test setup. number of
> skipped
> >> >> files
> >> >> > > > slow reducing day by day. I think files are skipped due to
> bitrot
> >> >> > > signature
> >> >> > > > process is not completed yet.
> >> >> > > >
> >> >> > > > where can i see scrub skipped files?
> >> >> > > >
> >> >> > > >
> >> >> > > > Volume name : glsvol1
> >> >> > > >
> >> >> > > > State of scrub: Active (Idle)
> >> >> > > >
> >> >> > > > Scrub impact: normal
> >> >> > > >
> >> >> > > > Scrub frequency: daily
> >> >> > > >
> >> >> > > > Bitrot error log location: /var/log/glusterfs/bitd.log
> >> >> > > >
> >> >> > > > Scrubber error log location: /var/log/glusterfs/scrub.log
> >> >> > > >
> >> >> > > >
> >> >> > > > =========================================================
> >> >> > > >
> >> >> > > > Node: localhost
> >> >> > > >
> >> >> > > > Number of Scrubbed files: 1644
> >> >> > > >
> >> >> > > > Number of Skipped files: 1001
> >> >> > > >
> >> >> > > > Last completed scrub time: 2016-09-20 11:59:58
> >> >> > > >
> >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:39:26
> >> >> > > >
> >> >> > > > Error count: 0
> >> >> > > >
> >> >> > > >
> >> >> > > > =========================================================
> >> >> > > >
> >> >> > > > Node: 10.1.2.3
> >> >> > > >
> >> >> > > > Number of Scrubbed files: 1644
> >> >> > > >
> >> >> > > > Number of Skipped files: 1001
> >> >> > > >
> >> >> > > > Last completed scrub time: 2016-09-20 10:50:00
> >> >> > > >
> >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:38:17
> >> >> > > >
> >> >> > > > Error count: 0
> >> >> > > >
> >> >> > > >
> >> >> > > > =========================================================
> >> >> > > >
> >> >> > > > Node: 10.1.2.4
> >> >> > > >
> >> >> > > > Number of Scrubbed files: 981
> >> >> > > >
> >> >> > > > Number of Skipped files: 1664
> >> >> > > >
> >> >> > > > Last completed scrub time: 2016-09-20 12:38:01
> >> >> > > >
> >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:35:19
> >> >> > > >
> >> >> > > > Error count: 0
> >> >> > > >
> >> >> > > >
> >> >> > > > =========================================================
> >> >> > > >
> >> >> > > > Node: 10.1.2.1
> >> >> > > >
> >> >> > > > Number of Scrubbed files: 1263
> >> >> > > >
> >> >> > > > Number of Skipped files: 1382
> >> >> > > >
> >> >> > > > Last completed scrub time: 2016-09-20 11:57:21
> >> >> > > >
> >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:37:17
> >> >> > > >
> >> >> > > > Error count: 0
> >> >> > > >
> >> >> > > >
> >> >> > > > =========================================================
> >> >> > > >
> >> >> > > > Node: 10.1.2.2
> >> >> > > >
> >> >> > > > Number of Scrubbed files: 1644
> >> >> > > >
> >> >> > > > Number of Skipped files: 1001
> >> >> > > >
> >> >> > > > Last completed scrub time: 2016-09-20 11:59:25
> >> >> > > >
> >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:39:18
> >> >> > > >
> >> >> > > > Error count: 0
> >> >> > > >
> >> >> > > > =========================================================
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > > Thanks
> >> >> > > > Amudhan
> >> >> > > >
> >> >> > > >
> >> >> > > > On Wed, Sep 21, 2016 at 11:45 AM, Kotresh Hiremath Ravishankar <
> >> >> > > > khiremat@xxxxxxxxxx> wrote:
> >> >> > > >
> >> >> > > > > Hi Amudhan,
> >> >> > > > >
> >> >> > > > > I don't think it's the limitation with read data from the
> brick.
> >> >> > > > > To limit the usage of CPU, throttling is done using token
> bucket
> >> >> > > > > algorithm. The log message showed is related to it. But even
> then
> >> >> > > > > I think it should not take 12 minutes for check-sum
> calculation
> >> >> unless
> >> >> > > > > there is an fd open (might be internal). Could you please
> cross
> >> >> verify
> >> >> > > > > if there are any fd opened on that file by looking into
> /proc? I
> >> >> will
> >> >> > > > > also test it out in the mean time and get back to you.
> >> >> > > > >
> >> >> > > > > Thanks and Regards,
> >> >> > > > > Kotresh H R
> >> >> > > > >
> >> >> > > > > ----- Original Message -----
> >> >> > > > > > From: "Amudhan P" <amudhan83@xxxxxxxxx>
> >> >> > > > > > To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
> >> >> > > > > > Cc: "Gluster Users" <gluster-users@xxxxxxxxxxx>
> >> >> > > > > > Sent: Tuesday, September 20, 2016 3:19:28 PM
> >> >> > > > > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process
> >> >> > > > > >
> >> >> > > > > > Hi Kotresh,
> >> >> > > > > >
> >> >> > > > > > Please correct me if i am wrong, Once a file write completes
> >> >> and as
> >> >> > > soon
> >> >> > > > > as
> >> >> > > > > > closes fds, bitrot waits for 120 seconds and starts hashing
> and
> >> >> > > update
> >> >> > > > > > signature for the file in brick.
> >> >> > > > > >
> >> >> > > > > > But, what i am feeling that bitrot takes too much of time to
> >> >> complete
> >> >> > > > > > hashing.
> >> >> > > > > >
> >> >> > > > > > below is test result i would like to share.
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > writing data in below path using dd :
> >> >> > > > > >
> >> >> > > > > > /mnt/gluster/data/G (mount point)
> >> >> > > > > > -rw-r--r-- 1 root root  10M Sep 20 12:19 test53-bs10M-c1.nul
> >> >> > > > > > -rw-r--r-- 1 root root 100M Sep 20 12:19
> test54-bs10M-c10.nul
> >> >> > > > > >
> >> >> > > > > > No any other write or read process is going on.
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > Checking file data in one of the brick.
> >> >> > > > > >
> >> >> > > > > > -rw-r--r-- 2 root root 2.5M Sep 20 12:23 test53-bs10M-c1.nul
> >> >> > > > > > -rw-r--r-- 2 root root  25M Sep 20 12:23
> test54-bs10M-c10.nul
> >> >> > > > > >
> >> >> > > > > > file's stat and getfattr info from brick, after write
> process
> >> >> > > completed.
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> >> >> test53-bs10M-c1.nul
> >> >> > > > > >   File: ‘test53-bs10M-c1.nul’
> >> >> > > > > >   Size: 2621440         Blocks: 5120       IO Block: 4096
> >> >>  regular
> >> >> > > file
> >> >> > > > > > Device: 821h/2081d      Inode: 536874168   Links: 2
> >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (
>   0/
> >> >> > > root)
> >> >> > > > > > Access: 2016-09-20 12:23:28.798886647 +0530
> >> >> > > > > > Modify: 2016-09-20 12:23:28.994886646 +0530
> >> >> > > > > > Change: 2016-09-20 12:23:28.998886646 +0530
> >> >> > > > > >  Birth: -
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> >> >> test54-bs10M-c10.nul
> >> >> > > > > >   File: ‘test54-bs10M-c10.nul’
> >> >> > > > > >   Size: 26214400        Blocks: 51200      IO Block: 4096
> >> >>  regular
> >> >> > > file
> >> >> > > > > > Device: 821h/2081d      Inode: 536874169   Links: 2
> >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (
>   0/
> >> >> > > root)
> >> >> > > > > > Access: 2016-09-20 12:23:42.902886624 +0530
> >> >> > > > > > Modify: 2016-09-20 12:23:44.378886622 +0530
> >> >> > > > > > Change: 2016-09-20 12:23:44.378886622 +0530
> >> >> > > > > >  Birth: -
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m.
> -e
> >> >> hex -d
> >> >> > > > > > test53-bs10M-c1.nul
> >> >> > > > > > # file: test53-bs10M-c1.nul
> >> >> > > > > > trusted.bit-rot.version=0x020000000000000057daa7b50002e5b4
> >> >> > > > > > trusted.ec.config=0x0000080501000200
> >> >> > > > > > trusted.ec.size=0x0000000000a00000
> >> >> > > > > > trusted.ec.version=0x00000000000000500000000000000050
> >> >> > > > > > trusted.gfid=0xe2416bd1aae4403c88f44286273bbe99
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m.
> -e
> >> >> hex -d
> >> >> > > > > > test54-bs10M-c10.nul
> >> >> > > > > > # file: test54-bs10M-c10.nul
> >> >> > > > > > trusted.bit-rot.version=0x020000000000000057daa7b50002e5b4
> >> >> > > > > > trusted.ec.config=0x0000080501000200
> >> >> > > > > > trusted.ec.size=0x0000000006400000
> >> >> > > > > > trusted.ec.version=0x00000000000003200000000000000320
> >> >> > > > > > trusted.gfid=0x54e018dd8c5a4bd79e0317729d8a57c5
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > file's stat and getfattr info from brick, after bitrot
> signature
> >> >> > > updated.
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> >> >> test53-bs10M-c1.nul
> >> >> > > > > >   File: ‘test53-bs10M-c1.nul’
> >> >> > > > > >   Size: 2621440         Blocks: 5120       IO Block: 4096
> >> >>  regular
> >> >> > > file
> >> >> > > > > > Device: 821h/2081d      Inode: 536874168   Links: 2
> >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (
>   0/
> >> >> > > root)
> >> >> > > > > > Access: 2016-09-20 12:25:31.494886450 +0530
> >> >> > > > > > Modify: 2016-09-20 12:23:28.994886646 +0530
> >> >> > > > > > Change: 2016-09-20 12:27:00.994886307 +0530
> >> >> > > > > >  Birth: -
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m.
> -e
> >> >> hex -d
> >> >> > > > > > test53-bs10M-c1.nul
> >> >> > > > > > # file: test53-bs10M-c1.nul
> >> >> > > > > > trusted.bit-rot.signature=0x0102000000000000006de7493c5c
> >> >> > > > > 90f643357c268fbaaf461c1567e0334e4948023ce17268403aa37a
> >> >> > > > > > trusted.bit-rot.version=0x020000000000000057daa7b50002e5b4
> >> >> > > > > > trusted.ec.config=0x0000080501000200
> >> >> > > > > > trusted.ec.size=0x0000000000a00000
> >> >> > > > > > trusted.ec.version=0x00000000000000500000000000000050
> >> >> > > > > > trusted.gfid=0xe2416bd1aae4403c88f44286273bbe99
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> >> >> test54-bs10M-c10.nul
> >> >> > > > > >   File: ‘test54-bs10M-c10.nul’
> >> >> > > > > >   Size: 26214400        Blocks: 51200      IO Block: 4096
> >> >>  regular
> >> >> > > file
> >> >> > > > > > Device: 821h/2081d      Inode: 536874169   Links: 2
> >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (
>   0/
> >> >> > > root)
> >> >> > > > > > Access: 2016-09-20 12:25:47.510886425 +0530
> >> >> > > > > > Modify: 2016-09-20 12:23:44.378886622 +0530
> >> >> > > > > > Change: 2016-09-20 12:38:05.954885243 +0530
> >> >> > > > > >  Birth: -
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m.
> -e
> >> >> hex -d
> >> >> > > > > > test54-bs10M-c10.nul
> >> >> > > > > > # file: test54-bs10M-c10.nul
> >> >> > > > > > trusted.bit-rot.signature=0x010200000000000000394c345f0b
> >> >> > > > > 0c63ee652627a62eed069244d35c4d5134e4f07d4eabb51afda47e
> >> >> > > > > > trusted.bit-rot.version=0x020000000000000057daa7b50002e5b4
> >> >> > > > > > trusted.ec.config=0x0000080501000200
> >> >> > > > > > trusted.ec.size=0x0000000006400000
> >> >> > > > > > trusted.ec.version=0x00000000000003200000000000000320
> >> >> > > > > > trusted.gfid=0x54e018dd8c5a4bd79e0317729d8a57c5
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > (Actual time taken for reading file from brick for md5sum)
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ time md5sum
> >> >> > > test53-bs10M-c1.nul
> >> >> > > > > > 8354dcaa18a1ecb52d0895bf00888c44  test53-bs10M-c1.nul
> >> >> > > > > >
> >> >> > > > > > real    0m0.045s
> >> >> > > > > > user    0m0.007s
> >> >> > > > > > sys     0m0.003s
> >> >> > > > > >
> >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ time md5sum
> >> >> > > > > test54-bs10M-c10.nul
> >> >> > > > > > bed3c0a4a1407f584989b4009e9ce33f  test54-bs10M-c10.nul
> >> >> > > > > >
> >> >> > > > > > real    0m0.166s
> >> >> > > > > > user    0m0.062s
> >> >> > > > > > sys     0m0.011s
> >> >> > > > > >
> >> >> > > > > > As you can see that 'test54-bs10M-c10.nul' file took around
> 12
> >> >> > > minutes to
> >> >> > > > > > update bitort signature (pls refer stat output for the
> file).
> >> >> > > > > >
> >> >> > > > > > what would be the cause for such a slow read?. Any
> limitation
> >> >> in read
> >> >> > > > > data
> >> >> > > > > > from brick?
> >> >> > > > > >
> >> >> > > > > > Also, i am seeing this line bitd.log, what does this mean?
> >> >> > > > > > [bit-rot.c:1784:br_rate_limit_signer] 0-glsvol1-bit-rot-0:
> >> >> [Rate
> >> >> > > Limit
> >> >> > > > > > Info] "tokens/sec (rate): 131072, maxlimit: 524288
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > Thanks
> >> >> > > > > > Amudhan P
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > On Mon, Sep 19, 2016 at 1:00 PM, Kotresh Hiremath
> Ravishankar <
> >> >> > > > > > khiremat@xxxxxxxxxx> wrote:
> >> >> > > > > >
> >> >> > > > > > > Hi Amudhan,
> >> >> > > > > > >
> >> >> > > > > > > Thanks for testing out the bitrot feature and sorry for
> the
> >> >> delayed
> >> >> > > > > > > response.
> >> >> > > > > > > Please find the answers inline.
> >> >> > > > > > >
> >> >> > > > > > > Thanks and Regards,
> >> >> > > > > > > Kotresh H R
> >> >> > > > > > >
> >> >> > > > > > > ----- Original Message -----
> >> >> > > > > > > > From: "Amudhan P" <amudhan83@xxxxxxxxx>
> >> >> > > > > > > > To: "Gluster Users" <gluster-users@xxxxxxxxxxx>
> >> >> > > > > > > > Sent: Friday, September 16, 2016 4:14:10 PM
> >> >> > > > > > > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature
> process
> >> >> > > > > > > >
> >> >> > > > > > > > Hi,
> >> >> > > > > > > >
> >> >> > > > > > > > Can anyone reply to this mail.
> >> >> > > > > > > >
> >> >> > > > > > > > On Tue, Sep 13, 2016 at 12:49 PM, Amudhan P <
> >> >> > > amudhan83@xxxxxxxxx >
> >> >> > > > > > > wrote:
> >> >> > > > > > > >
> >> >> > > > > > > >
> >> >> > > > > > > >
> >> >> > > > > > > > Hi,
> >> >> > > > > > > >
> >> >> > > > > > > > I am testing bitrot feature in Gluster 3.8.3 with
> disperse
> >> >> EC
> >> >> > > volume
> >> >> > > > > 4+1.
> >> >> > > > > > > >
> >> >> > > > > > > > When i write single small file (< 10MB) after 2 seconds
> i
> >> >> can see
> >> >> > > > > bitrot
> >> >> > > > > > > > signature in bricks for the file, but when i write
> multiple
> >> >> files
> >> >> > > > > with
> >> >> > > > > > > > different size ( > 10MB) it takes long time (> 24hrs)
> to see
> >> >> > > bitrot
> >> >> > > > > > > > signature in all the files.
> >> >> > > > > > >
> >> >> > > > > > >    The default timeout for signing to happen is 120
> seconds.
> >> >> So the
> >> >> > > > > > > signing will happen
> >> >> > > > > > >   120 secs after the last fd gets closed on that file. So
> if
> >> >> the
> >> >> > > file
> >> >> > > > > is
> >> >> > > > > > > being written
> >> >> > > > > > >   continuously, it will not be signed until 120 secs after
> >> >> it's
> >> >> > > last
> >> >> > > > > fd is
> >> >> > > > > > > closed.
> >> >> > > > > > > >
> >> >> > > > > > > > My questions are.
> >> >> > > > > > > > 1. I have enabled scrub schedule as hourly and throttle
> as
> >> >> > > normal,
> >> >> > > > > does
> >> >> > > > > > > this
> >> >> > > > > > > > make any impact in delaying bitrot signature?
> >> >> > > > > > >       No.
> >> >> > > > > > > > 2. other than "bitd.log" where else i can watch current
> >> >> status of
> >> >> > > > > bitrot,
> >> >> > > > > > > > like number of files added for signature and file
> status?
> >> >> > > > > > >      Signature will happen after 120 sec of last fd
> closure,
> >> >> as
> >> >> > > said
> >> >> > > > > above.
> >> >> > > > > > >      There is not status command which tracks the
> signature
> >> >> of the
> >> >> > > > > files.
> >> >> > > > > > >      But there is bitrot status command which tracks the
> >> >> number of
> >> >> > > > > files
> >> >> > > > > > >      scrubbed.
> >> >> > > > > > >
> >> >> > > > > > >      #gluster vol bitrot <volname> scrub status
> >> >> > > > > > >
> >> >> > > > > > >
> >> >> > > > > > > > 3. where i can confirm that all the files in the brick
> are
> >> >> bitrot
> >> >> > > > > signed?
> >> >> > > > > > >
> >> >> > > > > > >      As said, signing information of all the files is not
> >> >> tracked.
> >> >> > > > > > >
> >> >> > > > > > > > 4. is there any file read size limit in bitrot?
> >> >> > > > > > >
> >> >> > > > > > >      I didn't get. Could you please elaborate this ?
> >> >> > > > > > >
> >> >> > > > > > > > 5. options for tuning bitrot for faster signing of
> files?
> >> >> > > > > > >
> >> >> > > > > > >      Bitrot feature is mainly to detect silent corruption
> >> >> > > (bitflips) of
> >> >> > > > > > > files due to long
> >> >> > > > > > >      term storage. Hence the default is 120 sec of last fd
> >> >> > > closure, the
> >> >> > > > > > > signing happens.
> >> >> > > > > > >      But there is a tune able which can change the default
> >> >> 120 sec
> >> >> > > but
> >> >> > > > > > > that's only for
> >> >> > > > > > >      testing purposes and we don't recommend it.
> >> >> > > > > > >
> >> >> > > > > > >       gluster vol get master features.expiry-time
> >> >> > > > > > >
> >> >> > > > > > >      For testing purposes, you can change this default and
> >> >> test.
> >> >> > > > > > > >
> >> >> > > > > > > > Thanks
> >> >> > > > > > > > Amudhan
> >> >> > > > > > > >
> >> >> > > > > > > >
> >> >> > > > > > > > _______________________________________________
> >> >> > > > > > > > Gluster-users mailing list
> >> >> > > > > > > > Gluster-users@xxxxxxxxxxx
> >> >> > > > > > > > http://www.gluster.org/mailman/listinfo/gluster-users
> >> >> > > > > > >
> >> >> > > > > >
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux