Re: Problems with ioc again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dan,
 can you do 'echo 3 > /proc/sys/vm/drop_caches' and see if the usage
comes back to normal?

avati

2008/12/22 Dan Parsons <dparsons@xxxxxxxx>:
> OK, I just had this problem again in a big way.
>
> root     26231  9.3 90.5 12676632 11141304 ?   Ssl  Dec17 659:31 [glusterfs]
>
> That's 90.5% of 12GB RAM. cache-size is set to 2048mb. Miraculously this
> node is still running, about 28 of my 33 nodes died over the weekend because
> of this issue. We wanted to run some big jobs over the holiday break but
> this crash is getting in the way.
>
> Is there *anything* that can be done?
>
> Dan Parsons
>
>
> On Dec 17, 2008, at 3:28 PM, Anand Avati wrote:
>
>> Dan,
>>  I have a vague memory about giving a custom patch for io-cache. Was that
>> you? Can you mail me the diff and I can answer your question..
>>
>> Avati
>>
>> On Dec 17, 2008 2:34 PM, "Dan Parsons" <dparsons@xxxxxxxx> wrote:
>>
>> I'd love to use 1.4rc4 but are there any issues in it that would effect
>> me?
>> I have 4 glusterfs servers, each with 2gbit ethernet (bonded), provididing
>> sustained 8gbit/s to 33 client nodes. Below is my entire config file. If
>> you
>> see anything in there using a system that is either buggy or non-optimal
>> in
>> 1.4rc4, or would be difficult to upgrade, please let me know. If not, I
>> can
>> possibly upgrade.
>>
>> Below is my current config file. The one I was using when gluster was
>> using
>> all memory is identical except for 'cache-size' was changed to 4096MB and
>> 'page-size' was changed to 512KB.
>>
>> -----------
>> ### Add client feature and attach to remote subvolume of server1
>> volume distfs01
>> type protocol/client
>> option transport-type tcp/client     # for TCP/IP transport
>> option remote-host 10.8.101.51      # IP address of the remote brick
>> option remote-subvolume brick        # name of the remote volume
>> end-volume
>>
>> ### Add client feature and attach to remote subvolume of server2
>> volume distfs02
>> type protocol/client
>> option transport-type tcp/client     # for TCP/IP transport
>> option remote-host 10.8.101.52      # IP address of the remote brick
>> option remote-subvolume brick        # name of the remote volume
>> end-volume
>>
>> volume distfs03
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 10.8.101.53
>> option remote-subvolume brick
>> end-volume
>>
>> volume distfs04
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 10.8.101.54
>> option remote-subvolume brick
>> end-volume
>>
>> volume stripe0
>> type cluster/stripe
>> option block-size *.gff:1KB,*.nt:1KB,*.best:1KB,*.txt3:1KB,*.nbest.info:1
>> KB*:1MB
>> option scheduler alu
>> option alu.order read-usage:write-usage:disk-usage
>> option alu.read-usage.entry-threshold 20%
>> option alu.read-usage.exit-threshold 4%
>> option alu.write-usage.entry-threshold 20%
>> option alu.write-usage.exit-threshold 4%
>> option alu.disk-usage.entry-threshold 2GB
>> option alu.disk-usage.exit-threshold 100MB
>> subvolumes distfs01 distfs02 distfs03 distfs04
>> end-volume
>>
>> volume ioc  type performance/io-cache  subvolumes stripe0         # In
>> this
>> example it is 'client...
>> volume fixed
>> type features/fixed-id
>> option fixed-uid 0
>> option fixed-gid 900
>> subvolumes ioc
>> end-volume
>>
>> Dan Parsons
>>
>> On Dec 17, 2008, at 2:09 PM, Anand Avati wrote: > Dan, > Is it feasible
>> for
>> you to try 1.4.0pre4...
>
>




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux