Re: Infiniband 40GB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, 

I'm currently doing some tests with xfs, debian wheezy with standard libc6 (2.11.3-3) and 3.2 kernel.

I'm doing some iostats(3 nodes with 5 osd), and I see constant writes to disks.(as the datas are flushed each second from journal to disk).

Journal is big enough (20GB tmpfs) to handle 30s of write.

Do you think it's related to the missing syncfs() support ?

-Alexandre


----- Mail original ----- 

De: "Amon Ott" <a.ott@xxxxxxxxxxxx> 
À: "Yann Dupont" <Yann.Dupont@xxxxxxxxxxxxxx> 
Cc: ceph-devel@xxxxxxxxxxxxxxx 
Envoyé: Lundi 4 Juin 2012 11:47:22 
Objet: Re: Infiniband 40GB 

On Monday 04 June 2012 you wrote: 
> Le 04/06/2012 10:23, Stefan Majer a écrit : 
> > Hi Hannes, 
> > 
> > our production environment is running on 10GB infrastructure. We had a 
> > lot of troubles till we got to where we are today. 
> > We use Intel X520 D2 cards on our OSD´s and nexus switch 
> > infrastructure. All other cards we where testing failed horrible. 
> 
> we have Intel Corporation 82599EB 10 Gigabit Dual Port Backplane 
> Connection (rev 01)... Don't know the 'commercial name'. ixgbe driver. 
> 
> > Some of the problems we encountered have been: 
> > - page allocation failures in the ixgbe driver --> fixed in upstream 
> > - problems with jumbo frames, we had to disable tso, gro, lro -- > 
> > this is the most obscure thing 
> > - various tuning via sysctl in the net.tcp and net.ipv4 area --> this 
> > was also the outcome of stefan´s benchmarking odysee. 
> 
> some tuning we made : 
> 
> -> Turning off Virtualisation extension in BIOS. Don't know why, but it 
> gaves us crappy performance. We usually put it on, because we use KVM a 
> lot. In our case, OSD are in bare metal and disabling virtualisation 
> extension gives us a very big boost. 
> It may be a BIOS bug in our machines (DELL M610). 
> 
> -> One of my colleague played with receive flow steeting ; the intel 
> card supports multi queue, so it seems we can gain a little with it : 
> 
> !/bin/sh 
> 
> for x in $(seq 0 23); do echo FFFFFFFF > 
> /sys/class/net/eth2/queues/rx-${x}/rps_cpus; done 
> echo 16384 > /proc/sys/net/core/rps_sock_flow_entries 
> for x in $(seq 0 23); do echo 16384 > 
> /sys/class/net/eth2/queues/rx-${x}/rps_flow_cnt; done 
> 
> > But after all this we a quite happy actully and are only limited by 
> > the speed of the drives (2TB SATA). 
> > The fsync is a fdatasync in fact which is available in newer glibc. If 
> > you dont use btrfs (we use xfs) you need to use a recent glibc with 
> > fdatasync support. 
> 
> Does it may explain why we see loosy performance with xfs right now ? 
> That the main reason we're stuck with btrfs for the moment. 
> 
> we're using debian 'stable' : libc is 
> libc6 2.11.3-3 
> probably too old ? 

One reason for performance problems with that libc6 version is missing 
syncfs() support. I backported a patch for 2.13, originally by Andreas 
Schwab, schwab@xxxxxxxxxx, to Debian stable code. Patch is attached. 

Copy the patch to eglibc's debian/patches/, add to debian/patches/series, 
rebuild eglibc packages (including libc6) with dpkg-buildpackage, install new 
libc6-dev, rebuild ceph packages against it, install and retry. AFAIK, not 
even libc6 in Debian experimental has syncfs() support. 

Also see thread "OSD deadlock with cephfs client and OSD on same machine" 

Amon Ott 
-- 
Dr. Amon Ott 
m-privacy GmbH Tel: +49 30 24342334 
Am Köllnischen Park 1 Fax: +49 30 24342336 
10179 Berlin http://www.m-privacy.de 

Amtsgericht Charlottenburg, HRB 84946 

Geschäftsführer: 
Dipl.-Kfm. Holger Maczkowsky, 
Roman Maczkowsky 

GnuPG-Key-ID: 0x2DD3A649 



-- 

-- 




	Alexandre D erumier 
Ingénieur Système 
Fixe : 03 20 68 88 90 
Fax : 03 20 68 90 81 
45 Bvd du Général Leclerc 59100 Roubaix - France 
12 rue Marivaux 75002 Paris - France 
	
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux