Re: read latency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Vladimir for the clarification!

Tony
> -----Original Message-----
> From: Vladimir Prokofev <v@xxxxxxxxxxx>
> Sent: Monday, November 2, 2020 3:46 AM
> Cc: ceph-users <ceph-users@xxxxxxx>
> Subject:  Re: read latency
> 
> With sequential read you get "read ahead" mechanics attached which helps
> a lot.
> So let's say you do 4KB seq reads with fio.
> By default, Ubuntu, for example, has 128KB read ahead size. That means
> when you request that 4KB of data, driver will actually request 128KB.
> When your IO is served, and you request next seq 4KB, they're already in
> VMs memory, so no new read IO is necessary.
> All those 128KB will likely reside on the same OSD, depending on your
> CEPH object size.
> When you'll reach the end of that 128KB of data, and request next - once
> again it will likely reside in the same rbd object as before, assuming
> 4MB object size, so depending on the internal mechanics which I'm not
> really familiar with, that data can be either in the hosts memory, or at
> least in osd node memory, so no real physical IO will be necessary.
> What you're thinking about is the worst case scenario - when that 128KB
> is split between 2 objects residing on 2 different osds - well, you just
> get 2 real physical IO for your 1 virtual, and in that moment you'll
> have slower request, but after that you get read ahead to help for a lot
> of seq IOs.
> In the end, read ahead with sequential IOs leads to way way less real
> physical reads than random read, hence the IOPS difference.
> 
> пн, 2 нояб. 2020 г. в 06:20, Tony Liu <tonyliu0592@xxxxxxxxxxx>:
> 
> > Another confusing about read vs. random read. My understanding is
> > that, when fio does read, it reads from the test file sequentially.
> > When it does random read, it reads from the test file randomly.
> > That file read inside VM comes down to volume read handed by RBD
> > client who distributes read to PG and eventually to OSD. So a file
> > sequential read inside VM won't be a sequential read on OSD disk.
> > Is that right?
> > Then what difference seq. and rand. read make on OSD disk?
> > Is it rand. read on OSD disk for both cases?
> > Then how to explain the performance difference between seq. and rand.
> > read inside VM? (seq. read IOPS is 20x than rand. read, Ceph is with
> > 21 HDDs on 3 nodes, 7 on each)
> >
> > Thanks!
> > Tony
> > > -----Original Message-----
> > > From: Vladimir Prokofev <v@xxxxxxxxxxx>
> > > Sent: Sunday, November 1, 2020 5:58 PM
> > > Cc: ceph-users <ceph-users@xxxxxxx>
> > > Subject:  Re: read latency
> > >
> > > Not exactly. You can also tune network/software.
> > > Network - go for lower latency interfaces. If you have 10G go to 25G
> > > or 100G. 40G will not do though, afaik they're just 4x10G so their
> > > latency is the same as in 10G.
> > > Software - it's closely tied to your network card queues and
> > > processor cores. In short - tune affinity so that the packet receive
> > > queues and osds processes run on the same corresponding cores.
> > > Disabling process power saving features helps a lot. Also watch out
> for NUMA interference.
> > > But overall all these tricks will save you less than switching from
> > > HDD to SSD.
> > >
> > > пн, 2 нояб. 2020 г. в 02:45, Tony Liu <tonyliu0592@xxxxxxxxxxx>:
> > >
> > > > Hi,
> > > >
> > > > AWIK, the read latency primarily depends on HW latency, not much
> > > > can be tuned in SW. Is that right?
> > > >
> > > > I ran a fio random read with iodepth 1 within a VM backed by Ceph
> > > > with HDD OSD and here is what I got.
> > > > =================
> > > >    read: IOPS=282, BW=1130KiB/s (1157kB/s)(33.1MiB/30001msec)
> > > >     slat (usec): min=4, max=181, avg=14.04, stdev=10.16
> > > >     clat (usec): min=178, max=393831, avg=3521.86, stdev=5771.35
> > > >      lat (usec): min=188, max=393858, avg=3536.38, stdev=5771.51
> > > > ================= I checked HDD average latency is 2.9 ms. Looks
> > > > like the test result makes perfect sense, isn't it?
> > > >
> > > > If I want to get shorter latency (more IOPS), I will have to go
> > > > for better disk, eg. SSD. Right?
> > > >
> > > >
> > > > Thanks!
> > > > Tony
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send
> > > > an email to ceph-users-leave@xxxxxxx
> > > >
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > > email to ceph-users-leave@xxxxxxx
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux