Re: I/O performance

Xavi Hernandez <xhernandez@xxxxxxxxxx> · Fri, 1 Feb 2019 13:51:54 +0100

On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah <pgurusid@xxxxxxxxxx> wrote:
Can the threads be categorised to do certain kinds of fops?

Could be, but creating multiple thread groups for different tasks is generally bad because many times you end up with lots of idle threads which waste resources and could increase contention. I think we should only differentiate threads if it's absolutely necessary.

 Read/write affinitise to certain set of threads, the other metadata fops to other set of threads. So we limit the read/write threads and not the metadata threads? Also if aio is enabled in the backend the threads will not be blocked on disk IO right? 

If we don't block the thread but we don't prevent more requests to go to the disk, then we'll probably have the same problem. Anyway, I'll try to run some tests with AIO to see if anything changes.

All this is based on the assumption that large number of parallel read writes make the disk perf bad but not the large number of dentry and metadata ops. Is that true?

It depends. If metadata is not cached, it's as bad as a read or write since it requires a disk access (a clear example of this is the bad performance of 'ls' in cold cache, which is basically metadata reads). In fact, cached data reads are also very fast, and data writes could go to the cache and be updated later in background, so I think the important point is if things are cached or not, instead of if they are data or metadata. Since we don't have this information from the user side, it's hard to tell what's better. My opinion is that we shouldn't differentiate requests of data/metadata. If metadata requests happen to be faster, then that thread will be able to handle other requests immediately, which seems good enough.

However there's one thing that I would do. I would differentiate reads (data or metadata) from writes. Normally writes come from cached information that is flushed to disk at some point, so this normally happens in the background. But reads tend to be in foreground, meaning that someone (user or application) is waiting for it. So I would give preference to reads over writes. To do so effectively, we need to not saturate the backend, otherwise when we need to send a read, it will still need to wait for all pending requests to complete. If disks are not saturated, we can have the answer to the read quite fast, and then continue processing the remaining writes.

Anyway, I may be wrong, since all these things depend on too many factors. I haven't done any specific tests about this. It's more like a brainstorming. As soon as I can I would like to experiment with this and get some empirical data.

Xavi

Thanks,
Poornima

On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus <manu@xxxxxxxxxx wrote:
On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote:

> Perhaps we could throttle both aspects - number of I/O requests per disk

While there it would be nice to detect and report  a disk with lower than

peer performance: that happen sometimes when a disk is dying, and last

time I was hit by that performance problem, I had a hard time finding

the culprit.

-- 

Emmanuel Dreyfus

manu@xxxxxxxxxx

_______________________________________________

Gluster-devel mailing list

Gluster-devel@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel