Re: What's the correct way to enable direct-IO?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is great info - with a lot of options to take in :)

To summarise, to enable direct-io and bypass the kernel filesystem cache for a volume
1. Mount the brick with direct-io-mode=enable option
2. run vol set <vol> performance.strict-o-direct on
3. update the vol files with 'o-direct' option in storage/posix (at least for now)

Is that right?



On Thu, Feb 25, 2016 at 5:56 PM, Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx> wrote:


----- Original Message -----
> From: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>
> To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>, "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>
> Cc: "Paul Cuzner" <pcuzner@xxxxxxxxxx>
> Sent: Thursday, February 25, 2016 7:28:30 AM
> Subject: What's the correct way to enable direct-IO?
>
> Hi,
>
> git-grep tells me there are multiple options in our code base for enabling
> direct-IO on a gluster volume, at several layers in the translator stack:
> i) use the mount option 'direct-io-mode=enable'

This option is between kernel and glusterfs. Specifically it asks fuse kernel module to bypass page-cache. Note that when this option is set, direct-io is enabled for _all_ fds irrespective of whether applications have used O_DIRECT in their open/create calls or not.

> ii) enable 'network.remote-dio' which is a protocol/client option using
> volume set command

This is an option introduced by [1] to _filter_ O_DIRECT flags in open/create calls before sending those requests to server. The option name is misleading here. However please note that this is the key (alias?) used by glusterd. The exact option name used by protocol/client is "filter_O_DIRECT" and its fine. Probably we should file a bug on glusterd to change the name?

Coming to your use case, we don't want to filter O_DIRECT from reaching brick. Hence, we need to set this option to _off_ (by default its disabled).

I am still not sure what is the relevance of this option against the bug it was introduced. If we need direct-io, we've to pass it to brick too, so that backend fs on brick is configured appropriately.

[1] http://review.gluster.org/4206
[2] https://bugzilla.redhat.com/show_bug.cgi?id=845213

> iii) enable performance.strict-o-direct which is a performance/write-behind
> option using volume-set command

Yes, write-behind honours O_DIRECT only if this option is set. So, we need to enable this for your use-case. Also, note that applications still need to use O_DIRECT in open/create calls.

To summarize, following are the ways to bypass write-behind cache:
1. disable write-behind :).
2. applications use O_SYNC/O_DSYNC in open calls
3. enable performance.strict-o-direct _and_ applications should use O_DIRECT in open/create calls.

> iv) use 'o-direct' option in storage/posix, volume-set on which reports that
> the option doesn't exist.

The option exists in storage/posix. But, there is no way to set it through cli (probably you can send a patch to do that if necessary). With this option, O_DIRECT is passed with _every_ open/create call on the brick.

>
> So then the question is - what is a surefire way to get direct-io-like
> behavior on gluster volume(s)?

There is no one global option. You need to configure various translators in the stack. Probably [2] was asking for such a feature. Also, as you might've noticed above the behavior/interpretation of these options is not same across all translators (like some are global and some are local only to an fd etc).

Also note that apart from the options you listed above,
1. Quick-read is not aware of O_DIRECT. We need to make it to disable caching if open happens with O_DIRECT.
2. Handling of Quota Marker xattrs is not synchronous (though not exactly an O_DIRECT requirement) as marking is done after sending reply to calls like writev.

On a related note, found article [3] to be informative.

[1] http://review.gluster.org/4206
[2] https://bugzilla.redhat.com/show_bug.cgi?id=845213
[3] https://lwn.net/Articles/457667/

regards,
Raghavendra.

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux