The following changes since commit 1b10477b21157800f030c3ec91511a810e75e4c7: Add support for HDFS IO engine (2014-08-13 13:36:52 -0600) are available in the git repository at: git://git.kernel.dk/fio.git master for you to fetch changes up to b74e419ec6152ae2dd4b9f36c2559961f4fab5cf: Update libhdfs engine documention and options (2014-08-14 11:45:16 -0600) ---------------------------------------------------------------- Manish Mandlik (1): Update libhdfs engine documention and options HOWTO | 16 +++++++++++++++- examples/libhdfs.fio | 8 ++++++++ fio.1 | 12 ++++++++++-- options.c | 4 ++-- 4 files changed, 35 insertions(+), 5 deletions(-) create mode 100644 examples/libhdfs.fio --- Diff of recent changes: diff --git a/HOWTO b/HOWTO index d728353..a0b89c8 100644 --- a/HOWTO +++ b/HOWTO @@ -694,7 +694,21 @@ ioengine=str Defines how the job issues io to the file. The following having to go through FUSE. This ioengine defines engine specific options. - hdfs Read and write through Hadoop (HDFS). + libhdfs Read and write through Hadoop (HDFS). + The 'filename' option is used to specify host, + port of the hdfs name-node to connect. This + engine interprets offsets a little + differently. In HDFS, files once created + cannot be modified. So random writes are not + possible. To imitate this, libhdfs engine + expects bunch of small files to be created + over HDFS, and engine will randomly pick a + file out of those files based on the offset + generated by fio backend. (see the example + job file to create such files, use rw=write + option). Please note, you might want to set + necessary environment variables to work with + hdfs/libhdfs properly. external Prefix to specify loading an external IO engine object file. Append the engine diff --git a/examples/libhdfs.fio b/examples/libhdfs.fio new file mode 100644 index 0000000..d5c0ba6 --- /dev/null +++ b/examples/libhdfs.fio @@ -0,0 +1,8 @@ +[global] +runtime=300 + +[hdfs] +filename=dfs-perftest-base.dfs-perftest-base,9000 +ioengine=libhdfs +rw=read +bs=256k diff --git a/fio.1 b/fio.1 index b5ff3cc..c61948b 100644 --- a/fio.1 +++ b/fio.1 @@ -613,8 +613,16 @@ Using Glusterfs libgfapi async interface to direct access to Glusterfs volumes w having to go through FUSE. This ioengine defines engine specific options. .TP -.B hdfs -Read and write through Hadoop (HDFS) +.B libhdfs +Read and write through Hadoop (HDFS). The \fBfilename\fR option is used to +specify host,port of the hdfs name-node to connect. This engine interprets +offsets a little differently. In HDFS, files once created cannot be modified. +So random writes are not possible. To imitate this, libhdfs engine expects +bunch of small files to be created over HDFS, and engine will randomly pick a +file out of those files based on the offset generated by fio backend. (see the +example job file to create such files, use rw=write option). Please note, you +might want to set necessary environment variables to work with hdfs/libhdfs +properly. .RE .P .RE diff --git a/options.c b/options.c index 484efc1..3acfdc8 100644 --- a/options.c +++ b/options.c @@ -672,7 +672,7 @@ static int str_numa_mpol_cb(void *data, char *input) } td->o.numa_memnodes = strdup(nodelist); numa_free_nodemask(verify_bitmask); - + break; case MPOL_LOCAL: case MPOL_DEFAULT: @@ -1542,7 +1542,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = { }, #endif #ifdef CONFIG_LIBHDFS - { .ival = "hdfs", + { .ival = "libhdfs", .help = "Hadoop Distributed Filesystem (HDFS) engine" }, #endif -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html