Linus Torvalds wrote:
So no, performance is not the only reason to move to kernel space. It can
easily be things like needing direct access to internal data queues (for a
iSCSI target, this could be things like barriers or just tagged commands -
yes, you can probably emulate things like that without access to the
actual IO queues, but are you sure the semantics will be entirely right?
The kernel/userland boundary is not just a performance boundary, it's an
abstraction boundary too, and these kinds of protocols tend to break
abstractions. NFS broke it by having "file handles" (which is not
something that really exists in user space, and is almost impossible to
emulate correctly), and I bet the same thing happens when emulating a SCSI
target in user space.
Well, speaking as a complete nutter who just finished the bare bones of
an NFSv4 userland server[1]... it depends on your approach.
If the userland server is the _only_ one accessing the data[2] -- i.e.
the database server model where ls(1) shows a couple multi-gigabyte
files or a raw partition -- then it's easy to get all the semantics
right, including file handles. You're not racing with local kernel
fileserving.
Couple that with sendfile(2), sync_file_range(2) and a few other
Linux-specific syscalls, and you've got an efficient NFS file server.
It becomes a solution similar to Apache or MySQL or Oracle.
I quite grant there are many good reasons to do NFS or iSCSI data path
in the kernel... my point is more that "impossible" is just from one
point of view ;-)
Maybe not. I _rally_ haven't looked into iSCSI, I'm just guessing there
would be things like ordering issues.
iSCSI and NBD were passe ideas at birth. :)
Networked block devices are attractive because the concepts and
implementation are more simple than networked filesystems... but usually
you want to run some sort of filesystem on top. At that point you might
as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your
networked block device (and associated complexity).
iSCSI is barely useful, because at least someone finally standardized
SCSI over LAN/WAN.
But you just don't need its complexity if your filesystem must have its
own authentication, distributed coordination, multiple-connection
management code of its own.
Jeff
P.S. Clearly my NFSv4 server is NOT intended to replace the kernel one.
It's more for experiments, and doing FUSE-like filesystem work.
[1] http://linux.yyz.us/projects/nfsv4.html
[2] well, outside of dd(1) and similar tricks... the same "going around
its back" tricks that can screw up a mounted filesystem.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html