On Wed, 2012-05-09 at 13:52 +0930, Tim wrote: > On Tue, 2012-05-08 at 16:02 -0430, Patrick O'Callaghan wrote: > > As I tried to explain, rewriting a couple of apps is not going to hack > > it. The apps don't *know* they're using a networked filesystem, > > they're just accessing files. They could find out and try to take > > measures, but then what about all the other apps that also write > > files? Rewrite tar, cpio, dd, cat, ...? > > > > The price of treating a networked fs as equivalent to a local one is > > that you get screwed when it doesn't behave like a local one. Dealing > > with this in a coherent and consistent way is hard. See the literature > > on distributed filesystems. The semantics of an NFS system are *not* > > the same as a local system. We brush this under the carpet most of the > > time because it usually works, but sometimes the differences bite. > > And thinking out loud... In Linux, when anything wants file system > access, does it directly access the file system, or does it ask the > system to access it? How can a program possibly access the file system without being mediated by the system? "File systems" are an abstraction maintained by the system and programs have no direct access to the media that store them (using /dev/whatever is just a lower-level abstraction). > If it's direct access, then I can see that you'd need to change every > program that wants access. But if everything asks the system to access > the drive, then you have the potential to change how the system works, > solving the problem in (mostly) one place. Yes, hypothetically you could rewrite the kernel to support a different set of abstractions (not as simple as adding a new file system since you have to deal with the VFS layer which covers all of them). Then rewrite the application programs so they understand the new abstractions, i.e. a new API. It's not clear that what you would end up with could still be called Linux, and better make the new API an extension of the old one or you won't get any users, but sure. > i.e. The ability to set more reasonable timeout periods (seconds, not > minutes or hours). The overwhelming majority of programs don't deal in timeouts of any length. Programmers prefer a simple file access abstraction: if I can open the file, I can access it until I close it. It's worth noting that the clean file model (no "access methods", no "fixed versus variable length record structure", no "character versus binary" files, no "end of file mark", etc.) was an important reason for the original success of Unix, without which we wouldn't even be here. > And for the system to report access success or > failure to whatever wanted to access the drives, and that accessing > program would have to accept failure (this part being a problem that has > to be implemented in each application - though they should already have > failure handling built in, unless programmed by a fool). Most of the commonly used programs deal with simple failures such as non-existent or protected files, out of disk space etc. Few if any are written to deal with the file suddenly disappearing in the middle of the access. > That'd prevent the infinite waits for a non-available file system, and > deal with programs thinking they're writing files when they're really > dumping data nowhere. How do you know they're infinite? The answer is that you don't, you're just guessing (see the recent "down vs disconnected" discussion). In practice, "infinite" means "until the user's patience runs out". One user's "infinite" wait is another user's slow system. The low level network ops are already timing out and retrying and at some point they may work, but there's no way to know. I strongly recommend Jerry Saltzer's classic paper "End to End Arguments in System Design" (http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf). In some circumstances, the lower levels *cannot* resolve every problem. No matter how cleverly you design it, there will always be cases when the abstraction of a reliable network breaks down and you have to deal with messy reality at the only level where it's possible to make an intelligent decision. The use of NFS soft mounts is an example where the user can intervene and deal with the consequences, but D state hangs are not confined to network file systems. poc -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org