On Apr 15, 2010, at 1:19 AM, ales-76@xxxxxxxxx wrote:
Hello,
I'm not sure if I understand your requirements correctly, but I
have my doubts about NILFS being a silver bullet for latency
constrained workloads.
Thanks for your valuable input, i really appreciate your quick reply!
Als Blah e.a.
First, the idea behind log-structured file systems is to optimize
for writes - batching several unrelated write requests into a
single continuous write. Your workload seems to be read-mostly
I'm not so much looking for a single silver bullet from good old
days, instead i'm looking for a readspeed from the OS which is
similar to 21th century electronic bullet fire rates.
Correct it is only reads. Basically see it as a big ROM that
*sometimes* gets used to terminate a search when the absolute truth
is already known about a specific entity (position).
(BTW, ReiserFS is noted for very good small file performance and it
worked stable for me).
Well i expect that if you do only reads that basically any FS will be
reasonably bugfree.
I would not use reiserFS for mission critical stuff too quickly; i
had bad experiences there, but well that is an entirely different
discussion,
as i have no statistical evidence other than it messed up for me too
much :)
Same stability problems with ext3 and software raid-0 in linux
kernel. Always problems.
However what matters here is having 1 mainboard where i want to do as
fast as possible small reads from the i/o meanwhile the cpu's are busy
with their Artificial Intelligence job, so it shouldn't eat massive
system time either. It all is about the total time of a single read
and that as many times
a second as i can.
Acceptable is a load of 10%. So basicaly every core is allowed to
lose 10% of its time to i/o reads. Now how many can i do a second a
core to achieve that?
So to speak if getting 1 byte is really fast in a specific filesystem
i can move to bytelevel, as i store 5 positions in 1 byte ( 3^5 = 241
so that fits in 1 byte).
Each position can have in a simplified form of fuzzy logic 3
realities: "win, draw or loss". So the length of the reads i can
really limit bigtime.
However as things in past were tuned to magnetic disk speeds latency,
things also get cached in the RAM. For this i have designed an O (1)
lookup table,
which is quite sophisticated and the aim is to use the RAM efficiently.
Much depends upon the speed at which can get read from the flash 'usb
stick' so to speak or SD card, whatever i can get cheap in size say
64 GB or more.
Second, performace of any file system depends on performance of the
underlying hardware and magnetic disks simply cannot deliver enough
IOPs, especially for random reads/writes.
Of course we are all aware of that, but also that of course magnetic
disks for a 100 euro you can have a terabyte,
whereas for fast latency storage sized 1 terabyte you can buy entire
Greece nowadays.
So if you want really low latency and high IO rate you probably
need to either go for SSDs (and NILFS for that matter),
Apologies for my stupid mathematical logics in this, even though i
have never studied math of course; i only kept myself busy with
numbers and theories of myself how to
manipulate them (google for probable primes and Diepeveen), as
opposed to math guys who apply lemma's in braindead manner.
However according to logical thinking you first say that i should
consider Reiser-FS for doing a lot of low latency reads and now
suddenly must consider NILFS for exactly the same thing?
I'm not following that logics. Can you explain?
or keep the whole working set in memory (which is usually
prohibitive in terms of price).
The algorithms for the mainsearch speed up exponential with fast RAM
accesses, so we are speaking of a shared memory type of system; sure
my engine also ran on a supercomputer
such as SGI origin3800 @ 1024 processors from which i could use a
partition of 512 processors for the search. However that's very low
latency communication between each processor
over the shared memory network; not comparable with the factor 1000
slower latencies practical of gigabit ethernet (forget about paper
claims here, it really is ugly slow), which 'en passant'
(french for 'in the meantime' - excusez le mot) in its ugly slowness
also jams all the cores of the processors while doing that (not DMA
huh?).
So to keep it simple minded i'm looking for single mainboard i/o
speed, where speed is just the number of reads i can do with 16
cores. Then this core is doing a read then that core,
and so on. The cores do not know from each other whether and when
they do a read, if they do one at all. Chaining really happens a lot
there; a core that is doing a read now
has a high chance of doing a read next time also. In fact odds are
good it is a read to a position closeby a previous done read, which
is why i cache it in the RAM with some small
buffer.
Say a gigabyte of RAM or so in total for all caches of all cores
together.
Third, there are other things than file systems that are probably
more suited to the task. I mean, you can put several machines
together to form a cluster and than run a distribited key-value
store on it.
If you google on me you'll see i'm also in the beowulf mailing list.
See above for clusters. I'm interested in maximizing read speed to a
device, or even using several devices,
like for example i would use an usb stick with 16GB and a SD card of
32GB etc, wasn't it that the usb stick would jam everything (central
locking somewhere?).
Of course 32GB is rather small, so odds are also there it can be a
SSD or something, wasn't it that for just 128GB it's already 200
euro, far outside budget,
who knows one day though...
3000 euro for 1TB SSD is rather expensive i'd argue. Would those
still be fast latency?
Please note i'm not aware what latency we can expect from SATA as a
standard,
as i see some of those 1 TB SSD's are in fact PCI-E cards with a disk
put on top of it.
Of course PCI-E, if we look to the fastest network cards, they can
deliver at around 1 us latency handsdown, which is really a lot
faster than times quoted
for SSD's of 75 us, so i assume it is needed for additional cooling
as well.
Key-value stores are something in between a file system and
relational database. These are optimized for retrieving/storing
small objects with low latency. Most known key-value store is
probably Amazon's Dynamo (http://www.allthingsdistributed.com/
2007/10/amazons_dynamo.html), but there are others. You can find a
decent introduction here: http://www.metabrew.com/article/anti-
rdbms-a-list-of-distributed-key-value-stores/ I understand that my
answers are probably not what you expected, but that's what I
think. And I'm sure others will have come up with something more
substantial and more related to NILFS performace for your workload.
Yeah so they say Amazon personnel living in the clouds (computing)
might have plenty of time right now spamming the net...
Let's hope discussion is about NILFS here :)
Thanks for your contribution.
Vincent
Cheers
Ales Blaha
hi all,
Read an interesting article online on NILFS suggesting it would be
ok for fast latency.
Very interesting.
Now my use case is rather simple. It is for read-only of EGTBs.
(chess endgametablebases).
During a game tree search of (for example) a chessprogram if
reaching far endgame,
it will go into the file system and do a lot of random reads.
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html