Random thoughts: Maybe something like a freely available dictionary would work, with the key as the word, and the value as the definition. You could grab git commits from the Linux kernel and make the key the SHA, and the value the patch. There's a lot of text in Project Gutenberg. I guess you'd have to decide what you want your average key / value lengths to be-- I think most books there are longer than 16K. Maybe you could make the key (book, page_number). Colin P.S. I've been meaning to set up a bigger tabled installation myself, as soon as I get some time. On Fri, Mar 5, 2010 at 10:33 AM, Jeff Garzik <jeff@xxxxxxxxxx> wrote: > On 03/05/2010 10:31 AM, Jeff Garzik wrote: >> >> Can anybody suggest a good test dataset for tabled? >> >> Hopefully something with a million or more keys, where the values are >> large. >> >> I can certainly generate something like that artificially, but a >> real-world dataset would be nice. > > Still looking for a good, real-world data set. > > A synthetic store+retrieve test of 1m keys @ 16K values worked without a > hitch. I documented this on > http://hail.wiki.kernel.org/index.php/Extended_status > > Jeff > > > > -- > To unsubscribe from this list: send the line "unsubscribe hail-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html