Re: tabled test corpus?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/05/2010 04:15 PM, Colin McCabe wrote:
Random thoughts:

Maybe something like a freely available dictionary would work, with
the key as the word, and the value as the definition.

You could grab git commits from the Linux kernel and make the key the
SHA, and the value the patch.

There's a lot of text in Project Gutenberg. I guess you'd have to
decide what you want your average key / value lengths to be-- I think
most books there are longer than 16K. Maybe you could make the key
(book, page_number).

Yeah, I am definitely looking for something much larger than 16K. S3 values can run into the gigabytes per value...

	Jeff




--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Fedora Clound]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux