On Wed, 9 Nov 2011, Noah Watkins wrote: > ----- Original Message ----- > > From: "Gregory Farnum" <gregory.farnum@xxxxxxxxxxxxx> > > > > I can't imagine anybody needing location data inside of a fast path, > > in any case. So it probably shouldn't matter, and for any bizarre case > > where it does we can probably expect a certain level of knowledge! > > -Greg > > Interestingly, Hadoop's start-up phase a big time sink. With 64 MB blocks > and a 1 TB data set, we'd make something like 16,000 calls to get block > location data which is serialized with the start of the job. Hmm, interesting! In our case we jsut need the inodes in the libceph client cache. If it's doing stats on files it knows are there, that'll be O(num files), but if it is doing a readdir traversal type scan it'll be O(num directories). sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html