On Fri, 17 Jun 2016, Allen Samuels wrote: > > -----Original Message----- > > From: Sage Weil [mailto:sweil@xxxxxxxxxx] > > Sent: Friday, June 17, 2016 10:55 AM > > To: Allen Samuels <Allen.Samuels@xxxxxxxxxxx> > > Cc: ceph-devel@xxxxxxxxxxxxxxx > > Subject: RE: encoding > > > > On Fri, 17 Jun 2016, Allen Samuels wrote: > > > I'm just trying to work the encoding size. > > > > > > Igor has point out that unused could be replaced by a bitmap and > > > suggested that it could be quite small (4 or 8 bytes) -- though he > > > cites some particular examples about it being "small". > > > > > > Do we actually need a full bitmap? Might not a simple left-off pointer > > > be almost as good? > > > > He's right that it's only useful for a min_alloc_size'd blob, where we expect > > teh bitmap to be about 16 bits. We could have a left/right boundary that > > encodes into 8 bits (two 0..15 values) that would probably capture 70% of the > > performance benefit and save one byte. I'm inclined to go for a bitmap for > > now, though... > > > > > w.r.t. ref_map. The comment in the code suggests that it will always > > > be empty for a non-shared blob. If that's correct perhaps it's not a > > > big deal. > > > > That was the original thought, but it didn't end up being the case. We could > > write a non-trivial chunk of code to rebuild that info at runtime from the > > lextent map, but I'd put that pretty far down on the list too. > > Instead, we can probably take advantage of the fact that most of the ref > > counts will be 1 or 0 and combine those ~2 bits into a more efficent record_t > > encoding. This should be sequenced after the first pass of varint encodings, > > though, which I'm partway through stabilizing. > > Should have a PR by Monday. > > With some deeper thinking, it seems to me that the > rebuild-ref-map-on-deserialize is pretty trivial. Isn't it just a walk > of the lextent map and for each entry a corresponding call to add_ref > for that referenced blob (assuming add_ref correctly combines > overlapping references)???? You're exactly right. We encode the "blob map" and append it to the encoded onode or put it in the bnode key. I think we just need to make 2 variations of encode (to elide the ref_map), and to make the onode decode case rebuild it as you say. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html