On Thu, Feb 15, 2018 at 01:33:25PM +0300, Оля Тележная wrote: > 2018-02-15 8:53 GMT+03:00 Jeff King <peff@xxxxxxxx>: > > On Mon, Feb 12, 2018 at 08:08:54AM +0000, Olga Telezhnaya wrote: > > > >> Remove connection between expand_data variable > >> in cat-file and in ref-filter. > >> It will help further to get rid of using expand_data in cat-file. > > > > I have to admit I'm confused at this point about what is_cat_file is > > for, or even why we need cat_file_data. Shouldn't these items be handled > > by their matching ref-filter atoms at this point? > > We discussed that earlier outside of mailing list, and I even tried to > implement that idea and spent a couple of days to prove that it's not > possible. > The problem is that the list of atoms is made dynamically, and we > can't store pointers to any values in each atom. That's why we need > separate cat_file_info variable that is outside of main atom list. > We also need is_cat_file because we still have some part of logic that > is different for cat-file and for all other commands, and sometimes we > need to know that information. OK, right, I forgot about the pointers thing. This is definitely the sort of design decision that should go into the commit message. I went over our off-list discussion again to refresh my memory. Let me try to summarize a bit (for myself, or for other readerss). Naively, one would hope that you could do something like: 1. While parsing the %(objectsize) atom, add a pointer like: the_object_info->sizep = &atom.size; 2. Later when fulfilling the request for a specific object, if the_object_info is not empty, then call sha1_object_info() with it to hold the results. 3. Read the value out of atom.size when assembling the formatted string. But that has two problems: a. During parsing, we're resizing the atom array, so the &atom.size pointer is subject to change. This is the issue you mentioned above. I think we could solve it by taking a pass over the atoms after parsing and filling in the pointers then. But... b. Another problem is that the same atom may appear multiple times. So parsing "%(objectsize) %(objectsize)", we can't point the_object_info at _both_ of them. We need our call to sha1_object_info() to store the result in a single location, and then as we format each atom they can both use that (which works even if they might format it differently, say if you had "%(deltabase) %(deltabase:abbrev)". So we do need some kind of global storage to hold these results. This is sort of the same problem we have with parsing the objects at all. There we get by without a global because all of the logic is crammed into populate_value(). In theory we could do the same thing here, but there are just a lot of different items we might be asking for. So instead of "need_tagged", you'd have a multitude of need_objectsize_disk, need_deltabase, etc, which gets unwieldy. We may as well fill in a "struct object_info". So OK, I agree that something like "struct expand_data" is needed. But what seems non-ideal to me is the way that ref-filter depends on it in this cat-file-specific way. If it needs those values it should be calling sha1_object_info(). And if it doesn't, then it can skip the call. It shouldn't need to know about cat-file at all. And likewise cat-file should not know about expand_data. And I think that's true by the end of the series, though the steps in the middle left me a little confused. -Peff