karthik nayak <karthik.188@xxxxxxxxx> writes: > "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > >> From: Derrick Stolee <stolee@xxxxxxxxx> >> >> In anticipation of a few planned applications, introduce the most basic form >> of a path-walk API. It currently assumes that there are no UNINTERESTING >> objects, and does not include any complicated filters. It calls a function >> pointer on groups of tree and blob objects as grouped by path. This only >> includes objects the first time they are discovered, so an object that >> appears at multiple paths will not be included in two batches. >> >> These batches are collected in 'struct type_and_oid_list' objects, which >> store an object type and an oid_array of objects. >> >> The data structures are documented in 'struct path_walk_context', but in >> summary the most important are: >> >> * 'paths_to_lists' is a strmap that connects a path to a >> type_and_oid_list for that path. To avoid conflicts in path names, >> we make sure that tree paths end in "/" (except the root path with >> is an empty string) and blob paths do not end in "/". >> >> * 'path_stack' is a string list that is added to in an append-only >> way. This stores the stack of our depth-first search on the heap >> instead of using recursion. >> >> * 'path_stack_pushed' is a strmap that stores path names that were >> already added to 'path_stack', to avoid repeating paths in the >> stack. Mostly, this saves us from quadratic lookups from doing >> unsorted checks into the string_list. >> >> The coupling of 'path_stack' and 'path_stack_pushed' is protected by the >> push_to_stack() method. Call this instead of inserting into these >> structures directly. >> >> The walk_objects_by_path() method initializes these structures and >> starts walking commits from the given rev_info struct. The commits are >> used to find the list of root trees which populate the start of our >> depth-first search. > > Isn't this more of breadth-first search? Reading through the code, the > algorithm seems something like: > > - For each commit in list of commits (from rev_info) > - Tackle each root tree, add root path to the stack. > - For each path in stack left > - Call the callback provided by client. > - Find all its first level children, add each to the stack. > > So wouldn't this go through the tree in level by level basis? Making it > a BFS? My bad here, thinking more about it, it is DFS indeed. Although we add all the children of a level to the stack, we pop each of them from the stack and end up traversing down that level. > > Apart from this, the patch itself looks solid. I ended up writing a > small client to play with this API, and was very pleased how quickly I > could get it running. > > [snip]
Attachment:
signature.asc
Description: PGP signature