On Fri, Feb 05, 2021 at 02:30:47PM +0000, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > Add the capability to read the table of contents, then pair the chunks > with necessary logic using read_chunk_fn pointers. Callers will be added > in future changes, but the typical outline will be: > > 1. initialize a 'struct chunkfile' with init_chunkfile(NULL). > 2. call read_table_of_contents(). A reader should call read_table_of_contents(), noted. > 3. for each chunk to parse, > a. call pair_chunk() to assign a pointer with the chunk position, or > b. call read_chunk() to run a callback on the chunk start and size. > 4. call free_chunkfile() to clear the 'struct chunkfile' data. How could a user of this API learn about all chunks present in the chunkfile, including unrecognized chunks? > We are re-using the anonymous 'struct chunkfile' data, as it is internal > to the chunk-format API. This gives it essentially two modes: write and > read. If the same struct instance was used for both reads and writes, > then there would be failures. > > Helped-by: Junio C Hamano <gitster@xxxxxxxxx> > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > diff --git a/chunk-format.h b/chunk-format.h > index 9a1d770accec..0edcc57db4e7 100644 > --- a/chunk-format.h > +++ b/chunk-format.h > @@ -6,6 +6,19 @@ > struct hashfile; > struct chunkfile; > > +/* > + * Initialize a 'struct chunkfile' for writing _or_ reading a file > + * with the chunk format. > + * > + * If writing a file, supply a non-NULL 'struct hashfile *' that will > + * be used to write. > + * > + * If reading a file, then supply the memory-mapped data to the > + * pair_chunk() or read_chunk() methods, as appropriate. And call read_table_of_contents() in between. > + * > + * DO NOT MIX THESE MODES. Use different 'struct chunkfile' instances > + * for reading and writing. > + */ > struct chunkfile *init_chunkfile(struct hashfile *f); > void free_chunkfile(struct chunkfile *cf); > int get_num_chunks(struct chunkfile *cf); > @@ -16,4 +29,37 @@ void add_chunk(struct chunkfile *cf, > chunk_write_fn fn); > int write_chunkfile(struct chunkfile *cf, void *data); > > +int read_table_of_contents(struct chunkfile *cf, > + const unsigned char *mfile, > + size_t mfile_size, > + uint64_t toc_offset, > + int toc_length); > + > +#define CHUNK_NOT_FOUND (-2) > + > +/* > + * Find 'chunk_id' in the given chunkfile and assign the > + * given pointer to the position in the mmap'd file where > + * that chunk begins. > + * > + * Returns CHUNK_NOT_FOUND if the chunk does not exist. > + */ > +int pair_chunk(struct chunkfile *cf, > + uint32_t chunk_id, > + const unsigned char **p); > + > +typedef int (*chunk_read_fn)(const unsigned char *chunk_start, > + size_t chunk_size, void *data); > +/* > + * Find 'chunk_id' in the given chunkfile and call the > + * given chunk_read_fn method with the information for > + * that chunk. > + * > + * Returns CHUNK_NOT_FOUND if the chunk does not exist. > + */ > +int read_chunk(struct chunkfile *cf, > + uint32_t chunk_id, > + chunk_read_fn fn, > + void *data); > + > #endif > -- > gitgitgadget >