[PATCH] Strbuf documentation: document most functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



All functions in strbuf.h are documented, except launch_editor().

Signed-off-by: Miklos Vajna <vmiklos@xxxxxxxxxxxxxx>
---

Here is an improved version. Details below.

On Tue, Jun 03, 2008 at 01:00:33AM +0100, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
> In general, I'd rather leave the "->" from the members, since you have
> many instances where you access them with ".".

Ok, removed, except the code sniplet where the strbuf is obviously a
pointer.

> > +strbuf's can be use in many ways: as a byte array, or to store
> > arbitrary
> > +long, overflow safe strings.
>
> I think that you should not suggest using strbufs as byte array, even
> if
> that is certainly possible.  Rather, you should say something like:
>
>       An strbuf is NUL terminated for convenience, but no function in
>       the strbuf API actually relies on the string being free of NULs.

Changed.

> > +. The `->buf` member is always malloc-ed, hence strbuf's can be
> > used to
> > +  build complex strings/buffers whose final size isn't easily
> > known.
>
> Is this true?  I thought the initial string is empty, but not
> alloc'ed.
>
> So I'd rather have something like
>
>       The "buf" member is never NULL, so you can safely strcmp() it.

Right, corrected.

> I'd like to see a comment that strbuf's _have_ to be initialized
> either by
> strbuf_init() or by "= STRBUF_INIT" before the invariants, though.
>
> > +`strbuf_attach`::
> > +
> > +   Attaches a string to a buffer. You should specify the string to
> > attach,
> > +   the current length of the string and the amount of allocated
> > memory.
>
> ... This string _must_ be malloc()ed, and after attaching, the pointer
> cannot be relied upon anymore, and neither be free()d directly.

Added. Also fixed the rest of the typos you pointed out.

On Tue, Jun 03, 2008 at 10:42:40AM +0200, Pierre Habouzit <madcoder@xxxxxxxxxx> wrote:
>   (1) it is totally safe to touch anything in the buffer pointed by
>   the
>       buf member between the index 0 and buf->len excluded.
>
>   (1b) what you write later: it's also possible to write after
>   buf->len
>        if not after strbuf_avail() _BUT_ then you have when you're
>        done
>        the task to reset the (2) invariant yourself, using
>        strbuf_setlen().
>
>   (2) ->buf[->len] == '\0' holds _ALL TIME_.
>
>   (3) ->buf is never ever NULL so it can be used in any usual C string
>       ops safely.
>
>   (4) do NOT assume anything on what ->buf really is (allocated memory
>       or not e.g.), use strbuf_detach to unwrap a memory buffer from
>       its
>       strbuf shell in a safe way. That is the sole supported way. This
>       will give you a malloced buffer that you can later free().

Thanks, I improved the introduction including these, hopefully I haven't
missing anything.

> > +* Life cycle
> > +
> > +`strbuf_init`::
> > +
> > +   Initializes the structure. The second parameter can be zero or a
> > bigger
> > +   number to allocate memory, in case you want to prevent further
> > reallocs.
>
>   I'd add that it is _MANDATORY_ to initialize strbufs, and that a
> static allocation (for global variables e.g.) can be done using
> the STRBUF_INIT static initializer.

Added.

>
> > +`strbuf_release`::
> > +
> > +   Releases a string buffer and the memory it used. You should not
> > use the
> > +   string buffer after using this function, unless you initialize
> > it again.
>
>   Actually this is wrong because strbuf_release performs a new init
> since init allocates 0 memory and that it's idiot-proof. But it could
> be
> changed in the future and it should not be relied upon.

That's why I think 'should' is the good term and not 'must'. I don't
think the reader should be informed of the current implementation.

> > +`strbuf_detach`::
> > +
> > +   Detaches the string from the string buffer. The function returns
> > a
> > +   pointer to the old string and empties the buffer.
>
>   Not really strbuf_detach unwraps the embedded buffer for sure, but
>   it
> doesn't "empties" the buffer, strbuf_detach is like strbuf_release:
> after a release, strbuf should be init-ed again (even if for now
> strbuf_release does so).

Right, thanks for pointing this out.

> > +`strbuf_attach`::
> > +
> > +   Attaches a string to a buffer. You should specify the string to
> > attach,
> > +   the current length of the string and the amount of allocated
> > memory.
>
>   In addition to what Johannes said: size must be > len. Because the
> string you pass is supposed to be a NUL-terminated string.

Added.

> > +`strbuf_grow`::
> > +
> > +   Allocated extra memory for the buffer.
>
>   I'd put that this way: ensure that at least this amount of available
> memory is available. This is used when you know a typical size for
> what
> you will do and want to avoid repetitive automatic resize of the
> underlying buffer. This is never a needed operation, but can be
> critical
> for performance in some cases.

Changed.

> > +`strbuf_setlen`::
> > +
> > +   Sets the length of the buffer to a given value.
>
>   This function does NOT allocate new memory, so you should not
>   perform
> a strbuf_setlen to a length that is larger than sb->len +
> strbuf_avail(sb).
> strbuf_setlen is just meant as a "please fix invariants from this
> strbuf
> I just messed with)"

Added.

> > +`strbuf_add`::
> > +
> > +   Add data of given length to the buffer.
> > +
> > +`strbuf_addstr`::
> > +
> > +   Add a NULL-terminated string to the buffer.
>
>   Please use NUL, '\0' is NUL (as in its ascii name), NULL is (void
>   *)0.
> In addition to that, I'd say that strbuf_addstr will ALWAYS be
> implemented as an inline or a macro that expands to:
>   strbuf_add(..., s, strlen(s))
>
> Meaning that this is efficient to write things like:
>   strbuf_addstr(sb, "immediate string").

Changed and added.

> > +`strbuf_expand`::
>
>   This function is a pretty printer that expands magic formats string
> thanks to callbacks, so that it's done in a generic way. It's what is
> used to generate git-log e.g. I'm not its author, so I'm not really
> best
> placed to describe it.

OK. ;-)

> > +`strbuf_fread`::
> > +
> > +   Read a given size of data from a FILE* pointer to the buffer.
> > +
> > +`strbuf_read`::
> > +
> > +   Read the contents of a given file descriptor. The third argument
> > can be
> > +   used to give a hint about the file, to avoid reallocs.
> > +
> > +`strbuf_read_file`::
> > +
> > +   Read the contents of a file, specified by its path. The third
> > argument
> > +   can be used to give a hint about the file, to avoid reallocs.
> > +
> > +`strbuf_getline`::
> > +
> > +   Read a line from a FILE* pointer. The second argument specifies
> > the line
> > +   terminator character, like `'\n'`.
> > +
>
>   For all: the buffer is rewinded if the read fails.
>   If -1 is returned, errno must be consulted, like you would do for
> read(3).

Added.

> >     An strbuf is NUL terminated for convenience, but no function in
> >     the strbuf API actually relies on the string being free of NULs.
>
>   ACK. I'd add the fact that strbuf are meant to be used with all the
> usual C string and memory APIs. Given that the length of the buffer is
> known, it's often better to use the mem* functions than a str* one
> (memchr vs. strchr e.g.). Though, one has to be careful about the fact
> that str* functions often stop on NULs and that strbufs may have
> embedded NULs.

Added.

On Tue, Jun 03, 2008 at 05:41:38PM +0200, René Scharfe <rene.scharfe@xxxxxxxxxxxxxx> wrote:
> > Actually this is a bit of request for help, I haven't figured out
> > what
> > strbuf_expand() does [...]
>
> It can be used to expand a format string containing placeholders.  To
> that end, it parses the string and calls the specified function for
> every percent sign found.
>
> The callback function is given a pointer to the character after the
> '%'
> and a pointer to the struct strbuf.  It is expected to add the
> expanded
> version of the placeholder to the strbuf, e.g. to add a newline
> character if the letter 'n' appears after a '%'.  The function returns
> the length of the placeholder recognized and strbuf_expand skips over
> it.
>
> All other characters (non-percent and not skipped ones) are copied
> verbatim to the strbuf.  If the callback returned zero, meaning that
> the
> placeholder is unknown, then the percent sign is copied, too.
>
> In order to facilitate caching and to make it possible to give
> parameters to the callback, strbuf_expand passes a context pointer,
> which can be used by the programmer of the callback as she sees fit.

Thanks, added.

 Documentation/technical/api-strbuf.txt |  234 +++++++++++++++++++++++++++++++-
 1 files changed, 232 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
index a52e4f3..15f66b2 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -1,6 +1,236 @@
 strbuf API
 ==========
 
-Talk about <strbuf.h>
+An strbuf is NUL terminated for convenience, but no function in the
+strbuf API actually relies on the string being free of NULs.
 
-(Pierre, JC)
+strbuf's are meant to be used with all the usual C string and memory
+APIs. Given that the length of the buffer is known, it's often better to
+use the mem* functions than a str* one (memchr vs. strchr e.g.).
+Though, one has to be careful about the fact that str* functions often
+stop on NULs and that strbufs may have embedded NULs.
+
+strbufs has some invariants that are very important to keep in mind:
+
+. The `buf` member is never NULL, so you it can be used in any usual C
+string operations safely. strbuf's _have_ to be initialized either by
+`strbuf_init()` or by `= STRBUF_INIT` before the invariants, though.
++
+Do *not* assume anything on what `buf` really is (e.g. if it is
+allocated memory or not), use `strbuf_detach()` to unwrap a memory
+buffer from its strbuf shell in a safe way. That is the sole supported
+way. This will give you a malloced buffer that you can later `free()`.
++
+However, it is totally safe to touch anything in the buffer pointed by
+the `buf` member between the index `0` and `len` excluded.
+
+. The `buf` member is a byte array that has at least `len + 1` bytes
+  allocated. The extra byte is used to store a `'\0'`, allowing the
+  `buf` member to be a valid C-string. Every strbuf function ensure this
+  invariant is preserved.
++
+NOTE: It is OK to "play" with the buffer directly if you work it that
+      way:
++
+----
+strbuf_grow(sb, SOME_SIZE); <1>
+strbuf_setlen(sb, sb->len + SOME_OTHER_SIZE);
+----
+<1> Here, the memory array starting at `buf`, and of length
+`strbuf_avail(sb)` is all yours, and you can be sure that
+`strbuf_avail(sb)` is at least `SOME_SIZE`.
++
+Of course, `SOME_OTHER_SIZE` must be smaller or equal to `strbuf_avail(sb)`.
++
+Doing so is safe, though if it has to be done in many places, adding the
+missing API to the strbuf module is the way to go.
++
+WARNING: Do _not_ assume that the area that is yours is of size `alloc
+- 1` even if it's true in the current implementation. Alloc is somehow a
+"private" member that should not be messed with.
+
+Data structures
+---------------
+
+* `struct strbuf`
+
+This is string buffer structure. The `len` variable can be used to
+determine the current length of the string, and `buf` provides access
+to the string itself.
+
+Functions
+---------
+
+* Life cycle
+
+`strbuf_init`::
+
+	Initializes the structure. The second parameter can be zero or a bigger
+	number to allocate memory, in case you want to prevent further reallocs.
+
+`strbuf_release`::
+
+	Releases a string buffer and the memory it used. You should not use the
+	string buffer after using this function, unless you initialize it again.
+
+`strbuf_detach`::
+
+	Detaches the string from the string buffer. The function returns a
+	pointer to the old string and releases a buffer, so that if you want to
+	use it again, you should initialize it before doing so.
+
+`strbuf_attach`::
+
+	Attaches a string to a buffer. You should specify the string to attach,
+	the current length of the string and the amount of allocated memory.
+	The amount must be larger than the string length, because the string you
+	pass is supposed to be a NUL-terminated string.  This string _must_ be
+	malloc()ed, and after attaching, the pointer cannot be relied upon
+	anymore, and neither be free()d directly.
+
+`strbuf_swap`::
+
+	Swaps the contents of two string buffers.
+
+* Related to the size of the buffer
+
+`strbuf_avail`::
+
+	Determines the amount of allocated but not used memory.
+
+`strbuf_grow`::
+
+	Ensure that at least this amount of available memory is available. This
+	is used when you know a typical size for what you will do and want to
+	avoid repetitive automatic resize of the underlying buffer. This is
+	never a needed operation, but can be critical for performance in some
+	cases.
+
+`strbuf_setlen`::
+
+	Sets the length of the buffer to a given value. This function does *not*
+	allocate new memory, so you should not perform a `strbuf_setlen()` to a
+	length that is larger than `len + strbuf_avail()`. `strbuf_setlen()` is
+	just meant as a 'please fix invariants from this strbuf I just messed
+	with'.
+
+`strbuf_reset`::
+
+	Empties the buffer by setting the size of it to zero.
+
+* Related to the contents of the buffer
+
+`strbuf_rtrim`::
+
+	Strip whitespace from the end of a string.
+
+`strbuf_cmp`::
+
+	Compares two buffers. Returns an integer less than, equal to, or greater
+	than zero if the first buffer is found, respectively, to be less than,
+	to match, or be greater than the second buffer.
+
+* Adding data to the buffer
+
+`strbuf_addch`::
+
+	Adds a single character to the buffer.
+
+`strbuf_insert`::
+
+	Insert data to the given position of the buffer. The remaining contents
+	will be shifted, not overwritten.
+
+`strbuf_remove`::
+
+	Remove given amount of data from a given position of the buffer.
+
+`strbuf_splice`::
+
+	Splice pos..pos+len with given data.
+
+`strbuf_add`::
+
+	Add data of given length to the buffer.
+
+`strbuf_addstr`::
+
+Add a NUL-terminated string to the buffer.
++
+NOTE: This function will *always* be implemented as an inline or a macro
+that expands to:
++
+----
+strbuf_add(..., s, strlen(s));
+----
++
+Meaning that this is efficient to write things like:
++
+----
+strbuf_addstr(sb, "immediate string");
+----
+
+`strbuf_addbuf`::
+
+	Add an other buffer to the current one.
+
+`strbuf_adddup`::
+
+	Copy part of the buffer from a given position till a given length to the
+	end of the buffer.
+
+`strbuf_expand`::
+
+	This function can be used to expand a format string containing
+	placeholders. To that end, it parses the string and calls the specified
+	function for every percent sign found.
++
+The callback function is given a pointer to the character after the `%`
+and a pointer to the struct strbuf.  It is expected to add the expanded
+version of the placeholder to the strbuf, e.g. to add a newline
+character if the letter `n` appears after a `%`.  The function returns
+the length of the placeholder recognized and `strbuf_expand()` skips
+over it.
++
+All other characters (non-percent and not skipped ones) are copied
+verbatim to the strbuf.  If the callback returned zero, meaning that the
+placeholder is unknown, then the percent sign is copied, too.
++
+In order to facilitate caching and to make it possible to give
+parameters to the callback, `strbuf_expand()` passes a context pointer,
+which can be used by the programmer of the callback as she sees fit.
+
+`strbuf_addf`::
+
+	Add a formatted string to the buffer.
+
+`strbuf_fread`::
+
+	Read a given size of data from a FILE* pointer to the buffer.
++
+NOTE: The buffer is rewinded if the read fails. If -1 is returned,
+`errno` must be consulted, like you would do for `read(3)`.
+`strbuf_read()`, `strbuf_read_file()` and `strbuf_getline()` has the
+same behaviour as well.
+
+`strbuf_read`::
+
+	Read the contents of a given file descriptor. The third argument can be
+	used to give a hint about the file size, to avoid reallocs.
+
+`strbuf_read_file`::
+
+	Read the contents of a file, specified by its path. The third argument
+	can be used to give a hint about the file size, to avoid reallocs.
+
+`strbuf_getline`::
+
+	Read a line from a FILE* pointer. The second argument specifies the line
+	terminator character, typically `'\n'`.
+
+`stripspace`::
+
+	Strips whitespace from a buffer. The second parameter controls if
+	comments are considered contents to be removed or not.
+
+`launch_editor`::
-- 
1.5.6.rc0.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux