On 26.01.2018 04:22, Tanu Kaskinen wrote: > On Thu, 2018-01-25 at 09:08 +0100, Georg Chini wrote: >> On 24.01.2018 01:40, Tanu Kaskinen wrote: >>> On Mon, 2018-01-22 at 16:19 +0100, Georg Chini wrote: >>>> On 21.01.2018 01:03, Tanu Kaskinen wrote: >>>>> It looks like we're anyway going to >>>>> need a bunch of "helper" functions, so there's not that much difference >>>>> between our approaches, mainly list parsing details are affected >>>>> somewhat. How to organize the helper functions is an open question, >>>>> since you have so far opposed my pa_message idea, and haven't provided >>>>> a proper alternative for keeping parameter writing state. >>>> >>>> I was thinking of hiding the use of strbuf behind some wrappers, >>>> so that that you call for example pa_message_params_new() >>>> instead of pa_strbuf_new() to avoid having a mixed name space. >>>> Not sure if this is a good idea, because some of the functions >>>> (like _new() or _write_string()) would really not be more than >>>> simple wrappers. > Did you plan to return pa_strbuf from pa_message_params_new(), or did > you plan adding a new pa_message_params wrapper object? I would prefer > a pa_message_params object (it would be quite close to my original > pa_message proposal). > > If clients are going to interact with pa_strbuf directly, then I don't > think duplicating already-existing functions from the pa_strbuf API > makes sense. I can use a wrapper object as well. > >> Do you think we should have a new file (pulse/message-helper.c/.h) >> for the helper functions or should they go into pulse/util.c? > They should go into pulse/message-params.[ch]. OK. > >> Which functions do we need for a start? >> >> Reading: >> >> pa_message_params_split_list() - splits the list, see below >> pa_message_params_read_string() - reads a string >> pa_message_params_read_double() - reads a float value >> pa_message_params_read_int() - reads a long integer >> pa_message_params_read_unsigned() - reads an unsigned long >> pa_message_param_read_bool() - reads a boolean >> >> Anything else? > That's already more than what we strictly need in the beginning, since > we only use string values, but I'm not opposed at all to having these. > > We shouldn't specify that the integer functions deal with "long > integers", because the server and the client may have different ideas > about what that means. I'd have at least int64 and uint64 types, > possibly also int32 and uint32. Yes, I wanted to use (u)int64 for the integer functions. > >> Writing: >> >> pa_message_params_new() - returns a new strbuf >> pa_message_params_to_string() - converts the strbuf to a string >> pa_message_params_write_string() - writes a string >> pa_message_params_write_escaped_string() - writes an escaped string >> pa_message_params_write_double() - writes a float value >> pa_message_params_write_int() - writes a long integer >> pa_message_params_write_unsigned() - writes an unsigned long >> pa_message_param_write_bool() - writes a boolean >> >> Anything else? Should _write_escaped_string() and _write_string() >> be two different functions or should we pass a boolean to >> _write_string() to indicate if we want escaping or not? > If write_string() adds the curly braces around the value, then it > doesn't make sense to have an unescaped version. It still makes sense. You might have prepared a sub-list in advance and want to add the whole sub-list as one string to the buffer. Then you would not want escaping. Also there are many cases where you know in advance that the string is "well behaved" and escaping is not necessary. > > If you don't want to have write_list_start() and write_list_end(), then > we need a function for appending arbitrary stuff to the parameters (if > clients have direct access to the pa_strbuf object, then that function > already exists). Meanwhile I tend to add the write_list_start/end() functions, but we still need a function to write raw strings for the case that we cannot write the whole element at once. > >> Concerning the problem of writing/reading floats: Do you think >> it is OK if we search for "." or "," in the string and replace the >> character with the decimal point character of the current locale >> before conversion? This would work as long as the float is written >> without thousand separator and we can ensure that in the >> _write_float() function. > What do you mean by "the string"? write_double() should not take a > string as an argument. I was talking about the reading function. The write function can just use the current locale. When reading, we have to check anyway if the decimal point character matches the server locale, so it does not matter which one the client writes. > >> Any further ideas/suggestions/restrictions that need to be taken >> into account? >> >>>>> How should reading and writing nulls work? Writing nulls is not >>>>> complicated, we could just have pa_message_write_null() (or since we >>>>> probably will have a "raw string append" function, it's trivial to just >>>>> append "{}" manually). Reading nulls is less obvious. >>>>> >>>>> All reading functions could have an "is_null" parameter: >>>>> >>>>> double double_var; >>>>> int is_null; >>>>> ret = pa_message_read_double(msg, &double_var, &is_null); >>>>> >>>>> If the is_null parameter isn't provided (i.e. is itself NULL), the >>>>> function could fail if it encounters null. >>>> If we define that NULL is represented by the "special element" >>>> {} it is very easy to determine if an element is empty, because >>>> in this case, the length returned by the parsing function will be 0. >>>> So I don't think we need to include an allow_null parameter in >>>> the helper function (unless we treat a sequence of white spaces >>>> like an empty element), because in cases where it is important, >>>> we can check the length of the element before calling the helper. >>>> Currently, parsing a value (including full error check) would look >>>> something like >>>> >>>> ret = pa_split_message_parameter_string(list, max_length, &start_pos, >>>> &length, NULL, &state); >>>> if (ret == 0) >>>> handle empty element; >>>> if (ret == -1) >>>> handle parse error; >>>> >>>> /* start_pos contains now the first valid character of the element >>>> * after { and length the length of the element without closing >>>> * bracket */ >>>> >>>> ret = pa_message_read_double(start_pos, length, &double_var) >>>> >>>> >>>> Strings would not need a (read) helper function because they can >>>> be directly returned by the parsing function. >>> So you're proposing that reading a double should take two function >>> calls. I think the two steps should be merged into one >>> pa_message_params_read_double() function. Then error handling needs to >>> be done only once, rather than twice per value. With this approach, >>> however, the question arises how to deal with nulls. >>> >> You are right, it should be folded into one function. I guess the >> easiest way to handle NULL's is to have a special return value, >> let's say 1 if parsing was OK, 0 if NULL was found and -1 on >> parse error, similar to the split function. >> >> Actually I noticed, that the code above is not correct. After >> pa_split_message_parameter_string() three checks need to >> be performed: >> >> 1) ret = 0: This does NOT mean that the element is empty, >> it means that the element is completely missing (end of list). >> It can still be treated like an empty element. > Why should it ever be treated like an empty element? Doesn't that make > it impossible to distinguish between an empty element and end-of-list? The functions are retrieving a single element of a list. In that case there is no difference if the element is empty or the surrounding list ended (and the element is missing). > >> 2) ret= -1: Parse error >> 3) length = 0: This means the element is empty. >> >> I'll rename the split function to pa_message_params_split_list() >> to match the name space. Or do you have a better idea? > pa_message_params_split_list() seems good name to me. > > The argument list of pa_message_params_split_list() is not what I had > in mind when the idea of making it process the string in-place was > first brought up. I don't even know what all those parameters in your > example are used for. > > This is what I had in mind: > > int pa_message_params_split_list(char *params, char **value, void *state); > > params is the original string, value is the returned string and state > is for remembering the current position in the original string (so > internally it has type char*). state is initialized to NULL. > > Alternatively the params and state arguments can be merged: > > int pa_message_params_split_list(char **state, char **value); I prefer the first form as a reminder which string you are working on. > > In this case the calling code has to initialize and use state like > this: > > int ret; > char *state = params; > char *value; > while ((ret = pa_message_params_split_list(&state, &value)) > 0) { > ... > } > > params in the first version and state in the second version is not > declared const, because the function replaces the closing bracket of > the read value with a null byte. That will make the returned value > string null-terminated. This simple idea never occurred to me, first because the original string is a const char and second because I did not want to modify it. But your idea makes things much easier. Do you see an issue with modifying the const char like   /* Replace } with 0 */   *(char*) current = 0; where current is a const char* pointing to the "}"? This will always work because we never change the length of the string. Or should I allocate a new string with pa_xstrdup and pass that to the function? Or should I pass a char* to the callback function and the callback is responsible to free the string? Personally I would prefer modifying the const char but that is not a very strong opinion because I know it is a bit ugly.