On 01.08.2018 13:13, Tanu Kaskinen wrote: > On Mon, 2018-07-30 at 13:23 +0200, Georg Chini wrote: >> On 30.07.2018 11:18, Tanu Kaskinen wrote: >>> On Sun, 2018-07-29 at 22:36 +0200, Georg Chini wrote: >>>> On 29.07.2018 21:47, Tanu Kaskinen wrote: >>>>> On Fri, 2018-07-27 at 10:51 +0200, Georg Chini wrote: >>>>>> On 27.07.2018 10:08, Tanu Kaskinen wrote: >>>>>>> On Thu, 2018-07-26 at 18:02 +0200, Georg Chini wrote: >>>>>>>> On 26.07.2018 12:37, Tanu Kaskinen wrote: >>>>>>>>> On Sun, 2018-07-22 at 21:11 +0200, Georg Chini wrote: >>>>>>>>>> On 22.07.2018 17:48, Tanu Kaskinen wrote: >>>>>>>>>>> On Sun, 2018-07-22 at 16:02 +0200, Georg Chini wrote: >>>>>>>>>>>> On 21.07.2018 20:17, Tanu Kaskinen wrote: >>>>>>>>>>>>> On Mon, 2018-04-09 at 19:35 +0200, Georg Chini wrote: >>>>>>>>>>>>>> + >>>>>>>>>>>>>> +/* Read functions */ >>>>>>>>>>>>>> + >>>>>>>>>>>>>> /* Split the specified string into elements. An element is defined as >>>>>>>>>>>>>> * a sub-string between curly braces. The function is needed to parse >>>>>>>>>>>>>> * the parameters of messages. Each time it is called returns the position >>>>>>>>>>>>>> * of the current element in result and the state pointer is advanced to >>>>>>>>>>>>>> - * the next list element. >>>>>>>>>>>>>> + * the next list element. On return, the parameter *is_unpacked indicates >>>>>>>>>>>>>> + * if the string is plain text or contains a sub-list. is_unpacked may >>>>>>>>>>>>>> + * be NULL. >>>>>>>>>>>>> is_unpacked looks like unnecessary complexity. >>>>>>>>>>>>> pa_message_params_read_string() should always unescape the value. >>>>>> This is not about parameter type, it is just to distinguish between >>>>>> a list and a simple value. One example comes to my mind immediately: >>>>>> Consider a parameter list that consists of strings and a couple of >>>>>> arrays. Now you can read this list as an array of strings (patch 8 >>>>>> provides a function for that) and then examine those strings that >>>>>> contain arrays separately. Having the flag (and using it in read_string()) >>>>>> provides a more flexible and convenient way to parse a parameter list. >>>>>> >>>>>> The flag may not be strictly necessary at the moment, but I would still >>>>>> like to keep it. >>>>> To continue a familiar theme of my review: if there's a >>>>> read_string_array() function, I want that to be used only for string >>>>> arrays, not a mishmash of random types. There could be a separate >>>>> function split_list_into_array() that does something similar to >>>>> what you wanted to do with read_string_array(). >>>>> split_list_into_array() wouldn't do special string handling, >>>>> though, so unescaping would be left to the application. I find that >>>>> only logical, since other basic types don't get special handling >>>>> either (i.e. floats aren't converted to C floats). >>>>> >>>>> Your use case could be served with a vararg function that instead of >>>>> producing a string array would read into separate variables, like this: >>>>> >>>>> pa_message_params_read(c, state, >>>>> PA_TYPE_STRING, ¶m1, >>>>> PA_TYPE_FLOAT, ¶m2, >>>>> PA_TYPE_RAW, ¶m3, >>>>> 0); >>>>> >>>>> PA_TYPE_RAW would be useful for reading a list (or anything else) into >>>>> a C string without unescaping or other processing. There could be also >>>>> PA_TYPE_IGNORE for uninteresting variables, and PA_TYPE_*_ARRAY for >>>>> arrays of basic types. >>>>> >>>>> (Now it occurred to me that the 'c' argument name that is used in the >>>>> parsing functions is a bit weird and unhelpful. Maybe "param_list" >>>>> would be good?) >>>>> >>>> I still don't see your point. Again, the use of is_unpacked is >>>> transparent for the >>>> user of read_string(), so the user just reads a string without having to >>>> care about >>>> unescaping. The flag does not complicate the API, it simplifies it >>>> because the >>>> unescaping is done automatically. >>> The flag complicates the split_list() function parameters, but also the >>> read_string() semantics: you need to explain to the user that unlike >>> all the other read_foo() functions, read_string() can read any value >>> and unescaping becomes conditional because of this. >> read_string() is not supposed to read any value, it is only supposed >> to read strings. Actually, the user does not need to know anything >> about escaping, because read_string() and write_string() do the >> necessary escaping/unescaping completely transparent for the user. >> >> The is_unpacked flag is at least useful to check if something returned >> by split_list() is really a simple type and not a structure or array. I am >> currently not using it in the read_foo() functions, but I think I should. > I guess you'd use is_unpacked for catching errors in the read_foo() > functions? That seems reasonable, but I'd like that to be done in an > internal function. You said that maybe split_list() could be made an > internal function, and that seems like a good idea. An alternative > would be to have an internal function for which split_list() would be a > simplified wrapper. I guess I can make split_list() internal. > >>>> Your approach seems unnecessary >>>> complicated to me. A string is a string, even if it contains >>>> sub-structures. Your >>>> split_list_into_array() function would basically return an array of >>>> strings, so what >>>> would be the advantage? It would only make parsing more cumbersome, because >>>> the task of unescaping is given to the user instead of doing it >>>> automatically where >>>> necessary. >>> That's why I came up with the vararg function. I agree that >>> split_list_into_array() is unlikely to be very useful. >>> >>>> Also there is no "mishmash of random types", they are all strings It is >>>> only the >>>> difference between a "simple" string which needs no further processing and a >>>> "complex" string which needs further parsing. >>> The API defines types for the parameters. There are no separate >>> "simple" and "complex" strings in the type system. The string type is >>> different than the list type. What you call complex strings are in my >>> mind just the raw unprocessed serialized data, which is a different >>> abstraction level than the values that the various read_foo() functions >>> return. It feels just very wrong to have a function that returns values >>> in both domains: unparsed raw data and parsed values. Especially when >>> the function is a family of read_foo() functions where all other >>> functions only operate on their designated types. >> Yes, that is some additional functionality that the read_string() >> function provides on top of reading what I called "simple strings" >> above. But that does not hinder the function to work exactly >> like expected when reading a plain string. Maybe a better name >> for the function would be read_string_element(), indicating that >> it can read either a plain string or an element of the parameter >> list as a string. >> I could split this into two functions, read_string() which always >> does unescaping and read_element() which would be a wrapper >> around split_list() and would never do unescaping. In both functions, >> the is_unpacked flag can be useful to check if the input matches >> the type "plain string" or "serialized data". > A separate function for reading the raw data sounds good to me. Are you > attached to the read_element() name, or could it be read_raw() (or > perhaps read_raw_value() or read_raw_element())? Somehow read_raw() > seems better to me, but I can't make strong arguments for why it should > be called that. > read_raw() seems OK as a counterpart to write_raw().