Re: [v4 2/5] virsh: Improve readline generators and readline completion

Eric Blake <eblake@xxxxxxxxxx> · Tue, 17 Sep 2013 21:22:06 -0600

On 09/10/2013 09:54 AM, Tomas Meszaros wrote:
> This patch is rather big and introduces several new functions,
> but I kept all new functions in the one patch because they are
> all connected together.
> 
> I had to extend vshReadlineOptionsGenerator() a lot, so it is possible
> to fully complete options by calling appropriate opt->completer().
> 
> vshReadlineOptionsCompletionGenerator() is simple new function
> which is used only when we need to fully complete specific option,
> e.g.: virsh # vol-key --vol <TAB>
> 
> ---

I still think you are mixing a bit too much into one patch, and that you
are reimplementing pieces of the parser instead of refactoring what is
already existing in other places for easier reuse.

On the other hand, I'm seeing some nice improvements. Below, I'm using
^^ to highlight what I typed; best viewed in fixed width font.

Pre-patch, I'm seeing:

virsh# vol-k<TAB>
       ^^^^^^^^^^
virsh# vol-key <TAB>
               ^^^^^
virsh# vol-key --pool <TAB>
                      ^^^^^
virsh# vol-key --pool --pool

while post-patch, it changes to:

virsh# vol-k<TAB>
       ^^^^^^^^^^
virsh# vol-key --<TAB>
                 ^^^^^
--help  --pool  --vol
virsh# vol-key --<TAB>
                 ^^^^^
--help  --pool  --vol
virsh# vol-key --p<TAB>
                 ^^^^^^
virsh# vol-key --pool <TAB>
                      ^^^^^
--help  --vol

Which is definitely better, but still not perfect (--pool takes a
mandatory argument, so we should NOT be offering --vol as a valid
completion once --pool has been typed, until AFTER the argument of pool
has also been typed).

> +++ b/tools/virsh.c
> @@ -2587,6 +2587,118 @@ vshCloseLogFile(vshControl *ctl)
>   * -----------------
>   */
>  
> +static const vshCmdDef *
> +vshDetermineCommandName(void)
> +{
> +    const vshCmdDef *cmd = NULL;
> +    char *p;
> +    char *cmdname;
> +
> +    if (!(p = strchr(rl_line_buffer, ' ')))
> +        return NULL;
> +
> +    cmdname = vshMalloc(NULL, (p - rl_line_buffer) + 1);
> +    memcpy(cmdname, rl_line_buffer, p - rl_line_buffer);
> +
> +    cmd = vshCmddefSearch(cmdname);
> +    VIR_FREE(cmdname);
> +
> +    return cmd;
> +}

This particular part of the patch looks like it is just code motion
(moving it earlier in the file to avoid having to do a forward
reference).  If so, please split that into a separate patch.

Both pre- and post-patch, I noticed:

virsh# e''<TAB>
       ^^^^^^^^
list of files in current directory starting with e

although what I really want is:

virsh# e''<TAB>
       ^^^^^^^^
echo  edit  emulatorpin  exit
virsh# e''cho hi<ENTER>
          ^^^^^^^^^^^^^
hi

that is, the virsh parser is already pretty sophisticated about doing
quote removal, but the readline parser is not.  I think this function
needs to take advantage of what parsing we already do.

Ideally, you should be using stuff like vshCommandStringGetArg to take
rl_line_buffer and split that into de-quoted words, and only then try to
figure out what completions to attempt.  That is, instead of trying to
complete on "e''" (which is not a prefix of any command, and therefore
falls back to the default completion of file names), you would be
completing on "e" (the results after vshCommandStringGetArg removes
quotes).  Then again, if you do that, then you also have to figure out
the readline hooks so that when the completion is not ambiguous (or when
it is ambiguous but has a non-ambiguous prefix and the user has
requested that <TAB> auto-type the common prefix), you have to know
which part of the completion is the suffix to append to what the user
typed without stripping the quotes they already typed.  It can get
tricky fast, so maybe the current behavior is okay, but food for thought.

> +
> +/*
> + * Check if option with opt_name is already fully completed.
> + */
> +static bool
> +vshOptFullyCompleted(const char *opt_name)
> +{
> +    const vshCmdDef *cmd = NULL;
> +    char *completed_name = NULL;
> +    char **completed_list = NULL;
> +    size_t completed_list_index;
> +    size_t opts_index = 0;
> +    bool opt_fully_completed = false;
> +
> +    if (!opt_name)
> +        return opt_fully_completed;
> +
> +    cmd = vshDetermineCommandName();

Here's an interesting exercise - run under gdb and put a breakpoint on
vshDetermineCommandName() (that gets tricky - I find it easiest to start
virsh in batch mode in one terminal, then use 'gdb --pid $(pidof
lt-virsh)' in another), and see how many times you are calling this
function (not all necessarily from this call site, but still
instructive).  For example, I counted:

virsh# e<TAB> => once
virsh# ec<TAB> => once
virsh# echo <TAB> => seven times
virsh# echo --s<TAB> => six times

Wow.  Once should be sufficient; which means we aren't caching what
we've learned about rl_line_buffer.  Really, when the user hits <TAB>,
we should COMPLETELY parse what they've typed so far (tokenize it into
an array of arguments with quotes stripped, with special handling if the
final argument is an unterminated quote such as hitting <TAB> after
unmatched '), and then ALL of our completion functions should refer back
to that cached structure for the remainder of their work, without having
to reparse the line.  Readline already gives you the hints you need for
proper memory management of that cache (argument of 0 when <TAB> is hit,
non-zero for all subsequent work until you finally return NULL to end
the completion list).  (Yeah, I wish it were via an opaque pointer,
rather than requiring you to store it in global variables, but such is
life when dealing with super-old APIs that fortunately still happen to
only need single-threading even today).

Your only saving grace is that tab completion is interactive and that we
don't have any command with more than 32 options by virtue of how we
track which options have been seen so far in an int bitmap, so even
though your behavior feels like O(n*2) (or maybe even O(n*3)), we don't
necessarily hit the penalties of scale where the user will notice the
wasted cpu cycles.  On the other hand, I did NOT test is whether you
call vshDetermineCommandName() once per volume in the per-volume
completer - and there, we KNOW there are users that stick literally
thousands of volumes in one pool, where the cost of calling
vshDetermineCommandName() per volume would quickly become a noticeable
delay.  There is no reason why we cannot get the parsing down to O(1)
(once per tab).

> +
> +    if (!cmd)
> +        return opt_fully_completed;
> +
> +    while (cmd->opts[opts_index].name) {
> +        const vshCmdOptDef *opt = &cmd->opts[opts_index];
> +        opts_index++;
> +
> +        if (!STREQ(opt->name, opt_name) || !opt->completer)
> +            continue;
> +
> +        completed_list_index = 0;
> +        completed_list = opt->completer(opt->completer_flags);
> +
> +        if (!completed_list)
> +            continue;
> +
> +        while ((completed_name = completed_list[completed_list_index])) {
> +            completed_list_index++;
> +            if (strstr(rl_line_buffer, completed_name))

strstr() is NOT the right function to be using.  A demonstration that
strstr is not right:

virsh# echo "string with --shell in middle" --shell<ENTER>
'string with --shell in middle'
virsh# echo "string with --shell in middle" --<TAB>
--help  --str   --xml

Oops.  Notice that the way virsh parses command lines, it is VALID to
put --shell AFTER the first --str argument; but your strstr mistakenly
hits on the embedded --shell inside the middle of the (implicit --str)
argument and refuses to treat --shell as a valid completion at the point
where I hit TAB.

Again, I argue that you should parse the complete line into an array of
strings, and then evaluate those strings in order, similar to how
vshCmddefOptParse is behaving.  In fact, I think that your first order
of business is to prepare a preliminary patch that refactors the body of
vshCmddefOptParse into calls to a helper function, so that both it and
your new code share the same parse engine, and the only place they
differ is what they do once the parse is complete - vshCmddefOptParse
raises errors if an option is missing an argument, if a required option
is missing, if an unknown option is found, or invokes the command; while
tab completion knows that an option missing an argument or a missing
required option means to do the per-option completer for that particular
option, unknown options mean that there is no longer anything to
complete, and the end result is displaying the completion instead of
running the command.

I also think it would be wise, for testing purposes, to implement my
proposed new 'virsh complete ...' command EARLY in the series, so that
you can more easily test scenarios from the command line instead of
having to fire up interactive gdb sessions across two terminals (that
is, it's easier to debug 'gdb --args virsh complete echo -' than it is
to fire up 'virsh' in one window, attach gdb in the other window, type
'echo -<TAB>' in the first window, then interact with gdb in the
second).  That, and having 'virsh complete' working would let us write
unit tests (tests/virshtest.c and tests/virsh-optparse already do a lot
of argument parsing testing on 'echo', these tests would be a great
starting point for a new test that virsh complete returns what we think
it should).

Remember, if you parse this exactly once, you would be left with:

argv[0] = "echo", associated with cmd name
argv[1] = "string with --shell in middle", associated with --str
argv[2] = "--", needs completion

along with the knowledge that --help, --shell, --str, and --xml are all
still valid at that point (--help because it is always valid if we
aren't waiting for an option's required argument; --str because it can
appear more than once thanks to its VSH_OT_ARGV data-type, and the other
two because they haven't been seen yet).

At this point, I'm torn on whether it is effective to review the rest of
your patch as-is, knowing that parts of it may be irrelevant if you
first do the cleanup to fix the algorithms behind the patch.  I'm glad
that you're making progress, and I want to encourage you to keep trying
(especially since I know this is a summer of code project, where you are
under a timeline to get something submitted to qualify for benefits
outside the scope of this list).  And I also apologize that it has taken
me a week to review your patch (you posted on the 10th, and it is now
the 17th), as that cuts into your timeline - if nothing else, I hope
that this serves as a positive learning experience that getting the
design right up front is KEY to getting the code approved later with
minimal churn (coding first, only to have the design ripped to shreds
and having to do more iterations of code, can be frustrating, both to
the coder and to the reviewers).  Please, even if the summer of code
deadlines pass, continue your work on this until it is in - I definitely
want this in virsh.

Revisiting some examples from my earlier mails:

>> virsh# vol-key vol <TAB>
>> pool1 pool2 --pool

The one-time parser should split this into:

argv[0] = "vol-key", cmd name
argv[1] = "vol", argument to implicit --vol
argv[2] = "", needs completion
at this point, --pool is the next available option, so the completion is
the union of the per-option pool completer (but with names beginning
with - filtered out, because the parser would treat leading - as the
start of an option rather than a pool name) and the possible remaining
options (--pool and the ever-present --help)

>> 
>> virsh# vol-key -<TAB>
>> --vol --pool

argv[0] = "vol-key", cmd name
argv[1] = "-", needs completion
at this point, we know we have to complete an option, so we don't use
any per-option completer (neither volume nor pool names can appear
here), and use just the option completer (--vol, --pool, and the
ever-present --help)

>> 
>> virsh# vol-key v<TAB>
>> vol1 vol2

argv[0] = "vol-key", cmd name
argv[1] = "v", needs completion
at this point, we know we have to complete a non-option; as no options
have been consumed yet, this is the first option that requires an
argument, so we treat it as an implicit --vol, and run just the volume
name completer (filtered to just names beginning with 'v').

>> 
>> virsh# vol-key --pool <TAB>
>> pool1 pool2

argv[0] = "vol-key", cmd name
argv[1] = "--pool", --pool option requires argument
argv[2] = "", needs completion
at this point, we know we have to supply the argument to the --pool
option, so we run just the pool name completer, and have nothing to
filter on (all pool names are valid, even those beginning with '-').
Guess what - even though I said "--help" is ever-present, here it does
not get listed (--help is only present when an option can appear, but
here we expect only an argument)

>> 
>> virsh# vol-key --pool pool1 <TAB>
>> vol1 vol2 --vol

argv[0] = "vol-key", cmd name
argv[1] = "--pool", --pool option requires argument
argv[2] = "pool1", provides argument to --pool
argv[3] = "", needs completion
at this point, we can take either an option or an argument to an
implicit option; the next available option needing an argument is --vol,
so we run the union of the per-option volume completer (filter out
leading '-') and the remaining option completer (filter out --pool,
leaving just --vol and the ever-present --help).

>> 
>> virsh# vol-key --pool=[<TAB>
>> --pool=pool1 --pool=pool2

argv[0] = "vol-key", cmd name
argv[1] = "--pool", --pool option requires an argument
argv[2] = "[", needs completion
ooh tricky - I split "--pool=<TAB>" into two arguments, and then
complete on just the per-option pool completer (filtered to names
beginning with 'p'); then have to reconstruct it back into a single
string for display.  But having the parser normalize the code once, then
generate the completion list, then munge the generated list back into
display format at the end, will be easier than having to write the
generation list to handle '--option arg' and '--option=arg' as
equivalent throughout the entire body of completer code.

[Oh, and I have homework too - I STILL need to submit my patch for a
much simpler ./configure --without-readline.  It's a challenge to see
who can post the next patch :) ]

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list