Derrick Stolee wrote: > On 8/3/2022 9:45 PM, Victoria Dye via GitGitGadget wrote: >> From: Victoria Dye <vdye@xxxxxxxxxx> > >> +--diagnose[=(basic|all)]:: >> + Create a zip archive of information about the repository including logs > > logs? I think the reflogs are not included unless "all" is specified. Perhaps > we can unify this description with the beginning of git-diagnose.txt: > > Collects detailed information about the user's machine, Git client, and > repository state and packages that information into a zip archive. > > resulting in > > Create a zip archive containing information about the user's machine, > Git client, and repository state. > [...] >> + and certain statistics describing the data shape of the repository. The >> + archive is written to the same output directory as the bug report and is >> + named 'git-diagnostics-<formatted suffix>'. >> ++ >> +By default, `--diagnose` (equivalent to `--diagnose=basic`) will collect only >> +statistics and summarized data about the repository and filesystem. Specifying >> +`--diagnose=all` will create an archive with the same contents generated by `git >> +diagnose --all`; this archive will be much larger, and will contain potentially >> +sensitive information about the repository. See linkgit:git-diagnose[1] for more >> +details on the contents of the diagnostic archive. > > Perhaps here (and git-diagnose.txt) should be really explicit about sharing the > "all" mode output only with trusted parties. Let the user decide what level of > trust is necessary depending on their situation (we don't need to say "open source > repos are fine to share" or something). Both of these documentation suggestions make sense to me; I'll update accordingly in V3. > >> +enum diagnose_mode { >> + DIAGNOSE_NONE, >> + DIAGNOSE_BASIC, >> + DIAGNOSE_ALL >> +}; > > This enum makes me think that it might be nice to use this in diagnose.h > along with an array that pairs strings with the enum. We could unify the > options by having 'git diagnose --mode=(basic|all)' which could be > extended in the future with another mode that might be in between the two. > > It may also be a waste of time to set up that infrastructure without it > actually mattering in the future, but I thought I'd mention it as an > alternative, in case that inspires you. Your suggestion is more extensible than the boolean "include_everything" I'm using right now in 'create_diagnostics_archive()'. I'll incorporate it into my next re-roll, thanks! > >> static void get_system_info(struct strbuf *sys_info) >> { >> @@ -91,6 +97,23 @@ static void get_header(struct strbuf *buf, const char *title) >> strbuf_addf(buf, "\n\n[%s]\n", title); >> } >> >> +static int option_parse_diagnose(const struct option *opt, >> + const char *arg, int unset) >> +{ >> + enum diagnose_mode *diagnose = opt->value; >> + >> + BUG_ON_OPT_NEG(unset); >> + >> + if (!arg || !strcmp(arg, "basic")) >> + *diagnose = DIAGNOSE_BASIC; >> + else if (!strcmp(arg, "all")) >> + *diagnose = DIAGNOSE_ALL; > > Should we allow "none" to reset the value to DIAGNOSE_NONE? As far as I can tell, while some builtins have options that match the default behavior of the command (e.g., '--no-autosquash' in 'git rebase'), those options typically exist to override a config setting (e.g., 'rebase.autosquash'). No config exists for 'bugreport --diagnose' (and I don't think it would make sense to add one), so '--diagnose=none' would only be used to override another '--diagnose' specification in the same command/alias (e.g., 'git bugreport --diagnose=basic --diagnose=none'). That use case seems unlikely, but if there's precedent or use cases I'm not accounting for, I'm happy to add the option. > >> + else >> + die(_("diagnose mode must be either 'basic' or 'all'")); > > I wondered initially if this should be a usage() call instead. But we have > plenty of examples of using die() to report an issue with a single option > or a combination of options. After looking at other 'OPT_CALLBACK_F' examples (e.g., the 'diff_opt_stat' options in 'diff.c'), I think this should at least change to 'return error(<something>)' to get a more appropriate return code (129): else return error(_("invalid --%s mode '%s'"), opt->long_name, arg); I also looked into using 'usage_msg_opt()' to print both an error message and the usage string, but doing so would require making the 'bugreport' options structure & variables static so they're accessible by 'option_parse_diagnose()'. If you think it would be valuable to include that information, I'll add an extra refactoring patch to include it. > >> const struct option bugreport_options[] = { >> + OPT_CALLBACK_F(0, "diagnose", &diagnose, N_("(basic|all)"), >> + N_("create an additional zip archive of detailed diagnostics"), >> + PARSE_OPT_NONEG | PARSE_OPT_OPTARG, option_parse_diagnose), > > The biggest reason for this to be an OPT_CALLBACK_F is because of the > '--diagnose' option (without '='), so an OPT_STRING would not be > appropriate here. > >> @@ -119,6 +147,7 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix) >> option_output ? option_output : ""); >> strbuf_addstr(&report_path, prefixed_filename); >> strbuf_complete(&report_path, '/'); >> + output_path_len = report_path.len; > > Perhaps this should be renamed to output_dir_len, since we know this is > a directory that will contain all of the output files. Agreed, 'path' is a bit ambiguous and 'dir' more clearly refers to the directory containing the output files. > >> + /* Prepare diagnostics, if requested */ >> + if (diagnose != DIAGNOSE_NONE) { >> + struct strbuf zip_path = STRBUF_INIT; >> + strbuf_add(&zip_path, report_path.buf, output_path_len); >> + strbuf_addstr(&zip_path, "git-diagnostics-"); >> + strbuf_addftime(&zip_path, option_suffix, localtime_r(&now, &tm), 0, 0); >> + strbuf_addstr(&zip_path, ".zip"); >> + >> + if (create_diagnostics_archive(&zip_path, diagnose == DIAGNOSE_ALL)) > > (Just pausing to say this could be create_diagnostics_archive(&zip_path, diagnose) > if we use the enum inside diagnose.c. > > Thanks, > -Stolee