[no subject]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> "a directory" -> "an existing directory"?
> 
> I am not comfortable to see the word "directory" used in this
> proposed log message, as some refs could be stored in the packed
> backend and are referenced by the symbolic ref you are inspecting
> (this comment also refers to the "refs/ directory" you mentioned
> earlier as "the first check").
> 
>     Lastly, a symbolic ref MUST either point to an existing ref,
>     or if the referent does not exist, it MUST NOT be a leading
>     subpath for another existing ref (e.g., when "refs/heads/main"
>     exists, a symbolic ref that points at "refs/heads" is a no-no).
> 
> or something (but again, I am open to a phrasing better than
> "subpath").
> 
> Design question.  What do we want to do when we have no loose refs
> under the "refs/heads/historical/" hiearchy, (i.e. all of them are
> in packed-refs file) hence ".git/refs/heads/historical" directory
> does not exist on the filesystem.  And a symbolic ref points at
> "refs/heads/historical".  Shouldn't we give the same error whether
> the .git/refs/heads/historical directory exist or not, as long as
> the refs/heads/historical/main branch exists (in the packed-refs
> backend)?
> 

I guess I need to think carefully here. Actually, my intention is that I
want to concentrate on the loose refs and then take consideration about
the packed refs.

However, from what you have said above, it seems I could not do this.
They are connected. But at current, I am not so familiar with packed
refs behavior, I could not answer all the questions above.

I decide to understand what packed-ref done. So, this series may be
stalled sometime until I have a good knowledge and re-think the design
here.

> > +`escapeReferent`::
> > +	(ERROR) The referent of a symref is outside the "ref" directory.
> 
> I am not sure starting this as ERROR is wise.  Users and third-party
> tools make creative uses of the system and I cannot offhand think of
> an argument why it should be forbidden to create a symbolic link to
> our own HEAD or to some worktree-specific ref in another worktree.
> 

Do we allow this cross-access (hack)? It might cause some trouble from
my perspective.

> > +	if (referent->buf[referent->len - 1] != '\n') {
> 
> As you initialized "len" to "referent->len-1" earlier, wouldn't it
> more natural to use it here?  That would match the incrementing of
> len++ later in this block.
> 

Yes, exactly.

> > +		ret = fsck_report_ref(o, report,
> > +				      FSCK_MSG_REF_MISSING_NEWLINE,
> > +				      "missing newline");
> > +		len++;
> > +	}
> 
> Having said that, the above should be simplified more like:
> 
>  * declare but not initialize "len".  better yet, declare "orig_len"
>    and leave it uninitialized.
> 
>  * do not touch "len++" in the above block (actually, you can
>    discard the above "if(it does not end with LF)" block, see
>    below).
> 
>  * instead grab "referent->len" in "len" (or "orig_len") immediately
>    before you first modify referent, i.e. before strbuf_rtrim() call.
> 
> 	orig_len = referent->len;
> 	orig_last_byte = referent->buf[orig_len - 1];
> 

I agree.

> > +	strbuf_rtrim(referent);
> > +	if (check_refname_format(referent->buf, 0)) {
> > +		ret = fsck_report_ref(o, report,
> > +				      FSCK_MSG_BAD_REFERENT_NAME,
> > +				      "points to refname with invalid format");
> 
> Similar to an earlier step, the message does not give any more
> information than the enum.  Wouldn't the user who got this error
> want to learn what referent->buf said and which part of it was bad
> in the same message, instead of having to look it up on their own
> after fsck finishes?
> 

Yes, I agree. I will improve this.

> > +		goto out;
> > +	}
> 
> At this point we know check_refname_format() is happy with what is
> left after rtrimming the referent.  There are four cases:
> 
>  - rtrim() did not trim anything (orig_len == referent->len); the file
>    lacked the terminating LF.
> 
>  - rtrim() trimmed one byte (orig_len - 1 == referent->len) and
>    the byte was not LF (orig_last_byte != '\n').  The file lacked
>    the terminating LF.
> 
>  - rtrim() trimmed exactly one byte (orig_len - 1 == referent->len)
>    and the byte was LF (orig_last_byte == '\n').  There is no error.
> 
>  - all other cases, i.e., rtrim() trimmed two or more bytes.  The
>    file had trailing whitespaces after a valid referent that passed
>    check_refname_format().
> 

That's so clear. My implementation is not good compared with this.

> So in short,
> 
> 	if (referent->len == orig_len ||
> 	    referent->len == orig_len - 1 && orig_last_byte != '\n') {
> 		FSCK_MSG_REF_MISSING_NEWLINE;
> 	} else if (referent->len < orig_len - 1) {
> 		FSCK_MSG_REF_TRAILING_WHITESPACE;
> 	}
> 
> can replace the next block you wrote, and we can also remove the
> earlier "it is an error if it does not end with '\n'", I think.
> 
> > +	if (len != referent->len) {
> > +		ret = fsck_report_ref(o, report,
> > +				      FSCK_MSG_TRAILING_REF_CONTENT,
> > +				      "trailing garbage in ref");
> 
> As check_refname_format() was happy, the difference between orig_len
> and referent->len are only coming from trailing whitespaces, i.e. it
> is not that it had arbitrary garbage.  Shouldn't we be more explicit
> about that?
> 

Yes, I made a lot of mistakes when calling the "fsck_report_ref". I will
report the exact garbage content to the user.

> > +	/*
> > +	 * Dangling symrefs are common and so we don't report them.
> > +	 */
> > +	if (lstat(referent_path->buf, &st)) {
> > +		if (errno != ENOENT) {
> > +			ret = error_errno(_("unable to stat '%s'"),
> > +					  referent_path->buf);
> > +		}
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * We cannot distinguish whether "refs/heads/a" is a directory or not by
> > +	 * using "check_refname_format(referent->buf, 0)". Instead, we need to
> > +	 * check the file type of the target.
> > +	 */
> > +	if (S_ISDIR(st.st_mode)) {
> > +		ret = fsck_report_ref(o, report,
> > +				      FSCK_MSG_BAD_REFERENT_FILETYPE,
> > +				      "points to the directory");
> > +		goto out;
> > +	}
> 
> If referent_path->buf refers to "refs/heads/historical/", and all
> the branches under the hierarchy have been sent to packed-refs,
> then this check will not trigger.
> 

Yes, because "refs/heads/historical" will not appear in the filesystem.

> I wonder if this check is the right thing to enforce in the first
> place, though.
> 
> As far as the end user is concerned, refs/heads/historical/master
> branch stil exists, and there is no refs/heads/historical branch, so
> such a symbolic ref, for all intents and purposes, is the same as
> any other dangling symbolic refs, no?
> 
> Of course, "git update-ref SUCH_A_SYMREF HEAD" will complain because
> there is refs/heads/historical, with something like 
> 
>     "refs/heads/historical/master" exists, cannot create "refs/heads/historical"
> 
> but that is to be expected.  If you remove the last branch in the
> refs/heads/historical hierarchy, you should be able to do such an
> update-ref to instanciate refs/heads/historical as a regular ref.
> 

I am a little shocked here. I do this in action and find the directory
will be automatically converted to a regular file in the filesystem. So,
I agree with you here. We should never check this, because we allow
symref to point to a directory. As long as there is no loose refs and
packed refs under this directory, we could use "git update-ref" for this
symref.

Thanks,

> > @@ -3484,12 +3553,24 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
> >  					      "trailing garbage in ref");
> >  			goto cleanup;
> >  		}
> > +	} else {
> > +		strbuf_addf(&referent_path, "%s/%s",
> > +			    ref_store->gitdir, referent.buf);
> > +		/*
> > +		 * the referent may contain the spaces and the newline, need to
> > +		 * trim for path.
> > +		 */
> > +		strbuf_rtrim(&referent_path);
> 
> I doubt this is a good design.  We have referent, and the symbolic
> ref checker knows that the true referent refname may be followed by
> whitespaces, so instead of inventing referent _path here, it would
> be a better design to let the files_fsck_symref_target() to decide
> what file to open and check based on referent, no?  Give it the
> refstore or refstore's gitdir and have the concatenation with the
> rtrimmed contents in the referent->buf after it inspected it
> instead, perhaps?
> 

Yes, I agree with you here. We should use "files_fsck_symref_target" to
do this.


----


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux