Re: [WIP-PATCH 1/2] send-email: create email parser subroutine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/29/2016 01:33 AM, Eric Wong wrote:
Matthieu Moy <Matthieu.Moy@xxxxxxxxxxxxxxx> wrote:
Samuel GROOT <samuel.groot@xxxxxxxxxxxxxxxx> writes:

Parsing and processing in send-email is done in the same loop.

To make the code more maintainable, we create two subroutines:
- `parse_email` to separate header and body
- `parse_header` to retrieve data from header

These routines are not specific to git send-email, nor to Git.

Does it make sense to use an external library, like
http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
either by depending on it, or by copying it in Git's source tree ?

That might be overkill and increase installation/maintenance
burden.  Bundling it would probably be problematic to distros,
too.

I have no opinion on that topic, but it could be interesting to have other opinions. For the first patch I thought it would be easier and quicker to use code already written, and maybe use another method in the next iteration.

Email::Simple is licensed under Perl's Artistic License or GPL (v1 or any later version), so it's fine to bundle it.

If not, I think it would be better to introduce an email parsing library
in a dedicated Perl module in perl/ in our source tree, to keep
git-send-email.perl more focused on the "send-email" logic.

Sounds good, Git.pm already has parse_mailboxes

I agree, I will look into that.

+sub parse_email {
+	my @header = ();
+	my @body = ();
+	my $fh = shift;
+
+	# First unfold multiline header fields
+	while (<$fh>) {
+		last if /^\s*$/;
+		if (/^\s+\S/ and @header) {
+			chomp($header[$#header]);
+			s/^\s+/ /;
+			$header[$#header] .= $_;
+		} else {
+			push(@header, $_);
+		}
+	}
+
+	# Now unfold the message body

Why "unfold"? Don't you mean "split message body into a list of lines"?

+	while (<$fh>) {
+		push @body, $_;
+	}

I'd rather avoid the loops entirely and do this:

	local $/ = "\n"; # in case caller clobbers $/
	@body = (<$fh>);

I didn't know this method before, thanks for suggesting it!

+	return (@header, @body);
+}


+		if (defined $input_format && $input_format eq 'mbox') {
+			if (/^Subject:\s+(.*)$/i) {
+				$subject = $1;
+			} elsif (/^From:\s+(.*)$/i) {
+				$from = $1;

Not sure we need thes if/elsif/ for generic headers. Email::Simple's API
seems much simpler and general: $email->header("From");

Right.  Reading this, it would've been easier to parse headers into a
hash (normalized keys to lowercase) up front inside parse_email.

So should we merge parse_email and parse_header in one unique subroutine?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]