W dniu 31.07.2016 o 21:49, Lars Schneider pisze: > On 31 Jul 2016, at 11:42, Jakub Narębski <jnareb@xxxxxxxxx> wrote: >> W dniu 31.07.2016 o 00:05, Jakub Narębski pisze: >>> W dniu 30.07.2016 o 01:38, larsxschneider@xxxxxxxxx pisze: [...] >>> I think it would be nice to have here at least summary of the benchmarks >>> you did in https://github.com/github/git-lfs/pull/1382 This would be nice to have in the commit message: real benchmarks. >> >> Note that this feature is especially useful if startup time is long, >> that is if you are using an operating system with costly fork / new process >> startup time like MS Windows (which you have mentioned), or writing >> filter in a programming language with large startup time like Java >> or Python (the latter may have changed since). >> >> https://gnustavo.wordpress.com/2012/06/28/programming-languages-start-up-times/ > > OK, I will add this. Is it OK to add the link to the commit message? > (since I don't know how long the link will be available). I don't think it is needed. Perhaps only a sentence or half to notify where you could get most from this feature, but even then it is not necessary. I'm sorry for the confusion. >> See below for proposal with two places to signal errors: before sending >> first byte, and after. > > Right now the protocol is implemented covering the following cases: > > ## CASE 1 - no stream success It is less "stream", more "size unknown". Real streaming is interleaving reading and writing, which is currently not supported due to lack of start_async() - I think. > > packet: git< size=57\n > packet: git< SMUDGED_CONTENT > packet: git< 0000 > packet: git< success\n Right. What happens if either length(SMUDGED_CONTENT) < size, or length(SMUDGED_CONTENT) > size? It could conceivably happen, e.g. due to an error in size calculation. NOTE that without using flush packet to signal end of contents, we would be not able to signal a situation when filter encounters an error (per-file, or long temporary) when it have written some content already. For example this may happen for git-LFS filter, if the server hosting artifactory (or even whole network) gets down during cleanup / smudging. Well, unless we would use other special packets: - empty packet, that is "0004" pkt-line - invalid packet, that is "0001", "0002", "0003" pkt-line to signal premature end of SMUDGED_CONTENT. > > > ## CASE 2 - no stream success but 0 byte response > > packet: git< size=0\n > packet: git< success\n Why there is need to special case 0 byte (empty file) response? packet: git< size=0\n packet: git< 0000 packet: git< success\n is perfectly fine. > ## CASE 3 - no stream filter; filter doesn't want to process the file > > packet: git< size=0\n > packet: git< reject\n Why not simply packet: git< reject\n Or, if we are going success/reject/whatever route packet: git< size=0\n packet: git< 0000 packet: git< reject\n > ## CASE 4 - no stream filter; filter error > > packet: git< size=57\n > packet: git< SMUDGED_CONTENT > packet: git< 0000 > packet: git< error\n > > CASE 4 is not explicitly checked. If a final message is neither > "success" nor "reject" then it is interpreted as error. If that > happens then Git will shutdown and restart the filter process > if there is another file to filter. This should be documented. > > Alternatively a filter process can shutdown itself, too, to signal > an error. > > The corresponding stream filter look like this: > > ## CASE 1 - stream success > > packet: git< SMUDGED_CONTENT > packet: git< 0000 > packet: git< success\n > > > ## CASE 2 - stream success but 0 byte response > > packet: git< 0000 > packet: git< success\n > > > ## CASE 3 - stream filter; filter doesn't want to process the file > > packet: git< 0000 > packet: git< reject\n > > > ## CASE 4 - stream filter; filter error > > packet: git< SMUDGED_CONTENT > packet: git< 0000 > packet: git< error\n > > -- > > I just realized that the size 0 case is a bit inconsistent > in the no stream case as it has no flush packet. Maybe I > should indeed remove the flush packet in the no stream case > completely?! That's what I wrote about SPOT (single point of truth), of using either size or flush packet, but not both. But... As I wrote, you need some mechanism to signal premature end of contents, and start of an error description. > > Do the cases above make sense to you? Except for the inconsistency of the size 0 case. This what I meant to say. > > Regarding error handling. I would prefer it if the filter prints > all errors to STDERR by itself. I think that is the safest > option to communicate errors to the users because if the communication > got into a bad state then Git might not be able to read the errors > properly. > > See Peff's response on the topic, too: > http://public-inbox.org/git/20160729165018.GA6553%40sigill.intra.peff.net/ Actually it looks like Peff is slightly against using stderr. JK> Git-LFS sends to stderr because there's no other option. I wonder if it JK> would be nicer to make it Git's responsibility to talk to the user, JK> because then it could respect things like "--quiet". I guess error JK> messages are generally printed regardless of verbosity, though, so JK> printing them unconditionally is OK. I think it should be O.K., and it makes writing filter drivers simpler if we don't have multiplex channels. >> NOTE: there is a bit of mixed and possibly confusing notation, that >> is 0000 is flush packet, not packet with 0000 as content. Perhaps >> write pkt-line in full? > > I am not sure I understand what you mean (maybe it's too late for me...). > Can you try to rephrase or give an example? Compare packet: git< 0000 with packet: git< success\n The former as pkt-line is git< 0000 the latter is git< 000csuccess\n ^^^^ \-- packet header -- Jakub Narębski -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html