[WIP 0/7] CDN offloading of fetch response

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's been a while, so here is an updated version of what I previously
sent [1]. The main difference is that fetch-pack now actually downloads
whatever the server tells it to. The second main difference is that
we no longer buffer progress messages and suspend keepalives - we
no longer need to, because the sideband-all patches have been merged.

I think I've address all the technical comments in [1], except one
comment of Junio's [2] that I still don't understand:

> And the "fix" I was alluding to was to update that "else" clause to
> make it a no-op that keeps os->used non-zero, which would not call
> send-client-data.
>
> When that fix happens, the part that early in the function this
> patch added "now we know we will call send-client-data, so let's say
> 'here comes packdata' unless we have already said that" is making
> the decision too early.  Depending on the value of os->used when we
> enter the code and the number of bytes xread() reads from the
> upstream, we might not call send-client-data yet (namely, when we
> have no buffered data and we happened to get only one byte).

With or without this fix, I don't think there is ever a time when we
say "here comes packdata" without calling send-client-data - we say
"here comes packdata" only when we see the string "PACK", which forms
part of the packfile, and thus we should have no problem sending any
client data. Having said that, maybe this is a moot point - Junio says
that this only happens when the fix is implemented, and the fix is not
implemented.

There are probably some more design discussions to be had:

 - Requirement that all pointed-to CDNs support byte ranges for
   resumability, and to guarantee that the packfiles will be there
   permanently. After some thought, it is a good idea for CDNs to
   support that, but I think that we should support CDNs that can only
   give temporal guarantees (e.g. if/when we start implementing
   resumption, we could read the Cache headers). I didn't add any
   mention of this issue in the documentation.

 - Client-side whitelist of protocol and hostnames. I've implemented
   whitelist of protocol, but not hostnames.

 - Any sort of follow-up fetch - for example, if the download from the
   CDN fails or if we allow the server to tell us of best-effort
   packfiles (but the client still must check and formulate the correct
   request to the server to fetch the rest). This protocol seems like a
   prerequisite to all those, and is independently useful, so maybe all
   of those can be future work.

Please take a look. Feel free to comment on anything, but I prefer
comments on the major things first (e.g. my usage of a separate process
(http-fetch) to fetch packfiles, since as far as I know, Git doesn't
link to libcurl; any of the design decisions I described above). I know
that there are some implementation details that could be improved (e.g.
parallelization of the CDN downloads, starting CDN downloads *after*
closing the first HTTP request, holding on to the .keep locks until
after the refs are set), but will work on those once the overall design
is more or less finalized.

Note that the first patch is exactly the same as one I've previously
sent [3].

[1] https://public-inbox.org/git/cover.1543879256.git.jonathantanmy@xxxxxxxxxx/
[2] https://public-inbox.org/git/xmqqmupi89ub.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx/
[3] https://public-inbox.org/git/20190221001447.124088-1-jonathantanmy@xxxxxxxxxx/

Jonathan Tan (7):
  http: use --stdin and --keep when downloading pack
  http: improve documentation of http_pack_request
  http-fetch: support fetching packfiles by URL
  Documentation: order protocol v2 sections
  Documentation: add Packfile URIs design doc
  upload-pack: refactor reading of pack-objects out
  upload-pack: send part of packfile response as uri

 Documentation/git-http-fetch.txt         |   7 +-
 Documentation/technical/packfile-uri.txt |  79 ++++++++++++
 Documentation/technical/protocol-v2.txt  |  22 ++--
 builtin/pack-objects.c                   |  63 +++++++++
 fetch-pack.c                             |  58 +++++++++
 http-fetch.c                             |  65 ++++++++--
 http-push.c                              |   7 +-
 http-walker.c                            |   5 +-
 http.c                                   |  83 +++++++-----
 http.h                                   |  26 +++-
 t/t5550-http-fetch-dumb.sh               |  18 +++
 t/t5702-protocol-v2.sh                   |  54 ++++++++
 upload-pack.c                            | 155 +++++++++++++++++------
 13 files changed, 542 insertions(+), 100 deletions(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

-- 
2.19.0.271.gfe8321ec05.dirty




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux