Re: Dropping gitolite and breaking stg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 19, 2016 at 12:14:30PM +0100, Pierre-Yves Chibon wrote:
> On Wed, Feb 17, 2016 at 11:51:51AM -0500, Ralph Bean wrote:
> > On Wed, Feb 17, 2016 at 12:11:09PM +0100, Pierre-Yves Chibon wrote:
> > Just as a point of clarification -- there are two systems in play for
> > pkgs.fp.o currently:
> > 
> > - There is 'cgit' which provides the web-based view of the repos.
> > - There is 'gitolite' which provides the backend ACL controls over who
> >   is allowed to push to what.
> 
> And pkgdb that has the info about these ACLs, ACLs that are then propagated to
> gitolite who applies them.
> 
> > Using pagure as the read-only view of the repos is definitely a good idea.
> > What's being discussed here is the "if we should" and "how we should"
> > of replacing the backend acl controls with pagure.  Generally, I think
> > it's a good direction to move in.  Maybe we're already on the same
> > page about that.
> 
> I have two comments about this paragraph:
> * Pagure isn't read-only, it's read and write. We need to write for
>   online-editing as well as to allow the pull-request mechanism.
>   So pagure isn't just a replacement for cgit, it is a little more than this :)

Yes, correct.  My mistake for reducing it to RO.  (This will be cool).

> * Then there is the ACLs question. I don't think we want to drop pkgdb, and
>   pkgdb is where ACLs are stored.
>   So the current workflow looks like:
>      pkgdb -> script -> gitolite
>   With pagure (imho) it will look like:
>      pkgdb -> script -> gitolite
>                      \_ pagure
>   or eventually:
>      pkgdb -> script  -> gitolite
>            \_ script2 -> pagure
>   or with the proposal made here:
>      pkgdb -> service
>            \_ script -> pagure
>   What I was proposing here is that we drop the script and gitolite in favor of
>   our own service/REST server that would grant/deny access based on the info in
>   pkgdb (very like what gitolite does atm).
>  
> > > We could have the async service directly linked to pkgdb's DB. This means:
> > > - Changes to pkgdb are directly propagated to pkgs.fp.o
> > > - We can rely on the collections information directly retrieved from the DB
> > > - We can use our current set-up (one shell account / packager) and not tweak
> > >   gitolite until it behaves as we want (which is only supported by gitolite to
> > >   please us).
> > > - No need for the alias warning for namespacing, we can check if a namespace was
> > >   specified and use ``rpms`` if not
> > > - May be easier to hack/maintain in the long term (may be not, hard to say in a
> > >   way :))

FWIW, the namespace warning is a bit of a red herring.
- We can configure gitolite to not issue the warning now, as things
  stand.
- Once we patch fedpkg to adjust the git urls to use the correct
  namespace, the warning would go away anyways.  (Just need some round
  'tuits there.)

The other points are right on.

> > I'm not sure about querying pkgdb DB directly from the git hook -- or
> > even querying pkgdb's JSON API directly.
> > 
> > - If we connect directly to the DB, it exposes the internals of pkgdb
> >   in a way that could make it much harder to upgrade that schema in
> >   the future.  Better to connect over the REST API.
> 
> +1 there
> 
> > - If we connect over the REST API, we could run into system
> >   interdependence problems in the future.  If pkgdb goes down
> >   accidentally, or if it needs to go down for maintenance, then the
> >   dist-git repos will be dead in the water.  No one will be able to
> >   push anything.
> 
> So my idea here was to not rely on pkgdb itself (also because of the number of
> requests we may have to deal with), but rather to a small service (much like
> mdapi) that would connect directly to pkgdb's DB.
> The service could also run on pkgs01 directly (but would likely be py3, meaning
> some porting work required on pkgdb, which is a good thing anyway).

I like the idea of a microservice that provides a read-only interface
to pkgdb ACLs.  Currently, querying for ACLs from pkgdb can be quite
slow.

> This would mean:
> - git remains independent from pkgdb (good)
> - git is blocked if the DB server goes down (bad), but if our DB server goes
>   down, most of our infra will have troubles anyway.

Heh, yeah.  But let's not put more problems on that pile if we can
avoid it, I guess.

Could the pkgdb read-only API service use an on-disk cache of the ACLs
just like mdapi does?  A sqlite stash?

> > - Right now, we basically cache *all* of the pkgdb acls on disk as
> >   gitolite perms.  This has the advantage of decoupling the systems at
> >   request time.  It has the disadvantage of synchronization lag.  When
> >   ACLs get updated in pkgdb, we have to wait for those to sync to
> >   gitolite to be meaningful in practice.  We used to have a cronjob on
> >   which we waited forever.. we now have that fedmsg-genacls updater
> >   that makes it much quicker, but not instant.  Can we keep this
> >   same arrangement for a pagure replacement of gitolite?
> 
> We could make the service rely on a small local version of pkgdb's DB, but it
> would be a little more work, would make the process a little more fragile (cf
> mdapi's error when the sqlite DB is corrupted) but would bring the advantage of
> still allowing to commit/build packages when we reboot our server (iirc, koji
> doesn't need FAS, does it?).
> There is pros and cons to this approach. The fact that so much of our apps would
> be impacted anyway when the DB server goes down (including pagure itself for
> login, but not for users logged in) weights-in a little more for no on-disk
> caching for me, but I can be convinced otherwise :)

Cool - we're on the same page here.

> > > We would still need to have a service to create the git repo and eventually the
> > > branches.
> > > And for pkgs.fp.o we will definitely need a git hook (but there is already one)
> > > to prevent branch from being deleted and do branch-based ACL control.
> > 
> > Yeah, since we still need a service to create the git repo and the
> > branches, we might as well sync pagure ACLs at the same time, no?
> 
> We will need to sync to pagure, not from pagure I think, but yes, we'll need a
> sync script anyway.
> 
> > > Note that I might still pursue this for pagure, not entirely sure yet though.
> > > Just a thought while writing this, using this approach might actually make it
> > > easier to deploy pagure for pkgs.fp.o since then we could indeed just use pkgdb
> > > as data source and have a fedmsg-based updater to sync ACLs from pkgdb to
> > > pagure.
> > 
> > I guess I don't understand what's easier about using pkgdb as a data
> > source for pagure.  It seems harder to me (both to write up front as
> > well as to maintain long term).
> 
> Well, imho, pkgdb should remain the canonical place to store and manage ACLs for
> packages, otherwise we'll duplicate things between managing branch in one side,
> ACLs in another side, we'd need to update the script syncing info to bugzilla as
> well. I prefer to adjust pagure to get its info from pkgdb or we decide to drop
> pkgdb entirely but managing the ACLs in pagure itself and the rest in pkgdb
> doesn't quite appeal to me.

Agreed on the role of pkgdb.  If it has one job, that job is storing
and managing ACLs.  Let's not duplicate that.

> Note: In a way dropping pkgdb is tempting but it brings a number of new
> questions:
>   - What do we do about the watch* ACLs?
>   - How do we determine the PoC? How do we change it? (Required for bugzilla)
>     -> Or we drop bugzilla as well, but then the ticketing system of pagure will
>     need much much more work (including a good search feature).
>   - How are branches managed? (Request, creation, deletion)
>   - How do we manage status (Maintain/Orphan/Retire)?
> Of course we could build this in pagure as well, but I am afraid this would make
> it too Fedora-specific and much less a self-hostable forge.

Agreed.  Let's not overload pagure.

> Anyway, food for thought, maybe we could find a way, micro-services?
> 
> 
> I have written quickly the service for pagure itself, it lead to me to make
> pagure work with py3. I'll need to test it some more "in condition" but it's
> promising.

Awesome :)

Attachment: signature.asc
Description: PGP signature

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
http://lists.fedoraproject.org/admin/lists/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux