Re: Web application frameworks and the future

Xavier Lamien <laxathom@xxxxxxxxxxxxxxxxx> · Fri, 6 Jul 2012 02:59:26 +0200

On Thu, Jun 28, 2012 at 10:23 PM, Toshio Kuratomi <a.badger@xxxxxxxxx> wrote:

"""

In the beginning there was cgi.  And everything was slow but simple.  And

lo, one day we began to crave faster speeds, MVC, and other features that

plain cgi did not provide.  And thus we entered the age of web

frameworks....

"""

At last week's infrastructure meeting, I brought up the fact that we seem to

have a proliferation of web application frameworks for the new apps that we

are creating.  In some ways this is good as it lets us experiment with new

technologies as a group and lets us fit the needs of a specific application

or programmer's style with the framework.  However, it has downsides as

well; mostly in the realm of ongoing maintenance of the apps.  We need to

take a moment to figure out where we want to go with this.

== Some issues ==

* Retaining group knowledge of many different application frameworks

  even when the original author stops being an active contributor

* Maintaining the packages in EPEL and Infrastructure for these

  * Maintaining some knowledge of the frameworks' code and involvement with

    their upstreams to fix bugs in the frameworks themselves.

* Deployment of multiple frameworks that may have conflicting deps.

* Deployment of multiple frameworks taking up more memory on the servers.

We think to some extent we currently have ways to manage the deployment

problems:

* Separate app servers for individual apps.  As long as we have an inflow of

  hardware resources we can continue to separate out applications onto

  different machines instead of running them all on app* as our first

  generation of apps was.  This would be an ongoing expense.  We should

  continue to allocate at least two servers to each application so that we

  can do things like reboots and updates transparently to the users.

* Openshift.  Hosting applications on a cloud service like openshift allows

  us to separate out applications and parcel out memory as a resource

  differently than if we're managing multiple apps on a single host.

While these factors do change the game as far as hardware allocation is

concerned, it doesn't help our manpower resources.  As we spin up more hosts

for each web application, we need sysadmin time to spin those hosts up.  As

we deploy to openshift we need to figure out how we're going to integrate

configuration and deployment to those hosts into our existing puppet

configurations (I don't think that any of our current openshift deployed

services are puppet managed) and how we're going to manage load balancing

and failover.

== Where are we now? ==

.. note:: I would like this section to be an inventory of everything that

    we're deploying and writing but I don't have a complete picture.  If you

    have more things, feel free to update this on the wiki page:

    https://fedoraproject.org/wiki/Infrastructure_Services_Survey

TG1 => Turbogears1, SQLAlchemy and genshi/mako

Old TG1 => TurboGears1, SQLObject and kid

TG2 => TurboGears2

Pyramid => Curent successor to TG2 but a break from the current TG1 style;

           may have a new layer built on top of it at a later date that is

           more TG-ish.

Flask => Easy to get started with and wrap your head around. Great for small

         projects.  Not a huge stack of deps.

Application        Host       Framework   Notes

-----------        ----       ---------   -----

bodhi              app*       old TG1     has a pyramid branch

bodhi              releng*    old TG1     has a pyramid branch

busmon             ?          TG2/moksha  Not yet deployed

copr(2)            ?          flask       not yet deployed. Loosely,

                                          "buildsys for fedorapeople repos"

datagrepper        ?          flask?      Not yet deployed

dataviewer         ?          flask?      Not yet deployed

dpsearch           ?          perl/C      Not yet deployed testing on

                                          search01-dev

elections          app*       TG1         has a TG2 branch and ianweller

                                          trying a flask branch for

                                          comparison

fas                fas*       TG1

fedorabadges       ?          pyramid     Not yet deployed

fedoracommunity    app07?     TG2/moksha  Only runs on RHEL5.  We're

                                          retiring this pending on

                                          datanommer being deployed or we

                                          get tired of keeping app07.  (Is

                                          the version of moksha here old as

                                          well?)

fedorahosted-reg   openshift? flask       not yet deployed

freemedia          app*       php         In Puppet. Looks like it would be

                                          very simple to port to something

                                          lightweight like Flask if we

                                          wanted to get away from PHP.

fudcon-reg         openshift  flask       registration application for

                                          fudcon.  Not currently configured

                                          in puppet, load balanced, etc.

koji               koji*      custom      was mod_python.  plans to move to

                                          mod_wsgi.  (Current status?)

mirrorlist-server  app*       custom      lightweight, mod_wsgi process.  No

                                          real framework

mirrormanager      app*       old TG1     has an older TG2 branch

packagedb          app*       TG1

packages           packages*  TG2

pager              app*, noc* CGI

raffle             app*       TG2         Disposable -- no promises to keep

                                          maintaining have been made

smolt              value*     TG1         We're planning to get rid of this

                                          in favor of census on openshift

                                          (Are we still running the process

                                          on app* even though it isn't

                                          actively serving pages?)

tagger             packages*  TG2

We deploy but do not code for:

Application        Host       Framework   Notes

-----------        ----       ---------   -----

askbot             ask*       django      Uses openid login

darkserver         darkserver django

insight            insight*   drupal/php  I'm not sure the level of coding

                                          that we do on this.

gitweb(-caching)   pkgs*      cgi?        thinking of replacing with cgit

                   hosted*

hg?                hosted*    cgi?

loggerhead         hosted*    mod_wsgi

mailman webui      hosted*    python cgi  mailman web frontend for

                   collab*                lists.fp.o and lists.fh.o

mediawiki          app*       php

reviewboard        hosted*    django      we've talked about moving this to

                                          openshift and/or app servers

trac               hosted*    mod_wsgi    genshi templates

Deployed but only for our sysadmins: collectd, nagios, awstats

== Some analysis ==

Right now we're deploying against the following frameworks for applications

in our critical path:

* TG1

* mod_wsgi/mod_python

We also have a few additional applications that are not currently critical

to creating Fedora but are value adds that we've worked hard on.  These

applications are written against

* TG1

* TG2

* flask

The new applications that we're writing seem to be written against:

* TG2

* flask

* pyramid

== Some thoughts ==

=== Openshift ===

Although openshift is attractive from a hardware-provisioning perspective,

we haven't figured out how to manage configs for it for any of our currently

deployed services.  So, for instance, if there was evidence that one of our

openshift instances had been compromised we wouldn't have the benefit of

configs checked into puppet to refer to and to help us reconstruct that

instance.  We probably also don't have these hosts as part of our backups

(don't know if openshift manages backups for us).  We should figure out

disaster recovery for these hosts before we go too much further here.

We also don't currently have any openshift hosts working in a load balanced

fashion so, for instance, doing an update of an app could require user

visible downtime.

If we're going to use openshift for deploying production apps, we should

come up with answers for these tasks.

=== Getting rid of TG1 ===

At some point I want to get rid of the TG1 stack.  Upstream is in

maintenance-only mode for it.  And increasingly, they are moving to the

somewhat incompatible TG-1.5.x stack for their maintenance while

simultaneously pushing people to write their apps for TG2 or pyramid.  While

TG1.1 "just works" for us right now, we're eventually going to run up against

things that upstream isn't handling (whether bugfixes in the TG-1.1.x

branch, security fixes, or porting of the stack to new versions of dependent

libraries).  While the maintenance burden of the TG1.1 stack is low at this

time, it's just going to get higher over time.

In order to port away from the TG1 stack, I want to figure out what we

should be porting to.  Last year we thought that should be TG2 because

moksha was intrinsically linked to TG2 and we were deploying on

fedoracommunity which needed moksha.  Now, neither of those is true.

(moksha can now run on other frameworks besides TG2.  fedoracommunity is

going away in the future.)  However, there's no clear successor.

=== Plethora of frameworks ===

We're writing and deploying apps written against an ever expanding number of

frameworks.  I am a bit afraid of this.  While it is nice to know that we

have exactly the right tool for the job among the many choices of framework,

I think that maintaining apps written in a variety of frameworks is going to

cause us pain as frameworks die off or change radically and current

contributors move on to other things.  With that in mind I think we should

commit to using only a few frameworks in our coding for infrastructure and

those frameworks will serve to be where we concentrate on gathering our

experience, what we write new apps against, what we design our

infrastructure to support, and what we port our apps to as time goes on.

>From browsing the list of frameworks we're currently deploying:

Django has a good track record of making new releases with clear porting

guides for making changes in your old code on run on the new versions.

However, it is conceptually something of an application server (like JBoss),

not a pure framework like Turbogears.  At the least, this would require some

thought on our part on how to deploy and code for it.

Flask seems to be lighter weight in terms of its deps and in terms of its

learning curve.  It's pretty easy to run a flask app in openshift.  If we

were to choose just two frameworks, it might make sense to choose flask as

an entry level framework for smaller applications and one other framework

with lots of bells and whistles for things that need those features.

Sounds like a a good candidate for the purpose --> project requirements.

TurboGears2 is still developed upstream.  Some of the main developers have

moved on to work on pyramid but others are continuing to work on TG2.

Upstream has committed to doing the necessary work to port TG2 to python3

but much of the TG2 underlying stack is in maintenance mode so the TG2 devs

have had to do some of that work themselves.

Pyramid is a merging of certain segments of the zope community and the

pylons community.  If pylons has a successor, this is it.  Since TG2 was

built on pylons, pyramid might be the next logical step (or a web framework

built on top of pyramid).

 So we have to choose between a full-stack and a low-level one...
Or are you pointing out that maybe we should go for TG2 as we could expect have pyramid as a core base?

== Final thoughts ==

My primary goal is to decide what framework to port our old TG1 code to so

that we can stop maintaining the TG1 stack before upstream stops working on

it at all.  My secondary concern is that we stop growing the other stacks

that we're maintaining and concentrate on one or two which will make

mainenance easier.  Can we choose two frameworks right now that will suit

our needs?  It seems that flask can serve a niche and maybe should be one of

them.  What should our bells and whistles framework be?  TG2 or pyramid or

something else entirely?

Ruby_on_rails?
Seriously, That should be related to the dev team capabilities for the "something else entirely"
where I can only see TG2 for now unless people's hacking on new web framework?

Beside, I would love to get rid of tg1 and actually move my FAS's work to TG2.

-- 
Xavier.t Lamien
--
http://fedoraproject.org/wiki/XavierLamien

GPG-Key ID: F3903DEB
Fingerprint: 0F2A 7A17 0F1B 82EE FCBF 1F51 76B7 A28D F390 3DEB

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure