On Mon, Mar 05, 2007 at 05:14:51PM -0800, Toshio Kuratomi wrote: > > A presentation was given at this years PyCon called "Scaling Python for > > High-Load Web Sites"[0], I definitely recommend checking it out. > > > Really cool. My reading of the talk is if our loads match up with their > sample application then we're probably okay with just a single cherrypy > instance behind apache for nearly everything. Load balancing could get > us the rest of the way for all of our "internal" apps (meaning: Apps > meant for contributors to the project rather than the Fedora Userbase.) > Of course, in your proposal, once we have one thing behind the load > balancers we should be able to put everything behind the load balancers > without too much effort. > > The wiki/plone, bugzilla and other end-user facing applications need > more than that. Unfortunately, we aren't in charge of coding those so > we don't have as many choices in terms of getting it to scale at the > moment. With moin moin, for instance, my impression is that moin > wouldn't be able to lock files if we had two instances running so we're > unable to use load balancing as an optimization. Yeah, I agree that we definitely need to work on optimizing some of our current software; I mean, seriously, have you tried saving a Wiki page lately ? > > I recommend that we load balance dynamic page requests from our proxy > > servers to our application servers, and let the proxies serve out cached > > static content. We definitely want to hide hide CherryPy behind apache, > > because having HTTP/1.1 and SSL support is nice, among many other > > benefits. Whether or not we use mod_{python,proxy,rewrite} to connect to > > CherryPy is up for discussion. mod_python is the fastest option, and the > > only downfall really is that it is harder to configure, and that you have to > > restart Apache every time you cange your CherryPy code. I give a +1 for > > mod_python, at least until WSGI support in CherryPy solidifies. > > > It appears that TG + mod_python is very slow ATM:: > http://tinyurl.com/3xyznr Interesting. To get a better idea of the performance of the TurboGears stack in our infrastructure, I think it would be extremely valuable to perform some stress tests before F7. This way, we can know for sure the best options for our needs, with regard to: o Apache mod_{rewrite,python,proxy} o SQL{Object,Alchemy} o Xen instances vs. CherryPy instances If anyone is interested in heading this up (as my stress-testing-fu is weak), I would definitely be willing to help out. > > Since each application server will have its own connection pool with the > > db servers, increasing our scalability will simply consist of adding > > another Xen guest behind our load balancer. > > > Why do we even need to add Xen guests? From the pycon talk it looked > like just adding additional cherrypy servers would increase our ability > to serve more pages. True. > We'd want to run benchmarks to see but I'd suspect that having one guest > with five cherrypy instances that we load balance between will give us > more bang for the resources used than five guests on the same Xen host > running one cherrypy server apiece. Yeah, I think that benchmarking this will yield extremely useful data that would benefit many. > Additional guests could enhance reliability, though. If our load > balancer detects whether a guest has stopped responding and serves > requests to the other guests that are running the cherrypy servers, we > could take a guest down for maintenance and then return it to the pool > without interrupting service. Having them on separate Xen hosts would > mean we could lose a physical machine and still survive (at half > capacity). Yep, this will help mitigate much suffering on our end :) > > So from here we might want to look into creating a standard guest image optimized > > for our TurboGears Xen guests. publictest2 was running FC6 (it still might be, > > but as far as I can tell it seems to be down), and I'm not sure what our > > other TG systems are running, but I think we should be consistent. I tend to > > lean towards RHEL{5,4}, which will help us get TurboGears & friends whipped into > > shape for EPEL > > > RHEL4 would be python2.3. RHEL5 is python2.4 like FC6. F7 will be > python2.5.... > > python2.4 has decorators which TG makes heavy use of so I think we want > to have at least that version. It'll feel constraining to run 2.5 for > local development on our home machines with Fedora7+ but having to > develop for python 2.4 because that's what comes with RHEL5 (Unified > try:except:finally and ternary operators being the features I'll miss > the most) but I suspect that's a tradeoff that we'll want to make so we > aren't upgrading every six months. I have yet to start utilizing any Python 2.5 features in my code, so I'm not really partial either way. luke