Archive for the tag 'Seaside'

Scaling Seaside Redux: Enter the Penguin

In my last post titled My Journey to Linux I spoke about Linux. Having decided to use Linux to host my Seaside servers, I was now faced with picking a distro. I chose Ubuntu Server. I’ve heard good things about Debian and Ubuntu and I wanted something clean, free of bloat, and designed to be used as a server. I also wanted to try a distro I’d never tried before. In the past, I’ve tried Red Hat, its offshoot Fedora, and Slackware. I almost tried Suse but it seemed too aimed at the desktop and I was looking for something to use as a server. Ubuntu seemed to fit the bill nicely. I grabbed a spare test box, and decided to explore the idea of hosting Seaside on a Linux server by installing Ubuntu.

I popped in the CD burned from downloading the ISO image and was greeted with a nice menu asking me what I wanted to do. I chose to install server to hard drive and proceed through an install that reminded me very much of the old NT4 install but better. The install finished, successfully detected all the hardware, auto configured the network (using the local DHCP server), and left me with a command line login prompt, exactly what I wanted. At this point, I was very impressed, setting up a Linux server takes a fraction of the time of setting up a Windows server and was completely trouble free. I wouldn’t use DHCP for a production server, but it’ll do fine for testing purposes.

Now, on to what I thought would be the hard part, setting up the necessary software. I haven’t yet loaded emacs, so I’ll use pico and to whip the server into shape.

I spent a few minutes looking into Debian’s package system, learned what I needed, and then away I went. First I needed to ensure the server was up to date.

sudo apt-get update
sudo apt-get upgrade

OK, that was easy. Now I need to install some software. Two of the main things I wanted weren’t in the default Ubuntu repositories. Squeak and Daemontools. Squeak for hosting Seaside and Daemontools because I found out that’s what Lukas uses to maintain his Seaside services, and HAProxy to load balance the processes. I quickly found repositories for all of them and added them to the list of repositories available.

sudo pico /etc/apt/sources.list

and added…

#daemontools
deb http://smarden.org/pape/Debian/ sarge unofficial
deb-src http://smarden.org/pape/Debian/ sarge unofficial
#squeak
deb http://ftp.squeak.org/debian/ stable main
deb-src http://ftp.squeak.org/debian/ stable main
#haproxy
deb http://ftp.sysif.net/debian sid main

Install the cert for HAProxy site

wget http://ftp.sysif.net/debian/apt_key.asc
sudo apt-key add apt_key.asc

Now fetch the new lists by updating again….

sudo apt-get update

OK, time to install everything I want. I need Squeak, an FTP server, Apache, Daemontools for managing services, a SSH server for remote access, HAProxy for load balancing, Samba for networking with my other windows servers, and Emacs because I’m going to be using it. I chose HAProxy for two reasons, there’s no official Ubuntu version of Apache2.2 with mod_proxy_balancer so installing is a pain, and even with mod_proxy_balancer I haven’t been able to get it to work successfully with Seaside. So, on to the install, simple enough…

sudo apt-get install squeak daemontools vsftpd
sudo apt-get install apache2 apache2-utils
sudo apt-get install openssh-server emacs21 haproxy samba

This was actually one command but 3 fits nicer on the blog. Now configure the ftp server…

sudo emacs /etc/vsftpd.conf

and make a few changes…

anonymous_enable=NO
local_enable=YES
write_enable=YES

and restart the ftp service…

sudo /etc/init.d/vsftpd restart

Now setup Apache with the modules I need…

sudo a2enmod rewrite
sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod deflate

and restart the Apache service…

sudo /etc/init.d/apache2 restart

And I’m mostly setup. Wow, that was easy, much easier than I’d assumed it would be.

Time to setup Seaside. I’m no Linux expert, but I see other services running out of /etc/serviceName so I’ll setup my Seaside services the same way. I’m going to run 10 processes, so I’ll create one directory per process, i.e. squeak1, squeak2, etc. Using daemontools I simply have to setup a directory for my service with the files it needs and a shell script called “run” that will kick off the service. Here’s the script…

#!/bin/bash
exec squeakvm -mmap 200m -headless SqueakProd "" port 3001

I’m using mmap to limit each process to a maximum of 200 megs of ram and then feed Seaside the port number to start on, so squeak1 runs on port 3001, squeak2 on 3002, etc. In each folder I put SqueakProd.image, SqueakProd.changes, SqueakV39.sources and chmod 755 the run script.

Because I’m using daemontools, I can now start my services by simply creating a symbolic link in the /service directory to the /etc/squeakX directory for each directory I created.

sudo ln -s /etc/squeak1 /service/squeak1

My services are started within a few seconds and maintained by daemontools. Now lets setup the load balancer. Seaside requires session persistence to do the magic it does, so we need to configure HAProxy to use a cookie to ensure a user gets routed to the appropriate server each time. All that talk about statelessness being necessary is crap, an old onion in the web framework recipe that isn’t at all necessary and is actually crippling.

sudo emacs /etc/haproxy/haproxy.cfg

using these settings for now (these could change based on load testing later)…

global
    log 127.0.0.1 local0
    maxconn 32000
    chroot /usr/share/haproxy
    pidfile /var/run/haproxy.pid
    uid 33
    gid 33
    daemon

defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries	3
    redispatch
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

listen smalltalk 0.0.0.0:8080
    mode http
    cookie SEASIDE insert nocache
    balance roundrobin
    server app1 127.0.0.1:3001 cookie app1inst1 check
    server app2 127.0.0.1:3002 cookie app1inst2 check
    server app3 127.0.0.1:3003 cookie app1inst3 check
    server app4 127.0.0.1:3004 cookie app1inst4 check
    server app5 127.0.0.1:3005 cookie app1inst5 check
    server app6 127.0.0.1:3006 cookie app1inst6 check
    server app7 127.0.0.1:3007 cookie app1inst7 check
    server app8 127.0.0.1:3008 cookie app1inst8 check
    server app9 127.0.0.1:3009 cookie app1inst9 check
    server app10 127.0.0.1:3010 cookie app1inst10 check

Then I need to enable the proxy, it’s disabled by default.

sudo emacs /etc/default/haproxy

and set…

STARTUP=1

then restart the service

sudo /etc/init.d/haproxy restart

Now I hit the box on port 8080 with a web browser to ensure everything works, it does. I now have 10 load balanced sticky session enabled instances of Seaside running behind HAProxy, sweet! Time to setup Apache as the front end to this beast to offload all the static content, URL rewriting, HTTP compression, HTTPS, and logging to it. Only dynamic requests will be proxied to the Seaside cluster. This also allows me to mix in other frameworks when necessary (Rails, .Net), all proxied behind Apache.

Now I want to disable the default site and create a new virtual host…

sudo a2dissite default
sudo emacs /etc/apache2/sites-available/linuxweb1

with the following settings…

NameVirtualHost *:80

<VirtualHost *:80>
    ServerName linuxweb1
    DocumentRoot /var/www
    RewriteEngine On
    ProxyRequests Off
    ProxyPreserveHost On
    UseCanonicalName Off

    # http compression
    DeflateCompressionLevel 5
    SetOutputFilter DEFLATE
    AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css
    BrowserMatch ^Mozilla/4 gzip-only-text/html
    BrowserMatch ^Mozilla/4.0[678] no-gzip
    BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

    #proxy to seaside if file not found
    RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
    RewriteRule ^/(.*)$ http://localhost:8080/$1 [P,L]

    # Logfiles
    ErrorLog  /var/log/apache2/linuxweb1.error.log
    CustomLog /var/log/apache2/linuxweb1.access.log combined
</VirtualHost>

Now I need to enable Apache’s proxy, it’s disabled by default.

<Proxy *>
    Order allow,deny
    Allow from all
</Proxy *>

ProxyVia Block

and enable the new site and reload Apache

sudo a2ensite linuxweb1
sudo /etc/init.d/apache2 force-reload

Now I just check the box by hitting it in my web browser, http://linuxweb1/seaside/config, verify everything works. It does, my session sticks to a single Seaside instance. It’s easy to tell if something is setup wrong, without sticky sessions no Seaside link works. All in all, I’m very pleased with this setup, it runs great, performs well, and being Linux, should be easy to replicate to new servers as I add them.

I’ll just pretend everything actually went this smooth, in reality, there were many points during this install where I had to stop and learn something new about Linux, or about configuring the various components required. In reality, my brain hurts a little after all this, but in a good way. I tackled a lot of new stuff all at once, but I had a lot of fun doing it.

If anyone has a better setup or pointers about mistakes I’ve made, I’d love to hear about it, this is my first crack at using Linux as a web server.

NOTE: I left out the samba setup because it deals with various things with my local windows network that aren’t relevant to this setup. As with any article of this detail, I may have overlooked some step I took but forgot to write down while doing so, so I’ll make corrections as they come up in any comments.

UPDATE: Thanks to a tip in the comments, and having learned Linux a little better now, I use “aptitude update/upgrade/install” rather than “apt-get update/upgrade/install” because aptitude will track dependencies and clean them out automatically when I uninstall stuff.

UPDATE: Since I’ve switched to Linux, my Seaside site has become incredibly stable and much faster than when it was hosted on Windows 2003 server. Part of the speedup comes from HTTP compression, however, the CPU load is now less than before. The hardware is slightly different as well. I went from 3 dual processor windows servers to 2 dual (both hyper-threaded so looks like 4) processor Linux servers. So it’s not an apples to apples comparison, but I’ll give Linux the credit anyway ;)

My Journey to Linux

I’m a Windows guy, I’ve always been a Windows guy. Windows today is more stable than ever. Seems now would be the best time of all to be a Windows guy. Slowly but surely though, I’m becoming a Linux guy.

Truth is, I was always a Microsoft guy, and that simply included Windows along with all of their development products. I used to be a hardware/network technician. I’d setup and maintain networks for medium to small businesses. Windows was always the way to go here, it’s what the users were accustomed to and expected. I’d usually setup a Windows NT server and from a dozen to maybe 30 client computers running various version of Windows including NT workstation. So Windows was just something I was always familiar with.

Even back then, I had the occasional urge to try other things. One of my first experiences with Linux involved using it as a firewall for a windows network on some cheap throwaway hardware that wasn’t good for much else. But it always seemed a pain to use, and I didn’t really understand it, despite having it working quite well for what I intended. I just didn’t see the point of not having a nice GUI and using cryptic commands to do everything.

Later, I learned to program in VBScript and VB using ASP and SQL. I became a web developer and abandoned the hardware gig. Software was so much more interesting. ASP became ASP.net, and VB became C# when I realized how crappy a language VB actually was. What made me want to change was my discovery of the original Wiki. I found a place where real programmers hung out and discussed anything and everything. I realized the world was bigger than VB. VB.Net fixed many of the issues with VB and is pretty much equivalent to C# in all but one area… culture.

What I really was abandoning was the VB culture. I’d outgrown it, I wanted to be involved in a culture that cared more about programming well. The VB culture is dominated by amateur programmers that are just happy to get something working, they tend to care very little about things like architecture, or patterns, or the aesthetics of good code. They don’t think of themselves as amateurs, many of them consider themselves experts, but start talking about object oriented programming or functional programming and the confused looks on their faces tells you they’ve not really looked into such things very deeply. Many think simply using classes makes code object oriented.

I was still firmly in the Microsoft camp at this point, though my change to C# had opened my eyes to Java, and more importantly object oriented programming. It was the Wiki that introduced me to Smalltalk. I just couldn’t help but notice how much Smalltalk was referenced whenever object oriented programming was discussed, nor how many famous authors credentials included a Smalltalk background. I decided I had to check out this Smalltalk thing. Now, at the same time, I was checking out the Lisp thing as well, but that’s not relevant to this story.

So I’m a web developer, my seeking tends to be guided by the need to make my job easier, to find better ways to automate myself. Obviously, I discovered Seaside. Seaside got me into a non Microsoft language. Around the same time, a buddy of mine who I’d met on the Wiki suggested cygwin. I’d been talking about wanting to learn a little more about Linux and he said I could do so without leaving Windows by using a better shell. Cygwin was the beginning of the end for Windows.

I started finding reasons to grep, cat, sed, sort, uniq. This was pretty cool, I was still in Windows but had a Linux command line and the shell became a bigger part of my toolbox. Now I find myself using a non Microsoft programming language, and having discovered PostgreSQL, a non Microsoft database. And now bash for my shell. Hmm…

So now I’m still hosting my apps on Windows servers, but I keep having problems crop up. I recently did a write up on Scaling Seaside which included a bash script for making sure the Seaside services were always up and running. Problem is, turns out the only thing making my Seaside services seem to die, was the bash script itself. Somehow lynx gums up Windows after a certain period of time and Windows starts having random network errors. I’ve taken the script down and now have another one running that uses wget and simply notifies me should any site I’m monitoring go down, or come back up.

So I find myself using all open source non Microsoft tools for everything except for the server’s operating system. Having become quite comfortable on the command line, it finally hit me, stop screwing with all these problems on Windows and try Linux again. Setting up new Seaside services on Windows is a multi step pain in the but. I’d thought I’d give a Linux a try and see how far it’s come since the last time I tried it. Boy was I surprised. In the next post I’ll detail my experience setting up a Linux server for hosting Seaside.

Scaling Seaside

UPDATE: This advice is a bit outdated and only applied back when I was hosting Seaside on Windows servers (which I advise against), see Scaling Seaside Redux Enter the Penguin for more up to date advice.

I’ve been busy with non Seaside projects lately, but one of the things I have squeezed in was a bit of configuration to make Seaside scale better. I was having some performance problems when more than a few sessions were running concurrently and after discussing the issue on the Seaside developers list, Avi popped in from DabbleDB and told us the way to scale was load balance many VMs with a few sessions running on each. So I had to read up a bit on the issue and learn what to do.

It turns out one of the best ways to learn about scaling Seaside is to read about scaling Ruby on Rails. The architecture for scaling both is pretty much the same. Ruby developers use a web server called Mongrel, a light weight single threaded server that works well with Rails but isn’t a heavy duty web server like Apache. This is much the same position Seaside is in with Comanche, though not single threaded, the Squeak VM can’t take advantage of multiple processors and doesn’t do well with too many concurrent connections.

UPDATE: I neglected to mention one major requirement of Seaside, your load balancer must support sticky sessions. Seaside uses sessions heavily and does not support the shared nothing approach where every request can hit a different server. This isn’t an issue unique to Seaside, many frameworks use sessions and must deal with this. Other frameworks handle such issues either by sticking a session to a server, or by having a shared session cache that all the servers can access, such as memcached or a sql server. Currently, to the best of my knowledge, no one has externalized Seaside sessions in this manner, so sticky sessions is the only viable approach.

The solution for both is quite simple really, setup a heavier duty web server/load balancer (Apache/LiteHttpd) to serve up static content, and load balance and proxy connections to a farm of light weight application servers (Mongrel/Comanche) running on other ports. I’m actually using an F5 as my front end load balancer, but Apache has all the necessary features including the ability to create pools of virtual servers which it will load balance requests across with its new mod_proxy.

One only need Google scaling Rails to find many examples of detailed setup information and articles for such a setup, so I won’t repeat it here. I will mention that neither Mongrel or Comanche are anywhere near as rock solid stable as Apache, so one thing you’ll want when having a setup like this to ensure maximum uptime is a process running to poll all of your servers to ensure they aren’t hung up for any reason.

UPDATE: I don’t want to imply Comanche is unstable, it is very rare that a service needs reset, I only do this because it “can” happen, not because it happens a lot. Seaside is very stable and under normal conditions, doesn’t crash.

Here’s a little bash script I found somewhere that makes checking a site for a specific response string simple and easy to use from the command line allowing you to easily schedule some scripts to reset any hung processes. Oh, I use cygwin on all my servers so I can have a decent Unix command line on my Windows servers.

checkSite.sh UPDATE: This script eventually freaks out Windows and causes random network errors because lynx somehow eats up network resources. Don’t use it, Seaside is quite stable without it.

#!/bin/bash

if [ $# -lt 2 ]; then
    exit
fi

URL=$1
findText=$2

lynx -dump -error_file=/tmp/x$$ $URL 2>/dev/null | grep “$findText” &>/dev/null
if [ $? != 0 ]; then
    echo WARNING: Specified search text was not found
fi

stcode=`awk ‘/STATUS/{print $2}’ /tmp/x$$ 2>/dev/null`
if [ $? != 0 ]; then
    echo site is down
fi

for code in $stcode
do
    case $code in
      200) echo OK;;
      302) echo redirecting
           awk -F/ ‘/URL/{print ”  “,$3}’ /tmp/x$$;;
      *)   echo $code
    esac
done

if [ -f /tmp/x$$ ]; then
    rm /tmp/x$$
fi

Then I use this in another script built specifically to monitor instances of my Seaside app. Though this could be more generic, I haven’t bothered yet because I only have one Seaside site in production to worry about.

checkSeaside.sh

echo "$1 $2 $3..."
sh checkSite.sh http://$1:$3/seaside/someApp "Some Required Text" | grep WARNING &>/dev/null
if [ $? = 0 ]; then
    echo “restarting $2 on $1″
    psservice \\\\$1 restart “$2″ >/dev/null #NT util to restart services on remove machines
    echo “Restarting $1 $2″ | wsendmail — -CGI -Ssome.mail.server.com -s”App Monitor reset $1 $2″ someEmail@someAddress.com -Fmonitor@someAddress.com -P1
fi

Then on each web server, I’m running a pool of 10 instances of Seaside setup as services and ensuring they’re up by scheduling a simple batch file with the windows task scheduler.

monitorSomeApp.cmd

@echo off
bash checkSeaside.sh serverName "Some Service1" 3001
bash checkSeaside.sh serverName "Some Service2" 3002
bash checkSeaside.sh serverName "Some Service3" 3003
bash checkSeaside.sh serverName "Some Service4" 3004
bash checkSeaside.sh serverName "Some Service5" 3005
bash checkSeaside.sh serverName "Some Service6" 3006
bash checkSeaside.sh serverName "Some Service7" 3007
bash checkSeaside.sh serverName "Some Service8" 3008
bash checkSeaside.sh serverName "Some Service9" 3009
bash checkSeaside.sh serverName "Some Service10" 3010

Yea, I’m mixing and matching bash and dos scripts, so sue me! Anyway, the setup works great, I’m running 30 instances of Squeak across 3 servers and these scrips ensure they’re always up and responding, and reset them and email me if they go down for any reason. Response time is now much better and I can fully take advantage of the multiple processors on the web boxes.

My process isn’t nearly as fancy as Avi’s (he’s dynamically bringing images up and down based on the host header), but balancing the connections across a bunch of fixed sized pools works well. I started with 10 processes per box, just for the hell of it, but I’ll increase or decrease the size of the pool as load dictates to eventually find the sweet spot for the pool size. For now, 10 per box works, I’ve got plenty of spare ram.

Of course, doing a setup like this means you’ll need to automate your deployment process for new code as well. So far this mean keeping a master image I upgrade with new code and test, then a script to take down each process, copy the image file over the old one, and bring the process back up. Seems Avi’s doing the same thing, works pretty well so far.

Giles Screencast on Seaside and Rails

For those who may not be on the Seaside mailing list, Giles just posted a couple of screencasts about Seaside and Rails. If you’re a Rails guy, go watch them, you’ll learn something about Seaside. If you’re a Seaside guy, watch them to learn a little about the Rails approach. In either case, go watch them, the second one especially, it’s an excellent screencast. This may be a preview of something he might present as OSCON 2007. My favorite quote from the video…

“Although I love Rails, I’m going to find a way to do stuff in Seaside as quickly as I can, because it’s just so cool.”

Seems people are starting to understand that having the power to write desktop style applications (i.e. insanely complex) on the web might be something worth having. I think the Seaside community is going to have quite a few Rails converts over the next year or two. Ruby seems to be a gateway drug to Smalltalk and Rails I think, will be the gateway drug to Seaside.

« Previous PageNext Page »