Archive for the 'Linux' Category

Scaling Seaside Redux: Enter the Penguin

In my last post titled My Journey to Linux I spoke about Linux. Having decided to use Linux to host my Seaside servers, I was now faced with picking a distro. I chose Ubuntu Server. I’ve heard good things about Debian and Ubuntu and I wanted something clean, free of bloat, and designed to be used as a server. I also wanted to try a distro I’d never tried before. In the past, I’ve tried Red Hat, its offshoot Fedora, and Slackware. I almost tried Suse but it seemed too aimed at the desktop and I was looking for something to use as a server. Ubuntu seemed to fit the bill nicely. I grabbed a spare test box, and decided to explore the idea of hosting Seaside on a Linux server by installing Ubuntu.

I popped in the CD burned from downloading the ISO image and was greeted with a nice menu asking me what I wanted to do. I chose to install server to hard drive and proceed through an install that reminded me very much of the old NT4 install but better. The install finished, successfully detected all the hardware, auto configured the network (using the local DHCP server), and left me with a command line login prompt, exactly what I wanted. At this point, I was very impressed, setting up a Linux server takes a fraction of the time of setting up a Windows server and was completely trouble free. I wouldn’t use DHCP for a production server, but it’ll do fine for testing purposes.

Now, on to what I thought would be the hard part, setting up the necessary software. I haven’t yet loaded emacs, so I’ll use pico and to whip the server into shape.

I spent a few minutes looking into Debian’s package system, learned what I needed, and then away I went. First I needed to ensure the server was up to date.

sudo apt-get update
sudo apt-get upgrade

OK, that was easy. Now I need to install some software. Two of the main things I wanted weren’t in the default Ubuntu repositories. Squeak and Daemontools. Squeak for hosting Seaside and Daemontools because I found out that’s what Lukas uses to maintain his Seaside services, and HAProxy to load balance the processes. I quickly found repositories for all of them and added them to the list of repositories available.

sudo pico /etc/apt/sources.list

and added…

#daemontools
deb http://smarden.org/pape/Debian/ sarge unofficial
deb-src http://smarden.org/pape/Debian/ sarge unofficial
#squeak
deb http://ftp.squeak.org/debian/ stable main
deb-src http://ftp.squeak.org/debian/ stable main
#haproxy
deb http://ftp.sysif.net/debian sid main

Install the cert for HAProxy site

wget http://ftp.sysif.net/debian/apt_key.asc
sudo apt-key add apt_key.asc

Now fetch the new lists by updating again….

sudo apt-get update

OK, time to install everything I want. I need Squeak, an FTP server, Apache, Daemontools for managing services, a SSH server for remote access, HAProxy for load balancing, Samba for networking with my other windows servers, and Emacs because I’m going to be using it. I chose HAProxy for two reasons, there’s no official Ubuntu version of Apache2.2 with mod_proxy_balancer so installing is a pain, and even with mod_proxy_balancer I haven’t been able to get it to work successfully with Seaside. So, on to the install, simple enough…

sudo apt-get install squeak daemontools vsftpd
sudo apt-get install apache2 apache2-utils
sudo apt-get install openssh-server emacs21 haproxy samba

This was actually one command but 3 fits nicer on the blog. Now configure the ftp server…

sudo emacs /etc/vsftpd.conf

and make a few changes…

anonymous_enable=NO
local_enable=YES
write_enable=YES

and restart the ftp service…

sudo /etc/init.d/vsftpd restart

Now setup Apache with the modules I need…

sudo a2enmod rewrite
sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod deflate

and restart the Apache service…

sudo /etc/init.d/apache2 restart

And I’m mostly setup. Wow, that was easy, much easier than I’d assumed it would be.

Time to setup Seaside. I’m no Linux expert, but I see other services running out of /etc/serviceName so I’ll setup my Seaside services the same way. I’m going to run 10 processes, so I’ll create one directory per process, i.e. squeak1, squeak2, etc. Using daemontools I simply have to setup a directory for my service with the files it needs and a shell script called “run” that will kick off the service. Here’s the script…

#!/bin/bash
exec squeakvm -mmap 200m -headless SqueakProd "" port 3001

I’m using mmap to limit each process to a maximum of 200 megs of ram and then feed Seaside the port number to start on, so squeak1 runs on port 3001, squeak2 on 3002, etc. In each folder I put SqueakProd.image, SqueakProd.changes, SqueakV39.sources and chmod 755 the run script.

Because I’m using daemontools, I can now start my services by simply creating a symbolic link in the /service directory to the /etc/squeakX directory for each directory I created.

sudo ln -s /etc/squeak1 /service/squeak1

My services are started within a few seconds and maintained by daemontools. Now lets setup the load balancer. Seaside requires session persistence to do the magic it does, so we need to configure HAProxy to use a cookie to ensure a user gets routed to the appropriate server each time. All that talk about statelessness being necessary is crap, an old onion in the web framework recipe that isn’t at all necessary and is actually crippling.

sudo emacs /etc/haproxy/haproxy.cfg

using these settings for now (these could change based on load testing later)…

global
    log 127.0.0.1 local0
    maxconn 32000
    chroot /usr/share/haproxy
    pidfile /var/run/haproxy.pid
    uid 33
    gid 33
    daemon

defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries	3
    redispatch
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

listen smalltalk 0.0.0.0:8080
    mode http
    cookie SEASIDE insert nocache
    balance roundrobin
    server app1 127.0.0.1:3001 cookie app1inst1 check
    server app2 127.0.0.1:3002 cookie app1inst2 check
    server app3 127.0.0.1:3003 cookie app1inst3 check
    server app4 127.0.0.1:3004 cookie app1inst4 check
    server app5 127.0.0.1:3005 cookie app1inst5 check
    server app6 127.0.0.1:3006 cookie app1inst6 check
    server app7 127.0.0.1:3007 cookie app1inst7 check
    server app8 127.0.0.1:3008 cookie app1inst8 check
    server app9 127.0.0.1:3009 cookie app1inst9 check
    server app10 127.0.0.1:3010 cookie app1inst10 check

Then I need to enable the proxy, it’s disabled by default.

sudo emacs /etc/default/haproxy

and set…

STARTUP=1

then restart the service

sudo /etc/init.d/haproxy restart

Now I hit the box on port 8080 with a web browser to ensure everything works, it does. I now have 10 load balanced sticky session enabled instances of Seaside running behind HAProxy, sweet! Time to setup Apache as the front end to this beast to offload all the static content, URL rewriting, HTTP compression, HTTPS, and logging to it. Only dynamic requests will be proxied to the Seaside cluster. This also allows me to mix in other frameworks when necessary (Rails, .Net), all proxied behind Apache.

Now I want to disable the default site and create a new virtual host…

sudo a2dissite default
sudo emacs /etc/apache2/sites-available/linuxweb1

with the following settings…

NameVirtualHost *:80

<VirtualHost *:80>
    ServerName linuxweb1
    DocumentRoot /var/www
    RewriteEngine On
    ProxyRequests Off
    ProxyPreserveHost On
    UseCanonicalName Off

    # http compression
    DeflateCompressionLevel 5
    SetOutputFilter DEFLATE
    AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css
    BrowserMatch ^Mozilla/4 gzip-only-text/html
    BrowserMatch ^Mozilla/4.0[678] no-gzip
    BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

    #proxy to seaside if file not found
    RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
    RewriteRule ^/(.*)$ http://localhost:8080/$1 [P,L]

    # Logfiles
    ErrorLog  /var/log/apache2/linuxweb1.error.log
    CustomLog /var/log/apache2/linuxweb1.access.log combined
</VirtualHost>

Now I need to enable Apache’s proxy, it’s disabled by default.

<Proxy *>
    Order allow,deny
    Allow from all
</Proxy *>

ProxyVia Block

and enable the new site and reload Apache

sudo a2ensite linuxweb1
sudo /etc/init.d/apache2 force-reload

Now I just check the box by hitting it in my web browser, http://linuxweb1/seaside/config, verify everything works. It does, my session sticks to a single Seaside instance. It’s easy to tell if something is setup wrong, without sticky sessions no Seaside link works. All in all, I’m very pleased with this setup, it runs great, performs well, and being Linux, should be easy to replicate to new servers as I add them.

I’ll just pretend everything actually went this smooth, in reality, there were many points during this install where I had to stop and learn something new about Linux, or about configuring the various components required. In reality, my brain hurts a little after all this, but in a good way. I tackled a lot of new stuff all at once, but I had a lot of fun doing it.

If anyone has a better setup or pointers about mistakes I’ve made, I’d love to hear about it, this is my first crack at using Linux as a web server.

NOTE: I left out the samba setup because it deals with various things with my local windows network that aren’t relevant to this setup. As with any article of this detail, I may have overlooked some step I took but forgot to write down while doing so, so I’ll make corrections as they come up in any comments.

UPDATE: Thanks to a tip in the comments, and having learned Linux a little better now, I use “aptitude update/upgrade/install” rather than “apt-get update/upgrade/install” because aptitude will track dependencies and clean them out automatically when I uninstall stuff.

UPDATE: Since I’ve switched to Linux, my Seaside site has become incredibly stable and much faster than when it was hosted on Windows 2003 server. Part of the speedup comes from HTTP compression, however, the CPU load is now less than before. The hardware is slightly different as well. I went from 3 dual processor windows servers to 2 dual (both hyper-threaded so looks like 4) processor Linux servers. So it’s not an apples to apples comparison, but I’ll give Linux the credit anyway ;)

My Journey to Linux

I’m a Windows guy, I’ve always been a Windows guy. Windows today is more stable than ever. Seems now would be the best time of all to be a Windows guy. Slowly but surely though, I’m becoming a Linux guy.

Truth is, I was always a Microsoft guy, and that simply included Windows along with all of their development products. I used to be a hardware/network technician. I’d setup and maintain networks for medium to small businesses. Windows was always the way to go here, it’s what the users were accustomed to and expected. I’d usually setup a Windows NT server and from a dozen to maybe 30 client computers running various version of Windows including NT workstation. So Windows was just something I was always familiar with.

Even back then, I had the occasional urge to try other things. One of my first experiences with Linux involved using it as a firewall for a windows network on some cheap throwaway hardware that wasn’t good for much else. But it always seemed a pain to use, and I didn’t really understand it, despite having it working quite well for what I intended. I just didn’t see the point of not having a nice GUI and using cryptic commands to do everything.

Later, I learned to program in VBScript and VB using ASP and SQL. I became a web developer and abandoned the hardware gig. Software was so much more interesting. ASP became ASP.net, and VB became C# when I realized how crappy a language VB actually was. What made me want to change was my discovery of the original Wiki. I found a place where real programmers hung out and discussed anything and everything. I realized the world was bigger than VB. VB.Net fixed many of the issues with VB and is pretty much equivalent to C# in all but one area… culture.

What I really was abandoning was the VB culture. I’d outgrown it, I wanted to be involved in a culture that cared more about programming well. The VB culture is dominated by amateur programmers that are just happy to get something working, they tend to care very little about things like architecture, or patterns, or the aesthetics of good code. They don’t think of themselves as amateurs, many of them consider themselves experts, but start talking about object oriented programming or functional programming and the confused looks on their faces tells you they’ve not really looked into such things very deeply. Many think simply using classes makes code object oriented.

I was still firmly in the Microsoft camp at this point, though my change to C# had opened my eyes to Java, and more importantly object oriented programming. It was the Wiki that introduced me to Smalltalk. I just couldn’t help but notice how much Smalltalk was referenced whenever object oriented programming was discussed, nor how many famous authors credentials included a Smalltalk background. I decided I had to check out this Smalltalk thing. Now, at the same time, I was checking out the Lisp thing as well, but that’s not relevant to this story.

So I’m a web developer, my seeking tends to be guided by the need to make my job easier, to find better ways to automate myself. Obviously, I discovered Seaside. Seaside got me into a non Microsoft language. Around the same time, a buddy of mine who I’d met on the Wiki suggested cygwin. I’d been talking about wanting to learn a little more about Linux and he said I could do so without leaving Windows by using a better shell. Cygwin was the beginning of the end for Windows.

I started finding reasons to grep, cat, sed, sort, uniq. This was pretty cool, I was still in Windows but had a Linux command line and the shell became a bigger part of my toolbox. Now I find myself using a non Microsoft programming language, and having discovered PostgreSQL, a non Microsoft database. And now bash for my shell. Hmm…

So now I’m still hosting my apps on Windows servers, but I keep having problems crop up. I recently did a write up on Scaling Seaside which included a bash script for making sure the Seaside services were always up and running. Problem is, turns out the only thing making my Seaside services seem to die, was the bash script itself. Somehow lynx gums up Windows after a certain period of time and Windows starts having random network errors. I’ve taken the script down and now have another one running that uses wget and simply notifies me should any site I’m monitoring go down, or come back up.

So I find myself using all open source non Microsoft tools for everything except for the server’s operating system. Having become quite comfortable on the command line, it finally hit me, stop screwing with all these problems on Windows and try Linux again. Setting up new Seaside services on Windows is a multi step pain in the but. I’d thought I’d give a Linux a try and see how far it’s come since the last time I tried it. Boy was I surprised. In the next post I’ll detail my experience setting up a Linux server for hosting Seaside.