Archive for the 'Databases' Category

Simple Image Based Persistence in Squeak

One of the nicest things about prototyping in Smalltalk is that you can delay the need to hook up a database during much of your development, and if you’re lucky, possibly even forever.

It’s a mistake to assume every application needs a relational database, or even a proper database at all. It’s all too common for developers to wield a relational database as a golden hammer that solves all problems, but for many applications they introduce a level of complexity that can making development feel like wading through a pond full of molasses where you spend much of your time trying to keep the database schema and the object schema in sync. It kills both productivity and fun, and god dammit, programming should be fun!

This is sometimes justified, but many times it’s not. Many business applications and prototypes are built to replace manual processes using Email, Word, and Excel. Word and Excel by the way, aren’t ACID compliant, don’t support transactions, and manage to successfully run most small businesses. MySql became wildly popular long before it supported transactions, so it’s pretty clear a wide range of apps just don’t need that, no matter how much relational weenies say it’s required.

It shouldn’t come as a surprise that one can take a single step up the complexity ladder, and build simple applications that aren’t ACID compliant, don’t support transactions, and manage to successfully run most small businesses BETTER than Word and Excel while purposely not taking a further step and moving up to a real database which would introduce a level of complexity that might blow the budget and make the app infeasible.

No object relational mapping layer (not even Rails and ActiveRecord) can match the simplicity, performance, and speed of development one can get just using plain old objects that are kept in memory all the time. Most small office apps with no more than a handful of users can easily fit everything into memory, this is the idea behind Prevayler.

The basic idea is to use a command pattern to apply changes to your model, you can then log the commands, snapshot the model, and replay the log in case of a crash to bring the last snapshot up to date. Nice idea, if you’re OK creating commands for every state changing action in you’re application and being careful with how you use timestamps so replaying the logs works properly. I’m not OK with that, it introduces a level of complexity that is overkill for many apps and is likely the reason more people don’t use a Prevayler like approach.

One might attempt to use the Smalltalk image itself as a database (and many try), but this is ripe with problems. My average image is well over 30 megs, saving it takes a bit of time, and saving it while processing http requests risks all kinds of things going wrong as the image prepares for what is essentially a shutdown/restart cycle.

Using a ReferenceStream to serialize objects to disk Prevayler style, but ignoring the command pattern part and just treating it more like crash proof image persistence is a viable option if your app won’t ever have that much data. Rather than trying to minimize writes with commands, you just snapshot the entire model on every change. This isn’t as crazy as it might sound, most apps just don’t have that much data. This blog for example, a year and a half old, around 100 posts, 1500 comments, has a 2.1 megabyte MySql database, which would be much smaller as serialized objects.

If you’re going to have a lot of data, clearly this is a bad approach, but if you’re already thinking about how to use the image for simple persistence because you know your data will fit in ram, here’s how I do it.

It only takes a few lines of code in a single abstract class that you can subclass for each project to make a Squeak image fairly robust and crash proof and more than capable enough to allow you just use the image, no database necessary. We’ll start with a class…

Object subclass: #SMFileDatabase
	instanceVariableNames: ''
	classVariableNames: ''
	poolDictionaries: ''
	category: 'SimpleFileDb'

SMFileDatabase class
	instanceVariableNames: 'lock'

All the methods that follow are class side methods. First, we’ll need a method to fetch the directory where rolling snapshots are kept.

backupDirectory
	^ (FileDirectory default directoryNamed: self name) assureExistence.

The approach I’m going to take is simple, a subclass will implement #repositories to return the root object that needs serialized, I just return an array containing the root collection of each domain class.

repositories
	self subclassResponsibility

The subclass will also implement #restoreRepositories: which will restore those repositories back to wherever they belong in the image for the application to use them.

restoreRepositories: someRepositories
	self subclassResponsibility

Should the image crash for any reason, I want the last backup will be fetched from disk and restored. So I need a method to detect the latest version of the backup file, which I’ll stick a version number in when saving.

lastBackupFile
	^ self backupDirectory fileNames
		detectMax: [:each | each name asInteger]

Once I have the file name, I’ll deserialize it with a read only reference stream (don’t want to lock the file if I don’t plan on editing it)

lastBackup
	| lastBackup |
	lastBackup := self lastBackupFile.
	lastBackup ifNil: [ ^ nil ].
	^ ReferenceStream
		readOnlyFileNamed: (self backupDirectory fullNameFor: lastBackup)
		do: [ : f | f next ]

This requires you extend ReferenceStream with #readOnlyFileNamed:do:, just steal the code from FileStream so nicely provided by Avi Bryant that encapsulates the #close of the streams behind #do:. Much nicer than having to remember to close your streams.

Now I can provide a method to actually restore the latest backup. Later, I’ll make sure this happens automatically.

restoreLastBackup
	self lastBackup ifNotNilDo: [ : backup | self restoreRepositories: backup ]

I like to keep around the last x number of snapshots to give me a warm fuzzy feeling that I can get old versions should something crazy happen. I’ll provide a hook for an overridable default value in case I want to adjust this for different projects.

defaultHistoryCount
	^ 15

Now, a quick method to trim the older versions so I’m not filling up the disk with data I don’t need.

trimBackups
	| entries versionsToKeep |
	versionsToKeep := self defaultHistoryCount.
	entries := self backupDirectory entries.
	entries size < versionsToKeep ifTrue: [ ^ self ].
	((entries sortBy: [ : a : b | a first asInteger < b first asInteger ])
		allButLast: versionsToKeep)
			do: [ : entry | self backupDirectory deleteFileNamed: entry first ]

OK, I’m ready to actually serialize the data. I don’t want multiple processes all trying to do this at the same time, so I’ll wrap the save in a critical section, #trimBackups, figure out the next version number, and serialize the data (#newFileNamed:do: another stolen FileStream method), ensuring to #flush it to disk before continuing (don’t want the OS doing any write caching).

saveRepository
	| version |
	lock critical:
		[ self trimBackups.
		version := self lastBackupFile
			ifNil: [ 1 ]
			ifNotNil: [ self lastBackupFile name asInteger + 1 ].
		ReferenceStream
			newFileNamed: (self backupDirectory fullPathFor: self name) , ‘.’ , version asString
			do: [ : f | f nextPut: self repositories ; flush ] ]

So far so good, let’s automate it. I’ll add a method to schedule the subclass to be added to the start up and shutdown sequence. You must call this for each subclass, not for this class itself.

UPDATE: This method also initializes the lock and must be called prior to using #saveRepository, this seems cleaner.

enablePersistence
	lock := Semaphore forMutualExclusion.
	Smalltalk addToStartUpList: self.
	Smalltalk addToShutDownList: self

So on shutdown, if the image is actually going down, just save the current data to disk.

shutDown: isGoingDown
	isGoingDown ifTrue: [ self saveRepository ]

And on startup we can #restoreLastBackup.

startUp: isComingUp
	isComingUp ifTrue: [ self restoreLastBackup ]

Now, if you want a little extra snappiness and you’re not worried about making the user wait for the flush to disk, I’ll add little convience method for saving the repository on a background thread.

takeSnapshot
	[self saveRepository] forkAt: Processor systemBackgroundPriority
		named: ’snapshot: ‘ , self class name

And that’s it, half a Prevayler and a more robust easy to use method that’s a bit better than trying to shoehorn the image into being your database for those small projects where you really really don’t want to bother with a real database (blogs, wikis, small apps, etc). Just sprinkle a few MyFileDbSubclass saveRepository or MyFileDbSubclass takeSnapshot’s around your application whenever you feel it important, and you’re done.

Here’s a file out if you just want the code fast, SMFileDatabase.st

22 February 2007 > Squeak Image Updated

Just a quick notification that I’ve updated my squeak image. I do this occasionally to keep my base up to date with the latest and greatest of the frameworks I use.

In this update I sharing my current 3.9 image. It’s based on Damien Cassou’s Squeak Dev Image, an awesome base image with all the necessary goodies a developer needs. Of course I’ve loaded up my window customizations and preferences, nicer looking fonts, and have all the preferences set the way I like.

This image includes PostgreSQL drivers and Glorp, since I’m now doing some development with them and consider them part of my base tool set. If my image isn’t to your liking, I highly recommend learning to build and maintain your own using Damien’s as a starting point. It will save you a lot of time, and he’s done an awesome job building and sharing these images. Thanks Damien.

A Smalltalk ActiveRecord using Magritte, Seaside, and Glorp

I’ve been working on a side project that’s given me reason to want to use Glorp with Seaside. Having just mapped the sample blog written from my screencast into Glorp manually, by writing Glorp descriptors, I decided that I wanted something simpler, something more like Ruby on Rails, automatic persistence, with almost no configuration.

Having used Magritte for a while to describe my Seaside UI’s, I decided that those same descriptions contained all the necessary meta data to write Glorp descriptions from. Unlike the ActiveRecord in Rails, or the one Alan Knight is working on, I’m not using the database as the source of my metadata, I’m using the objects themselves with meta data from Magritte instead.

I sat down and started hacking out my own ActiveRecord implementation, which is really just a small framework that glues these three existing frameworks together for me and makes using Seaside against a Postgres database easy for me. Needless to say, knowledge of Magritte is a prerequisite for using this code.

I just open sourced the code I’ve been using on SqueakSource, just add this repository to Monticello to get a copy

MCHttpRepository
    location: 'http://www.squeaksource.com/MagritteGlorp'
    user: ''
    password: ''

This is a first cut Alpha release, I make no guarantees, only the brave need attempt using it. To use it, simply requires the following.

subclass MGActiveRecord, this will be your root class, all your biz classes can descend from this.

MGActiveRecord subclass: #SBActiveRecord
	instanceVariableNames: ''
	classVariableNames: ''
	poolDictionaries: ''
	category: 'SeasideBlog-Glorp'

subclass MGDescriptorSystem and override #rootClass, returning the class above, like so…

rootClass
	^SBActiveRecord

and on the class side, override #defaultLogin

defaultLogin
    ^(Login new)
        database: PostgreSQLPlatform new;
        username: 'xxxx';
        password: 'xxxx';
        connectString: '127.0.0.1_yourDatabaseName'

and optionally #initializeDatabase: if you want to insert some test data on creation of the schema…

initializeDatabase: aSession
    aSession inUnitOfWorkDo: [aSession register: SBPost testPost]

Now, describe your classes with Magritte, here’s an example…

descriptionCategories
    ^ (MAMultipleOptionDescription selector: #categories label: 'Categories' priority: 1000)
        options: [SBCategory findAll execute] dynamicallyRefreshed ;
        classes:{SBCategory};
        reference: SBCategory description;
        componentClass: MACheckboxGroupComponent;
        yourself

[SBCategory findAll execute] dynamicallyRefreshed is a shortcut for (MADynamicObject on:[SBCategory findAll execute]) that I implemented, since Magritte descriptions are cached, I do this to ensure each time a UI is rendered, a fresh query is done against the database.

You must manually create your database in Postgres. Once created, to create your schema, simply call #createSchema on your MGDescriptorSystem subclass like so…

SBGlorpDescriptions createSchema.

It will use your default connection to infer and create the schema necessary to support your object model. I’ve only used this on two schemas so far, but it seems to work OK, though I’m sure there must be bugs.

Now, to use this all in Seaside, subclass MGGlorpSession and override #glorpDescriptionClass like so…

glorpDescriptionClass
    ^SBGlorpDescriptions

Once done, from Seaside, I can execute queries like so…

blogPosts
    "Grab published blog posts from database and return them in reverse order"

    ^(SBPost findAll)
        limit: self numberOfPostsToShow;
        where: [:each | each isPublished];
        orderBy: [:each | each timestamp descending];
        execute

And have the full power of Glorp available from two class methods, #find and #findAll which return Glorp queries wrapped in a decorator that allows you to call execute on them for the current Seaside context. All classes are commented. If anyone is brave enough to use this, I’d appreciate any feedback on any trouble you run into, or just general discussion about the approach. As I said before, I make no guarantees, but I’m using this code myself, and so far, it seems OK.

UPDATE: The tests included in the package are there to demonstrate a missing feature, automatic inheritance mapping. They do not need to be ran and have nothing to do with the base package. If createSchema works, you’re done, just start using it.

Making a Connection Pool for Glorp in Seaside

First let me say…. Glorp rocks! Kudos to Alan Knight for this framework. I’m really liking it, having written two home brew O/R frameworks in the past (mostly to learn how, and in less capable languages than Smalltalk), I can appreciate the flexibility of its design. This is going on my list of programs to read thoroughly from time to time. It’s a very well written and very nice example of a well written OO system that anyone could learn a lot from. I’d add both Seaside and Magritte to that list as well. Reading great code is a lost art too few programmers do these days.

Glorp really gives me that object oriented feel and allows me to at least pretend I’m working with an object database, while keeping all the benefits of a relational database like constraints, indexing, and random queries. It’s far more capable than I thought it’d be and totally pluggable if you need to add any capabilities. It’ll do things Rails couldn’t dream of as far as mapping and querying goes, and it does it in native Smalltalk syntax.

OK, enough of the Glorp envy. I’ve been working to get Glorp, Seaside, and Magritte all tied together so I can have a full stack framework to work with that allows me to work in Smalltalk at every level. I’ve used many languages and nothing comes close to the productivity I feel in Smalltalk, so naturally, I’m looking forward to finally using it from top to bottom, html, biz objects, and sql queries, all in Smalltalk.

While working on an implementation of a sort of ActiveRecord, but using Magritte for the meta data instead of the database, I quickly found I needed to mary one Glorp session to one Seaside session to keep everything simple and intuitive. It’s a good match, however, I don’t want to keep a connection to PostgreSQL, they’re too valuable a resource to keep sitting idle and unused for 10 minutes while a session times out.

I played around a bit and found, at least so far, that within Glorp, the Squeak and Postgres adapters don’t really maintain any state and can be swapped in and out of an existing GlorpSession. I decided that I’d write a connection pool for Glorp’s SqueakDatabaseAccessor allowing me to tie a PostgreSQL connection to a request allowing much more scalability in the web scenario without risking running out of connections during peak loads. So, after a bit of playing around, I came up with a class called MGConnectionPool. I’m using the class side for all this code, taking advantage of the simple fact that in Smalltalk a class “is” a singleton ensuring there’s only one instance of the pool in the image.

MGConnectionPool class>>initialize
    lock := Monitor new.
    connections := Dictionary new.

MGConnectionPool class>>poolTimeout
    ^30 seconds

MGConnectionPool class>>withUser: aUser password: aPassword
    server: aServer database: aDatabase in: aBlock

    | connection result expired |
    "Grab a connection from the pool util you find one that
    isn't expired, logout the expired ones"
    expired := true.
    [expired]
        whileTrue: [connection := self
                        getConnectionUser: aUser
                        password: aPassword
                        server: aServer
                        database: aDatabase.
            expired := DateAndTime now - connection value > self poolTimeout.
            expired ifTrue: [connection key logout]].

    “pass the connection through the block, which will be the page
    request, and ensure it’s returned to the pool when done”
    [result := aBlock value: connection key]
        ensure: [self
                returnConnection: connection
                forKey: (self
                        makeKeyUser: aUser
                        password: aPassword
                        server: aServer
                        database: aDatabase)].
    ^result

MGConnectionPool class>>getConnectionUser: aUser password: aPassword
    server: aServer database: aDatabase 

    | key matchingConnections |
    ^ lock
        critical: [key := self
                        makeKeyUser: aUser
                        password: aPassword
                        server: aServer
                        database: aDatabase.
            matchingConnections := connections
                        at: key
                        ifAbsentPut: [OrderedCollection new].
            matchingConnections
                ifEmpty: [matchingConnections add: (self
                            newLoginForUser: aUser
                            password: aPassword
                            server: aServer
                            database: aDatabase)
                            -> DateAndTime now].
            matchingConnections removeFirst]

MGConnectionPool class>>makeKeyUser: aUser password: aPassword
    server: aServer database: aDatabase 

    ^aUser , ‘~’ , aPassword , ‘~’ , aServer , ‘~’ , aDatabase

MGConnectionPool class>>newLoginForUser: aUser password: aPassword
    server: aServer database: aDatabase 

    ^ (SqueakDatabaseAccessor forLogin:
        (Login new database: PostgreSQLPlatform new;
             username: aUser;
             password: aPassword;
             connectString: aServer , ‘_’ , aDatabase))
         login;
         yourself

MGConnectionPool class>>returnConnection: aConnection forKey: aKey
    lock
        critical: [aConnection value: DateAndTime now.
            (connections at: aKey)
                add: aConnection]

This allows me to create a subclass of a Seaside session and glue Glorp and Seaside together like this.

MGGlorpSession>>commit: aBlock
	^database inUnitOfWorkDo: aBlock

MGGlorpSession>>execute: aQuery
	^database execute: aQuery

MGGlorpSession>>register: anObject
	^database register: anObject

MGGlorpSession>>ensureGlorpSessionOn: aDbAccessor
	database
		ifNil: [database := GlorpSession new
						 system: (SBGlorpDescriptions
                            forPlatform: aDbAccessor
                                currentLogin database);
						 accessor: aDbAccessor;
						 yourself]

MGGlorpSession>>responseForRequest: aRequest
	^ MGConnectionPool
		withUser: (self application preferenceAt: #glorpUserName)
		password: (self application preferenceAt: #glorpPassword)
		server: (self application preferenceAt: #glorpServer)
		database: (self application preferenceAt: #glorpDatabase)
		in: [:dbAccessor |
			self ensureGlorpSessionOn: dbAccessor.
			database accessor: dbAccessor.
			super responseForRequest: aRequest]

Reading in the authentication from the current application config. Now I can feel safe using Glorp from Seaside, and from within any Seaside component, run Glorp queries with ease in a scalable manner. The code is totally generic and can reused for every future application by simply subclassing. Programming is getting more fun by the day; it’s going to be a good year for Seaside. All the necessary frameworks exist to build a truly awesome and “fully” object oriented web stack that doesn’t drag you down into the request response cycle inherent to other frameworks, even Rails. Glorp + PostgreSQL + Seaside + Magritte + Scriptaculous + Albatross + a little glue == Slick totally object oriented full stack Ajax web framework of the future, today!

Next Page »