Clean URLs in Seaside
By Ramon Leon - 1 December 2008 under Seaside, Smalltalk, Programming, Performance
Seaside is known as a heretic framework when it comes to URLs, by default, they aren't very pretty. This is both a blessing and a curse. It speeds up development tremendously but confuses the crap out of your users who don't understand why they can't copy URLs and instant message or email them to you.
These URLs come from callbacks, but you don't want to get rid of all callbacks since they're a major part of what makes programming in Seaside so enjoyable by removing the need to manually marshal state in URLs. Once you get to the point where your app is working well enough that you are concerned about the URLs, you can identify those parts of your application that are mostly just navigation from one component to the next and start replacing callbacks with clean URLs encoding the necessary state in the URL as every other framework does. This works well for those more web page parts of your site where you don't really need complex callbacks anyway.
Doing clean URLs in Seaside isn't very difficult, but unlike using callbacks, how you pass state with them is rather application-specific. Seaside doesn't have simplistic controllers that receive requests and dispatch to views, but an actual control tree that maintains state between requests in the session. Since the control tree is application-specific and varies depending on the developer's personal style, the URL routing, which has to build or change parts of the control tree, is also necessarily application-specific.
Basically there are two things you have to do, get rid of the _s and get rid of the _k. I picked up these ideas from the squeak-dev list when Adrian from cmsbox explained what they were doing. He didn't post any code, just a quick description of the method, but it was more than enough to get me going.
Getting rid of the _s is trivial, it's a configuration option on the application config page. Using that method, however, will cost you an initial redirect where Seaside sets a cookie and then redirects so it can detect the cookie on the following request. This leaves you without the _s but immediately leaves you sitting on a page with a _k and c in the URL when you haven't even done anything but hit the site root; it's ugly.
If you want clean URLs, at least for something as simple as page to page navigation where nothing fancy is going on you probably don't want this initial redirect, bots don't like it either. The fix is to not enable cookie sessions via the config but to do it manually by tagging the response with the cookie on the way out if it isn't already there.
On your WASession subclass just override #returnResponse: with something like this:
returnResponse: aResponse
(self currentRequest cookieAt: self application handlerCookieName)
ifNil: [ aResponse addCookie: self sessionCookie ].
^ super returnResponse: aResponse
This adds the same cookie the config screen would without the redirect and thus without the initial ugly URL when a new session is instantiated.
We also have to remove the _s from generated callback URLs. Add another override to extend the behavior of #actionUrlForKey: to strip the _s when a session cookie is found:
actionUrlForKey: aString
| url |
url := super actionUrlForKey: aString.
(self currentRequest cookieAt: self application handlerCookieName)
ifNotNil: [ url parameters removeKey: self application handlerField ].
^ url
This takes care of the _s, you'll never see it again.
The _k is a little more interesting, so I'll use this blog as my example.
I tend to use a root component which acts as an outer frame and has an instance variable for the current body, header, and footer components. Sometimes some of this stuff in the root component might be expensive to get, so I don't want to have to do it more than once per session, or I just want it to persist between requests.
Normally when a request comes in without a _k, the current session will be invoked to create a new render loop main, which will be invoked to create a new instance of your root component and render it.
I want to avoid this--though this part isn't strictly necessary if you're OK with each request creating a new instance of your root--and keep the existing instance of the root component as well as parse the URL to decide what component should be loaded as the current body. This requires a custom #WARenderLoopMain subclass installed as the main class in the configuration.
This all starts at, go figure, #start: on the session class. So we'll override the default implementation with this:
start: aRequest
^ self mainClass new
blog: blog;
start: aRequest
Here we see blog, which is an instance of the root component that I want to reuse. I'm simply passing on the root component instance to the custom #WARenderLoopMain subclass.
This means my session class needs to keep track of the root component, easy enough to do in the #initialize of the root component.
initialize
super initialize.
self session blog: self.
currentBody := SBPostsView new.
Now the custom WARenderLoopMain subclass has the root component. A simple override of the #createRoot factory method allows me to return the same instance each time instead of creating a new one:
createRoot
^ blog ifNil: [ self rootClass new ]
At this point, I could override #start: and do all my URL parsing now, in the render loop, but I won't because I prefer to let each component parse the URL for itself taking its relevant state and loading itself up however it wishes. The default behavior of #start: already allows this by invoking #initialRequest: on each visible component.
Now, this won't be an initial request, but a subsequent request on an already initialized component; however, about the only thing I ever use #initialRequest: for is parsing URLs, so I'm happy to just treat each request without a _k as an #initialRequest: to a component.
#initialRequest: aRequest
"parse aRequest url however you like"
Of course, parsing your URL is quite naturally application-specific so I'll leave this as an exercise to the reader.
At this point, I just grab the path from the URL and do a quick search for blog posts with a matching URL slug. If one is found I load up that page as the current page, if not I check for any tags that match the slug and load up the posts in that tag. If nothing is found I issue a 404 status and render the home page.
The only thing left to do now is render the URLs cleanly instead of with callbacks in your render methods. Something like:
html anchor
url: (self baseUrl addToPath: eachPost slug) asString;
with: eachPost name
Note here that #baseUrl is not actually a method on a component but on the session. After some profiling I found #baseUrl to be a very expensive method to call and since it never really changes it pays off very well to cache it in my component base class:
baseUrl
^ (baseUrl ifNil: [ baseUrl := self session baseUrl ]) copy
Note that when I'm rendering the anchor I'm actually modifying the #baseUrl's path so the cached copy needs to return a copy of itself whenever it's used.
Anchors with callbacks, when clicked in Seaside result in two HTTP requests to the server, the initial one which looks up the callback and invokes it, and a quick 302 redirect to the final URL to render that page. By not using callbacks and rendering ordinary URLs I'm bypassing the callback phase completely and loading up my state in #initialRequest: which eliminates one of the HTTP requests. Combined with the caching of the #baseUrl, this is what makes the page navigation feel so snappy.
I'm sure some of this will likely change in 2.9 but at the moment I have no idea. In any case you get clean URLs with no parameters that are bookmarkable and won't confuse users trying to pass URLs around among themselves.
Comments (automatically disabled after 1 year)
Good article, especially the #baseUrl part. Thank you.
As always, very interesting, thanks a lot for this blog!
By coincidence, I started playing again last weekend with seaside after a two years hiatus, and just yesterday wanted to do exactly what you describe in this post. I'm glad that my solution is not drastically different ;-) (though I only used initialRequest instead of having a custom render loop).
One thing I was wondering how to do was to automatically output the generated page on disk; coupled with an apache rule to not call seaside if the page exist, that would make all those static pages extremely fast to serve, but with still keeping the possibility to easily create/manage them.
Of course, this would only be good for static pages, but this cover a wide range of possibilities.
That would only work for totally static pages, and I have yet to build a Seaside site that didn't use a callback on a page somewhere. This blog for instances still uses the built in pager, which uses callbacks.
Yes, only for totally static pages, but many websites actually are built mostly as static pages -- what I'm thinking of is a small company website, or a photo website presenting work for a photographer, this kind of thing.
Some pages will be dynamic, surely, but the majority could be done statically. For modifications, the site admin could just log in, get a dynamic page allowing edition, and when validating output the result. You'd get the advantage of seaside to build a nice dynamic webapplication which purpose is to edit/manage the website, while keeping things very fast. Though you are probably right that in practice this would be 1/ overkill 2/ unpractical ... :)
> small company website
Small company websites don't have enough load to justify the complexity of caching, just serve everything dynamic.
Seaside's fast enough for any small site as this blog proves. Seaside's fast enough for a larger site as well without caching. I'm doing 14k pages views a day at work with it, caching the occasional db query, but rendering all the pages dynamic; works fine.
(A question I asked Ramon by email)
> I'm confused with blog. Is blog an inst-var in the custom render loop, in the custom session or both ? ...
Ramon answered: blog is an inst var in both, the session passes it to the newly instantiated render loop on each hit. Lazily initializing the render loop doesn't make sense because it won't survive the current request. On a new session, session will pass a nil blog to render loop, so it uses the ifNil part to create a new root component. On all further hits the session passes a non nil blog to the render loop which will then use it instead of using the ifNil block.
Thanks for this post. I am working on something similar and your post is very helpfull. By the way, I see you write the comments are formatted using markdown. Did you write some smalltalk code for this or did you reuse some public classes?
No, I literally mean the comments are formatted with Markdown.pl, the original Perl version. I just open a pipe in Smalltalk with OSProcess and pump the text through the Perl script and read the result back, works great and saves me the trouble of implementing a half baked version in Smalltalk.
Seems really interesting doing this. Later I'll try it at home and see how it works with the implementation of Comet I'm doing.
Thanks for the post!