19 September 2006
Establishing Session Contexts

In a recent large Rails project I’ve been developing, I had an elusive error: some views that I was rendering contained the wrong data. It escaped testing - the unit, integration and functional tests all were green - but every once in a while bad data would come up. Once a day, more or less, I’d be looking at an impossible page. After too much investigation I identified the problem: it turned out to be a piece of data being held in the session was getting modified by two relatively independent parts of the application. It wasn’t being done intentionally - it was just by coincidence that I’d used the same name for two different pieces of data.

While this is not a big deal for small applications, in large projects or ones in which multiple developers are working on different parts of the code, the potential for namespace collisions in the session can become significant. And when that happens, finding the problem can be painful. And unfortunately, the remedy may be a challenge - how to obtain an unused symbol that doesn’t overlap with anything else already in use.

Rails took on the problem of namespace collision in Controllers and Views early on (See the section on Grouping Controllers into Modules in Agile Development with Rails by Dave Thomas and David Heinemeir Hansson) but there was no relief given for sessions. This may be by design. The intention is that persistent state should be stored in the database, however this state is somewhere between persistent and volatile. As I’ve developed web apps that act more like standalone apps, session state is required.

What I’ve done is built virtually invisible session context switching into my rails apps with a mechanism that is only noticeable when crossing contextual boundaries. This is now generalized into a nice clean mechanism which is described as the Context Problem.

The Context Problem in the Rails Session

Say I’ve written a FooController that refers to a session variable as session[:mumble]

# foo_controller.rb

class FooController < ApplicationController
  def edit
    ...
    session[:mumble] = mumble(foo)
    ...
  end
end

A few weeks later I get a request from a client who needs some Bar views which I create and back with a BarController. Because I didn’t have to think too much when I was creating the part of the app that worked with Foo, I forgot that I used session[:mumble], and inadvertently used it again in Bar.

# bar_controller.rb

class BarController < ApplicationController
  def expand value
    ...
    session[:mumble] = mumble(@bar,value*2)
    ...
  end
end

Unaccountably, the tests run fine, as the two parts are relatively unrelated and the contexts don’t happen to cross in the tests cases. But the error is lurking and there’s no good way to find it short of more testing or through a series of exhaustive code reviews. And neither is a guarantee. Adding more tests may not necessarily find the overlap. As the size of the application increases, the number of tests needed to identify such situations may grow disproportionately, and the problem may still not be found. Code reviews may help, but if the code base is large (yes, we are talking about Rails-in-the-Large here) the system probably gets reviewed in contextually-related chunks, clearly missing the problem again.

Couldn’t it be attacked through automation? Sometimes it can - a scan could reveal session keys that are the same in different parts of the code. But given the dynamism that makes Ruby so great, the keys may be computed, or the code that is messing with the session may be generated through metaprogramming. Then the namespace collision won’t happen until it’s released in the field and it’ll probably be at your biggest customer in the middle of the night. Of course.

The issue isn’t how to find and fix such problems when they occur, but how to avoid the problem entirely.

Brute Force and Blunt Trauma

The problem can be solved entirely via discipline - at least for those who are disciplined. By prefixing the context onto the symbol, there’s no longer an overlap:

# foo_controller.rb

class FooController < ApplicationController
  def edit
    ...
    session[:foo_mumble] = mumble(foo)
    ...
  end
end
# bar_controller.rb

class BarController < ApplicationController
  def expand value
    ...
    session[:bar_mumble] = mumble(@bar,value*2)
    ...
  end
end

That works well but it’s easy to forget and painful to type. If you have long context names it’s even more painful. Just think of all the typos. And if the code is generated, the mechanisms needed to add context everywhere need to be put into place. You would always need to be very disciplined.

It’s fragile, but it does work. An advantage of this method is that it’s readily clear what the context of a session variable is just by looking at it. But it definitely isn’t the Rails way. More typing means there’s probably a better way to do it. But first, we’ll consider the issue of discarding old state.

Forgetting

Interestingly, in cognitive science (in a sense, applications are but a mechanization of cognitive science) the hardest thing to do is to carefully forget information. The effort needed to remember things in a coordinated way is almost trivial in comparison with that of forgetting. Getting rid of integrated information is hard. Especially when we need to be selective about what values we must forget and make sure that losing the information at the wrong time doesn’t make a mess.

If we use the simplest scenario, our controllers are adapted to cleanse themselves completely on demand by erasing the session variables that start with their prefix:

# foo_controller.rb

class FooController < ApplicationController
  def cleanse
    ...
    session[:foo_mumble] = nil
    session[:foo_womp] = nil
    session[:foo_wuzzle] = nil
    ...
  end
end
# bar_controller.rb

class BarController < ApplicationController
  def cleanse
    ...
    session[:bar_mumble] = nil
    session[:bar_fraz] = nil
    session[:bar_fribble] = nil
    ...
  end
end

This certainly seems ripe for a little refactoring. If the methods are lifted and we put some smarts into the ApplicationController:

# application.rb

class ApplicationController < ActionController::Base
  def cleanse context
    regexp = RegExp.new("/^#{context}_/")
    session.keys.each { |key| session[key] = nil if regexp.match(key.to_s) == nil }
  end
end

Now a sending "foo" or "bar" to the cleanse method will nil-out the session context whose variables start with "foo_" or "bar_" respectively. Unfortunately, it will also clean out state starting with "foo_fighter_" or "bar_room_brawl_" - and in the same way, we can still get into the same problem of context overlap. So though this may or may not be a good solution for you, it wasn’t enough for me.

A Better Session

Let us go back to basics. A session is a hash that holds state. For our purposes, context is a partitioning of session. If we consider contexts to be disjoint (non-overlapping) then a context is also a hash that holds state. We want to store our state in the context hashes rather than in the session hash.

This is all great, but there’s an awful lot of Rails code that already references session and, while I love to refactor, I hate changing lots of stuff. And since the session hash is the predominate state storage mechanism, with lots of code written for different back-end session storage already, whatever needs to be done has to piggyback on what’s already there.

The trick is to build context hashes in the session hash as needed, and do a little dance to pull up the right context when a session is requested. Whatever the mechanism, it needs to be virtually invisible, be able to be expanded for all sorts of uses, and it needs to work without requiring changes to existing Rails code, at least for simple controllers that don’t cross context.

So let’s let contexts be hashes, and for consistency’s sake we’ll call them subsessions. We’ll access a subsession in the same way as session in Rails, as a key-value hash. What’s more, we’ll change the accessor to session into supersession, and on demand, return the subsession context when the session accessor is requested.

Of course, a subsession has to come into existence at the right time. They need to overlay the session in ActionController::Base and be available to all of our controllers. It makes sense to put the methods into the ApplicationController. By using a before_filter, it can do our subsession creation or access each time a controller is created to service a request, all behind the scenes. What’s more, other contexts can piggyback on the subsessions scheme just by utilizing the context creation and access mechanisms.

We also need to be able to forget - we have to be able to cleanse contexts as needed. So we create a cleansing method that, given a context identifier, removes the subsession data. However, it’d also be nice to purge a whole lot of subsessions using a single key. The way to do this is to store the set of contexts identifiers in the supersession, each with an associated key that can be the same for multiple contexts. The cleanse method can purge a single context, or purge multiple contexts based on a test of the associated key. Having the context list is also nice for investigative purposes, possibly providing important clues when things have gone wrong.

Finally, since we need to be able to make and access contexts, we need to decide on what basis context can be identified. We’ll do what makes sense and is as versatile as possible. If we’re supplied a symbol, that’s a great accessor. If we’re given a string, we turn it into a symbol. If we’re given a Class or Module, use the symbol corresponding to its name. And if we’re given an instance of an object, we assume its class’ name. We make two more shortcuts when accessing the session - one to provide a way to get the current subsession, and the other to get the supersession.

# application.rb

class ApplicationController < ActionController::Base
  before_filter :establish_session

  def cleanser
    true
  end

  alias_method :supersession, :session

  def session(klass = nil)
    if klass == nil
      @subsession
    elsif klass == :super
      supersession
    elsif klass.instance_of? String
      session klass.to_sym
    elsif klass.instance_of? Symbol
      affirm_subsessions
      affirm_subsession klass
    elsif (klass.instance_of? Class) || (klass.instance_of? Module)
      session klass.name
    else
      session klass.class.name
    end
  end

  def establish_session(klass = self, cleanser = cleanser)
    klass ||= self
    if klass.instance_of? String
      establish_session klass.to_sym, cleanser
    elsif klass.instance_of? Symbol
      affirm_subsessions
      @subsession = affirm_subsession klass, cleanser
    elsif (klass.instance_of? Class) || (klass.instance_of? Module)
      establish_session klass.name, cleanser
    else
      establish_session klass.class.name, cleanser
    end
  end

  def cleanse(klass = nil, cleanser = cleanser)
    if klass
      if klass.instance_of? String
        cleanse klass.to_sym
      elsif klass.instance_of? Symbol
        affirm_subsessions
        affirm_subsession klass
        if supersession[:subsessions][klass]
          supersession[klass] = nil
          supersession[:subsessions][klass] = nil
        end
      elsif (klass.instance_of? Class) || (klass.instance_of? Module)
        cleanse klass.name
      else
        cleanse klass.class.name
      end
    else
      supersession[:subsessions].each { |klass, autocleanse|
        cleanse klass if autocleanse == cleanser }
    end
  end

  def cleanse_and_establish_session(klass = nil, cleanser = cleanser)
    cleanse klass, cleanser
    establish_session klass, cleanser
  end

private

  def affirm_subsessions
    supersession[:subsessions] ||= {}
  end

  def affirm_subsession(klass = self, cleanser = cleanser)
    if ! (subsession = supersession[klass])
      supersession[:subsessions][klass] = cleanser
      subsession = supersession[klass] = {}
    end
  subsession
  end
end

All the mechanics are there, and context is created for controller and is accessed by session as if nothing happened.

Crossing Contexts

…that is unless we have to break out of our context. Happily, this is just a logical change to the way we access a session:

session[:foo] foo in the current context
session(:super)[:foo] foo in the global (supersession) context
session("bar")[:foo] foo in the bar context
session(:bar)[:foo] foo in the bar context
session(Mod)[:foo] foo in the Mod context (where Mod is a Module)
session(Klass)[:foo] foo in the Klass context (where Klass is a Klass)
session(klass)[:foo] foo in the Klass context (where klass is an instance of Klass)

Simple web applications won’t need much of this, as crossing contextual boundaries isn’t typical. But when Rails projects get large, as mine have begun to, maintaining disjoint contexts and being explicit about boundaries will save me a lot of sleep. And since it pretty much works invisibly, I can rest even easier.