by Dan Rosanova

Statelessness Part 1

Opinion
May 18, 2010
Enterprise Architecture

Principles of Service Orientation series continues with this first part piece on Statelessness

I’ve hinted at the importance of statelessness in the Loose Coupling piece, and here we’ll look more deeply at what state is and its impact on project success.  State is one of the most difficult types of coupling to avoid because it is so common in computer systems. 

What is State?

State is the current configuration and settings of a system at a given point in time.  When I put my laptop down on the table it remains there until I move it; it maintains state.  When I save this document I am writing it stays that way until I work on it next; it too retains state.  On a more technical level, when I set a variable to a value my code now has state.  State is a natural concept for software developers; one they learn early on.  The fact that state is constant in all our human activities probably exacerbates the issue. 

Back to my previous shopping cart example, the cart itself is a stateful construct.  Adding or removing items to the cart is changing its state.  When multiple operations are part of a single task they share state in that the task is only complete when all the operations happen.  Most often these operations must be in a specific order, they have temporal state, and the operations must share data to accomplish their task.  This is a type of dependency; each operation depends on the operations before it.  This data and order are part of the state of the task.  The more operations and data are shared the more state there is, and the management of such operations becomes cumbersome. 

Anytime two or more operations (meaning requests) are conjoined as a single unit of work (or are required to perform a single task) state is necessarily involved.  This is because the requests are related and someone has to pay a cost for correlating them; that is for matching the first request with the second. 

Many developers who came of age (in the programming sense) during the Client Server era have state mentality deeply ingrained and will fall back on it when tackling newer problems; despite the fact that modern tools and technologies are designed for a different approach.  This is a dangerous path into stateful, tightly-coupled code. 

Why is State Expensive?

So what is wrong with all this?  Why is state bad?  State is bad because it increases complication.  State involves several parts moving in unison in order to work correctly.  The separate operations must be coordinated and the progress tracked (normally on both the client and the server); this is most often done through the use of Sessions.  Sessions are conduits through which this data passes as well as the mechanism for connecting separate operations into the single task.  Any time you open a database connection, or even a terminal screen, is a session.  Sessions must do a lot of coordination in order to facilitate the work being performed.  They must match multiple requests from the client to the appropriate session in progress (normally a specific thread of execution).  Most frameworks do this for us, but as we scale these sessions become more cumbersome.  The reason for this is that a session is an open connection and to facilitate that across a disconnected and stateless protocol like HTTP requires a lot of abstraction and overhead. 

Even the quasi-automatic matching of requests to specific sessions, often called session affinity, becomes an issue as you scale. If the session is provided in a web server process every request from that user for that session must go to the same web server.  Alternately, the session could be stored in a shared resource, like a database, but doing so slows every operation as the session is loaded back into process by the application (the web server). 

And it gets worse. Besides simply coordinating requests and data, there are issues around session lifetime management.  How long should a session be valid?  What should be done when a session is no longer valid?  Who cleans up whatever resources the session used during its lifetime?  For some older client server applications this was easy; when the application closed the session was over.  This shows how tightly bound the application was to its session.  In the distributed world this is a much more difficult problem to solve.  Unless you’re using a single, open, persistent TCP/IP connection there is no definite lifetime for a session.  How does state relate to coupling?

Finally, the sequence of steps and data exchanged in state-heavy processes creates a condition we discussed in Loose Coupling.  This prescribed sequence of operations and data exchanges creates glue that makes our programs stay in a specific form.  This is an implicit contract, and is especially ill-advised, because it is not expressed via the service itself, but only through outside documentation, if at all. 

State makes it harder to change our services and applications.  You can’t just create a new mandatory middle step without coordinating with all of your consumers.  That would break their existing applications.  State affects coupling – generally more increases it, less decreases it. 

There are many technical reasons why state is bad, but for now let’s stick with that it’s complicated, requires a lot of resources and overhead, and creates an implicit contract.

How can you avoid state?

There are several strategies one can use to avoid state in services and they work better in specific situations.  The easiest, perhaps, is to defer to the consumer.  This is what classic client server applications did.  The client was responsible for tracking and taking care of its own session.  This works, and our shopping cart example fits this type of state deferral pattern.  The web portal must keep its own shopping cart and, when the user is ready, submit it to the Order service. 

Another popular technique is to defer to the ultimate destination application to which the service is a conduit.  This is less desirable, though also an option.  I say less desirable because a service so designed needs to carry information that really has nothing to do with the service itself–perhaps a session ID or token of some sort.  If this route is necessary, using a custom SOAP header for this information rather than placing it in the request message bodies is better.  The reason is twofold: first, this information has nothing to do with the operation you are providing, it is a technical detail.  Second, it is likely to be needed on other operations, so creating a SOAP header for it saves developers from having to put this information into every one of their operation input messages.

Often the best approach is simply to design services not to need state at all.  This can be done by making messages more coarsely grained and requiring them to carry more information in their payload than would be the case for stateful services.  Thinking in more coarse-grained terms can also be paired with bunching operations together. Both these approaches will be covered in Avoiding State by Design. 

Where does this all fit into Services?

Statelessness is critical to service orientation because of the unpredictable nature of networks and distributed computing in general.  We may take this for granted, especially on small scale implementations. But large distributed systems feel this impact as transient errors, slow performance, and difficulty in change. 

Services of all flavors are built on the premise and success of HTTP (i.e. the Web).  HTTP is perhaps the most successful protocol ever created.  It can effectively be argued that the very success of the Internet is largely due to the HTTP protocol’s stateless nature.  This has allowed the web to grow in the way that it has. 

This argument throws off a lot of people, even technical people, because almost everything done on the Internet these days appears to have state.  The best designed and most scalable services and applications actually do not have much state. 

Next Steps

This piece ran longer than I planned so I have broken out Avoiding State by Design into its own.  This will be posted shortly.