Command Query Separation and SOA
Monday, August 11th, 2008.One of the common questions I receive from people starting to use nServiceBus is how one-way messaging fits with showing the user a grid (or list) of data. Thinking about publish/subscribe usually just gets them even more confused. Trying to resolve all this with Service Oriented Architecture leaves them wondering - why bother?

In regular client-server development, the server is responsible for providing the client with all CRUD (create, read, update, and delete) capabilities. However, when users look at data they do not often require it to be up to date to the second (given that they often look at the same screen for several seconds to minutes at a time). As such, retrieving data from the same table as that being used for highly consistent transaction processing creates contention resulting in poor performance for all CRUD actions under higher load.
A Scalable Solution
One of the common answers to this question is for the server/service to publish a message when data changes (say, as the result of processing a message) and for clients to subscribe to these messages. When such a notification arrives at a client, the client would cache the data it needs. Then, when the user wants to see a grid of data, that data is already on the client. Of course, this solution doesn’t work so well for older client machines (like some point of service devices) or if there are millions of rows of data.
The thing is that this solution is one implementation of a more general pattern - command query separation (CQS).
Command Query Separation
Wikipedia describes CQS as a pattern where "… every method should either be a command that performs an action, or a query that returns data to the caller, but not both. More formally, methods should return a value only if they are referentially transparent and hence possess no side effects."
Martin Fowler is less strict about the use of CQS allowing for exceptions: "Popping a stack is a good example of a modifier that modifies state. Meyer correctly says that you can avoid having this method, but it is a useful idiom. So I prefer to follow this principle when I can, but I’m prepared to break it to get my pop."
So, how does separating commands from queries and SOA help at all in getting data to and from a UI? The answer is based on Pat Helland’s thinking as described in his article Data on the Inside vs. Data on the Outside.
Services Cross Boxes
The biggest lie around SOA is that services run.
Let that sink in a second.
Sure services have runnable components, but that’s not why they’re important.
I’ll skip the books of background and cut to the chase:
Services communicate with each other using publish/subscribe and one-way messaging. Services have components inside them. Inside a service, these components can communicate with each using synchronous RPC, or any other mechanism. Also, these components can reside on different machines.
This is broader than just scaling out a service. There can be service components running on the client as well as the server.
SOA & CQS
Combining these two concepts together, here’s what comes out:
In this solution there are two services that span both client and server - one in charge of commands (create, update, delete), the other in charge of queries (read). These services communicate only via messages - one cannot access the database of the other.
The command service publishes messages about changes to data, to which the query service subscribes. When the query service receives such notifications, it saves the data in its own data store which may well have a different schema (optimized for queries like a star schema).
The client component which is in charge of showing grids of data to the user behaves the same as it would in a regular layered/tiered architecture, using synchronous blocking request/response to get its data - SOA doesn’t change that.
Composite Applications
Although the client side components of both the command and query services are hosted in the same process, they are very much independent of each other. That being said, from an interoperability perspective (the one that most people attribute to SOA), all of the client-side components will likely be developed using the same technology - although there are already ways to host Java code in .NET and vice-versa.
Of course, once we talk about web UI’s things are a bit different - but still similar. While web-server-side there may be a level of independence, for browser side inter-component communications we’re still likely to target javascript. There, I’ve managed to say something technical supporting mashups and SOA without lying through my teeth.
On the Microsoft side with the recent release of the Composite Application Guidance & Library (pronounced "prism") I hope that more of these principles will be reaching the "smart client". The command pattern is especially critical in maintaining the separation while enabling communication to still occur so I’m glad that, as one of the Prism advisors, I was able to simplify that part (Glenn still has nightmares about that rooftop conversation).
Publish / Subscribe
In the "scalable solution" section up top I mentioned how publish/subscribe to the smart client is really just one implementation of CQS and SOA. So, how different is it really?
Well, there will probably be a different technology mapping. Instead of a star-schema OLAP product, we might simply store the published data in memory on the client. That is, if you designed your components to be technology agnostic.
In terms of the use of nServiceBus, the same component is going to be subscribing to the same type of message - all that’s different is that now every client will be having data pushed to them rather than this occurring server-side only.
You could have the same code deployed differently in the same system - stronger clients subscribing themselves, weaker ones using a remote server. Web servers would probably be considered stronger clients. This kind of flexible deployment has proven to be extremely valuable for my larger clients. The added benefit of enabling users to work (view data) even while offline (somewhere there’s no WIFI) is just icing on the cake.
A Word of Warning
Once the client starts receiving notifications, and handling those on a background thread (as it should) the code becomes susceptible to deadlocks and data races. Juval does a good job of outlining some of those with respect to the use of WCF. Prism doesn’t provide any assurances in this area either.
Summary
NServiceBus is not designed to be used for any and all types of communication in a given architecture. In the examples above, nServiceBus handles the publish/subscribe but leaves the synchronous RPC to existing solutions like WCF. Not only that, but synchronous RPC does have its place in architecture, just not across service boundaries. In all cases, data is served to users from a store different from that which transaction processing logic uses.
Command Query Separation is not only a good idea at the method/class level but has advantages at the SOA/System level as well - yet another good idea from 20 years ago that services build upon. Making use of CQS requires understanding your data and its uses - SOA builds on that by looking into data volatility and the freshness business requirements around it.
Finally, designing the components of your services in such a way that their dependency on technology is limited buys a lot of flexibility in terms of deployment and, consequently, significant performance and scalability gains.
Simple, it is. Easy, it is not.
|
If you liked this article, you might also like articles in these categories:
If you've got a minute, you might enjoy taking a look at some of my best articles.I've gone through the hundreds of articles I've written over the past 4 years and put together a list of the best ones as ranked by my 2000+ readers. You won't be disappointed. If you'd like to get new articles sent to you when they're published, it's easy and free.Subscribe right here. Something on your mind? Got a question? I'd be thrilled to hear it. Leave a comment below or email me, whatever works for you. 15 CommentsYour comment... |







August 11th, 2008 at 8:34 am
Short question on CQS.
“Usual” return value of the CREATE command is a surrogate key of the inserted (or sometimes upserted) entity. Even if natural key exists, it may be to clumsy to use. How this (getting identity of the created entity) should be handled in CQS.
I have ment upsert to underline that it’s not always possible to generate identity on the client side.
August 11th, 2008 at 10:54 am
Its kind of funny that the wikipedia article illustrates CQS with a function that returns a value and changes the state. I think they’re saying that returning a value via an out parameter isn’t ‘returning a value’.
So, to my understanding, CQS is really ‘only change state in methods with returntype == void’.
August 11th, 2008 at 1:33 pm
Alex,
Actually, the SQL insert command doesn’t return the value of an identity column - you specifically have to query for it. That being said, all SQL statement do return the number of rows affected, an apparent violation of CQS.
Personally, I liked GUIDs for IDs.
August 11th, 2008 at 1:35 pm
Commenter,
Wikipedia is the outer join of the wisdom of crowds and the ignorance of herds, or so someone once told me
August 11th, 2008 at 3:06 pm
Even though you didn’t answer my question your responce gave me an idea to use GUID as a correlation value for subsequent query. As I explained, I cannot generate ID on the client, the record being “created” may already exist (service automatically removes duplicate facts).
However, in this case, a new question arises: how long should service keep correlation value?
August 11th, 2008 at 3:22 pm
Why don’t you use async messages for any potentially remote communication? Why at all do you encourage people to use RPC? (let’s left streaming aside) Is your position more like “RPC is OK for in-service remote communication” or more like “Avoid using Messages for in-service communication”? Could you elaborate more on this?
August 11th, 2008 at 3:28 pm
Off topic: the comments feed seems to be broken, I can’t subscribe. See http://feedvalidator.org/check.cgi?url=http%3a%2f%2fwww.udidahan.com%2fcomments%2ffeed%2f
August 11th, 2008 at 3:50 pm
Thanks for the great post, Udi. This applies to my current situation, and it has helped clarify things greatly.
Alex, we’re toying with the idea of using an ID broker service for things that we can’t have GUIDs for. This way a client can send a RequestNextId message, the server generates the next id (and reserves it so no future entities can use it), and then sends back a message to the original sender. The correlation is done automatically for you in nServiceBus (via the IdForCorrelation property of the TransportMessage class if you’re using the MSMQTransport). The client receives the message and knows which RequestNextId message it correlates to.
August 11th, 2008 at 4:19 pm
Alex,
I’m not quite sure I understand your question, but does the REST style solution I described in previous post correlate to the way your thinking about using GUIDs?
If so, then I’d use a long-running process like nServiceBus’ sagas to manage the allocation and cleanup of those GUIDs. You’d probably want to experiment with various cleanup times until you settle on a cleanup policy that works for you.
Hope that helps.
August 11th, 2008 at 5:26 pm
Oh, now I see from where this idea has come to me
Thank you very much, indeed.
August 11th, 2008 at 10:11 pm
[…] the system within that paradigm, even when the fit seemed just terrible. Udi Dahan was kind enough to post about that exact topic in his blog last night, setting a critical thought-balloon free and allowing a system design […]
August 13th, 2008 at 2:20 am
Sergey,
> Why don’t you use async messages for any potentially remote communication?
When the architecture itself is synchronous, introducing one-way messaging brings quite a lot of complexity. Also, the robustness and scalability benefits are thwarted by the synchronous architecture.
You can’t be going both left and right at the same time.
> Why at all do you encourage people to use RPC?
I’m trying to encourage people to understand the tradeoffs between the various choices - one way/pub sub/rpc. What makes sense where, and why.
> Is your position more like “RPC is OK for in-service remote communication” or more like “Avoid using Messages for in-service communication”?
My position is that, if there’s any place where RPC will cause the minimal amount of damage, it’s within service bounaries.
I would definitely NOT suggest the latter. Messaging is such a strong paradigm that it requires the overall architecture to be aligned with it. When that occurs, you get tremendous benefits from using messaging within service boundaries as well.
Finally, the main point is to understand the different contexts for making decisions. Between services, under no circumstances should RPC be used - unless its an insignificant implementation detail in supporting a higher level messaging infrastructure (MSMQ makes use of RPC under the covers).
Hope that clears it up.
I’ll check on the comment feed, thanks.
October 21st, 2008 at 2:53 pm
@udidahan
Would it be fair to say that the upper layers mainly communicate with the Query services synchronously? I’m thinking here of getting all Customers, getting a Customer to edit and so on.
October 22nd, 2008 at 2:30 am
Colin,
That’s correct.
November 1st, 2008 at 4:57 pm
[…] How client interaction fits with SOA […]