<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Udi Dahan - The Software Simplist &#187; Scalability</title>
	<atom:link href="http://www.udidahan.com/category/scalability/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.udidahan.com</link>
	<description>Enterprise Development Expert &#38; SOA Specialist</description>
	<lastBuildDate>Tue, 31 Aug 2010 09:56:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>CQRS Video Online</title>
		<link>http://www.udidahan.com/2010/02/26/cqrs-video-online/</link>
		<comments>http://www.udidahan.com/2010/02/26/cqrs-video-online/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 09:42:45 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[CQRS]]></category>
		<category><![CDATA[Community]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Validation]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/?p=1184</guid>
		<description><![CDATA[A couple of weeks ago I gave a talk on Command/Query Responsibility Segregation in London. 
The recording of the talk is online here.
There is one important thing that I didn&#8217;t have enough time to cover, but I want you to keep in mind as you&#8217;re watching this. It is that CQRS is applicable only *within* [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago I gave a talk on Command/Query Responsibility Segregation in London. </p>
<p>The recording of the talk is online <a href="http://skillsmatter.com/podcast/open-source-dot-net/udi-dahan-command-query-responsibility-segregation/rl-311">here</a>.</p>
<p>There is one important thing that I didn&#8217;t have enough time to cover, but I want you to keep in mind as you&#8217;re watching this. It is that CQRS is applicable only *within* the context of a single service/BC &#8211; NOT across or between them.</p>
<p>Let me know what you think.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2010/02/26/cqrs-video-online/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Scalability Podcast on Herding Code</title>
		<link>http://www.udidahan.com/2010/01/11/scalability-podcast-on-herding-code/</link>
		<comments>http://www.udidahan.com/2010/01/11/scalability-podcast-on-herding-code/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 19:44:49 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Community]]></category>
		<category><![CDATA[Podcast]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/?p=1168</guid>
		<description><![CDATA[The great folks over at Herding Code were nice enough to interview me back in November as I was over in Paris giving my 5-day SOA course. We talked about quite a lot of topics related to scalability.
Click here for the full list of topics and to download the podcast.
Let me know what you think [...]]]></description>
			<content:encoded><![CDATA[<p>The great folks over at <a href="http://www.herdingcode.com">Herding Code</a> were nice enough to interview me back in November as I was over in Paris giving my <a href="http://www.UdiDahan.com/training">5-day SOA course</a>. We talked about quite a lot of topics related to scalability.</p>
<p><a href="http://herdingcode.com/?p=229">Click here</a> for the full list of topics and to download the podcast.</p>
<p>Let me know what you think or any questions you may have in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2010/01/11/scalability-podcast-on-herding-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Clarified CQRS</title>
		<link>http://www.udidahan.com/2009/12/09/clarified-cqrs/</link>
		<comments>http://www.udidahan.com/2009/12/09/clarified-cqrs/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 14:57:19 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Validation]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/?p=1149</guid>
		<description><![CDATA[
After listening how the community has interpreted Command-Query Responsibility Segregation I think that the time has come for some clarification. Some have been tying it together to Event Sourcing. Most have been overlaying their previous layered architecture assumptions on it. Here I hope to identify CQRS itself, and describe in which places it can connect [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/wp-content/uploads/clarification.png" style="float:right; margin-left:10px; margin-bottom:10px" alt="clarification" title="clarification" /><br />
After listening how the community has interpreted Command-Query Responsibility Segregation I think that the time has come for some clarification. Some have been tying it together to Event Sourcing. Most have been overlaying their previous layered architecture assumptions on it. Here I hope to identify CQRS itself, and describe in which places it can connect to other patterns.</p>
<p><a href="/wp-content/uploads/Clarified_CQRS.pdf">Download as PDF</a> &#8211; this is quite a long post.</p>
<h3>Why CQRS</h3>
<p>Before describing the details of CQRS we need to understand the two main driving forces behind it: collaboration and staleness.</p>
<p>Collaboration refers to circumstances under which multiple actors will be using/modifying the same set of data &#8211; whether or not the intention of the actors is actually to collaborate with each other. There are often rules which indicate which user can perform which kind of modification and modifications that may have been acceptable in one case may not be acceptable in others. We&#8217;ll give some examples shortly. Actors can be human like normal users, or automated like software. </p>
<p>Staleness refers to the fact that in a collaborative environment, once data has been shown to a user, that same data may have been changed by another actor &#8211; it is stale. Almost any system which makes use of a cache is serving stale data &#8211; often for performance reasons. What this means is that we cannot entirely trust our users decisions, as they could have been made based on out-of-date information.</p>
<p>Standard layered architectures don&#8217;t explicitly deal with either of these issues. While putting everything in the same database may be one step in the direction of handling collaboration, staleness is usually exacerbated in those architectures by the use of caches as a performance-improving afterthought.</p>
<h3>A picture for reference</h3>
<p>I&#8217;ve given some talks about CQRS using this diagram to explain it:</p>
<p><img src="/wp-content/uploads/cqrs.png" width="500" height="319" alt="CQRS" title="CQRS" /></p>
<p>The boxes named AC are Autonomous Components. We&#8217;ll describe what makes them autonomous when discussing commands. But before we go into the complicated parts, let&#8217;s start with queries:</p>
<h3>Queries</h3>
<p>If the data we&#8217;re going to be showing users is stale anyway, is it really necessary to go to the master database and get it from there? Why transform those 3rd normal form structures to domain objects if we just want data &#8211; not any rule-preserving behaviors? Why transform those domain objects to DTOs to transfer them across a wire, and who said that wire has to be exactly there? Why transform those DTOs to view model objects?</p>
<p>In short, it looks like we&#8217;re doing a heck of a lot of unnecessary work based on the assumption that reusing code that has already been written will be easier than just solving the problem at hand. Let&#8217;s try a different approach:</p>
<p>How about we create an additional data store whose data can be a bit out of sync with the master database &#8211; I mean, the data we&#8217;re showing the user is stale anyway, so why not reflect in the data store itself. We&#8217;ll come up with an approach later to keep this data store more or less in sync.</p>
<p>Now, what would be the correct structure for this data store? How about just like the view model? One table for each view. Then our client could simply SELECT * FROM MyViewTable (or possibly pass in an ID in a where clause), and bind the result to the screen. That would be just as simple as can be. You could wrap that up with a thin facade if you feel the need, or with stored procedures, or using <a href="http://automapper.codeplex.com/">AutoMapper</a> which can simply map from a data reader to your view model class. The thing is that the view model structures are already wire-friendly, so you don&#8217;t need to transform them to anything else.</p>
<p>You could even consider taking that data store and putting it in your web tier. It&#8217;s just as secure as an in-memory cache in your web tier. Give your web servers SELECT only permissions on those tables and you should be fine.</p>
<h3>Query Data Storage</h3>
<p>While you can use a regular database as your query data store it isn&#8217;t the only option. Consider that the query schema is in essence identical to your view model. You don&#8217;t have any relationships between your various view model classes, so you shouldn&#8217;t need any relationships between the tables in the query data store.</p>
<p>So do you actually need a <i>relational</i> database?</p>
<p>The answer is no, but for all practical purposes and due to organizational inertia, it is probably your best choice (for now).</p>
<h3>Scaling Queries</h3>
<p>Since your queries are now being performed off of a separate data store than your master database, and there is no assumption that the data that&#8217;s being served is 100% up to date, you can easily add more instances of these stores without worrying that they don&#8217;t contain the exact same data. The same mechanism that updates one instance can be used for many instances, as we&#8217;ll see later.</p>
<p>This gives you cheap horizontal scaling for your queries. Also, since your not doing nearly as much transformation, the latency per query goes down as well. Simple code is fast code.</p>
<h3>Data modifications</h3>
<p>Since our users are making decisions based on stale data, we need to be more discerning about which things we let through. Here&#8217;s a scenario explaining why:</p>
<p>Let&#8217;s say we have a customer service representative who is one the phone with a customer. This user is looking at the customer&#8217;s details on the screen and wants to make them a &#8216;preferred&#8217; customer, as well as modifying their address, changing their title from Ms to Mrs, changing their last name, and indicating that they&#8217;re now married. What the user doesn&#8217;t know is that after opening the screen, an event arrived from the billing department indicating that this same customer doesn&#8217;t pay their bills &#8211; they&#8217;re delinquent. At this point, our user submits their changes.</p>
<p>Should we accept their changes?</p>
<p>Well, we should accept some of them, but not the change to &#8216;preferred&#8217;, since the customer is delinquent. But writing those kinds of checks is a pain &#8211; we need to do a diff on the data, infer what the changes mean, which ones are related to each other (name change, title change) and which are separate, identify which data to check against &#8211; not just compared to the data the user retrieved, but compared to the current state in the database, and then reject or accept. </p>
<p>Unfortunately for our users, we tend to reject the whole thing if any part of it is off. At that point, our users have to refresh their screen to get the up-to-date data, and retype in all the previous changes, hoping that this time we won&#8217;t yell at them because of an optimistic concurrency conflict.</p>
<p>As we get larger entities with more fields on them, we also get more actors working with those same entities, and the higher the likelihood that something will touch some attribute of them at any given time, increasing the number of concurrency conflicts. </p>
<p>If only there was some way for our users to provide us with the right level of granularity and intent when modifying data. That&#8217;s what commands are all about.</p>
<h3>Commands</h3>
<p>A core element of CQRS is rethinking the design of the user interface to enable us to capture our users&#8217; intent such that making a customer preferred is a different unit of work for the user than indicating that the customer has moved or that they&#8217;ve gotten married. Using an Excel-like UI for data changes doesn&#8217;t capture intent, as we saw above.</p>
<p>We could even consider allowing our users to submit a new command even before they&#8217;ve received confirmation on the previous one. We could have a little widget on the side showing the user their pending commands, checking them off asynchronously as we receive confirmation from the server, or marking them with an X if they fail. The user could then double-click that failed task to find information about what happened.</p>
<p>Note that the client <i>sends</i> commands to the server &#8211; it doesn&#8217;t publish them. Publishing is reserved for events which state a fact &#8211; that something has happened, and that the publisher has no concern about what receivers of that event do with it.</p>
<h3>Commands and Validation</h3>
<p>In thinking through what could make a command fail, one topic that comes up is validation. Validation is different from business rules in that it states a context-independent fact about a command. Either a command is valid, or it isn&#8217;t. Business rules on the other hand are context dependent.</p>
<p>In the example we saw before, the data our customer service rep submitted was valid, it was only due to the billing event arriving earlier which required the command to be rejected. Had that billing event not arrived, the data would have been accepted.</p>
<p>Even though a command may be valid, there still may be reasons to reject it.</p>
<p>As such, validation can be performed on the client, checking that all fields required for that command are there, number and date ranges are OK, that kind of thing. The server would still validate all commands that arrive, not trusting clients to do the validation.</p>
<h3>Rethinking UIs and commands in light of validation</h3>
<p>The client can make of the query data store when validating commands. For example, before submitting a command that the customer has moved, we can check that the street name exists in the query data store.</p>
<p>At that point, we may rethink the UI and have an auto-completing text box for the street name, thus ensuring that the street name we&#8217;ll pass in the command will be valid. But why not take things a step further? Why not pass in the street ID instead of its name? Have the command represent the street not as a string, but as an ID (int, guid, whatever).</p>
<p>On the server side, the only reason that such a command would fail would be due to concurrency &#8211; that someone had deleted that street and that that hadn&#8217;t been reflected in the query store yet; a fairly exceptional set of circumstances. </p>
<h3>Reasons valid commands fail and what to do about it</h3>
<p>So we&#8217;ve got a well-behaved client that is sending valid commands, yet the server still decides to reject them. Often the circumstances for the rejection are related to other actors changing state relevant to the processing of that command.</p>
<p>In the CRM example above, it is only because the billing event arrived first. But &#8220;first&#8221; could be a millisecond before our command. What if our user pressed the button a millisecond earlier? Should that actually change the <b>business outcome</b>? Shouldn&#8217;t we expect our system to behave the same when observed from the outside?</p>
<p>So, if the billing event arrived second, shouldn&#8217;t that revert preferred customers to regular ones? Not only that, but shouldn&#8217;t the customer be notified of this, like by sending them an email? In which case, why not have this be the behavior for the case where the billing event arrives first? And if we&#8217;ve already got a notification model set up, do we really need to return an error to the customer service rep? I mean, it&#8217;s not like they can do anything about it <b>other than notifying the customer</b>.</p>
<p>So, if we&#8217;re not returning errors to the client (who is already sending us valid commands), maybe all we need to do on the client when sending a command is to tell the user &#8220;thank you, you will receive confirmation via email shortly&#8221;. We don&#8217;t even need the UI widget showing pending commands. </p>
<h3>Commands and Autonomy</h3>
<p>What we see is that in this model, commands don&#8217;t need to be processed immediately &#8211; they can be queued. How fast they get processed is a question of Service-Level Agreement (SLA) and not architecturally significant. This is one of the things that makes that node that processes commands autonomous from a runtime perspective &#8211; we don&#8217;t require an always-on connection to the client.</p>
<p>Also, we shouldn&#8217;t need to access the query store to process commands &#8211; any state that is needed should be managed by the autonomous component &#8211; that&#8217;s part of the meaning of autonomy.</p>
<p>Another part is the issue of failed message processing due to the database being down or hitting a deadlock. There is no reason that such errors should be returned to the client &#8211; we can just rollback and try again. When an administrator brings the database back up, all the message waiting in the queue will then be processed successfully and our users receive confirmation.</p>
<p>The system as a whole is quite a bit more robust to any error conditions.</p>
<p>Also, since we don&#8217;t have queries going through this database any more, the database itself is able to keep more rows/pages in memory which serve commands, improving performance. When both commands and queries were being served off of the same tables, the database server was always juggling rows between the two.</p>
<h3>Autonomous Components</h3>
<p>While in the picture above we see all commands going to the same AC, we could logically have each command processed by a different AC, each with it&#8217;s own queue. That would give us visibility into which queue was the longest, letting us see very easily which part of the system was the bottleneck. While this is interesting for developers, it is critical for system administrators.</p>
<p>Since commands wait in queues, we can now add more processing nodes behind those queues (using the distributor with NServiceBus) so that we&#8217;re only scaling the part of the system that&#8217;s slow. No need to waste servers on any other requests.</p>
<h3>Service Layers</h3>
<p>Our command processing objects in the various autonomous components actually make up our service layer. The reason you don&#8217;t see this layer explicitly represented in CQRS is that it isn&#8217;t really there, at least not as an identifiable logical collection of related objects &#8211; here&#8217;s why:</p>
<p>In the <a href="http://en.wikipedia.org/wiki/Multitier_architecture">layered architecture</a> (AKA 3-Tier) approach, there is no statement about dependencies between objects within a layer, or rather it is implied to be allowed. However, when taking a command-oriented view on the service layer, what we see are objects handling different types of commands. Each command is independent of the other, so why should we allow the objects which handle them to depend on each other?</p>
<p>Dependencies are things which should be avoided, unless there is good reason for them.</p>
<p>Keeping the command handling objects independent of each other will allow us to more easily version our system, one command at a time, not needing even to bring down the entire system, given that the new version is backwards compatible with the previous one.</p>
<p>Therefore, keep each command handler in its own VS project, or possibly even in its own solution, thus guiding developers away from introducing dependencies in the name of reuse (it&#8217;s a <a href="http://www.udidahan.com/2009/06/07/the-fallacy-of-reuse/">fallacy</a>). If you do decide <b>as a deployment concern</b>, that you want to put them all in the same process feeding off of the same queue, you can ILMerge those assemblies and host them together, but understand that you will be undoing much of the benefits of your autonomous components.</p>
<h3>Whither the domain model?</h3>
<p>Although in the diagram above you can see the domain model beside the command-processing autonomous components, it&#8217;s actually an implementation detail. There is nothing that states that all commands <i>must</i> be processed by the same domain model. Arguably, you could have some commands be processed by <a href="http://martinfowler.com/eaaCatalog/transactionScript.html">transaction script</a>, others using <a href="http://martinfowler.com/eaaCatalog/tableModule.html">table module</a> (AKA active record), as well as those using the <a href="http://martinfowler.com/eaaCatalog/domainModel.html">domain model</a>. Event-sourcing is another possible implementation.</p>
<p>Another thing to understand about the domain model is that it now isn&#8217;t used to serve queries. So the question is, why do you need to have so many relationships between entities in your domain model?</p>
<p>(You may want to take a second to let that sink in.)</p>
<p>Do we really need a collection of orders on the customer entity? In what command would we need to navigate that collection? In fact, what kind of command would need <i>any</i> one-to-many relationship? And if that&#8217;s the case for one-to-many, many-to-many would definitely be out as well. I mean, most commands only contain one or two IDs in them anyway.</p>
<p>Any aggregate operations that may have been calculated by looping over child entities could be pre-calculated and stored as properties on the parent entity. Following this process across all the entities in our domain would result in isolated entities needing nothing more than a couple of properties for the IDs of their related entities &#8211; &#8220;children&#8221; holding the parent ID, like in databases.</p>
<p>In this form, commands could be entirely processed by a single entity &#8211; viola, an aggregate root that is a consistency boundary.</p>
<h3>Persistence for command processing</h3>
<p>Given that the database used for command processing is not used for querying, and that most (if not all) commands contain the IDs of the rows they&#8217;re going to affect, do we really need to have a column for every single domain object property? What if we just serialized the domain entity and put it into a single column, and had another column containing the ID? This sounds quite similar to key-value storage that is available in the various cloud providers. In which case, would you really need an object-relational mapper to persist to this kind of storage? </p>
<p>You could also pull out an additional property per piece of data where you&#8217;d want the &#8220;database&#8221; to enforce uniqueness. </p>
<p>I&#8217;m not suggesting that you do this in all cases &#8211; rather just trying to get you to rethink some basic assumptions.</p>
<h3>Let me reiterate</h3>
<p>How you process the commands is an implementation detail of CQRS.</p>
<h3>Keeping the query store in sync</h3>
<p>After the command-processing autonomous component has decided to accept a command, modifying its persistent store as needed, it publishes an event notifying the world about it. This event often is the &#8220;past tense&#8221; of the command submitted:</p>
<p>MakeCustomerPerferredCommand -> CustomerHasBeenMadePerferredEvent</p>
<p>The publishing of the event is done transactionally together with the processing of the command and the changes to its database. That way, any kind of failure on commit will result in the event not being sent. This is something that should be handled by default by your message bus, and if you&#8217;re using MSMQ as your underlying transport, requires the use of transactional queues.</p>
<p>The autonomous component which processes those events and updates the query data store is fairly simple, translating from the event structure to the persistent view model structure. I suggest having an event handler per view model class (AKA per table). </p>
<p>Here&#8217;s the picture of all the pieces again:</p>
<p><img src="/wp-content/uploads/cqrs.png" width="500" height="319" alt="CQRS" title="CQRS" /></p>
<h3>Bounded Contexts</h3>
<p>While CQRS touches on many pieces of software architecture, it is still not at the top of the food chain. CQRS if used is employed within a bounded context (DDD) or a business component (SOA) &#8211; a cohesive piece of the problem domain. The events published by one BC are subscribed to by other BCs, each updating their query and command data stores as needed.</p>
<p>UI&#8217;s from the CQRS found in each BC can be &#8220;mashed up&#8221; in a single application, providing users a single composite view on all parts of the problem domain. Composite UI frameworks are very useful for these cases.</p>
<h3>Summary</h3>
<p>CQRS is about coming up with an appropriate architecture for multi-user collaborative applications. It explicitly takes into account factors like data staleness and volatility and exploits those characteristics for creating simpler and more scalable constructs.</p>
<p>One cannot truly enjoy the benefits of CQRS without considering the user-interface, making it capture user intent explicitly. When taking into account client-side validation, command structures may be somewhat adjusted. Thinking through the order in which commands and events are processed can lead to notification patterns which make returning errors unnecessary.</p>
<p>While the result of applying CQRS to a given project is a more maintainable and performant code base, this simplicity and scalability require understanding the detailed business requirements and are not the result of any technical &#8220;best practice&#8221;. If anything, we can see a plethora of approaches to apparently similar problems being used together &#8211; data readers and domain models, one-way messaging and synchronous calls.</p>
<p>Although this blog post is over 3000 words (a record for this blog), I know that it doesn&#8217;t go into enough depth on the topic (it takes about 3 days out of the 5 of my <a href="http://www.udidahan.com/training/">Advanced Distributed Systems Design course</a> to cover everything in enough depth). Still, I hope it has given you the understanding of why CQRS is the way it is and possibly opened your eyes to other ways of looking at the design of distributed systems.</p>
<p>Questions and comments are most welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2009/12/09/clarified-cqrs/feed/</wfw:commentRss>
		<slash:comments>94</slash:comments>
		</item>
		<item>
		<title>MySpace Architecture Considered Expensive</title>
		<link>http://www.udidahan.com/2009/10/09/myspace-architecture-considered-expensive/</link>
		<comments>http://www.udidahan.com/2009/10/09/myspace-architecture-considered-expensive/#comments</comments>
		<pubDate>Fri, 09 Oct 2009 21:24:09 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Caching]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/?p=1126</guid>
		<description><![CDATA[I just finished listening to the Microsoft presentation on how they use the Concurrency &#038; Coordination Runtime (CCR) in MySpace (the stated largest web site running .NET).
Some interesting numbers were stated in the talk.

Tens of thousands to hundreds of thousands of requests per second
Over 3 thousand web servers
Over a thousand mid-tier servers

No wonder most big [...]]]></description>
			<content:encoded><![CDATA[<p>I just finished listening to the Microsoft <a href="http://channel9.msdn.com/shows/Communicating/CCR-at-MySpace/">presentation</a> on how they use the <a href="http://msdn.microsoft.com/en-us/library/bb905470.aspx">Concurrency &#038; Coordination Runtime (CCR)</a> in MySpace (the stated largest web site running .NET).</p>
<p>Some interesting numbers were stated in the talk.</p>
<ul>
<li>Tens of thousands to hundreds of thousands of requests per second</li>
<li>Over 3 thousand web servers</li>
<li>Over a thousand mid-tier servers</li>
</ul>
<p>No wonder most big web sites don&#8217;t run .NET. The Windows licenses would put them out of business.</p>
<p>Well, that is if you follow those same architectural practices.</p>
<p>I&#8217;ve written in the past of alternative architectural approaches that can scale to those levels at easily an order of magnitude less hardware (I think it&#8217;s closer to two OOMs) &#8211; here&#8217;s one of them on the topic of weather:</p>
<p><a href="http://www.udidahan.com/2008/12/29/building-super-scalable-web-systems-with-rest/">Building Super-Scalable Web Systems with REST</a>.</p>
<p>By the way, the client quoted in that post is now well above 60 million users with only small incremental increases in hardware. Oh, and their running everything on Windows and .NET. The question is not &#8220;can it scale&#8221;, but rather &#8220;how much will it cost to scale&#8221;.</p>
<p>Architecture pays itself back faster than ever in the Web 2.0 world.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2009/10/09/myspace-architecture-considered-expensive/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>MSDN Magazine Smart Client Article</title>
		<link>http://www.udidahan.com/2009/03/28/msdn-magazine-smart-client-article/</link>
		<comments>http://www.udidahan.com/2009/03/28/msdn-magazine-smart-client-article/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 19:16:39 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Smart Client]]></category>
		<category><![CDATA[WCF]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2009/03/28/msdn-magazine-smart-client-article/</guid>
		<description><![CDATA[
My article on “optimizing a large-scale Software+Services application” has been published in the April edition of MSDN Magazine.
Here’s a short excerpt:
“We had to juggle occasional connectivity, data synchronization, and publish/subscribe all at the same time. We learned that we couldn’t solve all problems either client-side or server-side, but rather that an integrated approach was needed [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://msdn.microsoft.com/en-us/magazine/dd569749.aspx"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="244" alt="image" src="http://www.udidahan.com/wp-content/uploads/MSDNMagazineSmartClientArticle_13E17/image.png" width="189" align="right" border="0" /></a></p>
<p>My article on “optimizing a large-scale Software+Services application” has been published in the April edition of MSDN Magazine.</p>
<p>Here’s a short excerpt:</p>
<blockquote><p>“We had to juggle occasional connectivity, data synchronization, and publish/subscribe all at the same time. We learned that we couldn’t solve all problems either client-side or server-side, but rather that an integrated approach was needed since any changes on one side needed corresponding changes on the other side.”</p></blockquote>
<p><a href="http://msdn.microsoft.com/en-us/magazine/dd569749.aspx">Continue reading… </a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2009/03/28/msdn-magazine-smart-client-article/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Messaging ROI</title>
		<link>http://www.udidahan.com/2009/02/22/messaging-roi/</link>
		<comments>http://www.udidahan.com/2009/02/22/messaging-roi/#comments</comments>
		<pubDate>Sun, 22 Feb 2009 10:12:59 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2009/02/22/messaging-roi/</guid>
		<description><![CDATA[There&#8217;s been some recent discussion as to the &#8220;cost&#8221; of messaging:
Greg Young asserts: 
&#8220;I believe that this shows there to be a rather negligible cost associated with the use of such a model. There is however a small cost, this cost however I believe only exists when one looks at the system in isolation.&#8221;

Ayende adds [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s been some recent discussion as to the &#8220;cost&#8221; of messaging:</p>
<p>Greg Young <a href="http://codebetter.com/blogs/gregyoung/archive/2009/02/09/cost.aspx">asserts</a>:<a href="http://codebetter.com/blogs/gregyoung/archive/2009/02/09/cost.aspx"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="79" alt="image" src="http://www.udidahan.com/wp-content/uploads/image54.png" width="79" align="right" border="0"></a> </p>
<blockquote><p>&#8220;I believe that this shows there to be a rather negligible cost associated with the use of such a model. There is however a small cost, this cost however I believe only exists when one looks at the system in isolation.&#8221;</p>
</blockquote>
<p>Ayende adds <a href="http://ayende.com/Blog/archive/2009/02/09/the-cost-of-messaging.aspx">his perspective</a>:<a href="http://ayende.com/Blog/archive/2009/02/09/the-cost-of-messaging.aspx"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="77" alt="image" src="http://www.udidahan.com/wp-content/uploads/image55.png" width="85" align="right" border="0"></a> </p>
<blockquote><p>&#8220;The cost of messaging, and a very real one, comes when you need to understand the system. In a system where message exchange is the form of communication, it can be significantly harder to understand what is going on.&#8221;</p>
</blockquote>
<p>Of course, both these intelligent fellows are right. The reason for the apparent disparity in viewpoints has to do with which part of the following graph you look at. Ayende zooms in on the left side:</p>
<p><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="225" alt="left graph" src="http://www.udidahan.com/wp-content/uploads/image56.png" width="404" border="0"> </p>
<p>As systems get larger, though, the only way to understand them is by working at higher levels of abstraction. That&#8217;s where messaging really shines, as the incremental complexity remains the same by maintaining the same modularity as before:</p>
<p><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="232" alt="full graph" src="http://www.udidahan.com/wp-content/uploads/image57.png" width="404" border="0"> </p>
<p>In Ayende&#8217;s post, he follows the design I described a while back on using messaging for user management and <a href="http://www.udidahan.com/2007/11/10/asynchronous-high-performance-login-for-web-farms/">login for a high-scale web scenario</a>. In his comments, he agrees with the above stating:</p>
<blockquote><p>&#8220;I certainly think that a similar solution using RPC would be much more complex and likely more brittle.&#8221;</p>
</blockquote>
<p>I feel quite conservative in saying the most enterprise solutions fall on the right side of the intersection in the graph.</p>
<p>That being said, don&#8217;t underestimate the learning curve developers go through with messaging. While the mechanics are similar, the mindset is very different. Think about it like this:<a href="http://www.udidahan.com/wp-content/uploads/image58.png"><img style="border-right: 0px; border-top: 0px; margin: 5px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="100" alt="image" src="http://www.udidahan.com/wp-content/uploads/image-thumb36.png" width="80" align="right" border="0"></a> </p>
<blockquote><p>You&#8217;ve driven a car for years in the US. It&#8217;s practically second nature. Then you fly to the UK, rent a car, and all of a sudden, your brain is in meltdown. (or vice versa for those going from the UK to the US)</p>
</blockquote>
<h3>Summary</h3>
<p>If you are going down the messaging route, please be aware that there are shades of gray there as well. You don&#8217;t <em>have</em> to implement your user management and login the way I outlined in my post if you don&#8217;t require such high levels of scalability, but even lower levels of scalability can benefit from messaging.</p>
<p>Just as there isn&#8217;t a single correct design for non-messaging solutions, the same is true for those using messaging. Finding the right balance is tricky, and critical. </p>
<p>When the code is simple in every part of the system, and the asynchronous interactions are what provide for the necessary complexity the problem domain requires, that&#8217;s when you know you&#8217;ve got it just right.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2009/02/22/messaging-roi/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Building Super-Scalable Web Systems with REST</title>
		<link>http://www.udidahan.com/2008/12/29/building-super-scalable-web-systems-with-rest/</link>
		<comments>http://www.udidahan.com/2008/12/29/building-super-scalable-web-systems-with-rest/#comments</comments>
		<pubDate>Mon, 29 Dec 2008 21:38:58 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Caching]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/12/29/building-super-scalable-web-systems-with-rest/</guid>
		<description><![CDATA[I&#8217;ve been consulting with a client who has a wildly successful web-based system, with well over 10 million users and looking at a tenfold growth in the near future. One of the recent features in their system was to show users their local weather and it almost maxed out their capacity. That raised certain warning [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been consulting with a client who has a wildly successful web-based system, with well over 10 million users and looking at a tenfold growth in the near future. One of the recent features in their system was to show users their local weather and it almost maxed out their capacity. That raised certain warning flags as to the ability of their current architecture to scale to the levels that the business was taking them.</p>
<p> <center><img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="139" alt="danger" src="http://www.udidahan.com/wp-content/uploads/image51.png" width="408" border="0"></center>
</p>
<h3>On Web 2.0 Mashups</h3>
<p>One would think that sites like Weather.com and friends would be the first choice for implementing such a feature. Only thing is that they were strongly against being mashed-up Web 2.0 style on the client &#8211; they had enough scalability problems of their own. Interestingly enough (or not), these partners were quite happy to publish their weather data to us and let us handle the whole scalability issue.</p>
<h3>Implementation 1.0</h3>
<p>The current implementation was fairly straightforward &#8211; client issues a regular web service request to the GetWeather webmethod, the server uses the user&#8217;s IP address to find out their location, then use that location to find the weather for that location in the database, and return that to the user. Standard fare for most dynamic data and the way most everybody would tell you to do it.</p>
<p>Only thing is that it scales like a dog.</p>
<h3>Add Some Caching</h3>
<p>The first thing you do when you have scalability problems and the database is the bottleneck is to cache, well, that&#8217;s what everybody says (same everybody as above).</p>
<p>The thing is that holding all the weather of the entire globe in memory, well, takes a lot of memory. More than is reasonable. In which case, there&#8217;s a fairly decent chance that a given request can&#8217;t be served from the cache, resulting in a query to the database, an update to the cache, which bumps out something else, in short, not a very good hit rate.</p>
<p>Not much bang for the buck.</p>
<p>If you have a single datacenter, having a caching tier that stores this data is possible, but costly. If you want a highly available, business continuity supportable, multi-datacenter infrastructure, the costs add up quite a bit quicker &#8211; to the point of not being cost effective (&#8221;You need HOW much money for weather?! We&#8217;ve got dozens more features like that in the pipe!&#8221;)</p>
<p>What we can do is to tell the client we&#8217;re responding to that they can cache the result, but that isn&#8217;t close to being enough for us to scale.</p>
<h3>Look at the Data, Leverage the Internet</h3>
<p>When you find yourself in this sort of situation, there&#8217;s really only one thing to do:</p>
<div style="border-right: black 1px solid; border-top: black 1px solid; float: right; margin-left: 5px; border-left: black 1px solid; width: 220px; border-bottom: black 1px solid; background-color: beige">
<div style="font-size: 12px; margin: 5px">
<p>In order to save on bandwidth, the most precious commodity of the internet, the various ISPs and backbone providers cache aggressively. In fact, HTTP is designed exactly for that. </p>
<p>If user A asks for some html page, the various intermediaries between his browser and the server hosting that page will cache that page (based on HTTP headers). When user B asks for that same page, and their request goes through one of the intermediaries that user A&#8217;s request went through, that intermediary will serve back its cached copy of the page rather than calling the hosting server.</p>
<p>Also, users located in the same geographic region by and large go through the same intermediaries when calling a remote site.</p>
</div>
</div>
<p>Leverage the Internet</p>
<p>The internet is the biggest, most scalable data serving infrastructure that mankind was lucky enough to have happen to it. However, in order to leverage it &#8211; you need to understand your data and how your users use it, and finally align yourself with the way the internet works.</p>
<p>Let&#8217;s say we have 1,000 users in London. All of them are going to have the same weather. If all these users come to our site in the period of a few hours and ask for the weather, they all are going to get the exact same data. The thing is that the response semantics of the GetWeather webmethod must prevent intermediaries from caching so that users in Dublin and Glasgow don&#8217;t get London weather (although at times I bet they&#8217;d like to).</p>
<h3>REST Helps You Leverage the Internet</h3>
<p>Rather than thinking of getting the weather as an operation/webmethod, we can represent the various locations weather data as explicit web resources, each with its own URI. Thus, the weather in London would be <strong>http://weather.myclient.com/UK/London</strong>.</p>
<p>If we were able to make our clients in London perform an HTTP GET on <strong>http://weather.myclient.com/UK/London</strong> then we could return headers in the HTTP response telling the intermediaries that they can cache the response for an hour, or however long we want.</p>
<p>That way, after the first user in London gets the weather from our servers, all the other 999 users will be getting the same data served to them from one of the intermediaries. Instead of getting hammered by millions of requests a day, the internet would shoulder easily 90% of that load making it much easier to scale. <a href="http://www.perkel.com/politics/gore/internet.htm">Thanks Al</a>.</p>
<p>This isn&#8217;t a &#8220;cheap trick&#8221;. While being straight forward for something like weather, understanding the nature of your data and intelligently mapping that to a URI space is critical to building a scalable system, and reaping the benefits of REST.</p>
<h3>What&#8217;s left?</h3>
<p>The only thing that&#8217;s left is to get the client to know which URI to call. A simple matter, really. </p>
<p>When the user logs in, we perform the IP to location lookup and then write a cookie to the client with their location (UK/London). That cookie then stays with the user saving us from having to perform that IP to location lookup all the time. On subsequent logins, if the cookie is already there, we don&#8217;t do the lookup.</p>
<blockquote><p>BTW, we also show the user &#8220;you&#8217;re in London, <font color="#0000ff"><strong><u>aren&#8217;t you</u></strong></font>?&#8221; with the link allowing the user to change their location, which we then update the cookie with and change the URI we get the weather from.</p>
</blockquote>
<h3>In Closing</h3>
<p>While web services are great for getting a system up and running quickly and interoperably, scalability often suffers. Not so much as to be in your face, but after you&#8217;ve gone quite a ways and invested a fair amount of development in it, you find it standing between you and the scalability you seek.</p>
<p>Moving to REST is not about turning on the &#8220;make it restful&#8221; switch in your technology stack (ASP.NET MVC and WCF, I&#8217;m talking to you). Just like with databases there is no &#8220;make it go fast&#8221; switch &#8211; you really do need to understand your data, the various users access patterns, and the volatility of the data so that you can map it to the &#8220;right&#8221; resources and URIs.</p>
<p>If you do walk the RESTful path, you&#8217;ll find that the scalability that was once so distant is now within your grasp.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/12/29/building-super-scalable-web-systems-with-rest/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>SOA, REST, and Pub/Sub</title>
		<link>http://www.udidahan.com/2008/12/15/soa-rest-and-pubsub/</link>
		<comments>http://www.udidahan.com/2008/12/15/soa-rest-and-pubsub/#comments</comments>
		<pubDate>Mon, 15 Dec 2008 08:34:24 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[Integrated Simplicity]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/12/15/soa-rest-and-pubsub/</guid>
		<description><![CDATA[From Integrated Simplicity:
 
The question of how web-based (or 3rd party) consumers can work with pub/sub based services comes up a lot.
Many developers are used to implementing web services exposing methods on them like GetAllCustomers.
When moving to pub/sub and other more loosely coupled messaging patterns, developers look to implement the same pattern, opting for something [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://www.IntegratedSimplicity.com">Integrated Simplicity</a>:</p>
<p><a href="http://www.udidahan.com/wp-content/uploads/image49.png"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px; border-left: 0px; border-bottom: 0px" height="277" alt="SOA &amp; Web" src="http://www.udidahan.com/wp-content/uploads/image-thumb34.png" width="526" border="0"></a> </p>
<p>The question of how web-based (or 3rd party) consumers can work with pub/sub based services comes up a lot.</p>
<p>Many developers are used to implementing web services exposing methods on them like GetAllCustomers.</p>
<p>When moving to pub/sub and other more loosely coupled messaging patterns, developers look to implement the same pattern, opting for something like duplex GetCustomersRequest and GetCustomersResponse. The reasoning is simple and straightforward &#8211; it is difficult to push data over the web to consumers.</p>
<p>However, there are still ways to disconnect the preparation of the data from its usage thus gaining many of the advantages of pub/sub.</p>
<p>By employing REST principles and modelling our customer list as an explicit resource, web-based consumers would simply perform regular HTTP GET operations on the URI to get the list of customers.</p>
<p>The resource itself could be a simple XML file &#8211; it wouldn&#8217;t need to be dynamic at all.</p>
<p>You can get all the scalability benefits of pub/sub for web based consumers. All you need is a bit of REST <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/12/15/soa-rest-and-pubsub/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Reliability, Availability, and Scalability</title>
		<link>http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/</link>
		<comments>http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/#comments</comments>
		<pubDate>Sat, 15 Nov 2008 21:20:20 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/</guid>
		<description><![CDATA[The great people at IASA have made the recording for my webcast available online.
You can find it here.
The slides can be found here.
I also gave this talk at TechEd Barcelona and wanted to thank the attendee who posted this comment:

“You’ve done it again. Everytime I attend a session of yours I leave the room with [...]]]></description>
			<content:encoded><![CDATA[<p>The great people at IASA have made the recording for my <a href="http://www.udidahan.com/2008/09/25/presentation-reliability-scalability-and-availability/">webcast</a> available online.</p>
<p>You can find it <a href="http://www.iasahome.org/flash/global/udiras.wmv">here</a>.<br />
The slides can be found <a href="http://cid-c8ad44874742a74d.skydrive.live.com/self.aspx/Blog/Reliability|_Availability|_Scalability.pdf">here</a>.</p>
<p>I also gave this talk at TechEd Barcelona and wanted to thank the attendee who posted this comment:</p>
<blockquote><p>
<b>“You’ve done it again. Everytime I attend a session of yours I leave the room with new insights and inspiration on how to improve my software…”</b>
</p></blockquote>
<p>You made my day.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An Answer of Scale</title>
		<link>http://www.udidahan.com/2008/08/13/an-answer-of-scale/</link>
		<comments>http://www.udidahan.com/2008/08/13/an-answer-of-scale/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 11:22:27 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/08/13/an-answer-of-scale/</guid>
		<description><![CDATA[To the question of scale Ayende brings up, I thought I&#8217;d tap my concept map.
First of all, I wanted to address the relationship between various topics related to scalability:
 
And on the connection between scalability and throughput:
&#160; 
The important message here is that the scalability of a system is a cost function that gives throughput [...]]]></description>
			<content:encoded><![CDATA[<p>To the <a href="http://ayende.com/Blog/archive/2008/08/11/A-question-of-ScaleAgain.aspx">question of scale</a> Ayende brings up, I thought I&#8217;d tap my <a href="http://www.udidahan.com/2008/08/04/distributed-systems-concept-map/">concept map</a>.</p>
<p>First of all, I wanted to address the relationship between various topics related to scalability:</p>
<p><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="305" alt="performance topics" src="http://www.udidahan.com/wp-content/uploads/image40.png" width="550" border="0" /> </p>
<p>And on the connection between scalability and throughput:</p>
<p>&#160;<img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="334" alt="scalability topics" src="http://www.udidahan.com/wp-content/uploads/image41.png" width="550" border="0" /> </p>
<p>The important message here is that the scalability of a system is a cost function that gives throughput as a function of recurring costs and one time costs &#8211; servers and other hardware, and the join of buy &amp; build:</p>
<blockquote><p>Did you write your own <a href="http://ayende.com/Blog/archive/2008/08/09/Patterns-for-using-Distributed-Hash-Tables-Conclusion.aspx">locking/transaction mechanism</a> on top of an open source distributed cache or did you buy a license for a <a href="http://www.udidahan.com/2007/05/05/using-spaces-with-web-services/">space-based technology</a>?</p>
</blockquote>
<p>Also, don&#8217;t forget that people need to administer all the servers that you have. Those people cost money (easily100K per year). Maybe, because you haven&#8217;t invested in management or monitoring tools you need one person for every two servers. This will influence the breakdown of up front costs and recurring costs. Also, the level of availability you require will impact this as well.</p>
<p>In my experience, architects don&#8217;t consider often enough the operations environment in their &quot;scalability calculations&quot;.</p>
<p>What this means is that there&#8217;s no such thing as technically  &quot;not being able to scale&quot;.</p>
<p>Rather, that the cost (up front + recurring) of supporting higher throughput grows faster than the function of revenue per user/request/whatever.</p>
<p>Sometimes, the solution is just to find ways to make more money per customer.</p>
<p>For more technical solutions, take a look at <a href="http://www.udidahan.com/2007/12/12/scalability-you-wish-youre-gonna-need-it/">the difference between capacity and scalability</a> and how <a href="http://www.udidahan.com/2007/02/02/queues-scalability-availability/">the competing consumer pattern helps scale out</a>.</p>
<p>Scalability, it&#8217;s all about the money.</p>
<p>&#8211;</p>
<p>Oh, I almost forgot, I also had a great conversation with Carl and Richard about scaling web sites that&#8217;s <a href="http://www.dotnetrocks.com/default.aspx?showNum=367">now up</a> on the .NET Rocks site. Enjoy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/08/13/an-answer-of-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling Long Running Web Services</title>
		<link>http://www.udidahan.com/2008/07/30/scaling-long-running-web-services/</link>
		<comments>http://www.udidahan.com/2008/07/30/scaling-long-running-web-services/#comments</comments>
		<pubDate>Wed, 30 Jul 2008 12:06:38 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/07/30/scaling-long-running-web-services/</guid>
		<description><![CDATA[While I was at TechEd USA I had an attendee, Will, come up and ask me an interesting question about how to handle web service calls that can take a long time to complete. He has a number of these kinds of requests ranging from computationally intensive tasks to those requiring sifting through large amounts [...]]]></description>
			<content:encoded><![CDATA[<p>While I was at TechEd USA I had an attendee, Will, come up and ask me an interesting question about how to handle web service calls that can take a long time to complete. He has a number of these kinds of requests ranging from computationally intensive tasks to those requiring sifting through large amounts of data. What Will was having problems with was preventing too many of these resource-intensive tasks from running concurrently (causing increased memory usage, paging, and eventually the server becoming unavailable). </p>
<p>For comparison later, here&#8217;s a diagram showing the trivial interaction:</p>
<p><a href="http://www.udidahan.com/wp-content/uploads/image30.png"><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="354" alt="image" src="http://www.udidahan.com/wp-content/uploads/image-thumb26.png" width="484" border="0"></a> </p>
<p>One solution that he&#8217;d tried was to set up the web server to throttle those requests and keep a much smaller maximum thread-pool size for that application pool. The unfortunate side effect of that solution was that clients would get &#8220;turned away&#8221; by a not-so-pleasant Connection Refused exception.</p>
<p>Will had been to my <a href="http://www.udidahan.com/2008/06/06/web-scalability-slides-and-code/">web scalability talk</a> and was curious about how I was using queues behind my web services. I&#8217;ve also heard this question from people just getting started with <a href="http://www.nServiceBus.com">nServiceBus</a> when looking at the Web Services Bridge sample. Here&#8217;s the code that&#8217;s in the sample and in just a second I&#8217;ll tell you why you shouldn&#8217;t do this:</p>
<p><!-- code formatted by http://manoli.net/csharpformat/ --></p>
<div class="csharpcode" style="overflow: scroll; width: 95%">
<pre class="alt">[WebMethod]</pre>
<pre><span class="kwrd">public</span> ErrorCodes Process(Command request)</pre>
<pre class="alt">{</pre>
<pre>    <span class="kwrd">object</span> result = ErrorCodes.None;</pre>
<pre class="alt">&nbsp;</pre>
<pre>    IAsyncResult sync = Global.Bus.Send(request).Register(</pre>
<pre class="alt">        <span class="kwrd">delegate</span>(IAsyncResult asyncResult)</pre>
<pre>          {</pre>
<pre class="alt">              CompletionResult completionResult = asyncResult.AsyncState <span class="kwrd">as</span> CompletionResult;</pre>
<pre>              <span class="kwrd">if</span> (completionResult != <span class="kwrd">null</span>)</pre>
<pre class="alt">              {</pre>
<pre>                  result = (ErrorCodes) completionResult.ErrorCode;</pre>
<pre class="alt">              }</pre>
<pre>          },</pre>
<pre class="alt">          <span class="kwrd">null</span></pre>
<pre>          );</pre>
<pre class="alt">&nbsp;</pre>
<pre>    sync.AsyncWaitHandle.WaitOne();</pre>
<pre class="alt">&nbsp;</pre>
<pre>    <span class="kwrd">return</span> (ErrorCodes)result;</pre>
<pre class="alt">}</pre>
</div>
<p>Let me repeat, this is demo-ware. Do not use this in production.</p>
<p>What&#8217;s happening is that in this web service call we&#8217;re putting a message in a queue for some other process/machine to process. When that processing is complete, we&#8217;ll get a message back in our local queue (which you don&#8217;t see) which is correlated to our original request, firing off the callback. We block the web method from completing (using the WaitOne call) thus keeping the HTTP connection to the client open.</p>
<p>The problem here is that we&#8217;re wasting resources (the HTTP connection and the thread) while waiting for a response which, as already mentioned, can take a long time. In B2B or other server to server integration environments there are all sorts of middleware solutions that help us solve these problems, however in Will&#8217;s case browsers needed to interact with this web service. All he had was HTTP.</p>
<h4>HTTP Solutions</h4>
<p>Another attendee who was listening in (sorry I don&#8217;t remember your name) said that he was solving similar problems using polling but that he was having scalability problems as well.</p>
<p>What often surprises my clients when we deal with these same issues is that I <em>do</em> suggest a polling based solution, but one that still uses messaging, and this is what I described to Will:</p>
<p>Since we can&#8217;t actually push a message to a browser over HTTP from our server when processing is complete, the browser itself will be responsible for pulling the response. We still don&#8217;t want to leave costly resources like HTTP connections open a long time, however if the browser is going to polling for a response, we&#8217;ll need some way to correlate those following requests with the original one. What we&#8217;re going to do is use the <a href="http://www.smallmemory.com/almanac/PyaraliEtc98.html">Asynchronous Completion Token</a> pattern, and later I&#8217;ll show how to optimize it for web server technology. </p>
<h4>Basic Polling</h4>
<p><a href="http://www.udidahan.com/wp-content/uploads/image31.png"><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="453" alt="image" src="http://www.udidahan.com/wp-content/uploads/image-thumb27.png" width="574" border="0"></a> </p>
<p>When the browser calls the web service, the web service will generate a Guid, put it in the message that it sends for processing, and return that guid to the browser. When the processing of the message is complete, the result will be written to some kind of database, indexed by that guid. The browser will periodically call another web method, passing in the guid it previously received as a parameter. That web method will check the database for a response using the guid, returning null if no response is there. If the browser receives a null response, it will &#8220;sleep&#8221; a bit and then retry.</p>
<p>One of the problems with this solution is that polling uses up server resources &#8211; both on the web server and our DB; threads, memory, DB connections. A better solution would decrease the resource cost of the polling. Let&#8217;s use the fundamental building blocks of the web to our advantage &#8211; HTTP GET and resources:</p>
<h4>REST-full Polling</h4>
<p>Instead of using a guid to represent the id of the response, let&#8217;s consider the REST principle of &#8220;everything&#8217;s a resource&#8221;. That would mean that the response itself would be a resource. And since every resource has a URI, we might as well use that URI in lieu of the guid. So, instead of our web service returning a guid, let&#8217;s return a URI &#8211; something like:</p>
<p><a href="http://www.acme.com/responses/88ec5359-a5d8-4491-a570-3bfe469f3a64.xml">http://www.acme.com/responses/88ec5359-a5d8-4491-a570-3bfe469f3a64.xml</a></p>
<p>As you can see, the guid is still there. So, what&#8217;s different?</p>
<p><a href="http://www.udidahan.com/wp-content/uploads/image32.png"><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="486" alt="image" src="http://www.udidahan.com/wp-content/uploads/image-thumb28.png" width="574" border="0"></a> </p>
<p>What&#8217;s different is that instead of having the processing code write the response to the database, it writes it to a resource. This can be done by writing some XML to a file on the SAN in the case of a webfarm. Also, the browser wouldn&#8217;t need to call a web service to get the response, it would just do an HTTP GET on the URI. If the it gets an HTTP 404, it would sleep and retry as before. The reason that the SAN is needed is that, as the browser polls, it may have its requests arrive at various web servers so the response needs to be accessible from any one of them. </p>
<blockquote>
<p>Just as an aside, it would be better to free the processing node as quickly as possible and have something else write the response to the SAN. That would be done simply by sending a message from the processing node that would be handled by a different node that all it did was write responses to disk.</p>
</blockquote>
<p>The reason that the URI makes a difference is that serving &#8220;static&#8221; resources is something that web servers do <em>extremely efficiently</em> without requiring any managed resources (like ASP.NET threads). That&#8217;s a big deal.</p>
<p>We&#8217;re still using HTTP connections for the polling but that&#8217;s something whose effect can be mitigated to a certain degree.</p>
<h4>Timed REST-full Polling<a href="http://www.udidahan.com/wp-content/uploads/image33.png"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="221" alt="image" src="http://www.udidahan.com/wp-content/uploads/image-thumb29.png" width="134" align="right" border="0"></a> </h4>
<p>Since various requests can take varying amounts of time to process, it&#8217;s difficult to know at what rate the browser should poll. So, why don&#8217;t we have the web service <em>tell it</em>. As a part of the response to the original web service call, instead of just returning a URI, we could also return the polling interval &#8211; 1 second, 5 seconds, whatever is appropriate for the type of request. This value could easily be configurable [RequestType, PollingInterval].</p>
<p>An even more advanced solution would allow you to change these values dynamically. The advantage that would be gained would be that your operations team could better manage the load on your servers. When a large number of users are hitting your system, you could decrease the rate at which your servers would be polled, thus leaving more HTTP connections for other users. </p>
<h4>Scaling and Adaptive Polling</h4>
<p>You&#8217;d probably also want to scale out the number of processing nodes behind your queue. The nice thing is that you could change the polling interval as you scale the various processing nodes per request type providing better responsiveness for the more critical requests. Once we add virtualization, things get really fun:</p>
<p>We had separate queues per request type, so that we could easily see the load we were under for each type of request. That way, we could scale out the processing nodes per request type as well as change the polling interval. By virtualizing our processing nodes, and writing scripts to monitor queue sizes, we had those scripts automatically provisioning (and de-provisioning) nodes as well as changing the polling interval of the browsers.</p>
<p>This had the enormous benefit of the system automatically shifting resources to provide the appropriate relative allocation for the current load as its macroscopic make-up changed.</p>
<h4>Summary</h4>
<p>Will was well-pleased with the solution which, although more complicated than what he had originally tried, was flexible enough to meet his needs. As opposed to pure server-based solutions, here we make more use of the browser (writing our own Javascript) instead of putting our faith in some Ajax-y library. That&#8217;s not to say that you couldn&#8217;t wrap this up into a library &#8211; in essence, it is a kind of messaging transport for browser to server communication allowing duplex conversations.</p>
<p>In fact, what could be done is to return multiple responses to the browser over a long period of time. In the response that comes back to the browser could be an additional URI where the next response will be. This can be used for reporting the status of a long running process, paging results, and in many other scenarios.</p>
<p>And, one parting thought, could this not be used for all browser to web service communication?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/07/30/scaling-long-running-web-services/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>Durable Messaging Dilemmas</title>
		<link>http://www.udidahan.com/2008/07/17/durable-messaging-dilemmas/</link>
		<comments>http://www.udidahan.com/2008/07/17/durable-messaging-dilemmas/#comments</comments>
		<pubDate>Thu, 17 Jul 2008 22:18:47 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/07/17/durable-messaging-dilemmas/</guid>
		<description><![CDATA[I&#8217;ve received some great feedback on my MSDN article and some really great questions that I think more people are wondering about, so I think I&#8217;ll try to do a post per question and see how that goes.
Libor asks:
&#8220;Would you recommend using durable messaging for systems where there are similar requirements with respect to data [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve received some great feedback on my <a href="http://msdn.microsoft.com/en-us/magazine/cc663023.aspx">MSDN article</a> and some really great questions that I think more people are wondering about, so I think I&#8217;ll try to do a post per question and see how that goes.<img style="margin: 10px 0px 10px 10px" height="175" src="http://www.ewashtenaw.org/government/departments/cmhpsm/provider_information/Provider%20Training%20Resources/images/scales_of_justice.jpg" width="143" align="right"></p>
<p>Libor asks:</p>
<blockquote><p>&#8220;Would you recommend using durable messaging for systems where there are similar requirements with respect to data reliability as you had – ie. not losing any messages? If so, then why didn&#8217;t the final version of your solution use it? If not, can you explain why?&#8221;</p>
</blockquote>
<p>The answer is, as always, it depends, but here&#8217;s on what it depends:</p>
<p>When designing a system, we need to take a good, hard look at how we manage state, and what properties that state has. In a system of reasonable size we can expect various families of state with respect to their business value, data volatility, and fault-tolerance window. Each family needs to be treated differently. While durable messaging may be suitable for one, it may be overkill or underkill for another.</p>
<p>So, here&#8217;s what we&#8217;re going to be looking at:</p>
<ol>
<li>Business Value</li>
<li>Data Volatility</li>
<li>Fault-Tolerance Window</li>
</ol>
<h4>Business Value</h4>
<p>When talking about business value, I want to talk about what it means &#8220;not losing any messages&#8221;. The question is under what conditions will the messages not be lost, or rather, what are the threshold conditions where messages may start getting lost. If all our datacenters are nuked, we will lose data. It&#8217;s likely the business is OK with that (as much as can be expected under those circumstances). If a single server goes down, it&#8217;s likely the business would not be OK with losing messages containing financial data. However if a message requesting the health of a server were to get lost under those same conditions, that would probably be alright. In other words, what does that message represent in business terms.</p>
<h4>Data Volatility</h4>
<p><img style="margin: 0px 10px 0px 0px" height="150" src="http://www.classicdriver.com/upload/images/_de/3161/img02.jpg" width="270" align="left">Data volatility also has an impact. Let&#8217;s say that we&#8217;re building a financial trading system. The time that it takes us to respond to an event (message) that the cost of a certain financial instrument has changed, and the message that we send requesting to buy that security is critical. Let&#8217;s say that has to be done in under 10ms. Now, some failure has occurred preventing our message from reaching its destination for 20ms. What should we do with that message? Should we keep it around, making sure it doesn&#8217;t get lost? Not in this domain. On the contrary, that message should be thrown away as its &#8220;business lifetime&#8221; has been exceeded. Furthermore, even during that original period of 10ms, the use of durable messaging may make it close to impossible to maintain our response times.</p>
<h4>Fault-Tolerance Window</h4>
<p>These two topics feed into the third and more architectural one &#8211; fault-tolerance window: what period of time do we require fault tolerance, and with respect to how many (and what kind of) faults? This will lead us into an analysis of to how many machines do we need to copy a message before we release the calling thread. We&#8217;d also look at in which datacenters those machines reside. This will also impact (or be impacted by) the kinds of links we have to these datacenters if we want to maintain response times. These numbers will need to change when the system identifies a disaster &#8211; degrading itself to a lower level of fault-tolerance after a hurricane knocks out a datacenter, and returning to normal once it comes back up.</p>
<h4>Re-Evaluating Durable Messaging</h4>
<p>Durable messaging may be used at various points in each part of the solution, but we need to look at message size, the rate those messages are being written to disk, how fast the disk is, how much available disk we have (so we don&#8217;t make things worse in the case of degraded service), etc. Companies like Amazon also take into account disk failure rates, replacement rates (disks aren&#8217;t replaced <em>immediately</em> you know), and many other factors when making these decisions<img style="margin: 10px 0px 10px 10px;" height="231" alt="image" src="http://udidahan.weblogs.us/wp-content/uploads/image-thumb25.png" width="143" align="right" border="0"> </p>
<h4>Summary</h4>
<p>Our job as architects when designing the system is to find that cost-benefit balance for the various parts of the system according to these very applicative parameters. No, it&#8217;s not easy. No, cloud computing will not magically solve all of this for us. But, we are getting more technical tools to work with, operations staff is getting better at working with us in the design phase, and our thought processes more rigorous in dealing with the scary conditions of the real world. </p>
<p>To your question, Libor, as to why we didn&#8217;t eventually use durable messaging in our solution, the answer is that we solved the overall state management problem by setting up an applicative protocol with our partners which was resilient in the face of faults by using idempotent messages that could be resent as many times as necessary. You can read more about it <a href="http://udidahan.weblogs.us/2008/04/10/scalability-article-up-on-infoq/">here</a>. This solution isn&#8217;t viable for other kinds of interactions but was just what we needed to get the job done.</p>
<p>Hope that helps.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/07/17/durable-messaging-dilemmas/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Make WCF and WF as Scalable and Robust as NServiceBus</title>
		<link>http://www.udidahan.com/2008/06/30/make-wcf-and-wf-as-scalable-and-robust-as-nservicebus/</link>
		<comments>http://www.udidahan.com/2008/06/30/make-wcf-and-wf-as-scalable-and-robust-as-nservicebus/#comments</comments>
		<pubDate>Mon, 30 Jun 2008 14:47:08 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[WCF]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/06/30/make-wcf-and-wf-as-scalable-and-robust-as-nservicebus/</guid>
		<description><![CDATA[This topic is getting more play as more people are using WCF and WF in real-world scenarios, so I thought I&#8217;d pull the things that I&#8217;ve been watching in this space together:
Reliability 
Locking in SqlWorkflowPersistenceService (via Ron Jacobs) where, if you want predictable persistence (MS: &#8216;none of our customers asked for this to be easy&#8217;), [...]]]></description>
			<content:encoded><![CDATA[<p>This topic is getting more play as more people are using WCF and WF in real-world scenarios, so I thought I&#8217;d pull the things that I&#8217;ve been watching in this space together:</p>
<h3>Reliability<a href="http://udidahan.weblogs.us/wp-content/uploads/doctor1.png"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="244" alt="doctor" src="http://udidahan.weblogs.us/wp-content/uploads/doctor-thumb1.png" width="225" align="right" border="0"></a> </h3>
<p><a href="http://blogs.msdn.com/rjacobs/archive/2008/06/27/locking-in-sqlworkflowpersistenceservice.aspx">Locking in SqlWorkflowPersistenceService</a> (via Ron Jacobs) where, if you want predictable persistence (MS: &#8216;none of our customers asked for this to be easy&#8217;), you need to use a custom activity (which Ron was kind enough to supply).</p>
<blockquote><p>&#8220;Given what I learned today I&#8217;d have to say that I&#8217;d be very careful about using workflows with an optimistic locking.&nbsp; Detecting these types of situations is not that simple.&#8221;</p>
</blockquote>
<p>Let&#8217;s think about that. If we&#8217;re doing pessimistic locking, we get into the problem of, if a host restarts (as the result of a critical windows patch or some other unexpected occurrence), that the workflow won&#8217;t be able to be handled by any other host in the meantime (you didn&#8217;t care so much about your SLA, did you?).</p>
<p>Luckily, someone&#8217;s come up with a hack that works around this robustness problem in <a href="http://www.topxml.com/rbnews/Orchestration---Workflow/re-78382_Scaleable-Workflow-Persistence-and-Ownership.aspx">Scalable Workflow Persistence and Ownership</a>.</p>
<blockquote><p>&#8220;So this code will attempt to load workflow instances with expired locks every second. Is it a hack? Yes. But without one of two things in the SqlWorkflowPersistenceService its the sort of code you have to write to pick up unlocked workflow instances robustly.&#8221;</p>
</blockquote>
<p>This will seriously churn the table used to store your workflows, decreasing performance of workflows that haven&#8217;t timed out. Oh well.</p>
<h3>Testability</h3>
<p><a href="http://blogs.msdn.com/ploeh/archive/2008/06/26/implementing-wcf-services-without-referencing-wcf.aspx">Implementing WCF Services without Referencing WCF</a> (via Mark Seemann): </p>
<blockquote><p>&#8220;More than a year ago, I wrote my first post on <a href="http://blogs.msdn.com/ploeh/archive/2006/12/03/UnitTestingWCFServices.aspx">unit testing WCF services</a>. One of my points back then was that you have to be careful that the service implementation doesn&#8217;t use any of the services provided by the WCF runtime environment (if you want to keep the service testable). As soon as you invoke something like <a href="http://msdn.microsoft.com/en-us/library/system.servicemodel.operationcontext.current.aspx">OperationContext.Current</a>, your code is not going to work in a unit testing scenario, but only when hosted by WCF.&#8221;</p>
</blockquote>
<p>After pointing out some of the more basic difficulties in testability a straightforward WCF implementation brings, Mark turns the heat up in his follow-up post, <a href="http://blogs.msdn.com/ploeh/archive/2008/06/27/modifying-behavior-of-wcf-free-service-implementations.aspx">Modifying Behavior of WCF-Free Service Implementations</a>:</p>
<blockquote><p>&#8220;Perhaps you need to control the service&#8217;s <a href="http://msdn.microsoft.com/en-us/library/system.servicemodel.servicebehaviorattribute.concurrencymode.aspx">ConcurrencyMode</a>, or perhaps you need to set <a href="http://msdn.microsoft.com/en-us/library/system.servicemodel.servicebehaviorattribute.usesynchronizationcontext.aspx">UseSynchronizationContext</a>. These options are typically controlled by the <a href="http://msdn.microsoft.com/en-us/library/system.servicemodel.servicebehaviorattribute.aspx">ServiceBehaviorAttribute</a>. You may also want to provide an <a href="http://msdn.microsoft.com/en-us/library/system.servicemodel.dispatcher.iinstanceprovider.aspx">IInstanceProvider</a> via a custom attribute that implements <a href="http://msdn.microsoft.com/en-us/library/system.servicemodel.description.icontractbehavior.aspx">IContractBehavior</a>. However, you can&#8217;t set these attributes on the service implementation itself, since it mustn&#8217;t have a reference to System.ServiceModel.&#8221;</p>
</blockquote>
<p>Wow &#8211; all the things required to make a WCF service scalable and thread-safe make it difficult to test. In the end, we&#8217;re beginning to see how many hoops we have to go through in order to get separation of concerns, but until we can take all this and get it out of our application code, it&#8217;s an untenable solution. I hope Mark will continue with this series, if only so I can take the framework that might grow out of it and use it as a generic WCF transport for NServiceBus.</p>
<h3>Comparison<a href="http://udidahan.weblogs.us/wp-content/uploads/apples-and-oranges.jpg"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="244" alt="apples and oranges" src="http://udidahan.weblogs.us/wp-content/uploads/apples-and-oranges-thumb.jpg" width="184" align="right" border="0"></a> </h3>
<p>After the <a href="http://samgentile.com/blogs/samgentile/archive/2008/05/21/response-to-nservicebus-performance.aspx">Neuron-NServiceBus comparison</a> that Sam and I had, we talked some more. After going through some of the rational and thinking, Sam even <a href="http://samgentile.com/blogs/samgentile/archive/2008/06/24/looking-at-nservicebus-added-to-tonight-s-presentation.aspx">put nServiceBus into his WCF-Neuron comparison talk</a>. Sam had this to say about nServiceBus:</p>
<blockquote><p>&#8220;The bottom line is: I like what I see. Although it&#8217;s a framework, not an ESB product like Neuron, it&#8217;s a powerful framework that takes the right approach on SOA and enforces a paradigm of reliable one-way, *non-blocking* calls. That is the point of the talk tonight overall; we need to get away from the stack world of synchronous RPC calls to true asynchronous non-blocking message based SOA systems.&#8221;</p>
</blockquote>
<p>The main concern I have with a WCF+WF based solution is that developers need to know a lot in order to make it testable, scalable, and robust. In nServiceBus, that&#8217;s baked into the design. It would be extremely difficult for a developer writing application logic to interfere with when persistence needs to happen, or the concurrency strategy of long-running workflows. The fact that message handlers in the service layer don&#8217;t need concurrency modes, instance providers, or any of that junk make them testable by default.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/06/30/make-wcf-and-wf-as-scalable-and-robust-as-nservicebus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Object Relational Mapping Sucks!</title>
		<link>http://www.udidahan.com/2008/06/25/object-relational-mapping-sucks/</link>
		<comments>http://www.udidahan.com/2008/06/25/object-relational-mapping-sucks/#comments</comments>
		<pubDate>Wed, 25 Jun 2008 11:32:06 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Data Access]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/06/25/object-relational-mapping-sucks/</guid>
		<description><![CDATA[For reporting, that is. 
And doesn&#8217;t handle concurrency!
Unless you don&#8217;t expose setters.
I guess it depends, doesn&#8217;t it? 
Well, that was Ted&#8217;s assertion in his recent Pragmatic Architecture column on data access.
But, &#8220;it depends&#8221; doesn&#8217;t get the system built, does it?
So, here are some rules for using o/r mapping that will get you 99% of the [...]]]></description>
			<content:encoded><![CDATA[<p>For reporting, that is.<a href="http://udidahan.weblogs.us/wp-content/uploads/image26.png"><img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; margin: 0px 0px 10px 10px; border-right-width: 0px" height="160" alt="image" src="http://udidahan.weblogs.us/wp-content/uploads/image-thumb22.png" width="244" align="right" border="0"></a> </p>
<p>And doesn&#8217;t handle concurrency!</p>
<p>Unless you don&#8217;t expose setters.</p>
<p>I guess <em>it depends</em>, doesn&#8217;t it? </p>
<p>Well, that was Ted&#8217;s assertion in his <a href="http://msdn2.microsoft.com/en-us/library/cc178936.aspx">recent Pragmatic Architecture column on data access</a>.</p>
<p>But, &#8220;it depends&#8221; doesn&#8217;t get the system built, does it?</p>
<p>So, here are some rules for using o/r mapping that will get you 99% of the way there. </p>
<p>Yes, you heard me. </p>
<p><strong>Rules</strong>. </p>
<p>They do not depend. </p>
<p>If you&#8217;re doing something significantly bigger than enterprise-scale development, and you are already doing this, and it isn&#8217;t enough, <a href="mailto:consulting@UdiDahan.com">give me a call</a>. Here we go.</p>
<ol>
<li>No reporting.<br />
<blockquote>
<p>I mean it. Don&#8217;t report off of live data. <br />This isn&#8217;t just a o/r mapping thing. <br />Users can tolerate some, if not quite a lot of latency.</p>
<p>And it&#8217;s not like <em>objects</em> are even used. It&#8217;s just rolled up data. Not a single behaviour for miles.</p>
</blockquote>
<li>Don&#8217;t expose setters<br />
<blockquote>
<p>You want multiple users sharing and collaborating on data, right? Then don&#8217;t force them to either overwrite each others data, or throw away their own. There is one simple way to avoid that: Get an object, call a method. Once the object has the most up to date data, pass all the client data in via a method call. The object will decide if its valid, from a business perspective as well, and then update the appropriate fields. </p>
<p>Now your DBAs can vertically partition tables accordingly, and improve throughput. After that, you can increase the isolation level, to improve safety, without hurting throughput. </p>
<p>This will also keep your logic encapsulated, bringing you closer to a true Domain Model.</p>
<p>If your O/R mapping tool requires you to have setters on your domain classes, hide those from your service layer behind an interface. </p>
</blockquote>
<li>Grids are like reports.<br />
<blockquote>
<p>No o/r mapping required there either. While you probably won&#8217;t be showing grids of yesterday&#8217;s data to users in an interactive environment, it&#8217;s still just data &#8211; no behaviour.</p>
<p>However, users should NOT update data in those grids. This gets back to rule 2. Have users select a specific task they want to perform, pop open a window, and have them do it there. Change customer address. Discount order. You get the picture. That way you&#8217;ll know what method to call on those objects you designed in rule 2.</p>
</blockquote>
</li>
</ol>
<p>Before wrapping up, one small thing.</p>
<p>You <em>can</em> use an O/R mapping tool to do reporting, just, for the love of Bill, don&#8217;t use the same classes you designed for your OLTP domain model. But, just because you can, doesn&#8217;t necessarily mean you should. <strike>Datasets</strike> datatables are probably just as viable a solution.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/06/25/object-relational-mapping-sucks/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Sagas Solve Stupid Transaction Timeouts</title>
		<link>http://www.udidahan.com/2008/06/23/sagas-solve-stupid-transaction-timeouts/</link>
		<comments>http://www.udidahan.com/2008/06/23/sagas-solve-stupid-transaction-timeouts/#comments</comments>
		<pubDate>Mon, 23 Jun 2008 07:09:31 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Databases]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/06/23/sagas-solve-stupid-transaction-timeouts/</guid>
		<description><![CDATA[It turns out that there was a subtle, yet dangerous problem in the use of System.Transactions &#8211; a transaction could timeout, rollback, and the connection bound to that transaction could still change data in the database.  
Think about that a second.
Scary, isn&#8217;t it?
At TechEd Israel I had a discussion with Manu on this very [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://weblogs.asp.net/ryangaraygay/archive/2008/04/14/issue-with-system-transactions-sqlconnection-and-timeout.aspx">It turns out</a> that there was a subtle, yet dangerous problem in the use of System.Transactions &#8211; a transaction could timeout, rollback, and the connection bound to that transaction could still change data in the database. <a href="http://udidahan.weblogs.us/wp-content/uploads/image25.png"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="117" alt="image" src="http://udidahan.weblogs.us/wp-content/uploads/image-thumb21.png" width="84" align="right" border="0"></a> </p>
<p>Think about that a second.</p>
<p>Scary, isn&#8217;t it?</p>
<p>At TechEd Israel I had a discussion with <a href="http://blogs.microsoft.co.il/blogs/applisec/">Manu</a> on this very issue, just under a different hat: </p>
<blockquote><p>What&#8217;s the difference between a short-running workflow and a long-running one?</p>
</blockquote>
<p>Manu suggested that we look at the actual time that things ran to differentiate between them. I asserted that if any external communication was involved in some part of state-management logic, that logic should automatically be treated as long-running.</p>
<p>Manu&#8217;s reasoning was that the complexity involved in writing long-running workflows was not justified for things that ran quickly, even if there was communication involved. Many developers don&#8217;t think twice about synchronously calling some web services in the middle of their database transaction logic. In the many Microsoft presentations I&#8217;ve been at on WF, not once has it been mentioned that state machines should be used when external communication is involved.</p>
<p>The problem that I have with this guidance is how do you know how quickly a remote call will return?</p>
<p>Do you just run it all locally on your machine, measure, and if it doesn&#8217;t take more than a second or so, then you&#8217;re OK?</p>
<p>The fact of the matter is that we can never know what the response time of a remote call will be. Maybe the remote machine is down. Maybe the remote process is down. Maybe someone changed the firewall settings and now we&#8217;re doing 10KB/s instead of 10MB/s. Maybe the local service is down and we&#8217;re communicating with the backup on the other side of the Pacific Ocean.</p>
<p>But the thing is, Manu&#8217;s right.</p>
<p>Writing long-running workflows (with WF) is more complex than is justified. My guess is that since WF wasn&#8217;t specifically designed for long-running workflows <em>only</em>, that this complexity crept in.<a href="http://www.nServiceBus.com"><img style="border-right: 0px; border-top: 0px; margin: 0px 0px 10px 10px; border-left: 0px; border-bottom: 0px" height="43" alt="nservicebus_logo_small" src="http://udidahan.weblogs.us/wp-content/uploads/nservicebus-logo-small.png" width="153" align="right" border="0"></a></p>
<p>Sagas in <a href="http://www.nServiceBus.com">nServiceBus</a> <em>were</em> specifically designed for long-running workflows only. </p>
<p>Maybe that&#8217;s what kept them simple.</p>
<p>Since all external communication is done via one-way, non-blocking messaging only, each step of a saga runs as quick as if no communication were done at all. This keeps the time the transaction in charge of handling a message is open as short as possible. That, in turn, leads to the database being able to support more concurrent users. </p>
<p>In short, sagas are both more scalable and more robust.</p>
<p>No need to worry about garbaging-up your database.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/06/23/sagas-solve-stupid-transaction-timeouts/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>[Podcast] Highly Scalable Web Architectures</title>
		<link>http://www.udidahan.com/2008/06/19/podcast-highly-scalable-web-architectures/</link>
		<comments>http://www.udidahan.com/2008/06/19/podcast-highly-scalable-web-architectures/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 20:42:34 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Caching]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/06/19/podcast-highly-scalable-web-architectures/</guid>
		<description><![CDATA[For those people who couldn&#8217;t come to TechEd USA and didn&#8217;t see my talks on how to build highly scalable web architectures, you&#8217;re in luck &#8211; Craig, the man behind the Polymorphic Podcast sat down with me and we chatted about what the problems, common solutions, and effective tactics there are in this space. For [...]]]></description>
			<content:encoded><![CDATA[<p>For those people who couldn&#8217;t come to TechEd USA and didn&#8217;t see my talks on how to build highly scalable web architectures, you&#8217;re in luck &#8211; Craig, the man behind the <a href="http://polymorphicpodcast.com">Polymorphic Podcast</a> sat down with me and we chatted about what the problems, common solutions, and effective tactics there are in this space. For those of you who <em>were</em> at TechEd and still <em>didn&#8217;t</em> come to my talk &#8211; <em>what were you thinking?!</em></p>
<p> <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><a href="http://polymorphicpodcast.com/shows/scaletheweb/">Check it out.</a></p>
<p>Some of this stuff is a bit counter-intuitive (and not readily supported by the tools available in Visual Studio) so please, do feel free to ask questions (in the comments below).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/06/19/podcast-highly-scalable-web-architectures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NServiceBus Performance</title>
		<link>http://www.udidahan.com/2008/05/21/nservicebus-performance/</link>
		<comments>http://www.udidahan.com/2008/05/21/nservicebus-performance/#comments</comments>
		<pubDate>Wed, 21 May 2008 07:08:05 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/05/21/nservicebus-performance/</guid>
		<description><![CDATA[I&#8217;ve gotten this question several times already but now companies are beginning to look for performance comparisons in making decisions around the use of nServiceBus. It&#8217;s often compared to straight WCF, BizTalk, and now Neuron ESB. In Sam&#8217;s recent post he posts to a case study of Neuron doing 28 million messages an hour. That&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve gotten this question several times already but now companies are beginning to look for performance comparisons in making decisions around the use of nServiceBus. It&#8217;s often compared to straight WCF, BizTalk, and now Neuron ESB. In Sam&#8217;s <a href="http://samgentile.com/blogs/samgentile/archive/2008/05/19/new-and-notable-243.aspx">recent post</a> he posts to a case study of Neuron doing 28 million messages an hour. That&#8217;s far more than I&#8217;ve ever heard quoted for BizTalk.</p>
<h3>Disclaimer</h3>
<p>Before giving some numbers, please keep in mind that high performance of system infrastructure does not necessarily by itself mean that the system above it is running that fast. For instance, you may have server heartbeats running really quickly but the time it takes to save a purchase order borders on a minute. So, please, take all benchmarks with a grain of salt, or two, or a whole shaker-full.</p>
<p>While I&#8217;m not at liberty to say on which specific domain/company these numbers were measured, I can say that we had the full gamut of &#8220;stateless services&#8221;, statefull services (sagas), number crunching, large data sets, many users, complex visualization, etc. Also, this wasn&#8217;t the largest installation of nServiceBus that I&#8217;m aware of, but its the one I have the most specific numbers for.</p>
<h3>Setup</h3>
<p>OK, so using the default nServiceBus distribution using MSMQ, on servers where the queue files themselves were on separate SCSI RAID disks, we were pumping around 1000 durable, transactionally processed messages per second, per server. That means that similar to the Neuron case, no messages would be lost in the case of a single fault per server per window (time to replace a failed disk set at 3 hours from failure, through detection, to replacement per site &#8211; but that&#8217;s more an operational staffing concern, not the technology itself). </p>
<p>So, that&#8217;s 3.6 million messages per hour per server, at full load. We had a total of 98 servers doing these kinds of processing, not including web servers, databases, etc. Keep in mind that web servers would be communicating with other servers using nServiceBus, but that would maybe be an unfair comparison to the Neuron numbers.</p>
<h3>Server Breakdown</h3>
<p>Anyway, the 48 number crunching servers (blade centers) we had were at full load, so we were pumping more than 170 million messages there. Keep in mind that those servers had a really fast backbone so weren&#8217;t held up by IO. Your environment may be different.</p>
<p>Another 30 (regular pizza boxes) were doing our sagas. Saga state was stored in a distributed in-memory &#8220;cache&#8221;, so once again IO wasn&#8217;t an issue for processing those messages. We were at about 70% utilization there, coming to just over 100 million messages an hour.</p>
<p>The last 20 were clustered boxes (fairly expensive) that handled the various nServiceBus distributor and timeout manager processes were at full load since they handled control messages for all the servers as well as dynamically routing the load. However, on those boxes we used much higher performance disks for the messages, since they had to feed everything else, capable of doing, on average, around 5000 messages a second. That adds up to 360 million messages an hour.</p>
<h3>Unnecessary Durability</h3>
<p>Later, we moved a bunch of messages that didn&#8217;t need all that durability and transactionality off the disks, pushing the total throughput over 1 billion messages an hour. That was about 100 million per hour durable, 900 million per hour non-durable. You can guess that we were left with plenty of IO to spare at that point while we weren&#8217;t yet pushing the limit of our memory.</p>
<p>One thing that&#8217;s important to understand is the size of the messages that didn&#8217;t require durability was less than 1MB, with most weighing in under 10KB. Also, since most of those messages were published, less state management was required around them, enabling us to further improve performance.</p>
<h3>Summary</h3>
<p>NServiceBus didn&#8217;t give us all that by itself. It was the result of skilled architects, developers, and operations staff working together for many iterations, deploying, monitoring, re-designing, etc. You need to understand your technology, your hardware, and your specific performance, availability, and fault-tolerance requirements if you want to get anywhere.</p>
<p>There&#8217;s no magic.</p>
<p>I didn&#8217;t see the number or kinds of servers involved in the Neuron case study so this wasn&#8217;t ever really a comparison. Nor or we talking about the same system here. </p>
<p>So, please, don&#8217;t base your decisions on arbitrary numbers. Spend some time setting up a scaled down version of your target architecture with all the relevant technologies and <em>measure</em>. Be aware that you want high performance end to end, not just of the messaging part. At times, it makes sense to actively throw away messages (of the non-durable, published kind) to help a server come online faster especially after a restart.</p>
<p>Thus ends the tale of another &#8220;benchmark&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/05/21/nservicebus-performance/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>[Video] Messaging and Architecture Discussion at ALT.NET</title>
		<link>http://www.udidahan.com/2008/04/28/video-messaging-and-architecture-discussion-at-altnet/</link>
		<comments>http://www.udidahan.com/2008/04/28/video-messaging-and-architecture-discussion-at-altnet/#comments</comments>
		<pubDate>Mon, 28 Apr 2008 21:53:00 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/04/28/video-messaging-and-architecture-discussion-at-altnet/</guid>
		<description><![CDATA[In this video, Greg Young, Martin Fowler, Evan Hoff, Dru Sellers, myself and some others discussed various aspects of event-based systems, how Domain-Driven Design works with them, what role messaging has, and how all these connect to architectural properties like scalability and fault tolerance.
One of the questions that Martin started answering was how teams can [...]]]></description>
			<content:encoded><![CDATA[<p>In this video, <a href="http://codebetter.com/blogs/gregyoung/">Greg Young</a>, <a href="http://www.martinfowler.com/bliki/">Martin Fowler</a>, <a href="http://www.lostechies.com/blogs/evan_hoff/">Evan Hoff</a>, <a href="http://geekswithblogs.net/dsellers/Default.aspx">Dru Sellers</a>, myself and some others discussed various aspects of event-based systems, how Domain-Driven Design works with them, what role messaging has, and how all these connect to architectural properties like scalability and fault tolerance.</p>
<p>One of the questions that Martin started answering was how teams can start getting into the messaging state-of-mind. Unfortunately, the conversation veered off into what kind of messaging interactions are appropriate leaving the original question unanswered.</p>
<p>I&#8217;m hoping to address this topic with some of the information I&#8217;m putting up on the <a href="http://www.nServiceBus.com">nServiceBus site</a>. There&#8217;s always Gregor and Bobby&#8217;s excellent <a href="http://www.amazon.com/exec/obidos/redirect?link_code=ur2&amp;camp=1789&amp;tag=thesoftwaresi-20&amp;creative=9325&amp;path=tg/detail/-/0321200683/qid=1117231639/sr=8-1/ref=pd_csp_1?v=glance%26s=books%26n=507846">EIP book</a> that I think is a must for anybody writing distributed systems.</p>
<p> <object id="viddler" height="370" width="437" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"><param name="_cx" value="11562"><param name="_cy" value="9790"><param name="FlashVars" value=""><param name="Movie" value="http://www.viddler.com/player/e8529fc5/"><param name="Src" value="http://www.viddler.com/player/e8529fc5/"><param name="WMode" value="Window"><param name="Play" value="0"><param name="Loop" value="-1"><param name="Quality" value="High"><param name="SAlign" value="LT"><param name="Menu" value="0"><param name="Base" value=""><param name="AllowScriptAccess" value="always"><param name="Scale" value="NoScale"><param name="DeviceFont" value="0"><param name="EmbedMovie" value="0"><param name="BGColor" value=""><param name="SWRemote" value=""><param name="MovieData" value=""><param name="SeamlessTabbing" value="1"><param name="Profile" value="0"><param name="ProfileAddress" value=""><param name="ProfilePort" value="0"><param name="AllowNetworking" value="all"><param name="AllowFullScreen" value="true"><embed src="http://www.viddler.com/player/e8529fc5/" width="437" height="370" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" name="viddler"></embed></object></p>
<p>Enjoy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/04/28/video-messaging-and-architecture-discussion-at-altnet/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Scalability Article up on InfoQ</title>
		<link>http://www.udidahan.com/2008/04/10/scalability-article-up-on-infoq/</link>
		<comments>http://www.udidahan.com/2008/04/10/scalability-article-up-on-infoq/#comments</comments>
		<pubDate>Fri, 11 Apr 2008 05:59:38 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Articles]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/04/10/scalability-article-up-on-infoq/</guid>
		<description><![CDATA[I&#8217;ve published a new article on performance and scalability on InfoQ:
Spectacular Scalability with Smart Service Contracts

In this article, I attempt to debunk some of the myths around stateless-ness as the key to scalability.
Here&#8217;s how it starts:
It was a sunny day in June 2005 and our spirits were high as we watched the new ordering system [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve published a new article on performance and scalability on InfoQ:</p>
<blockquote><p><a href="http://www.infoq.com/articles/scale-with-service-contracts">Spectacular Scalability with Smart Service Contracts</a></p>
</blockquote>
<p>In this article, I attempt to debunk some of the myths around stateless-ness as the key to scalability.</p>
<p>Here&#8217;s how it starts:</p>
<blockquote><p>It was a sunny day in June 2005 and our spirits were high as we watched the new ordering system we&#8217;d worked on for the past 2 years go live in our production environment. Our partners began sending us orders and our monitoring system showed us that everything looked good. After an hour or so, our COO sent out an email to our strategic partners letting them know that they should send their orders to the new system. 5 minutes later, one server went down. A minute after that, 2 more went down. Partners started calling in. We knew that we wouldn&#8217;t be seeing any of that sun for a while.</p>
<p>The system that was supposed to increase the profitability of orders from strategic partners crumbled. The then seething COO emailed the strategic partners again, this time to ask them to return to the old system. The weird thing was that although we had servers to spare, just a few orders from a strategic customer could bring a server to its knees. The system could scale to large numbers of regular partners, but couldn&#8217;t handle even a few strategic partners.
<p>This is the story of what we did wrong, what we did to fix it, and how it all worked out.
<p><a href="http://www.infoq.com/articles/scale-with-service-contracts">Continue reading&#8230;</a></p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/04/10/scalability-article-up-on-infoq/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Distributed Architecture on ARCast.TV Rapid Response</title>
		<link>http://www.udidahan.com/2008/01/14/distributed-architecture-on-arcasttv-rapid-response/</link>
		<comments>http://www.udidahan.com/2008/01/14/distributed-architecture-on-arcasttv-rapid-response/#comments</comments>
		<pubDate>Mon, 14 Jan 2008 23:45:34 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[Podcast]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[WCF]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/01/14/distributed-architecture-on-arcasttv-rapid-response/</guid>
		<description><![CDATA[A while ago, me and Ron Jacobs (virtually) got together and did a couple &#8220;rapid responses&#8221; to questions on the MSDN architecture forums, and I just noticed that they&#8217;re online. The really great thing is that there are transcripts! For your convenience, I&#8217;ve included them here.
By the way, if you&#8217;re looking for more Q&#38;A style [...]]]></description>
			<content:encoded><![CDATA[<p>A while ago, me and Ron Jacobs (virtually) got together and did a couple &#8220;rapid responses&#8221; to questions on the MSDN architecture forums, and I just noticed that they&#8217;re <a href="http://channel9.msdn.com/ShowPost.aspx?PostID=348243#348243">online</a>. The really great thing is that there are transcripts! For your convenience, I&#8217;ve included them here.</p>
<p>By the way, if you&#8217;re looking for more Q&amp;A style info, check out the <a href="/category/ask-udi-podcast/">Ask Udi podcast</a>. If you have a pressing question and need a shorter turn around time than the month or so it usually takes me for the podcast, send me an email to <a href="mailto:OnlineConsultation@UdiDahan.com">OnlineConsultation@UdiDahan.com.</a></p>
<h3>Number 1</h3>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> Hey, welcome back to ARCast Rapid  Response. This is your host Ron Jacobs and today I&#8217;m looking at the MSDN  architecture forum where I see this message from &#8220;theking2.&#8221; Yeah? OK, so  &#8220;king,&#8221; he says, he&#8217;s building a distributed architecture that has a number of  external systems. These external systems interface through a telnet connection  and so they accept commands and return results as ACKS or  NACKS.</p>
<p>Typically these systems have limited resources for the number of  simultaneous sessions you can open, so, five to fifty depending on the system.  What he did to get around this, was, he created some Enterprise Services objects  and some pooled objects that set up these connections and then he has some Web  services. The Web services are going to receive an incoming message. They&#8217;re  going to call these pooled COM+ objects and they&#8217;re going to make the telnet  calls to the external systems. Sounds interesting.</p>
<p>He says, after a year  of production it has become apparent that some of the external systems are not  performing very well. He says the bulk of the requests, but not all, to the  external systems can be done asynchronously. So, he&#8217;s opting for a message  queue-based solution using pseudosynchronous calls whenever a direct response is  needed.</p>
<p>So, the question is, at what layer would message queuing make  most sense?</p>
<p>So, should the clients, this Web service that receives the  message &#8212; should it do a queue? Put a message in the queue and then the COM+  objects would pop off or they have some central Web services that would pop it.  So, the central Web services or these Enterprise Service objects? Or maybe just  a communication at the top of the telnet. He says this is the first time when  he&#8217;s using message queuing.</p>
<p>On the line with me I have Udi Dahan, the  Software Simplist from Israel.</p>
<p>Udi, this is a very interesting  application and my first gut reaction is, does it really matter where you put  the queuing?</p></blockquote>
<blockquote class="speaker_4_text"><p><cite class="speaker_4"><strong>Udi  Dahan:</strong></cite> Well, actually I took a look at it as well and I&#8217;d have  to say that it does because the problem that he&#8217;s trying to solve isn&#8217;t that  clear. We know that there is some sort of performance problem but we&#8217;re not  quite sure where it is. We know that there are long and varying latencies in the  responses but we&#8217;re not really quite sure why.</p>
<p>While we know that their  external system is a bit slow but our choice of where to put the queue will  probably have an impact, obviously on the development model of the clients and  the Web services as well as how those external systems would work. So, I&#8217;d have  to say that choosing the correct place to put the queue is important.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> Well, let me interject something  here because what you said just made me think. Now, if the problem is that these  external systems are slow and limited number of connections, the first question  we ought to ask is, does queuing help this situation at all?</p></blockquote>
<blockquote class="speaker_5_text"><p><cite class="speaker_5"><strong>Udi:</strong></cite> Well, that&#8217;s probably a good first  step. I mean every single time someone comes with a solution and then says, &#8220;OK,  what&#8217;s the problem,&#8221; it&#8217;s always a good thing to check that solution  first.</p>
<p>It looks like the problem that he has here has to deal with or the  reason that he wants to use a queue is to do some kind of load leveling. He&#8217;s  getting too many requests or at too high a rate from his clients and external  Web services and external web applications more than his back end systems can  use. So, using a queue as a load leveling mechanism is definitely the right way  to go. So, from that perspective I think that putting a queue somewhere in there  is a good idea.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> OK. So then if you put a queue, it  seems to me that it&#8217;s not going to make that much difference which layer you put  the queue, would it?</p></blockquote>
<blockquote class="speaker_5_text"><p><cite class="speaker_5"><strong>Udi:</strong></cite> Well, it might for the main reason  that you really have to look at where his bottleneck is and that&#8217;s his back end  systems. The bottleneck also has to do with the number of connections that can  be opened and the number of sessions that can be opened. The place that I&#8217;d be  looking at doing that is probably between those pooled COM+ objects and his  central Web services for the main reason that that really gives a nice  encapsulation in terms of the Web services towards both his organization&#8217;s  internal services if they are other Web services, web applications or clients  and everybody else that&#8217;s going on out there while keeping that abstraction out  of the way.</p>
<p>So, the choice of using pooled COM objects is one of the ways  he does the load leveling now. One of the problems he has is that it doesn&#8217;t  seem to be doing that much for him because the switches and knobs that are  available in COM+ in order to do that load leveling aren&#8217;t that great. What I&#8217;d  be looking at in his situation is to put a queue in there but on the back side  of that queue, not talking directly to the external system but doing something  with WCF.</p>
<p>WCF has an incredible amount of switches and knobs in order to  do the load throttling and the number of threads that are open. He could also do  that on a large number of URIs in order to sort of split up the load from that  perspective allowing him to cache results quite a bit better. So, that&#8217;s where  I&#8217;d be looking at too. Just throw away those COM+ objects, put WCF in there, use  the MSMQ binding and start configuring things from there.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> There&#8217;s a lot of stuff in the  message, but I think his core concern is performance. He mentions  pseudosynchronous calls. I think by that he means, a message comes in to the web  service, he&#8217;s going to drop something on the queue and then hold that message  response until he gets a response back from a queue. So, it&#8217;s sort of  synchronous but sort of not synchronous. So, in effect he&#8217;s kind of waiting on a  queue instead of waiting on the pooled object to make this outbound telnet  call.</p>
<p>I could agree if you said, &#8220;Well, look, our big problem is that we  keep getting time outs because when we go to get a COM+ object from the pool,  COM+ waits for a while and then it says, &#8220;Hey, there&#8217;s no object available&#8217; and  it returns an error,&#8221; then the queue is definitely going to help that problem.  But in terms of the sheer through put or performance of the system, this is not  going to help at all. It&#8217;s going to still be the same performance.</p>
<p>Now if  you said, &#8220;Oh look, we can do some of this work kind of at a later point in  time, &#8221; well queuing doesn&#8217;t allow you to time shift the work. Right? So, if you  said, &#8220;Look we can rethink this solution.&#8221; So you get a message in, we stuff  something in a queue that we&#8217;ll deal with later, and then very quickly return a  response like some kind of a number like, hey &#8220;your transaction number blah,  blah, blah, will be processed later, it&#8217;s queued for processing, &#8221;  whatever.</p>
<p>I mean that introduces a lot of complexity in the system but it  clearly would provide better response at the Web service layer. What do you  think?</p></blockquote>
<blockquote class="speaker_5_text"><p><cite class="speaker_5"><strong>Udi:</strong></cite> Well I think that at the most basic  level, his throughput is dictated by his back end systems. From what he seems to  be describing, every single request that is going through there, has to hit that  back end system. If he has a limited number of back end systems that are  supporting a limited number of connections, that&#8217;s going to limit his throughput  no matter what technology he puts in front of that. So that&#8217;s at the core level.  You just can&#8217;t get away from that.</p>
<p>The one thing that I would agree with  you in your description there is the choice of using those COM+ objects. I mean  COM+ was a great technology when it came out. The problem occurs, of course,  when we start getting into larger and larger delays around the response time and  we start getting all sorts of time out exceptions and things like that. So in  that respect, I definitely say you know, take a step back from there.</p>
<p>But  in terms of everything that he has around there, the queue isn&#8217;t going to make  the back end system run any faster. What it will do is definitely complicate his  system because he&#8217;s taking something that used to be synchronous and making it  asynchronous. Writing Web services in order to handle that, I mean just adding a  bunch of threads in order to listen to queues is not going to make things any  simpler.</p>
<p>However, what it might do is to improve the resource usage of  those Web services, OK? So instead of having those Web services have a bunch of  threads open, waiting for the response coming back from those COM+ pooled  objects, those threads could be relinquished and really just be triggered back  up when a response comes back from the queue.</p>
<p>So I don&#8217;t see an  improvement in the kind of solution that MSMQ or queuing would put in there in  terms of the latency &#8212; how long it would take for a response to get back.  However, I do see an improvement in terms of the resource usage of all the other  players in the system.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> I would agree with that. I would  just say though that if you make the Web server that is hosting these Web  services more resource efficient, maybe all you&#8217;re going to do is enable it to  get more requests in queue the more quickly. Ultimately, this solution I think  is going to solve a lot of problems related to time outs and server busy errors  and that sort of thing, thread contentions, but not likely to increase overall  performance.</p>
<p>But I definitely agree though. I would move this solution  forward to WCF. I used to be on the COM+ team. COM+ was rolled into WCF so that  it would have similar capabilities for pooling, instancing behavior,  transactional support, those sorts of things. I would definitely move that  forward into WCF.</p>
<p>OK! So great answer, Udi. Thank you so much for being  on this ARCast Rapid Response.</p></blockquote>
<h3>Number 2</h3>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> Hey this is Ron Jacobs back with  another ARCast.TV Rapid Response. Today I&#8217;m joined by Udi Dahan, the Software  Simplist from Israel.</p>
<p>Udi, I&#8217;m looking at the MSDN Architecture Forum and  here&#8217;s a question from &#8220;blast.&#8221; Blast says he&#8217;s looking for where to put  business rules. He&#8217;s developing a WinForm application. He uses data sets as the  data layer, he says. He&#8217;s thinking about business rules and where to put  them.</p>
<p>He says obviously, the more organized and centralized business  rules are, the better. He&#8217;s tempted to put the business rules in the UI layer  especially with the type data set. It makes a lot of sense there but not all  rules belong on the client. He says some rules belong on the server, perhaps in  a trigger.</p>
<p>So he&#8217;s asking where do you put your rules? How do you think  about this problem, Udi?</p></blockquote>
<blockquote class="speaker_5_text"><p><cite class="speaker_5"><strong>Udi:</strong></cite> Well, it looks like what he&#8217;s doing  here is developing a two-tier client that is using WinForms and using datasets  and speaking directly to the database. That in essence is part of his problem in  that in terms of performance, he&#8217;d like to run more rules in the UI layer so  that the user won&#8217;t be sending garbage to the database.</p>
<p>He also  understands that because he&#8217;s building a multi-user system, there is a limited  capability, in terms of concurrency, of actually having all the rules run  correctly in order to make sure that everything is correct. So, his choice of an  architecture, working two-tier is the main problem of why he has to fragment his  business rules.</p>
<p>If he were to move towards a three-tier solution, that is  put an application server between his smart client and the database, it would be  a lot easier to put those business rules there. Now, once the business rules are  out of the database, because again, we don&#8217;t have to deal with the concurrency  issues once we have an application server and we&#8217;re using transactions there and  we don&#8217;t have any disconnected problems, then what we can do is use those same  DLLs, that same CLR code that runs the business rules, and deploy it client-side  and use it there.</p>
<p>So, in terms of deployment, what we&#8217;d have is we&#8217;d have  the same rules, both running client-side and server-side, whereas from a  development perspective, we&#8217;d have them organized and centralized. That&#8217;s the  way that I&#8217;d go about it.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> Yeah, you know, I think  conceptually I agree with you that a multi-tier solution would be a very good  idea here. What I would probably think about conceptually, is breaking down  rules into things that really ought to happen on the client-side. In particular,  rules related to validation of data, so you know that you&#8217;ve got good and  complete data before you ship it off to the server-side. Oftentimes you have to  do that anyway because you have a button that shouldn&#8217;t be enabled until the  data is valid, or something like that.</p></blockquote>
<blockquote class="speaker_5_text"><p><cite class="speaker_5"><strong>Udi:</strong></cite> Absolutely.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> Of course, we all know that if you  have middle-tier web services, you must do validation both on the client-side  and the server-side, because you must ensure that the valid data is received on  the web server. So I agree with you that creating an assembly that you deploy on  both sides is a good idea.</p>
<p>I would just expand on what you said a little  bit and think about maybe on the server-side using a workflow foundation and  business rules and workflow as a way to handle a lot of the heavier lifting,  server-side validations and business rules that might require maybe sifting  through more data or whatever kind of things, but server-side business rules  that are more oriented towards business logic, and even if you have very, very  data-intensive roles, then maybe some of those might even happen in the  database. Don&#8217;t you agree?</p></blockquote>
<blockquote class="speaker_5_text"><p><cite class="speaker_5"><strong>Udi:</strong></cite> Oh, absolutely. Absolutely. That&#8217;s  something that I think often gets swept under the rug too much. Things like  unique constraints and things like that are kinds of business rules. They  protect the referential integrity and if we look at the alternatives, sometimes  getting 10 million rows out of the database, in order to do some sort of unique  email validation upon them, that&#8217;s just going to kill your  performance.</p>
<p>There are certain things that it just makes sense to do them  in the database, it&#8217;s just the best way to do it. The hard part, from a  development perspective, is maintaining the coherence of your business rules.  When you say, &#8220;OK, I want a single perspective, what are all the rules in my  system?&#8221;</p>
<p>Even though we might try to keep it all CLR based, some of the  things like unique constraints, like referential integrity, will be in the  database. So, what I sometimes suggest to do is to have a separate solution, in  terms of your development team, where you put all your business  rules.</p>
<p>This includes both the SQL statements for defining your unique  constraints and your referential integrity. Also put in that validation logic,  your workflow that you&#8217;re going to be running server-side. If it&#8217;s AJAX controls  and regular expressions that you&#8217;re going to be doing client-side in order to  validate that data, absolutely make sure you have, from a development  perspective, one place where you can go where you can see everything, because if  you don&#8217;t do that [inaudible] can be running, and when things stop working, you  won&#8217;t know how to debug it.</p></blockquote>
<blockquote class="speaker_3_text"><p><cite class="speaker_3"><strong>Ron:</strong></cite> All right. Well, excellent answer.  Udi, thank you so much.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/01/14/distributed-architecture-on-arcasttv-rapid-response/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Durable Messaging Is Not Enough</title>
		<link>http://www.udidahan.com/2008/01/09/durable-messaging-is-not-enough/</link>
		<comments>http://www.udidahan.com/2008/01/09/durable-messaging-is-not-enough/#comments</comments>
		<pubDate>Wed, 09 Jan 2008 23:17:27 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/01/09/durable-messaging-is-not-enough/</guid>
		<description><![CDATA[I&#8217;ve been sitting on this post for a while, waiting, before outlining all the kinds of problems durable messaging doesn&#8217;t solve, I wanted to have a solution handy. Harry Pierson begins to outline the goodness that durable messaging brings to SOA, and in a later post on idempotence describes in general terms how it ties [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been sitting on this post for a while, waiting, before outlining all the kinds of problems durable messaging doesn&#8217;t solve, I wanted to have a solution handy. Harry Pierson begins to outline the goodness that <a href="http://devhawk.net/2007/05/30/The+Case+For+Durable+Messaging+In+Service+Orientation.aspx">durable messaging brings to SOA</a>, and in a <a href="http://devhawk.net/2007/11/09/The+Importance+Of+Idempotence.aspx">later post on idempotence</a> describes in general terms how it ties back into durable messaging and transaction &#8211; in essence describing a <a href="http://udidahan.weblogs.us/2007/12/17/no-more-workflow-for-nservicebus-please-welcome-the-saga/">saga</a>. Let&#8217;s do this in story form.</p>
<p>Since you&#8217;re concerned that maybe your shipping company&#8217;s servers may be down for some kind of planned (or unplanned) maintenance just as you&#8217;re trying to fulfill orders, you use a durable messaging solution there. What happens is that messages get written to disk on your end, and later the messaging tries to transfer the messages until it succeeds. So what&#8217;s wrong with that?</p>
<p>Well, let&#8217;s say that the shipping company&#8217;s servers went up in smoke (true story &#8211; broken down air conditioners + poor ventilation, you get the picture). Those servers aren&#8217;t going to be coming back online any second now. So, you have all these order messages buffering on your disk. Taking into account all the data, meta-data, XML, SOAP, encryption and everything, we may get up to 1MB per message.</p>
<p>And now&#8217;s holiday season and your company&#8217;s selling hand over fist, hundreds of orders per second from all over the world. So that means we&#8217;re eating up 100MB of disk per second, that&#8217;s 6GB a minute, and in under an hour of our shipping company&#8217;s servers going down &#8211; so do ours.</p>
<p>Durable messaging &#8211; yay? We don&#8217;t want to lose those orders, right? In short, durable messaging is an important part of the solution, but it&#8217;s not the whole solution.</p>
<p>[Continued next time...]</p>
<p>If you&#8217;re impatient and just want the solution, yes, <a href="http://www.nServiceBus.com">nServiceBus</a> give you all the tools you need.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/01/09/durable-messaging-is-not-enough/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Israel Grid Technologies Association Presentation on NServiceBus</title>
		<link>http://www.udidahan.com/2008/01/08/israel-grid-technologies-association-presentation-on-nservicebus/</link>
		<comments>http://www.udidahan.com/2008/01/08/israel-grid-technologies-association-presentation-on-nservicebus/#comments</comments>
		<pubDate>Tue, 08 Jan 2008 22:20:58 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[ESB]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/01/08/israel-grid-technologies-association-presentation-on-nservicebus/</guid>
		<description><![CDATA[I know that I&#8217;ve been alluding to the grid-like capabilities that are gained when working with nServiceBus, and I&#8217;ll be giving a presentation on that next week.
Here&#8217;s the info:
Despite the recent flood of technologies and releases, distributed enterprise .net solution development remains as hard as ever. WCF and WF provide valuable runtime components, yet still [...]]]></description>
			<content:encoded><![CDATA[<p>I know that I&#8217;ve been alluding to the grid-like capabilities that are gained when working with nServiceBus, and I&#8217;ll be giving a presentation on that next week.</p>
<p>Here&#8217;s the <a href="http://www.grid.org.il/?CategoryID=384&amp;ArticleID=32&amp;Page=1">info</a>:</p>
<blockquote><p>Despite the recent flood of technologies and releases, distributed enterprise .net solution development remains as hard as ever. WCF and WF provide valuable runtime components, yet still leave open the risk of developers using the wrong combination of options and ending up with an unscalable solution.</p></blockquote>
<blockquote><p>In this session we&#8217;ll be looking at nServiceBus, an open-source communications framework, that guides developers into a style of development that is scalable by design. Including publish/subscribe facilities and long-running process state management, nServiceBus solves many of the challenges found in the enterprise. Finally, we&#8217;ll see the dynamic load-balancing features that enables endpoints to automatically adjust resource allocation in a grid-style deployment.</p></blockquote>
<blockquote><p><strong>Date</strong> Jan 17, 2008 14:00 16:00</p>
<p><strong>Location</strong> IGT Offices, Maskit 4, 5th Floor, Hertzelia</p></blockquote>
<p>As usual, I&#8217;ll be putting up the slides and example code after the presentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/01/08/israel-grid-technologies-association-presentation-on-nservicebus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WCF Everywhere? Not on my watch.</title>
		<link>http://www.udidahan.com/2007/12/29/wcf-everywhere-not-on-my-watch/</link>
		<comments>http://www.udidahan.com/2007/12/29/wcf-everywhere-not-on-my-watch/#comments</comments>
		<pubDate>Sat, 29 Dec 2007 15:00:19 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[WCF]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/12/29/wcf-everywhere-not-on-my-watch/</guid>
		<description><![CDATA[ The other day I was at Juval&#8217;s presentation where the main message was WCF is a better .NET. In other words, if you use WCF on every one of your classes, you&#8217;ll benefit. I don&#8217;t know about you, but I&#8217;m quite wary of silver bullets &#8211; they tend to inflict quite a bit of [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://udidahan.weblogs.us/wp-content/uploads/silver_bullets.png" alt="silver bullets" style="border: 0px none ; margin: 0px 20px 20px" align="right" border="0" height="139" width="98" /></a> The other day I was at <a href="http://www.idesign.net/">Juval&#8217;s</a> presentation where the main message was WCF is a better .NET. In other words, if you use WCF on every one of your classes, you&#8217;ll benefit. I don&#8217;t know about you, but I&#8217;m quite wary of silver bullets &#8211; they tend to inflict quite a bit of pain when used indiscriminately. This post is my response to all the people who came up to me at the end of the presentation and wanted to know if I agreed with these far-reaching architectural statements.</p>
<p><a href="http://udidahan.weblogs.us/wp-content/uploads/oz1.jpg"><img src="http://udidahan.weblogs.us/wp-content/uploads/oz-thumb1.jpg" alt="oz" style="border: 0px none ; margin: 0px 20px 20px 0px" align="left" border="0" height="137" width="134" /></a>  First of all let me say that Juval is indeed a master presenter. The &#8220;looks like a class, walks like a class, quacks like a class&#8221; bit was excellent. I could tell that most people didn&#8217;t notice the speedy hands quickly deleting all attributes from the classes before the &#8220;looks like a class&#8230;&#8221; bit. At times, I got flashbacks from the Wizard of Oz &#8211; &#8220;pay no attention to the man behind the curtain&#8221;. If all attributes in WCF only went on the interfaces, then this might actually fly, but we all know that that&#8217;s not the case.</p>
<p>One of the interesting comparisons Juval made with WCF was the introduction of .NET. Few people in the audience seemed to remember (or maybe were just professionally younger than .NET&#8217;s 8 years), but when it came out .NET was marketed as being mainly about XML Web Services. Juval stated that this was done to play down the fact that .NET made the previous Windows programming technologies obsolete. He then drew the same conclusion about WCF &#8211; that it&#8217;s as much .NET 3.0 as .NET was the next version of MFC; besides being written in a language that resembles the previous technology, it&#8217;s really all different. I don&#8217;t think that anyone would argue the difference, but is it really a &#8220;plain .NET&#8221; killer?</p>
<p><a title="answer" name="answer"></a>The answer seemed to come around the overhead of WCF &#8211; yet Juval deftly deflected that issue with a demo showing WCF doing 200 calls a second. And everybody just bought it &#8211; I was shocked. That&#8217;s 5ms per call. If you actually take Juval&#8217;s advice and use WCF on all your classes, you&#8217;ve bought yourself one hell of a performance nightmare. Say you have around 20 of your objects involved in a sequence to handle a user action &#8211; not that many actually. With a 5ms lag per object interaction, that user action is going to take 100ms &#8211; not including any database or webservice stuff you might be doing. If you do that in a server environment, you&#8217;ll be doing roughly 10 concurrent users per core. And that&#8217;s not even doing any heavy calculations or anything. Moderately sized systems are running upwards of 1000 concurrent users &#8211; if they needed 100 cores (or dozens of servers) for that, I&#8217;m guessing that they&#8217;d be out of business.</p>
<p>Let&#8217;s cut this short &#8211; WCF everywhere doesn&#8217;t scale, doesn&#8217;t perform, isn&#8217;t maintainable, or testable either. In other words &#8211; don&#8217;t do it. I know Juval is a brilliant guy, and an amazing presenter &#8211; but I don&#8217;t believe he would be employing this with his own clients. This actually bears repeating. WCF is a fine technology for your application&#8217;s boundaries, but don&#8217;t be pushing it in.</p>
<h1>Don&#8217;t do it.</h1>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/12/29/wcf-everywhere-not-on-my-watch/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
		<item>
		<title>BizTalk Blogs and UdiDahan.com, strange bedfellows?</title>
		<link>http://www.udidahan.com/2007/12/19/biztalk-blogs-and-udidahancom-strange-bedfellows/</link>
		<comments>http://www.udidahan.com/2007/12/19/biztalk-blogs-and-udidahancom-strange-bedfellows/#comments</comments>
		<pubDate>Wed, 19 Dec 2007 21:04:57 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[BizTalk]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/12/19/biztalk-blogs-and-udidahancom-strange-bedfellows/</guid>
		<description><![CDATA[So, it turns out that Microsoft has quietly launched a new community-style site.
Titled &#34;BizTalk Blogs&#34;, I wasn&#8217;t quite sure what my blog was doing there. It&#8217;s not that I never write about BizTalk &#8211; every once in a while I even find something nice to say about it   My quick post on BizTalk [...]]]></description>
			<content:encoded><![CDATA[<p>So, it turns out that Microsoft has quietly launched a new community-style site.<a href="http://www.biztalkblogs.com/"><img style="float: right; margin: 0px 20px" src="http://www.wedsg.com/images/biztalkblogs.gif" align="right" border="0" /></a></p>
<p>Titled &quot;BizTalk Blogs&quot;, I wasn&#8217;t quite sure what my blog was doing there. It&#8217;s not that I never write about BizTalk &#8211; every once in a while I even find something nice to say about it <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  My quick post on <a href="http://udidahan.weblogs.us/2007/05/02/biztalk-and-performance/">BizTalk and Performance</a> is one such example. But, let&#8217;s face it, a lot of the work I do is to provide BizTalk-like features like routing, transaction-management, and choreography (orchestration) without the actual product.</p>
<p>Apparently, I&#8217;m not the only non-BizTalk-only blogger there.</p>
<p>Including such names as <a href="http://blogs.thinktecture.com/cweyer/">Christian Weyer</a> and <a href="http://www.dasblonde.net/default.aspx">Michelle Leroux Bustamante</a> , there is a veritable who&#8217;s who in the Microsoft Connected Systems ecosystem and, quite frankly, I&#8217;m surprised the bouncer let me in the door.</p>
<p>So, this post is for my readers who, like me, have pretty much ignored anything looking like BizTalk for the past few years. Don&#8217;t let the name fool you. BizTalk Blogs is a valuable resource even for people who don&#8217;t care about BizTalk &#8211; and hey, you might even like what you start hearing about the future directions Microsoft is taking it.</p>
<p>But that&#8217;s enough of that. We&#8217;ll be back with your regularly scheduled BizTalk bashing right after this break&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/12/19/biztalk-blogs-and-udidahancom-strange-bedfellows/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Scalability &#8211; you wish you&#8217;re gonna need it</title>
		<link>http://www.udidahan.com/2007/12/12/scalability-you-wish-youre-gonna-need-it/</link>
		<comments>http://www.udidahan.com/2007/12/12/scalability-you-wish-youre-gonna-need-it/#comments</comments>
		<pubDate>Wed, 12 Dec 2007 21:28:33 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/12/12/scalability-you-wish-youre-gonna-need-it/</guid>
		<description><![CDATA[&#8220;Is it still valid to assume it is more expensive to design a scalable system?&#8221;
In Gavin Terrill&#8217;s news post on InfoQ, Big Architecture Up Front &#8211; A Case of Premature Scalaculation? he covers one of the questions so many of my clients deal with:
&#8220;How hard will it be to make it scale later?&#8221;
This is a [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Is it still valid to assume it is more expensive to design a scalable system?&#8221;</p>
<p>In Gavin Terrill&#8217;s news post on InfoQ, <a href="http://www.infoq.com/news/2007/12/bauf">Big Architecture Up Front &#8211; A Case of Premature Scalaculation?</a> he covers one of the questions so many of my clients deal with:</p>
<p>&#8220;How hard will it be to make it scale later?&#8221;</p>
<p>This is a valid question, especially for companies/products just beginning their lifecycle. When the product/web site isn&#8217;t bringing in any revenue yet, how much money should we spend on getting it ready for that future success?</p>
<p>The answer to that question lies in treating capacity and scalability differently (<a href="http://www.pervasivecode.com/blog/2007/11/13/capacity-vs-scalability/">source</a>).</p>
<p>What I mean by that is designing for scalability, yet separating out all technological aspects of the scaling from the core solution. That way, you can start with simple, low capacity technologies that won&#8217;t be too expensive. As you grow, upgrade that infrastructure and plug it in to your solution. Arnon&#8217;s recent post on <a href="http://www.rgoarchitects.com/nblog/2007/12/11/WhyArbitraryTiersplittingIsBad.aspx">Tier Splitting</a> touches on a project we worked on together where we designed it in a way that we could scale down to a single process on a single machine and scale out to a server farm, all without changing the core system.</p>
<p>Let me take the design of <a href="http://www.nServiceBus.com">nServiceBus</a> as an example:</p>
<p>One primary property of scalable systems is the explicit treatment of all IO/communication. This can be seen in the one-way messaging exposed by the Bus object. There is no immediately evident way to do synchronous RPC-style request/response. This design decision is taken up front. However, the way that messages are passed around is abstracted behind an ITransport interface. You can deploy the first version of your system on MSMQ, and as load increases, switch to a more performant solution like RV or MQ, just by changing configuration. WCF does this kind of abstraction as well.</p>
<p>Another important element of the scalability of a system is how workflow instances are persisted. This behaviour is also abstracted behind an interface &#8211; IWorkflowPersister. Start out persisting workflows to a database. As you grow, swap that out of a replicated in-memory cache. In any case, the interaction between workflow and messaging at the logical level is set up front. All the pieces of the design are there. Up front. Helping you design your core application in a way that won&#8217;t <strong>limit</strong> your scalability in the future.</p>
<p>This is plain Separation-of-Concerns; code that works with your specific ESB kept out of your business logic.</p>
<p>Design first, scale with technology later.</p>
<p>[Full disclosure: I'm an editor for InfoQ]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/12/12/scalability-you-wish-youre-gonna-need-it/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
