<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Udi Dahan - The Software Simplist &#187; Availability</title>
	<atom:link href="http://www.udidahan.com/category/availability/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.udidahan.com</link>
	<description>Enterprise Development Expert &#38; SOA Specialist</description>
	<lastBuildDate>Mon, 08 Mar 2010 14:34:24 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Reliability, Availability, and Scalability</title>
		<link>http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/</link>
		<comments>http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/#comments</comments>
		<pubDate>Sat, 15 Nov 2008 21:20:20 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/</guid>
		<description><![CDATA[The great people at IASA have made the recording for my webcast available online.
You can find it here.
The slides can be found here.
I also gave this talk at TechEd Barcelona and wanted to thank the attendee who posted this comment:

“You’ve done it again. Everytime I attend a session of yours I leave the room with [...]]]></description>
			<content:encoded><![CDATA[<p>The great people at IASA have made the recording for my <a href="http://www.udidahan.com/2008/09/25/presentation-reliability-scalability-and-availability/">webcast</a> available online.</p>
<p>You can find it <a href="http://www.iasahome.org/flash/global/udiras.wmv">here</a>.<br />
The slides can be found <a href="http://cid-c8ad44874742a74d.skydrive.live.com/self.aspx/Blog/Reliability|_Availability|_Scalability.pdf">here</a>.</p>
<p>I also gave this talk at TechEd Barcelona and wanted to thank the attendee who posted this comment:</p>
<blockquote><p>
<b>“You’ve done it again. Everytime I attend a session of yours I leave the room with new insights and inspiration on how to improve my software…”</b>
</p></blockquote>
<p>You made my day.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/11/15/reliability-availability-and-scalability/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An Answer of Scale</title>
		<link>http://www.udidahan.com/2008/08/13/an-answer-of-scale/</link>
		<comments>http://www.udidahan.com/2008/08/13/an-answer-of-scale/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 11:22:27 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.udidahan.com/2008/08/13/an-answer-of-scale/</guid>
		<description><![CDATA[To the question of scale Ayende brings up, I thought I&#8217;d tap my concept map.
First of all, I wanted to address the relationship between various topics related to scalability:
 
And on the connection between scalability and throughput:
&#160; 
The important message here is that the scalability of a system is a cost function that gives throughput [...]]]></description>
			<content:encoded><![CDATA[<p>To the <a href="http://ayende.com/Blog/archive/2008/08/11/A-question-of-ScaleAgain.aspx">question of scale</a> Ayende brings up, I thought I&#8217;d tap my <a href="http://www.udidahan.com/2008/08/04/distributed-systems-concept-map/">concept map</a>.</p>
<p>First of all, I wanted to address the relationship between various topics related to scalability:</p>
<p><img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="305" alt="performance topics" src="http://www.udidahan.com/wp-content/uploads/image40.png" width="550" border="0" /> </p>
<p>And on the connection between scalability and throughput:</p>
<p>&#160;<img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="334" alt="scalability topics" src="http://www.udidahan.com/wp-content/uploads/image41.png" width="550" border="0" /> </p>
<p>The important message here is that the scalability of a system is a cost function that gives throughput as a function of recurring costs and one time costs &#8211; servers and other hardware, and the join of buy &amp; build:</p>
<blockquote><p>Did you write your own <a href="http://ayende.com/Blog/archive/2008/08/09/Patterns-for-using-Distributed-Hash-Tables-Conclusion.aspx">locking/transaction mechanism</a> on top of an open source distributed cache or did you buy a license for a <a href="http://www.udidahan.com/2007/05/05/using-spaces-with-web-services/">space-based technology</a>?</p>
</blockquote>
<p>Also, don&#8217;t forget that people need to administer all the servers that you have. Those people cost money (easily100K per year). Maybe, because you haven&#8217;t invested in management or monitoring tools you need one person for every two servers. This will influence the breakdown of up front costs and recurring costs. Also, the level of availability you require will impact this as well.</p>
<p>In my experience, architects don&#8217;t consider often enough the operations environment in their &quot;scalability calculations&quot;.</p>
<p>What this means is that there&#8217;s no such thing as technically  &quot;not being able to scale&quot;.</p>
<p>Rather, that the cost (up front + recurring) of supporting higher throughput grows faster than the function of revenue per user/request/whatever.</p>
<p>Sometimes, the solution is just to find ways to make more money per customer.</p>
<p>For more technical solutions, take a look at <a href="http://www.udidahan.com/2007/12/12/scalability-you-wish-youre-gonna-need-it/">the difference between capacity and scalability</a> and how <a href="http://www.udidahan.com/2007/02/02/queues-scalability-availability/">the competing consumer pattern helps scale out</a>.</p>
<p>Scalability, it&#8217;s all about the money.</p>
<p>&#8211;</p>
<p>Oh, I almost forgot, I also had a great conversation with Carl and Richard about scaling web sites that&#8217;s <a href="http://www.dotnetrocks.com/default.aspx?showNum=367">now up</a> on the .NET Rocks site. Enjoy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/08/13/an-answer-of-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Durable Messaging Dilemmas</title>
		<link>http://www.udidahan.com/2008/07/17/durable-messaging-dilemmas/</link>
		<comments>http://www.udidahan.com/2008/07/17/durable-messaging-dilemmas/#comments</comments>
		<pubDate>Thu, 17 Jul 2008 22:18:47 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/07/17/durable-messaging-dilemmas/</guid>
		<description><![CDATA[I&#8217;ve received some great feedback on my MSDN article and some really great questions that I think more people are wondering about, so I think I&#8217;ll try to do a post per question and see how that goes.
Libor asks:
&#8220;Would you recommend using durable messaging for systems where there are similar requirements with respect to data [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve received some great feedback on my <a href="http://msdn.microsoft.com/en-us/magazine/cc663023.aspx">MSDN article</a> and some really great questions that I think more people are wondering about, so I think I&#8217;ll try to do a post per question and see how that goes.<img style="margin: 10px 0px 10px 10px" height="175" src="http://www.ewashtenaw.org/government/departments/cmhpsm/provider_information/Provider%20Training%20Resources/images/scales_of_justice.jpg" width="143" align="right"></p>
<p>Libor asks:</p>
<blockquote><p>&#8220;Would you recommend using durable messaging for systems where there are similar requirements with respect to data reliability as you had – ie. not losing any messages? If so, then why didn&#8217;t the final version of your solution use it? If not, can you explain why?&#8221;</p>
</blockquote>
<p>The answer is, as always, it depends, but here&#8217;s on what it depends:</p>
<p>When designing a system, we need to take a good, hard look at how we manage state, and what properties that state has. In a system of reasonable size we can expect various families of state with respect to their business value, data volatility, and fault-tolerance window. Each family needs to be treated differently. While durable messaging may be suitable for one, it may be overkill or underkill for another.</p>
<p>So, here&#8217;s what we&#8217;re going to be looking at:</p>
<ol>
<li>Business Value</li>
<li>Data Volatility</li>
<li>Fault-Tolerance Window</li>
</ol>
<h4>Business Value</h4>
<p>When talking about business value, I want to talk about what it means &#8220;not losing any messages&#8221;. The question is under what conditions will the messages not be lost, or rather, what are the threshold conditions where messages may start getting lost. If all our datacenters are nuked, we will lose data. It&#8217;s likely the business is OK with that (as much as can be expected under those circumstances). If a single server goes down, it&#8217;s likely the business would not be OK with losing messages containing financial data. However if a message requesting the health of a server were to get lost under those same conditions, that would probably be alright. In other words, what does that message represent in business terms.</p>
<h4>Data Volatility</h4>
<p><img style="margin: 0px 10px 0px 0px" height="150" src="http://www.classicdriver.com/upload/images/_de/3161/img02.jpg" width="270" align="left">Data volatility also has an impact. Let&#8217;s say that we&#8217;re building a financial trading system. The time that it takes us to respond to an event (message) that the cost of a certain financial instrument has changed, and the message that we send requesting to buy that security is critical. Let&#8217;s say that has to be done in under 10ms. Now, some failure has occurred preventing our message from reaching its destination for 20ms. What should we do with that message? Should we keep it around, making sure it doesn&#8217;t get lost? Not in this domain. On the contrary, that message should be thrown away as its &#8220;business lifetime&#8221; has been exceeded. Furthermore, even during that original period of 10ms, the use of durable messaging may make it close to impossible to maintain our response times.</p>
<h4>Fault-Tolerance Window</h4>
<p>These two topics feed into the third and more architectural one &#8211; fault-tolerance window: what period of time do we require fault tolerance, and with respect to how many (and what kind of) faults? This will lead us into an analysis of to how many machines do we need to copy a message before we release the calling thread. We&#8217;d also look at in which datacenters those machines reside. This will also impact (or be impacted by) the kinds of links we have to these datacenters if we want to maintain response times. These numbers will need to change when the system identifies a disaster &#8211; degrading itself to a lower level of fault-tolerance after a hurricane knocks out a datacenter, and returning to normal once it comes back up.</p>
<h4>Re-Evaluating Durable Messaging</h4>
<p>Durable messaging may be used at various points in each part of the solution, but we need to look at message size, the rate those messages are being written to disk, how fast the disk is, how much available disk we have (so we don&#8217;t make things worse in the case of degraded service), etc. Companies like Amazon also take into account disk failure rates, replacement rates (disks aren&#8217;t replaced <em>immediately</em> you know), and many other factors when making these decisions<img style="margin: 10px 0px 10px 10px;" height="231" alt="image" src="http://udidahan.weblogs.us/wp-content/uploads/image-thumb25.png" width="143" align="right" border="0"> </p>
<h4>Summary</h4>
<p>Our job as architects when designing the system is to find that cost-benefit balance for the various parts of the system according to these very applicative parameters. No, it&#8217;s not easy. No, cloud computing will not magically solve all of this for us. But, we are getting more technical tools to work with, operations staff is getting better at working with us in the design phase, and our thought processes more rigorous in dealing with the scary conditions of the real world. </p>
<p>To your question, Libor, as to why we didn&#8217;t eventually use durable messaging in our solution, the answer is that we solved the overall state management problem by setting up an applicative protocol with our partners which was resilient in the face of faults by using idempotent messages that could be resent as many times as necessary. You can read more about it <a href="http://udidahan.weblogs.us/2008/04/10/scalability-article-up-on-infoq/">here</a>. This solution isn&#8217;t viable for other kinds of interactions but was just what we needed to get the job done.</p>
<p>Hope that helps.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/07/17/durable-messaging-dilemmas/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Advanced Messaging with a dash of DDD</title>
		<link>http://www.udidahan.com/2008/02/18/advanced-messaging-with-a-dash-of-ddd/</link>
		<comments>http://www.udidahan.com/2008/02/18/advanced-messaging-with-a-dash-of-ddd/#comments</comments>
		<pubDate>Mon, 18 Feb 2008 11:07:48 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[Messaging]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Reliability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/02/18/advanced-messaging-with-a-dash-of-ddd/</guid>
		<description><![CDATA[Following my last post (From CRUD to Domain-Driven Fluency) a bunch of questions have started popping up. One that I received via email from a client up in Ireland particularly caught my eye, so here it is:
Hi Udi, I think&#160; I see the point about the domain-driven approach but I&#8217;m wondering what my messages will [...]]]></description>
			<content:encoded><![CDATA[<p>Following my last post (<a href="http://udidahan.weblogs.us/2008/02/15/from-crud-to-domain-driven-fluency/">From CRUD to Domain-Driven Fluency</a>) a bunch of questions have started popping up. One that I received via email from a client up in Ireland particularly caught my eye, so here it is:</p>
<blockquote><p>Hi Udi, I think&nbsp; I see the point about the domain-driven approach but I&#8217;m wondering what my messages will look like. If it&#8217;s this:
<p><font face="Consolas" size="2">IAppointment InsertInterview(Guid recruiterId, Guid applicantId, Guid appointmentId); </font>OR
<p><font face="Consolas" size="2">IRecuiter UpdateRecuiter(IRecuiter recruiter); </font>(passing in an operation flag attached to the IRecuiter object) OR
<p><font face="Consolas" size="2">IRecuiter UpdateRecuiter(IRecuiter recruiter); </font>(setting a state flag on the relevant object and have the business object check the flag and behave according the state change)
<p>Hope I’m not way off
<p>Sean</p>
</blockquote>
<p>Well, Sean, first of all &#8211; messages don&#8217;t look like functions. They&#8217;re a lot more like structures &#8211; data transfer objects. In this case, you&#8217;d probably be looking at a ScheduleInterviewMessage that had the relevant fields. It would look something like this:
<p>&nbsp;<!-- code formatted by http://manoli.net/csharpformat/ --><br />
<style type="text/css">
.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}</p>
<p>.csharpcode pre { margin: 0em; }</p>
<p>.csharpcode .rem { color: #008000; }</p>
<p>.csharpcode .kwrd { color: #0000ff; }</p>
<p>.csharpcode .str { color: #006080; }</p>
<p>.csharpcode .op { color: #0000c0; }</p>
<p>.csharpcode .preproc { color: #cc6633; }</p>
<p>.csharpcode .asp { background-color: #ffff00; }</p>
<p>.csharpcode .html { color: #800000; }</p>
<p>.csharpcode .attr { color: #ff0000; }</p>
<p>.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}</p>
<p>.csharpcode .lnum { color: #606060; }
</style>
<div class="csharpcode">
<pre class="alt"><span class="lnum">   1:  </span><span class="kwrd">using</span> System;</pre>
<pre><span class="lnum">   2:  </span><span class="kwrd">using</span> NServiceBus;</pre>
<pre class="alt"><span class="lnum">   3:  </span><span class="kwrd">using</span> System.Xml.Serialization;</pre>
<pre><span class="lnum">   4:  </span>&nbsp;</pre>
<pre class="alt"><span class="lnum">   5:  </span><span class="kwrd">namespace</span> Messages</pre>
<pre><span class="lnum">   6:  </span>{</pre>
<pre class="alt"><span class="lnum">   7:  </span>    [Serializable]</pre>
<pre><span class="lnum">   8:  </span>    [Recoverable]</pre>
<pre class="alt"><span class="lnum">   9:  </span>    [TimeToBeReceived(<span class="str">"0:01:00.000"</span>)]</pre>
<pre><span class="lnum">  10:  </span>    <span class="kwrd">public</span> <span class="kwrd">class</span> ScheduleInterviewMessage : IMessage</pre>
<pre class="alt"><span class="lnum">  11:  </span>    {</pre>
<pre><span class="lnum">  12:  </span>        <span class="kwrd">public</span> Guid InterviewerId;</pre>
<pre class="alt"><span class="lnum">  13:  </span>        <span class="kwrd">public</span> Guid CandidateId;</pre>
<pre><span class="lnum">  14:  </span>        <span class="kwrd">public</span> DateTime RequestedTime;</pre>
<pre class="alt"><span class="lnum">  15:  </span>&nbsp;</pre>
<pre><span class="lnum">  16:  </span>        [XmlAnyElement]</pre>
<pre class="alt"><span class="lnum">  17:  </span>        <span class="kwrd">public</span> <span class="kwrd">object</span> extra;</pre>
<pre><span class="lnum">  18:  </span>    }</pre>
<pre class="alt"><span class="lnum">  19:  </span>}</pre>
</div>
<p>Before we go on, I want to explain what we see. The &#8220;recoverable&#8221; attribute is the way we indicate to the infrastructure that these messages should not be lost in case a server fails, there are network problems, etc. In essence, it does durable, store-and-forward messaging. This will create an environment in which, in the case of network problems, these messages will be written to disk. That&#8217;s a good thing, since once connectivity comes back or the server boots back up, the messages will still be around and can be sent.</p>
<p>Now these messages are fairly small, so even at a relatively high load, we shouldn&#8217;t be chewing through too much of our expensive, small, high performance local disks. However, if these messages were bigger, we may fill up our disks before connectivity comes back, and we all know what happens to Windows boxes when there&#8217;s no room on the file system left:</p>
<p><img style="margin: 0px 0px 0px 20px" src="http://www.ferzkopp.net/joomla/images/stories/bsod.gif"></p>
<p>In order to prevent our system from <strong>Denial-of-Servicing</strong> itself we need to make those messages clean themselves up. That&#8217;s what the &#8220;TimeToBeReceived&#8221; attribute is for. The amount of time that if a message had not yet been received by the other side that it will be deleted. This could be that the message even made it to the other machine, but the process handling those messages was down. You wouldn&#8217;t want to be filling the other side&#8217;s disk either causing them to crash, would you? This protects both parties.</p>
<p>The way to figure out how long to set is by looking at the smallest amount of durable storage you have available at your nodes, divide that by the size of the average message, and then again by the rate you need to process messages &#8211; and leave yourself at least 100% spare.</p>
<h2>In other words, to build a robust system you not only will need to deal with lost messages, but you will be actively throwing messages away.</h2>
<p>Finally, that last &#8220;XmlAnyElement&#8221; attribute is there for versioning. As we version our system and schema, we&#8217;ll be adding fields to the message. However, an old client may be talking to a new server, or vice versa. Since we wouldn&#8217;t want data to get lost just because of serialization. In a future post, I&#8217;ll show how to set up a message handler pipeline exactly for these issues.</p>
<p>Now that we&#8217;ve covered all the intricacies around messaging, we can see how the code that handles that above message looks:</p>
<p><!-- code formatted by http://manoli.net/csharpformat/ --></p>
<style type="text/css">
.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}</p>
<p>.csharpcode pre { margin: 0em; }</p>
<p>.csharpcode .rem { color: #008000; }</p>
<p>.csharpcode .kwrd { color: #0000ff; }</p>
<p>.csharpcode .str { color: #006080; }</p>
<p>.csharpcode .op { color: #0000c0; }</p>
<p>.csharpcode .preproc { color: #cc6633; }</p>
<p>.csharpcode .asp { background-color: #ffff00; }</p>
<p>.csharpcode .html { color: #800000; }</p>
<p>.csharpcode .attr { color: #ff0000; }</p>
<p>.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}</p>
<p>.csharpcode .lnum { color: #606060; }
</style>
<div class="csharpcode">
<pre class="alt"><span class="lnum">   1:  </span><span class="kwrd">using</span> System;</pre>
<pre><span class="lnum">   2:  </span><span class="kwrd">using</span> Messages;</pre>
<pre class="alt"><span class="lnum">   3:  </span><span class="kwrd">using</span> NServiceBus;</pre>
<pre><span class="lnum">   4:  </span><span class="kwrd">using</span> NHibernate;</pre>
<pre class="alt"><span class="lnum">   5:  </span>&nbsp;</pre>
<pre><span class="lnum">   6:  </span><span class="kwrd">namespace</span> Server</pre>
<pre class="alt"><span class="lnum">   7:  </span>{</pre>
<pre><span class="lnum">   8:  </span>    <span class="kwrd">public</span> <span class="kwrd">class</span> ScheduleInterviewMessageHandler :</pre>
<pre class="alt"><span class="lnum">   9:  </span>                 BaseMessageHandler&lt;ScheduleInterviewMessage&gt;</pre>
<pre><span class="lnum">  10:  </span>    {</pre>
<pre class="alt"><span class="lnum">  11:  </span>        <span class="kwrd">public</span> <span class="kwrd">override</span> <span class="kwrd">void</span> Handle(ScheduleInterviewMessage message)</pre>
<pre><span class="lnum">  12:  </span>        {</pre>
<pre class="alt"><span class="lnum">  13:  </span>            <span class="kwrd">using</span> (ISession session = SessionFactory.OpenSession())</pre>
<pre><span class="lnum">  14:  </span>            <span class="kwrd">using</span> (ITransaction tx = session.BeginTransaction())</pre>
<pre class="alt"><span class="lnum">  15:  </span>            {</pre>
<pre><span class="lnum">  16:  </span>                ICandidateInterviewer interviewer = session.Get&lt;ICandidateInterviewer&gt;(</pre>
<pre class="alt"><span class="lnum">  17:  </span>                        message.InterviewerId);</pre>
<pre><span class="lnum">  18:  </span>                ICandidate candidate = session.Get&lt;ICandidate&gt;(</pre>
<pre class="alt"><span class="lnum">  19:  </span>                        message.CandidateId);</pre>
<pre><span class="lnum">  20:  </span>&nbsp;</pre>
<pre class="alt"><span class="lnum">  21:  </span>                interviewer.ScheduleInterviewWith(candidate)</pre>
<pre><span class="lnum">  22:  </span>                        .At(message.RequestedTime);</pre>
<pre class="alt"><span class="lnum">  23:  </span>&nbsp;</pre>
<pre><span class="lnum">  24:  </span>                tx.Commit();</pre>
<pre class="alt"><span class="lnum">  25:  </span>            }</pre>
<pre><span class="lnum">  26:  </span>&nbsp;</pre>
<pre class="alt"><span class="lnum">  27:  </span>            <span class="rem">// publish new appointment data</span></pre>
<pre><span class="lnum">  28:  </span>        }</pre>
<pre class="alt"><span class="lnum">  29:  </span>    }</pre>
<pre><span class="lnum">  30:  </span>}</pre>
</div>
<p>If you&#8217;ve read this far and have more questions, please feel free to <a href="mailto:questions@UdiDahan.com">send them my way</a>. If you&#8217;re at a more time-critical part of your project and need an answer quickly, we can <a href="mailto:OnlineConsultation@UdiDahan.com">set up a skype call</a>. This has been working quite well for many of my overseas clients (shout out to the guys in Ireland and Florida).</p>
<p>Until next time <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/02/18/advanced-messaging-with-a-dash-of-ddd/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Durable Messaging Is Not Enough</title>
		<link>http://www.udidahan.com/2008/01/09/durable-messaging-is-not-enough/</link>
		<comments>http://www.udidahan.com/2008/01/09/durable-messaging-is-not-enough/#comments</comments>
		<pubDate>Wed, 09 Jan 2008 23:17:27 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2008/01/09/durable-messaging-is-not-enough/</guid>
		<description><![CDATA[I&#8217;ve been sitting on this post for a while, waiting, before outlining all the kinds of problems durable messaging doesn&#8217;t solve, I wanted to have a solution handy. Harry Pierson begins to outline the goodness that durable messaging brings to SOA, and in a later post on idempotence describes in general terms how it ties [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been sitting on this post for a while, waiting, before outlining all the kinds of problems durable messaging doesn&#8217;t solve, I wanted to have a solution handy. Harry Pierson begins to outline the goodness that <a href="http://devhawk.net/2007/05/30/The+Case+For+Durable+Messaging+In+Service+Orientation.aspx">durable messaging brings to SOA</a>, and in a <a href="http://devhawk.net/2007/11/09/The+Importance+Of+Idempotence.aspx">later post on idempotence</a> describes in general terms how it ties back into durable messaging and transaction &#8211; in essence describing a <a href="http://udidahan.weblogs.us/2007/12/17/no-more-workflow-for-nservicebus-please-welcome-the-saga/">saga</a>. Let&#8217;s do this in story form.</p>
<p>Since you&#8217;re concerned that maybe your shipping company&#8217;s servers may be down for some kind of planned (or unplanned) maintenance just as you&#8217;re trying to fulfill orders, you use a durable messaging solution there. What happens is that messages get written to disk on your end, and later the messaging tries to transfer the messages until it succeeds. So what&#8217;s wrong with that?</p>
<p>Well, let&#8217;s say that the shipping company&#8217;s servers went up in smoke (true story &#8211; broken down air conditioners + poor ventilation, you get the picture). Those servers aren&#8217;t going to be coming back online any second now. So, you have all these order messages buffering on your disk. Taking into account all the data, meta-data, XML, SOAP, encryption and everything, we may get up to 1MB per message.</p>
<p>And now&#8217;s holiday season and your company&#8217;s selling hand over fist, hundreds of orders per second from all over the world. So that means we&#8217;re eating up 100MB of disk per second, that&#8217;s 6GB a minute, and in under an hour of our shipping company&#8217;s servers going down &#8211; so do ours.</p>
<p>Durable messaging &#8211; yay? We don&#8217;t want to lose those orders, right? In short, durable messaging is an important part of the solution, but it&#8217;s not the whole solution.</p>
<p>[Continued next time...]</p>
<p>If you&#8217;re impatient and just want the solution, yes, <a href="http://www.nServiceBus.com">nServiceBus</a> give you all the tools you need.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2008/01/09/durable-messaging-is-not-enough/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Handling messages out of order</title>
		<link>http://www.udidahan.com/2007/12/15/handling-messages-out-of-order/</link>
		<comments>http://www.udidahan.com/2007/12/15/handling-messages-out-of-order/#comments</comments>
		<pubDate>Sat, 15 Dec 2007 23:22:24 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Data Access]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Threading]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/12/15/handling-messages-out-of-order/</guid>
		<description><![CDATA[I wanted to follow up on my recent post, &#8220;In order messaging a myth?&#8221; by showing the exact code that solves the issue. I have a podcast waiting to come online that deals with the specifics, so keep your eye out for that too.
The important thing to note is that if we just automatically return [...]]]></description>
			<content:encoded><![CDATA[<p>I wanted to follow up on my recent post, &#8220;<a href="http://udidahan.weblogs.us/2007/12/09/in-order-messaging-a-myth/">In order messaging a myth?</a>&#8221; by showing the exact code that solves the issue. I have a podcast waiting to come online that deals with the specifics, so keep your eye out for that too.</p>
<p>The important thing to note is that if we just automatically return the message to the queue, we may get &#8220;stuck&#8221; with that message if the first PolicyCreatedMessage never arrived. This opens us up to a Denial-of-Service attack by quite simply flooding us with a bunch of messages that never get cleaned up.</p>
<p>Anyway, the general idea is to first try the regular happy path, and only if we see that prerequisite data isn&#8217;t available, do we see if another thread may be working on that data. This is done by decreasing the isolation level of our transaction from the regular ReadCommitted to ReadUncommitted. This will enable our thread to see if some other thread inserted the policy in to the Policies table but hasn&#8217;t committed its transaction yet.</p>
<div style="overflow: auto; ">
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">public</span> <span style="COLOR: blue">class</span> <span style="COLOR: #2b91af">PolicyApprovedMessageHandler</span> : BaseDBMessageHandler&lt;PolicyApprovedMessage&gt;<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp; </span>{<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">public</span> <span style="COLOR: blue">override</span> <span style="COLOR: blue">void</span> Handle(PolicyApprovedMessage message)<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>{<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">bool</span> policyExists = <span style="COLOR: blue">true</span>;<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><o:p><font size="3">&nbsp;</font></o:p></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">using</span> (<span style="COLOR: #2b91af">ISession</span> s = OpenSession())<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">using</span> (<span style="COLOR: #2b91af">ITransaction</span> tx = s.BeginTransaction(<span style="COLOR: #2b91af">IsolationLevel</span>.ReadCommitted))<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>{<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>Policy p = s.Get&lt;Policy&gt;(message.PolicyId);<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><o:p><font size="3">&nbsp;</font></o:p></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">if</span> (p != <span style="COLOR: blue">null</span>)<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>{<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>p.Approve();<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>tx.Commit();<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>}<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">else<o:p></o:p></span></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>policyExists = <span style="COLOR: blue">false</span>;<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>}<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><o:p><font size="3">&nbsp;</font></o:p></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">if</span> (!policyExists) <span style="COLOR: green">// check to make sure<o:p></o:p></span></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">using</span> (<span style="COLOR: #2b91af">ISession</span> s = OpenSession())<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="mso-spacerun: yes">&nbsp;</span><span style="COLOR: blue">using</span> (<span style="COLOR: #2b91af">ITransaction</span> tx = s.BeginTransaction(<span style="COLOR: #2b91af">IsolationLevel</span>.ReadUncommitted))<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>{<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>Policy p = s.Get&lt;Policy&gt;(message.PolicyId);<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><o:p><font size="3">&nbsp;</font></o:p></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">if</span> (p != <span style="COLOR: blue">null</span>) <span style="COLOR: green">// another thread hasn&#8217;t committed its tx yet, so try message again later<o:p></o:p></span></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">this</span>.bus.HandleCurrentMessageLater();<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">else<o:p></o:p></span></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span style="COLOR: blue">this</span>.bus.Return((<span style="COLOR: blue">int</span>)ErrorCodes.PolicyNotFound);<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>}<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: normal; mso-layout-grid-align: none"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>}<o:p></o:p></font></span></p>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 10pt"><span style="FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-no-proof: yes"><font size="3"><span style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp; </span>}</font></span><span style="FONT-SIZE: 9pt; LINE-HEIGHT: 115%"><o:p></o:p></span></p>
</div>
<p>The next step will be how we take this code and make it generic, so that we don&#8217;t have write the same code over and over again for the different kinds of message handlers we have.</p>
<p>But that will have to wait until the next installment <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/12/15/handling-messages-out-of-order/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Asynchronous, High-Performance Login for Web Farms</title>
		<link>http://www.udidahan.com/2007/11/10/asynchronous-high-performance-login-for-web-farms/</link>
		<comments>http://www.udidahan.com/2007/11/10/asynchronous-high-performance-login-for-web-farms/#comments</comments>
		<pubDate>Sat, 10 Nov 2007 16:08:46 +0000</pubDate>
		<dc:creator>udidahan</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Caching]]></category>
		<category><![CDATA[Data Access]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[NServiceBus]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Web Services]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/11/10/asynchronous-high-performance-login-for-web-farms/</guid>
		<description><![CDATA[Often during my consulting engagements I run into people who say, &#34;some things just can&#8217;t be made asynchronous&#34; even after they agree about the inherent scalability that asynchronous communications pattern bring. One often-cited example is user authentication &#8211; taking a username and password combo and authenticating it against some back-end store. For the purpose of [...]]]></description>
			<content:encoded><![CDATA[<p>Often during my consulting engagements I run into people who say, &quot;some things just can&#8217;t be made asynchronous&quot; even after they agree about the inherent scalability that asynchronous communications pattern bring. One often-cited example is user authentication &#8211; taking a username and password combo and authenticating it against some back-end store. For the purpose of this post, I&#8217;m going to assume a database. Also, I&#8217;m not going to be showing more advanced features like ETags to further improve the solution.</p>
<h3>The Setup</h3>
<p>Just so that the example is in itself secure, we&#8217;ll assume that the password is one-way hashed before being stored. Also, given a reasonable network infrastructure our web servers will be isolated in the <a href="http://en.wikipedia.org/wiki/Demilitarized_zone_(computing)">DMZ</a> and will have to access some application server which, in turn, will communicate with the DB. There&#8217;s also a good chance for something like round-robin load-balancing between web servers, especially for things like user login.</p>
<p>Before diving into the meat of it, I wanted to preface with a few words. One of the commonalities I&#8217;ve found when people dismiss asynchrony is that they don&#8217;t consider a real deployment environment, or scaling up a solution to multiple servers, farms, or datacenters.</p>
<h3>The Synchronous Solution</h3>
<p>In the synchronous solution, each one of our web servers will be contacting the app server for each user login request. In other words, the load on the app server and, consequently, on the database server will be proportional to the number of logins. One property of this load is its data locality, or rather, the lack of it. Given that user U logged in, the DB won&#8217;t necessarily gain any performance benefits by loading all username/password data into memory for the same page as user U. Another property is that this data is very non-volatile &#8211; it doesn&#8217;t change that often.</p>
<p>I won&#8217;t go to far into the synchronous solution since its been <a href="http://www.michaelnygard.com/blog/2007/11/two_ways_to_boost_your_flaggin.html">analysed</a> numerous times before. The bottom line is that the database is the bottleneck. You could use sharding solutions. Many of the large sites have numerous read-only databases for this kind of data, with one master for updates &#8211; replicating out to the read-only replicas. That&#8217;s <a href="http://www.michaelnygard.com/blog/2007/11/two_quick_observations.html">great</a> if you&#8217;re using a nice cheap database like mySql (of LAMP), not so nice if you&#8217;re running Oracle or MS Sql Server.</p>
<p>Regardless of what you&#8217;re doing in your data tier, you&#8217;re there. Wouldn&#8217;t it be nice to close the loop in the web servers? Even if you are using Apache, that&#8217;s going to be less iron, electricity, and cooling all around. That&#8217;s what the asynchronous solution is all about &#8211; capitalizing on the low cost of memory to save on other things.</p>
<h3>The Asynchronous Solution</h3>
<p>In the asynchronous solution, we cache username/hashed-password pairs in memory on our web servers, and authenticate against that. Let&#8217;s analyse how much memory that takes:</p>
<p>Usernames are usually 12 characters or less, but let&#8217;s take an average of 32 to be sure. Using Unicode we get to 64 bytes for the username. Hashed passwords can run between 256 and 512 <em>bits</em> depending on the algorithm, divide by 8 and you have 64 bytes. That&#8217;s about 128 bytes altogether. So we can safely cache 8 million of these with 1GB of memory per web server. If you&#8217;ve got a million users, first of all, good for you <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Second, that&#8217;s just 128 MB of memory &#8211; relatively nothing even for a cheap 2GB web server. </p>
<p>Also, consider the fact that when registering a new user we can check if such a username is already taken at the web server level. That doesn&#8217;t mean it won&#8217;t be checked again in the DB to account for <a href="http://udidahan.weblogs.us/2007/01/22/realistic-concurrency/">concurrency issues</a>, but that the load on the DB is further reduced. Other things to notice include no read-only replicas and no replication. Simple. Our web servers are the &quot;replicas&quot;.</p>
<h3>The Authentication Service</h3>
<p>What makes it all work is the &quot;Authentication Service&quot; on the app server. This was always there in the synchronous solution. It is what used to field all the login requests from the web servers, and, of course, allowed them to register new users and all the regular stuff. The difference is that now it publishes a message when a new user is registered (or rather, is validated &#8211; all a part of the internal long-running workflow). It also allows subscribers to receive the list of all username/hashed-password pairs. It&#8217;s also quite likely that it would keep the same data in memory too.</p>
<p>The same message can be used to publish both single updates, and returning the full list when using <a href="http://www.NServiceBus.com">NServiceBus</a>. Let&#8217;s define the message:</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>[Serializable]      <br />public class UsernameInUseMessage : IMessage       <br />{       <br />&#160;&#160;&#160; private string username;       <br />&#160;&#160;&#160; public string Username       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; get { return username; }       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; set { username = value; }       <br />&#160;&#160;&#160; } </p>
</p>
<p>&#160;&#160;&#160; private byte[] hashedPassword;      <br />&#160;&#160;&#160; public byte[] HashedPassword       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; get { return hashedPassword; }       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; set { hashedPassword = value; }       <br />&#160;&#160;&#160; }       <br />} </p>
</p></div>
<p>And the message that the web server sends when it wants the full list:</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>[Serializable]      <br />public class GetAllUsernamesMessage : IMessage       <br />{ </p>
<p>} </p>
</p></div>
<p>And the code that the web server runs on startup looks like this (assuming constructor injection):</p>
<p>&#160;</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>public class UserAuthenticationServiceAgent      <br />{&#160; <br />&#160;&#160;&#160; public UserAuthenticationServiceAgent(IBus bus)&#160; <br />&#160;&#160;&#160; {&#160; <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; this.bus = bus;       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; bus.Subscribe(typeof(UsernameInUseMessage));&#160; <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; bus.Send(new GetAllUsernamesMessages());       <br />&#160;&#160;&#160; } </p>
<p> }</p></div>
<p>And the code that runs in the Authentication Service when the GetAllUsernamesMessage is received:</p>
</p>
<p>&#160;</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>public class GetAllUsernamesMessageHandler : BaseMessageHandler&lt;GetAllUsernamesMessage&gt;      <br />{       <br />&#160;&#160;&#160; public override void Handle(GetAllUsernamesMessage message)       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; this.Bus.Reply(Cache.GetAll&lt;UsernameInUseMessage&gt;());       <br />&#160;&#160;&#160; }       <br />}</p>
</p></div>
<p>&#160;</p>
<p>And the class on the web server that handles a UsernameInUseMessage when it arrives:</p>
</p>
<p>&#160;</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>public class UsernameInUseMessageHandler : BaseMessageHandler&lt;UsernameInUseMessage&gt;      <br />{       <br />&#160;&#160;&#160; public override void Handle(UsernameInUseMessage message)       <br />&#160;&#160;&#160; {&#160; <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; WebCache.SaveOrUpdate(message.Username, message.HashedPassword);&#160; <br />&#160;&#160;&#160; }       <br />}</p>
</p></div>
<p>When the app server sends the full list, multiple objects of the type UsernameInUseMessage are sent in one physical message to that web server. However, the bus object that runs on the web server dispatches each of these logical messages one at a time to the message handler above.</p>
<p>So, when it comes time to actually authenticate a user, this the web page (or controller, if you&#8217;re doing MVC) would call:</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>public class UserAuthenticationServiceAgent      <br />{       <br />&#160;&#160;&#160; public bool Authenticate(string username, string password)       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; byte[] existingHashedPassword = WebCache[username];       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; if (existingHashedPassword != null)       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; return existingHashedPassword == this.Hash(password); </p>
<p>&#160;&#160;&#160;&#160;&#160;&#160;&#160; return false;      <br />&#160;&#160;&#160; }       <br />}</p>
</p></div>
<p>&#160;</p>
<p>When registering a new user, the web server would of course first check its cache, and then send a RegisterUserMessage that contained the username and the hashed password.</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>[Serializable]      <br />[StartsWorkflow]       <br />public class RegisterUserMessage : IMessage       <br />{       <br />&#160;&#160;&#160; private string username;       <br />&#160;&#160;&#160; public string Username       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; get { return username; }       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; set { username = value; }       <br />&#160;&#160;&#160; } </p>
</p>
<p>&#160;&#160;&#160; private string email;      <br />&#160;&#160;&#160; public string Email       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; get { return email; }       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; set { email = value; }       <br />&#160;&#160;&#160; } </p>
</p>
<p>&#160;&#160;&#160; private byte[] hashedPassword;      <br />&#160;&#160;&#160; public byte[] HashedPassword       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; get { return hashedPassword; }       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; set { hashedPassword = value; }       <br />&#160;&#160;&#160; }       <br />} </p>
</p></div>
<p>&#160;</p>
<p>When the RegisterUserMessage arrives at the app server, a new long-running workflow is kicked off to handle the process:</p>
</p>
<div style="border-right: black 1px solid; padding-right: 1em; border-top: black 1px solid; padding-left: 1em; padding-bottom: 0em; overflow: auto; border-left: black 1px solid; padding-top: 0em; border-bottom: black 1px solid; font-family: courier; background-color: beige">
<p>public class RegisterUserWorkflow :      <br />&#160;&#160;&#160; BaseWorkflow&lt;RegisterUserMessage&gt;, IMessageHandler&lt;UserValidatedMessage&gt;       <br />{       <br />&#160;&#160;&#160; public void Handle(RegisterUserMessage message)       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; //send validation request to message.Email containing this.Id (a guid)       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; // as a part of the URL       <br />&#160;&#160;&#160; } </p>
<p>&#160;&#160;&#160; /// &lt;summary&gt;      <br />&#160;&#160;&#160; /// When a user clicks the validation link in the email, the web server       <br />&#160;&#160;&#160; /// sends this message (containing the workflow Id)       <br />&#160;&#160;&#160; /// &lt;/summary&gt;       <br />&#160;&#160;&#160; /// &lt;param name=&quot;message&quot;&gt;&lt;/param&gt;       <br />&#160;&#160;&#160; public void Handle(UserValidatedMessage message)       <br />&#160;&#160;&#160; {       <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; // write user to the DB </p>
<p>&#160;&#160;&#160;&#160;&#160;&#160;&#160; this.Bus.Publish(new UsernameInUseMessage(      <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; message.Username, message.HashedPassword));       <br />&#160;&#160;&#160; }       <br />}</p>
</p></div>
<p>That UsernameInUseMessage would eventually arrive at all the web servers subscribed.</p>
<h3>Performance/Security Trade-Offs</h3>
<p>When looking deeper into this workflow we realize that it could be implemented as two separate message handlers, and have the email address take the place of the workflow Id. The problem with this alternate, better performing solution has to do with security. By removing the dependence on the workflow Id, we&#8217;ve in essence stated that we&#8217;re willing to receive a UserValidatedMessage without having previously received the RegisterUserMessage. </p>
<p>Since the processing of the UserValidatedMessage is relatively expensive &#8211; writing to the DB and publishing messages to <em>all</em> web servers, a malicious user could perform a denial of service (<a href="http://en.wikipedia.org/wiki/Denial-of-service_attack">DOS</a>) attack without that many messages, thus flying under the radar of many detection systems. Spoofing a guid that would result in a valid workflow instance is much more difficult. Also, since workflow instances would probably be stored in some in-memory, replicated data grid the relative cost of a lookup would be quite small &#8211; small enough to avoid a DOS until a detection system picked it up.</p>
<h3>Improved Bandwidth &amp; Latency</h3>
<p>The bottom line is that you&#8217;re getting much more out of your web tier this way, rather than hammering your data tier and having to scale it out much sooner. Also, notice that there is much less network traffic this way. Not such a big deal for usernames and passwords, but other scenarios built in the same way may need more data. Of course, the time it takes us to log a user in is much shorter as well since we don&#8217;t have to cross back and forth from the web server (in the DMZ) to the app server, to the db server.</p>
<p>The important thing to remember in this solution is doing pub/sub. NServiceBus merely provides a simple API for designing the system around pub/sub. And publishing is where you get the serious scalability. As you get more users, you&#8217;ll obviously need to get more web servers. The thing is that you probably won&#8217;t need more database servers <em>just to handle logins</em>. In this case, you also get <a href="http://www.michaelnygard.com/blog/2007/11/architecting_for_latency.html">lower latency</a> per request since all work needed to be done can be done locally on the server that received the request. </p>
<h3>ETags make it even better</h3>
<p>For the more advanced crowd, I&#8217;ll wrap it up with the <a href="http://en.wikipedia.org/wiki/HTTP_ETag">ETags</a>. Since web servers do go down, and the cache will be cleared, what we can do is to write that cache to disk (probably in a background thread), and &quot;tag&quot; it with something that the server gave us along with the last UsernameInUseMessage we received. That way, when the web server comes back up, it can send that ETag along with its GetAllUsernamesMessage so that the app server will only send the changes that occurred since. This drives down network usage even more at the insignificant cost of some disk space on the web servers.</p>
<h3>And in closing&#8230;</h3>
<p>Even if you don&#8217;t have anything more than a single physical server today, and it acts as your web server and database server, this solution won&#8217;t slow things down. If anything, it&#8217;ll speed it up. Regardless, you&#8217;re much better prepared to scale out than before &#8211; no need to rip and replace your entire architecture just as you get 8 million Facebook users banging down your front door.</p>
<p>So, go check out <a href="http://www.NServiceBus.com">NServiceBus</a> and get the most out of your iron.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/11/10/asynchronous-high-performance-login-for-web-farms/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Make non-stop software simple, make it possible</title>
		<link>http://www.udidahan.com/2007/08/24/make-non-stop-software-simple-make-it-possible/</link>
		<comments>http://www.udidahan.com/2007/08/24/make-non-stop-software-simple-make-it-possible/#comments</comments>
		<pubDate>Fri, 24 Aug 2007 06:08:35 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/08/24/make-non-stop-software-simple-make-it-possible/</guid>
		<description><![CDATA[Dan Pritchett is calling for Non-Stop Software and I for one am picking up the cry as well. I&#8217;d also like non-stop operating systems, databases, and tools to support it. I&#8217;d like a system where I won&#8217;t be afraid to backup the database while its running. I&#8217;d like a system where I can incrementally upgrade [...]]]></description>
			<content:encoded><![CDATA[<p>Dan Pritchett is calling for <a href="http://www.addsimplicity.com/adding_simplicity_an_engi/2007/08/in-support-of-n.html">Non-Stop Software</a> and I for one am picking up the cry as well. I&#8217;d also like non-stop operating systems, databases, and tools to support it. I&#8217;d like a system where I won&#8217;t be afraid to backup the database while its running. I&#8217;d like a system where I can incrementally upgrade the middleware while its running. I&#8217;d like programming models where all of this is hidden from the programmer &#8211; I understand that this means preventing programmers from working in certain ways (synchronous RPC for one).</p>
<p>I don&#8217;t have that today. All I have are patterns which, when strictly applied, bring me close. This is a real challenge with geographically dispersed teams. Its even hard with everyone sitting in the same room.</p>
<p>That is why this call is more directed at the platform and tool vendors than anyone else. Stop changing paradigms all the time. Stick with one or two, make them rock solid. Make it simple to get it right.</p>
<p>We need non-stop software.</p>
<p>Let&#8217;s start talking about how to get there from here.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/08/24/make-non-stop-software-simple-make-it-possible/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>What happens if it fails &#8211; circa 1989</title>
		<link>http://www.udidahan.com/2007/08/13/what-happens-if-it-fails-circa-1989/</link>
		<comments>http://www.udidahan.com/2007/08/13/what-happens-if-it-fails-circa-1989/#comments</comments>
		<pubDate>Mon, 13 Aug 2007 12:55:17 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[Simplicity]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/08/13/what-happens-if-it-fails-circa-1989/</guid>
		<description><![CDATA[Via Patrick Logan&#8217;s Bits of Wisdom: HOPL III&#8217;s History of Erlang:

1989 also provided us with one of our first opportunities to present Erlang to the world outside Ericsson. This was when we presented a paper at the SETSS conference in Bournemouth. This conference was interesting not so much for the paper but for the discussions [...]]]></description>
			<content:encoded><![CDATA[<p>Via Patrick Logan&#8217;s <a href="http://patricklogan.blogspot.com/2007/08/bits-of-wisdon-hopl-iiis-history-of.html">Bits of Wisdom: HOPL III&#8217;s History of Erlang</a>:</p>
<blockquote><p>
1989 also provided us with one of our first opportunities to present Erlang to the world outside Ericsson. This was when we presented a paper at the SETSS conference in Bournemouth. This conference was interesting not so much for the paper but for the discussions we had in the meetings and for the contacts we made with people from Bellcore. It was during this conference that we realised that the work we were doing on Erlang was very different from a lot of mainstream work in telecommunications programming. Our major concern at the time was with detecting and recovering from errors. I remember Mike, Robert and I having great fun asking the same question over and over again: “what happens if it fails?”— the answer we got was almost always a variant on “our model assumes no failures.”We seemed to be the only people in the world designing a system that could recover from software failures&#8230;
</p></blockquote>
<p>Those of you who&#8217;ve had the chance to hear me speak about technology and architecture know that one of my favorite ways to win an argument is by saying, &#8220;yeah, but what if the server restarts?&#8221;</p>
<p>Almost 20 years later, the more things change, the more they stay the same.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/08/13/what-happens-if-it-fails-circa-1989/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>No such thing as a centralized ESB</title>
		<link>http://www.udidahan.com/2007/06/30/no-such-thing-as-a-centralized-esb/</link>
		<comments>http://www.udidahan.com/2007/06/30/no-such-thing-as-a-centralized-esb/#comments</comments>
		<pubDate>Sat, 30 Jun 2007 13:09:55 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[BizTalk]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[SCA & SDO]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/06/30/no-such-thing-as-a-centralized-esb/</guid>
		<description><![CDATA[Via David McGhee&#8217;s Q&#038;A with Dr. Don Ferguson, but read the whole thing.

Q: Could you tell you your thoughts or preference for a distributed or centralized ESB? 
DON: there is no such thing as a centralized ESB.

This is the problem with a lot of the products that call themselves ESBs. They are centralized brokers which [...]]]></description>
			<content:encoded><![CDATA[<p>Via <a href="http://blogs.msdn.com/davidmcg/archive/2007/06/28/soa-esb-integration-in-the-real-world.aspx">David McGhee&#8217;s Q&#038;A</a> with <a href="http://www.microsoft.com/presspass/exec/techfellow/Ferguson/default.mspx">Dr. Don Ferguson</a>, but read the whole thing.</p>
<blockquote><p>
Q: Could you tell you your thoughts or preference for a distributed or centralized ESB? </p>
<p>DON: there is no such thing as a centralized ESB.
</p></blockquote>
<p>This is the problem with a lot of the products that call themselves ESBs. They are centralized brokers which may be clustered for availability. But they are in no way an implementation of the Bus Architectural Pattern. Please check this before cutting a check to your vendor.</p>
<p>Also, understand that if you do security related things in your ESB, possibly as a part of your routing rules, that if the security infrastructure is centralized that means your ESB is too. Even if it really was distributed to begin with.</p>
<p>Buyer beware.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/06/30/no-such-thing-as-a-centralized-esb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Space-Based Architecture – scalable, but not much to do with SOA</title>
		<link>http://www.udidahan.com/2007/06/20/space-based-architecture-%e2%80%93-scalable-but-not-much-to-do-with-soa/</link>
		<comments>http://www.udidahan.com/2007/06/20/space-based-architecture-%e2%80%93-scalable-but-not-much-to-do-with-soa/#comments</comments>
		<pubDate>Wed, 20 Jun 2007 21:31:25 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Simplicity]]></category>
		<category><![CDATA[Space-Based Architecture]]></category>
		<category><![CDATA[Web Services]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/06/20/space-based-architecture-%e2%80%93-scalable-but-not-much-to-do-with-soa/</guid>
		<description><![CDATA[Space-Based Architecture (or SBA for short) just might be in your future if your building large-scale distributed systems. By focusing on high-throughput and low latency, SBA joins messaging and in-memory data caching and adds a good measure of load partitioning. However, with the entire industry enamoured with SOA, what place is left for SBA?
Before going [...]]]></description>
			<content:encoded><![CDATA[<p>Space-Based Architecture (or SBA for short) just might be in your future if your building large-scale distributed systems. By focusing on high-throughput and low latency, SBA joins messaging and in-memory data caching and adds a good measure of load partitioning. However, with the entire industry enamoured with SOA, what place is left for SBA?</p>
<p>Before going too far ahead, you might want to take a look at my previous post <a href="http://udidahan.weblogs.us/2007/01/20/space-based-architectural-thinking/">&#8220;Space-Based Architectural Thinking</a>, or listen to my podcast <a href="http://udidahan.weblogs.us/2007/04/10/podcast-space-based-architectures-for-the-web/">Space-Based Architecture for the Web</a>. There&#8217;s also a 30 minute webcast online describing SBA more fully <a href="http://www.bejug.org/confluenceBeJUG/display/PARLEYS/SBA+-+Scalable+SOA">here</a>. I&#8217;m also going to try to stay away from things concerning Jini this time after already discussing <a href="http://udidahan.weblogs.us/2007/06/12/don%e2%80%99t-let-the-jini-out-of-the-bottle-on-soa/">the connection between Jini and SOA</a>, and the tradeoffs between two general approaches: <a href="http://udidahan.weblogs.us/2007/05/10/tasks-and-spaces-versus-messages-and-handlers/">Tasks and Spaces vs Message and Handlers</a>.</p>
<p>OK, so the issue of state-management is a big one. Everybody wants to work stateless, because it scales. The only problem is that the business processes that we are automating are long running, meaning that there are external systems or people involved. This makes these processes inherently stateful. So, we need a way to scale statefully &#8211; SBA gives us that. For some background on the &#8220;Shared Nothing Architecture&#8221;, I suggest reading <a href="http://www.dehora.net/journal/2004/04/interprocess_soa.html">this post on inter-process SOA</a> and <a href="http://www.zefhemel.com/archives/2004/09/01/the-share-nothing-architecture">this one as well</a>.</p>
<p>Availability also has to be handled, not only in terms of having enough servers online to handle the required load but in having all the data required to process each request be accessible. This has often been handled by the database using ACID transactions &#8211; durability being that which solved availability issues, but also hurting latency the most. The problem with saving the state of our long-running business processes/workflows in the database is the load and the responsiveness requirements. In many verticals &#8211; telcos, financial, and defense to name a few, we need millisecond level latency on each stage of the workflow. This is what leads SBA to the in-memory, replicated data grid.</p>
<p>Note that SBA only intends to take these workflows out of the database, and not anything else &#8211; especially not Master Data. The lifetime of these workflows is incredibly short compared to that of master data like customers and products. It will have much different backup strategies as well. In terms of load, these workflows will be heavy on reads and writes together in the same transactions, but quite low in terms of just reads. If we have workflows that perform work in parallel, we easily end up with concurrency requirements that make DBAs cringe under the barrage of short transactions. </p>
<blockquote style="background-color:white"><p>
If you&#8217;re worried that Workflow Foundation (WF) won&#8217;t scale because of the above, you needn&#8217;t be. You can (more or less easily) replace the persistence mechanism of WF with your own, saving your workflow instances to an in-memory replicated data grid.
</p></blockquote>
<p>By enabling the objects in the grid to call back into logic on your servers, you have, in essence, done messaging and more. The added benefit that SBA receives from this is a unification of technology between caching and messaging. This translates directly to savings when it comes time to cluster each of those technology&#8217;s environments.</p>
<p>Finally, if we can find an attribute in the incoming stream of messages that creates a nice even distribution, we can then partition our load between our servers by that key. This will work up to the point where the load per key increases beyond a single server&#8217;s capacity, and then we have to look at re-partitioning, a non-trivial problem. However, if we put objects in our grid that represent the master data, and tie them to our workflow instances with both of those tied to the key of our load, a smart infrastructure can make sure all that data is already resident on the server that is handling that piece of the load. That decreases latency even more since we no longer have to pay network roundtrips to collect all the data needed before we can process it. That&#8217;s a substantial advantage for the above verticals.</p>
<p>But all of this has nothing to do with SOA.</p>
<p>Sure, it&#8217;ll change how we implement our Services internally, but it has no impact on their interfaces or the top-level service decomposition. In the Java community, the word &#8220;service&#8221; is often used to describe the logic of a system. Great significance is placed on keeping these &#8220;services&#8221; simple, as in Plain-Old Java Objects. The fact of the matter is that the logic of the system should be simple and independent of other concerns like data access and communcations (a la Web Services), but that does not make it a service, not in the SOA sense.</p>
<p>For more information on what Services in SOA are like, check out this podcast on <a href="http://udidahan.weblogs.us/2006/08/28/podcast-business-and-autonomous-components-in-soa/">Business and Autonomous Components in SOA</a>. Actually, SBA will probably have the biggest impact on the way <a href="http://udidahan.weblogs.us/2007/06/02/podcast-using-autonomous-components-for-slas-in-soa/">autonomous components will handle service-level agreements</a>.</p>
<p>So, it appears that even with SOA, SBA has its place. The former dealing with business level agility, the latter dealing with all the technical aspects of supporting that agility. If you&#8217;re tasked with the designing the architecture of a scalable, available, high-throughput, low-latency distributed system, I&#8217;d strongly advise you to look at SBA &#8211; the technical value is overwhelming. Even if you don&#8217;t utilize all elements of SBA and choose the <a href="http://www.theserverside.com/tt/articles/article.tss?l=DistCompute">Master Worker Pattern</a> instead of load partitioning, you&#8217;ll find the technologies supporting SBA to be quite flexible in that respect.</p>
<p>Will Space-Based Architectures be a part of your future? I don&#8217;t know for sure, but they&#8217;re a most welcome part of my present.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/06/20/space-based-architecture-%e2%80%93-scalable-but-not-much-to-do-with-soa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On Intermediation And SOA</title>
		<link>http://www.udidahan.com/2007/05/19/on-intermediation-and-soa/</link>
		<comments>http://www.udidahan.com/2007/05/19/on-intermediation-and-soa/#comments</comments>
		<pubDate>Sun, 20 May 2007 04:27:07 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SCA & SDO]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/05/19/on-intermediation-and-soa/</guid>
		<description><![CDATA[Nick Malik has an interesting post on The value of intermediation in SOA where he starts out suggesting a couple of books that stand at the basis of much of today&#8217;s SOA thinking. I agree that far too few people seem to have read them.
In his previous post Is it service-oriented if the message cannot [...]]]></description>
			<content:encoded><![CDATA[<p>Nick Malik has an interesting post on <a href="http://blogs.msdn.com/nickmalik/archive/2007/05/15/the-value-of-intermediation-in-soa.aspx">The value of intermediation in SOA</a> where he starts out suggesting a couple of <a href="http://udidahan.weblogs.us/books/">books</a> that stand at the basis of much of today&#8217;s SOA thinking. I agree that far too few people seem to have read them.</p>
<p>In his previous post <a href="http://blogs.msdn.com/nickmalik/archive/2007/05/14/is-it-service-oriented-if-the-message-cannot-be-intermediated.aspx">Is it service-oriented if the message cannot be intermediated</a>, Nick defines intermediability as <i>&#8220;SOA should give us the ability to intercept a message going from point A to point B, and react to that message without informing either end of that pipe.&#8221;</i>. I&#8217;ll respond to this in due course.</p>
<p>Anyway, he continues on by saying &#8220;SOA [is] an architecture for Enterprise Application Integration.&#8221;</p>
<p>I can&#8217;t agree with that statement. The main reason is that EAI puts the application in the center, and that integrating existing applications one of the primary purposes of it. It is my assertion that in order to solve many of the problems that we are having today, we need to take a broader, business based view of the enterprise and model that with services. A service may be implemented with one or more applications. However, my experience has been that these services tend to use parts of existing applications, with multiple services using different parts of the same application. The reason for this is that the applications we have today, especially the ERP monoliths, do a lot, and at the same time, not everything. This is part of the reality that EAI tried to solve, but then got mired down in cross system hell. You just can&#8217;t solve poor business decomposition in the technology domain.</p>
<p>The value of putting services at the fore makes it possible to gradually phase out and evolve legacy applications, and migrate costly mainframe apps bit by bit without having these changes ripple out and break other services. The same is true for those systems&#8217; data &#8211; backup strategies are defined at the service level, impacted primarily by their Service-Level Agreements.</p>
<p>While I whole-heartedly agree with what Nick has to say in terms of <a href="http://udidahan.weblogs.us/category/oo/">OO intermediation</a> of the <a href="http://udidahan.weblogs.us/category/dependency-inection/">Dependency Injection</a> variety, and that <a href="http://udidahan.weblogs.us/category/scalability/">scaling up those same concepts in terms of messaging</a> is the right way to go, I take issue with orchestration in the intermediation area. These &#8220;tactical changes&#8221; need to be done in the context of the top, business-level service strategy. That means that all logic belongs within a service. The &#8220;network&#8221; between services is just that, a &#8220;dumb&#8221; network &#8211; no business logic of any kind, just technological capabilities like knowing which physical server to route messages to.</p>
<p>In this spirit, I&#8217;d like to suggest an alternative solution to the example Nick gives. Here&#8217;s the scenario:</p>
<blockquote><p>
Let&#8217;s say that system 1 generates an invoice.  It sends an event to the world saying &#8220;invoice here&#8221; and system 2 captures that message.  System 2 asks for details about the invoice&#8230; perhaps it will place the information on a web site for internal support teams.</p>
<p>Let&#8217;s say that we are moving to a CRM solution in our internal support groups.  We want to create the information in the CRM system related to the invoices that specific customers have been issued.  We need to integrate these two systems.  The existing web app needs to have a link to the CRM system&#8217;s data, to allow the user to move across easily.
</p></blockquote>
<p>And here is the solution he prescribes:</p>
<blockquote><p>
We can intercept the request for further information from the web app to the publisher.  When the publisher responds with information about the invoice, we can insert the invoice in the CRM system, add a link to the CRM record for that invoice to the data structure, and resume our response to the web app.  Assuming that our canonical schema has a field for &#8216;foreign key&#8217;, we have just integrated our CRM and web information portal&#8230; without changing either one.
</p></blockquote>
<p>Without getting into the business-level analysis of what the correct service decomposition might be, here&#8217;s what I suggest (although all of these &#8220;systems&#8221; might just end up within the same service, or having parts of them being used by multiple services).</p>
<p>First of all, have all information about the invoice available via the message only. This could be done by actually putting all the invoice data in the message, or by placing a URI instead where other systems can HTTP GET it from &#8211; <a href="http://udidahan.weblogs.us/category/rest/">REST style</a>. This decreases coupling between the <a href="http://udidahan.weblogs.us/category/pubsub/">publisher and its subscribers</a>. However, we haven&#8217;t solved the problem of our web apps getting access to the relevant data in the CRM system.</p>
<p>The solution presents itself at the business level. The invoice is not &#8220;complete&#8221; without the appropriate CRM data. Therefore, it does not make sense for a service to publish it that way. Let&#8217;s call this service the Purchasing Service. It would handle the workflow of receiving the first system&#8217;s event, adding the invoice to the CRM system, and taking the resulting full invoice data and publishing that. All external systems like the web apps would see just the final event. Orchestration, if there even is such a thing, occurs within the service boundary. This technological level intermedation isn&#8217;t even a blip at the business level. We can also imagine other services, say a Sales Service, that would use the CRM system as well.</p>
<p>In summary, when moving to <a href="http://udidahan.weblogs.us/category/soa/">SOA</a>, intermediation provides many technological benefits in getting data and behavior to work across existing systems and applications, however it&#8217;s laregly a NO-OP at the service level. After phasing out many of those existing applications behind the service boundaries, the same service-level interactions would persist. Your Service-Oriented Architecture would not be any different. That&#8217;s the technical agility aspect of SOA.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/05/19/on-intermediation-and-soa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Grid computing and SOA</title>
		<link>http://www.udidahan.com/2007/05/18/grid-computing-and-soa/</link>
		<comments>http://www.udidahan.com/2007/05/18/grid-computing-and-soa/#comments</comments>
		<pubDate>Fri, 18 May 2007 18:32:51 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/05/18/grid-computing-and-soa/</guid>
		<description><![CDATA[For a great description of what grid computing really is, read this.
What they say about the connection to SOA, though, requires some clarification. Here&#8217;s the quote:

There is a lot of talk going on about synergy between Grid Computing and SOA. It is however driven primarily by implementation concerns at this point rather than by any [...]]]></description>
			<content:encoded><![CDATA[<p>For a great description of what grid computing really is, read <a href="http://jroller.com/page/nivanov?entry=what_is_grid_computing">this</a>.</p>
<p>What they say about the connection to SOA, though, requires some clarification. Here&#8217;s the quote:</p>
<blockquote><p>
There is a lot of talk going on about synergy between Grid Computing and SOA. It is however driven primarily by implementation concerns at this point rather than by any deeper considerations. Clearly, Grid Computing can deliver unchanged value without SOA, yet WS-* based implementation (such as Globus) can be beneficial in some cases (highly distributed heterogeneous environments that should only exist in unfortunate legacy-support situations).
</p></blockquote>
<p>The main thing that I want to call out is that &#8220;grids&#8221; don&#8217;t cross service boundaries &#8211; not at the logical level anyway. Although, even if you did share a single grid infrastructure between services implementations, you may have some problems maintaining service-level agreements, autonomy may be put in danger.</p>
<p>Just something to keep in mind.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/05/18/grid-computing-and-soa/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Using spaces with web services</title>
		<link>http://www.udidahan.com/2007/05/05/using-spaces-with-web-services/</link>
		<comments>http://www.udidahan.com/2007/05/05/using-spaces-with-web-services/#comments</comments>
		<pubDate>Sat, 05 May 2007 09:16:38 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Space-Based Architecture]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/05/05/using-spaces-with-web-services/</guid>
		<description><![CDATA[Willam Brogden has an article up on SearchWebServices.com on How Web Services can use JavaSpaces. I don&#8217;t want all the Microsoft folks tuning out now that they&#8217;ve heard the &#8220;J&#8221; word, so let me just say that there are technologies out there for .NET too.
A &#8220;JavaSpace&#8221; is really just a space, which is, at the [...]]]></description>
			<content:encoded><![CDATA[<p>Willam Brogden has an article up on SearchWebServices.com on <a href="http://searchwebservices.techtarget.com/tip/0,289483,sid26_gci1251765,00.html?track=NL-449&#038;ad=586258&#038;asrc=EM_NLT_1307248&#038;uid=5532089">How Web Services can use JavaSpaces</a>. I don&#8217;t want all the Microsoft folks tuning out now that they&#8217;ve heard the &#8220;J&#8221; word, so let me just say that there are technologies out there for .NET too.</p>
<p>A &#8220;JavaSpace&#8221; is really just a <a href="http://udidahan.weblogs.us/2007/01/20/space-based-architectural-thinking/">space</a>, which is, at the end of the day, a queryable distributed in-memory hashtable. Something many of us are already doing for caching. The reason you shouldn&#8217;t be doing this yourself is simple. While keeping a single hashtable in memory on a single computer and synchronizing it against changes to your database is simple, doing that in a highly available manner across multiple servers is not. Vendors providing solutions in this space include:</p>
<ul>
<li><a href="http://www.gigaspaces.com">Gigaspaces</a></li>
<li><a href="http://www.ibm.com/developerworks/websphere/downloads/objectgrid/">IBM ObjectGrid</a></li>
<li><a href="http://www.alachisoft.com/ncache/index.html">Alachisoft&#8217;s NCache</li>
<li><a href="http://www.gemstone.com">GemStone</a></li>
<li><a href="http://www.tangosol.com/coherence-.net.jsp">Tangosol&#8217;s Coherence</a></li>
</ul>
<p>But there are others as well. Bottom line: don&#8217;t develop one of your own. Do a proof of concept with your short list of vendors and go from there.</p>
<p>The article sums it up nicely like this:</p>
<blockquote><p>
Although JavaSpaces servers are not trivial to set up, they are much easier than any other type of grid computing server. Furthermore, the simplicity of the interface makes the learning curve easier. The greatest advantage of the JavaSpaces approach is the ease with which additional workers can be added to the grid. </p>
<p>It should be clear from the example that there is a lot of extra communication traffic in a JavaSpaces solution so the only reason to use JavaSpaces or any other form of grid computing in support of a Web service is a requirement for computing power or special resources that are not feasible to supply on the server directly.
</p></blockquote>
<p>I have this to add to it. Whereas most traditional systems keep the idea of message-based communication and data caching separate, spaces allow you to kill two birds with one stone. Even if you don&#8217;t go the whole Space-Based Architecture route, you&#8217;ll find that spaces will fit nicely in your distributed architecture toolkit &#8211; I know I did.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/05/05/using-spaces-with-web-services/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Database performance optimization</title>
		<link>http://www.udidahan.com/2007/04/30/database-performance-optimization/</link>
		<comments>http://www.udidahan.com/2007/04/30/database-performance-optimization/#comments</comments>
		<pubDate>Mon, 30 Apr 2007 11:12:20 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Availability]]></category>
		<category><![CDATA[Data Access]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/04/30/database-performance-optimization/</guid>
		<description><![CDATA[I&#8217;ve been doing quite a bit of consulting these past weeks around performance issues for database intensive systems. The fact that we use O/R Mapping makes the business logic possible to get right (using the Domain Model Pattern), but adds another dimension to the performance optimization &#8211; primarily around limiting the number of roundtrips to [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been doing quite a bit of consulting these past weeks around performance issues for database intensive systems. The fact that we use <a href="/category/nhibernate/">O/R Mapping</a> makes the business logic possible to get right (using the <a href="http://udidahan.weblogs.us/2007/04/21/domain-model-pattern/">Domain Model Pattern</a>), but adds another dimension to the performance optimization &#8211; primarily around limiting the number of roundtrips to the database.</p>
<p>However, in the truly large scale scenarios, that isn&#8217;t enough.</p>
<p>On of my customers is having to scale up from 500 concurrent users to 50,000. You need to get seriously close to the metal to handle that. Here&#8217;s a great post on <a href="http://blogs.smugmug.com/don/2007/04/27/the-perfect-db-storage-array/">the kind of hardware storage considerations</a> I went through that I went through there. Of course, sometimes a database is just the wrong hammer for your screw &#8211; sometimes what you really need is a <a href="http://udidahan.weblogs.us/2007/01/20/space-based-architectural-thinking/">space</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/04/30/database-performance-optimization/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Enterprise Service Bus and Your SOA</title>
		<link>http://www.udidahan.com/2007/04/28/the-enterprise-service-bus-and-your-soa/</link>
		<comments>http://www.udidahan.com/2007/04/28/the-enterprise-service-bus-and-your-soa/#comments</comments>
		<pubDate>Sat, 28 Apr 2007 19:53:14 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/2007/04/28/the-enterprise-service-bus-and-your-soa/</guid>
		<description><![CDATA[It was about a year or so back—I was in the middle of figuring out how to pass an authorization token between trust boundaries when I got a call from our CIO. He had just come back from some conference sponsored by *** (Vendor&#8217;s name withheld to protect the clueless) and was brimming with new [...]]]></description>
			<content:encoded><![CDATA[<p>It was about a year or so back—I was in the middle of figuring out how to pass an authorization token between trust boundaries when I got a call from our CIO. He had just come back from some conference sponsored by *** (Vendor&#8217;s name withheld to protect the clueless) and was brimming with new acronyms. </p>
<p>&#8220;Udi&#8221;, he says, &#8220;I just heard that for us to realize the potential of our SOA, we should be using an ESB.&#8221; </p>
<p>I&#8217;m sure he said a lot more than that, but that first sentence was enough for me to tune out. I managed to get through the conversation without catching a case of acronym-itis, but my train of thought was broken. I wasted one Google search to find out that ESB meant &#8220;Enterprise Service Bus,&#8221; got fed up, and went to Starbucks. Of all the overused terms in software, &#8220;Enterprise&#8221; is by far the most annoying—although &#8220;Service&#8221; seems to be moving up in the world.</p>
<p>I&#8217;ve been developing loosely-coupled systems for awhile now, using all kinds of technologies, and never thought that I was doing anything particularly ground-breaking. So when the CIO came down and introduced me to the army of high-priced consultants that were going to help us redesign our software to become more &#8220;service-oriented&#8221; I really was interested in seeing what would be different. Soon after, I realized that the vendor-driven SOA meant nothing more than Web Services, XML, and loose coupling—with the mindset of loose coupling being the most important. I&#8217;d been &#8220;service-oriented&#8221; all this time and didn&#8217;t even know it. Oh, and it turns out that we didn&#8217;t have to redesign anything. </p>
<p>Imagine my utter joy in hearing that the reason our project was in trouble was that we didn&#8217;t have an ESB. And all this time I thought it was because the requirements were changing every two weeks. </p>
<p>It was about a month after that that our project manager got promoted for doing such a great SOA implementation and was now in charge of making the whole company, oops, sorry—enterprise, service oriented. I was made &#8220;acting project manager&#8221; and managed to do one really smart thing pretty quick—get our software through testing and deployed within the month, before too many new requests came in (I think it was possible because most of the stakeholders were on holiday). The system wasn&#8217;t that complex; we had the standard HR, accounting, inventory, etc., functionality split up into the same high-level components. Each top-level component had its own server-cluster and database. We pulled the data from each of the database to the data warehouse using classic ETL. Data flowed between the components primarily using publish-subscribe semantics while the client-side software just request-responded what it needed from each component. </p>
<p>We kept on working, pushing out features as fast as we could, until one morning I found the sysadmin at my door. </p>
<p>&#8220;We had a problem with some of the disks on the accounting cluster, so we installed it on a different cluster and brought the first one down. We tried a simple test to make sure everything was OK, and it wasn&#8217;t. A bunch of things don&#8217;t seem to be working.&#8221; </p>
<p>After looking around for a bit I found out that the sysadmins had forgotten to update the config files on the servers and the clients. We restarted the server components and they worked fine, but we couldn&#8217;t really go around restarting and changing config files on all the clients. Luckily, the sysadmins had it set up so that every client on our domain that logged in to the network could be sent a script to run, so pushing out the new config files was easy. As for the restarting part, we called up the help desk and told them that if (when) someone called about why their software wasn&#8217;t working properly, just to tell them to close it and run it again—which apparently is their standard first suggestion anyway. </p>
<p>Well, things lurched along like that for a while as we put out more and more functionality—put up a Web front-end, tied in some business partners, etc. I thought I had everything under control until our COO charged in, with the CIO in tow. </p>
<p>&#8220;Our business partners haven&#8217;t been able to send us orders for almost a week,&#8221; he fumed. &#8220;What did you do?!&#8221; </p>
<p>When money talks, you&#8217;d better believe everybody listens. After some seriously hectic hours of peering through diffs between the deployed source and the previously deployed source, we were getting nowhere. Somebody, I don&#8217;t remember who, had the common sense to get the sysadmins in there too. It turned out to have been the same problem as before, but this time with the inventory cluster. So we used the same solution. The problem was that our business partners&#8217; software didn&#8217;t get the updated config files. While I was pondering how we could get the same system to work with external partners, my boss called me in for an urgent meeting. </p>
<p>&#8220;I just got a call from Jim (the CIO) and he wants you and me to help him explain what happened to the CEO.&#8221; </p>
<p>I started to get that sinking feeling, the kind where you know things are going from bad to worse. That afternoon, we all filed in to the chief&#8217;s office, bracing ourselves for the worst. He got directly to the point. </p>
<p>&#8220;If anything like this happens again, you three are fired. Now get the hell out of my office.&#8221; </p>
<p>How&#8217;s that for motivation? </p>
<p>And to top it all off, before he hurried off to another meeting, Jim asks us &#8220;You guys know about our first audit for Sarbanes-Oxley in three months, right? I don&#8217;t want to see any more screw-ups, and this SOX stuff is getting people anxious. We need full audit trails on everything.&#8221; </p>
<p>It looks like Moore&#8217;s law will continue indefinitely: You will need to handle twice as much crap today as you did 18 months ago. </p>
<p>It was at about this point where I realized that I needed help. I called up one of my old partners in crime, Clem, who had been doing large-scale distributed systems development for awhile. I told him my sorry tale and asked if he could give me a hand. Unfortunately, he was in the middle of some serious crunch time, but he left me with these pearls of wisdom: </p>
<p>&#8220;Udi, it&#8217;s all in the message. Forget about remote method invocations and pub-subbing events—down on the wire it&#8217;s all just messages. The trick is to think of your system as passing messages at the application level as well. </p>
<p>Asynchronous message passing over queues. It&#8217;s really quite simple. </p>
<p>Once you&#8217;ve packaged everything into the message, that message can be dynamically routed anywhere, and so can its responses. The application doesn&#8217;t need to bind against any specific endpoint—it just drops a message addressed to some logical location. Infrastructure can make sure that messages get to the logical recipient, even if they change physical locations. </p>
<p>That infrastructure is what brings about the &#8220;Bus&#8221; architectural style between your distributed components.&#8221; </p>
<p>Luckily I was writing down what he said, because I had to re-read it at least a dozen times for it to sink in. Flashback to that original conversation with the CIO—so that&#8217;s what ESBs are for! Well, you wouldn&#8217;t have guessed it with all the hype going on—IT/Business Alignment, like that&#8217;s going to happen any time soon. </p>
<p>After talking with some ESB vendors, I understood some nuances in what Clem told me. The message passing at the application level is really passing logical messages—a message is an object just like any other. The transformation that logical message undergoes in order to be sent across the wire is something else entirely. We can transform our message to and from XML, binary, text based key-value pairs—anything we need. Finally, the transport used to pass that wire-representation between machines is an infrastructure detail that is also independent of the logical message. </p>
<p>Once my mind warped itself around asynchronous messaging, the whole SOA thing became clear. The top-level components we were developing were providing top-level services—requests would queue up like people would at the teller in a bank. A component could send out the exact same message either as a broadcast or a unicast, where the recipients would be able to use the same semantics either way. Exposing a method, subscribing to an event, and handling a message were all the same, both internally and externally. </p>
<p>I can&#8217;t explain how much this simplified my view of the distributed world. It kind of felt like dominos—as one thing fell into place, it knocked down something else. I was finally beginning to understand what needed to be changed in our system—and all it took was a multi-million dollar mistake and nearly getting fired. </p>
<p>Needless to say, the whole SOX thing caused all hell to break loose. Our team wasn&#8217;t compliant, but then neither was any other team. The same goes for most of the company&#8217;s software. But, the reassuring thing for me, was knowing where I was going with our system. It took some time—we redesigned most of the communication paths, found a vendor whose product met our needs (at the right price), and several months later, we rolled out the new version. I wouldn&#8217;t say that the rollout was flawless, but I will tell you this—when the sysadmins moved a service from one cluster to another, no config files needed to be pushed out in order for things to work, and, more importantly, no orders were lost. I even got promoted from &#8220;Acting Project Manager&#8221; to &#8220;Project Manager&#8221; <img src='http://www.udidahan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  </p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/04/28/the-enterprise-service-bus-and-your-soa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Queues, Scalability, &amp; Availability</title>
		<link>http://www.udidahan.com/2007/02/02/queues-scalability-availability/</link>
		<comments>http://www.udidahan.com/2007/02/02/queues-scalability-availability/#comments</comments>
		<pubDate>Sat, 03 Feb 2007 00:08:02 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Availability]]></category>
		<category><![CDATA[MSMQ]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[WCF]]></category>

		<guid isPermaLink="false">http://udidahan.weblogs.us/archives/371</guid>
		<description><![CDATA[Dr. Nick has a great post up on scalability, queues, and WCF. For some reason, everybody&#8217;s always talking about scalability, but availability gets much less play. For many systems, availability is actually the more important *ility.
Anyway, when it comes to scaling out queues help a lot. Although not explicitly mentioned in the above post, having [...]]]></description>
			<content:encoded><![CDATA[<p>Dr. Nick has <a href="http://blogs.msdn.com/drnick/archive/2007/01/30/queue-scalability.aspx">a great post</a> up on scalability, queues, and WCF. For some reason, everybody&#8217;s always talking about scalability, but availability gets much less play. For many systems, availability is actually the more important *ility.</p>
<p>Anyway, when it comes to scaling out queues help a lot. Although not explicitly mentioned in the above post, having multiple machines feeding off of the same queue is the key to scalability and is known as the Competing Consumer pattern. The added benefit of such a design is that you get availability without any additional work, given that you have more than one consuming machine per queue.</p>
<p>One thing to keep in mind about the Microsoft platform today is that MSMQ does not currently support remote, transactional receives (<b>Update</b> MSMQ 4 released as a part of Vista / Server 2008 now supports this, yet people in the know have told me to avoid this feature). What this means is that, in the above design, you cannot make sure that if one of the servers fails while processing a message from the queue, that that message will return to the queue. For some kinds of messages this isn&#8217;t a big deal (like stock prices), but in other cases (like money transfers) this isn&#8217;t acceptable.</p>
<p>So, bottom line is that queues and other asynchronous transports (JMSs and topics) enable robust systems to be built using proven patterns, but be aware of any limitations of the technology and what ramifications they may have.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2007/02/02/queues-scalability-availability/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Autonomous Services &#8211; a step beyond Service Orientation</title>
		<link>http://www.udidahan.com/2006/01/08/autonomous-services-a-step-beyond-service-orientation/</link>
		<comments>http://www.udidahan.com/2006/01/08/autonomous-services-a-step-beyond-service-orientation/#comments</comments>
		<pubDate>Mon, 09 Jan 2006 05:09:31 +0000</pubDate>
		<dc:creator>thesoftwaresimplist</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Autonomous Services]]></category>
		<category><![CDATA[Availability]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESB]]></category>
		<category><![CDATA[Pub/Sub]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[SCA & SDO]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://wp_630.weblogs.us/archives/245</guid>
		<description><![CDATA[After starting to write a whitepaper on Workflow in Service Oriented Architectures, I wanted to reference some prior work published on autonomous services (so that the whitepaper wouldn&#8217;t turn into a book). Anyway, after some futile googling, I&#8217;ve decided to give in and write it up myself.
The tenets of Service Orientation as put forth by [...]]]></description>
			<content:encoded><![CDATA[<p>After starting to write a whitepaper on Workflow in Service Oriented Architectures, I wanted to reference some prior work published on autonomous services (so that the whitepaper wouldn&#8217;t turn into a book). Anyway, after some futile googling, I&#8217;ve decided to give in and write it up myself.</p>
<p>The tenets of Service Orientation as put forth by Microsoft include one about “autonomy”. The tenet states that “Services should be autonomous”. After some digging, I found out that the intent of the authors was “The teams that develop different services should not be dependent on each other”, or in shortened form “Autonomous Teams”. This revelation was surprising to me since the real meaning of the tenet was less profound than what I had imagined – autonomous computing.</p>
<p>The idea of autonomous computing has been around for some time and presents a view of the world in which computing units cooperate to achieve global goals yet are not dependent even on the existence of other computing units to function. (If you’re envisioning tiny robots playing soccer, you’re not far off.)</p>
<p>So when I first saw the autonomy tenet, I was thinking of autonomous services: services so loosely coupled that the correct functioning of a service would not be dependent on the correct functioning of other cooperating services. Services loosely coupled in time as well as in code. Obviously this would mean that if Service A needed to cooperate with Service B, and Service B was not even available, Service A would continue to function, and live up to its service-level-agreement. But before we start drifting off into the outer reaches of business-IT alignment, let’s bring this down to earth.</p>
<p>Before we get into a detailed analysis of the how, let’s first agree on the why. Despite being a historical trend, architectures these days tend to be more loosely coupled than before. Loose coupling being a good thing that enables us to better manage the complexity inherent in large software projects. The practical test of loose coupling in a system is changing the public interface of a class and seeing how much of the system doesn’t compile any more (dynamic languages aside). Service Orientation brings us tenets that, when followed, lead to more loosely coupled architectures than if we actively did not follow them. I think that we can agree then that if we could somehow achieve the loose coupling in time mentioned above, without paying an arm and a leg, that would move our architectures another step forward.</p>
<p>Looking back on the evolution of the field of distributed computing, we can see that, over time, less and less things are being assumed. It is now well understood that anything that goes over the network takes much longer than those calls that stay on the same machine, yet once systems were built that abstracted the network communication into looking just like local calls. The performance of those systems was matched only by their lifetime. With the advent of autonomous computing, the assumption that the called service is available and will respond in a timely fashion is called into question. In the real world, servers crash and network equipment goes up in smoke – we can no longer take for granted that communication will always be available, and that its quality will be good enough. In essence, this marks the end of synchronous RPC/RMI. The following code just won’t cut it in this brave new world:</p>
<p>localhost.service1 s1 = new localhost.service1();<br />
orderReply = s1.HandleOrder(orderRequest);</p>
<p>If the service is unavailable, what will happen to our order request? Will it just get lost?<br />
If the service takes a long time to respond, will our server tie up resources for the same amount of time? If this happens under peak load, might it cause our server to crash?</p>
<p>Performing the above code on a different thread won’t make any difference, autonomous services means the end of Request/Response as we know it.</p>
<p>“No Request/Response between services?!”, you ask incredulously.</p>
<p>The simple answer is “yes”, but there is another level of meaning to it. If you have two software entities that between them you just HAVE to have request/response communication, then they should be in the same service. This is where the real architectural guidance comes in. </p>
<p>In component-orientation and object-orientation, the division of the solution into the right number of parts, with each part having the right amount of responsibility was a kind of black magic passed from master to apprentice. Getting the boundaries right was paramount, but difficult. A number of litmus tests are used to catch the gross errors, and the rest is just gut. So too, the request/response test helps us catch gross errors in service boundary demarcation.</p>
<p>The interesting thing that happens after separating our services out this way is that we often end up with services that mirror the way the business side is structured. Voila, business-IT alignment with your hands closed and one eye tied behind your back! Well, it’s one step in the right direction anyway.</p>
<p>This leaves us with the original types of one-way communication (fire-and-forget, pub-sub, etc) and with one kind of two-way communication: duplex. Duplex is really just two one-way communications (A to B, then B to A) that are correlated. First, I send a message to you, mark it with an id number, and save that number. At some future point in time, you get the message, process it, and send a message back with its own id number. But, you’ll have to put my original id number on the message too, so that I’ll know that your message is a response to mine. At some even more distant point in the future, I get a message from you, look at it, and see that it is the long-awaited response to the request I sent way back when.</p>
<p>If I had to sum up the difference autonomous services bring to the styles of  communication used between services, I’d say this: You get a message, look at it, and figure out what it means and what you should do. This isn’t an infrastructure issue. There application level timeouts to deal with (If I don’t get a response back in 3 days, then notify the supervisor), and long-running workflows to manage (next whitepaper ).</p>
<p>If there is one thing to pay attention to in this whole “autonomous services paradigm” it is that the focus has shifted from between services to within a single service. In parting, I want to let you know that systems can be, and are being, built this way. It works. It better than works. Systems created this way are more robust to failures (seeing as they’re designed for failures makes it less impressive) and easier to manage. Give it a try. You didn’t really think that SOA would fizzle away into a bunch of WS specs, did you?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.udidahan.com/2006/01/08/autonomous-services-a-step-beyond-service-orientation/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
