Udi Dahan   Udi Dahan  -  The  Software  Simplist
 
Enterprise  Development  Expert  &  SOA  Specialist
 
 
Home Blog Consulting Training Articles Speaking About Contact
  

Scaling Long Running Web Services

Wednesday, July 30th, 2008.

While I was at TechEd USA I had an attendee, Will, come up and ask me an interesting question about how to handle web service calls that can take a long time to complete. He has a number of these kinds of requests ranging from computationally intensive tasks to those requiring sifting through large amounts of data. What Will was having problems with was preventing too many of these resource-intensive tasks from running concurrently (causing increased memory usage, paging, and eventually the server becoming unavailable).

For comparison later, here’s a diagram showing the trivial interaction:

image

One solution that he’d tried was to set up the web server to throttle those requests and keep a much smaller maximum thread-pool size for that application pool. The unfortunate side effect of that solution was that clients would get “turned away” by a not-so-pleasant Connection Refused exception.

Will had been to my web scalability talk and was curious about how I was using queues behind my web services. I’ve also heard this question from people just getting started with nServiceBus when looking at the Web Services Bridge sample. Here’s the code that’s in the sample and in just a second I’ll tell you why you shouldn’t do this:

[WebMethod]
public ErrorCodes Process(Command request)
{
    object result = ErrorCodes.None;
 
    IAsyncResult sync = Global.Bus.Send(request).Register(
        delegate(IAsyncResult asyncResult)
          {
              CompletionResult completionResult = asyncResult.AsyncState as CompletionResult;
              if (completionResult != null)
              {
                  result = (ErrorCodes) completionResult.ErrorCode;
              }
          },
          null
          );
 
    sync.AsyncWaitHandle.WaitOne();
 
    return (ErrorCodes)result;
}

Let me repeat, this is demo-ware. Do not use this in production.

What’s happening is that in this web service call we’re putting a message in a queue for some other process/machine to process. When that processing is complete, we’ll get a message back in our local queue (which you don’t see) which is correlated to our original request, firing off the callback. We block the web method from completing (using the WaitOne call) thus keeping the HTTP connection to the client open.

The problem here is that we’re wasting resources (the HTTP connection and the thread) while waiting for a response which, as already mentioned, can take a long time. In B2B or other server to server integration environments there are all sorts of middleware solutions that help us solve these problems, however in Will’s case browsers needed to interact with this web service. All he had was HTTP.

HTTP Solutions

Another attendee who was listening in (sorry I don’t remember your name) said that he was solving similar problems using polling but that he was having scalability problems as well.

What often surprises my clients when we deal with these same issues is that I do suggest a polling based solution, but one that still uses messaging, and this is what I described to Will:

Since we can’t actually push a message to a browser over HTTP from our server when processing is complete, the browser itself will be responsible for pulling the response. We still don’t want to leave costly resources like HTTP connections open a long time, however if the browser is going to polling for a response, we’ll need some way to correlate those following requests with the original one. What we’re going to do is use the Asynchronous Completion Token pattern, and later I’ll show how to optimize it for web server technology.

Basic Polling

image

When the browser calls the web service, the web service will generate a Guid, put it in the message that it sends for processing, and return that guid to the browser. When the processing of the message is complete, the result will be written to some kind of database, indexed by that guid. The browser will periodically call another web method, passing in the guid it previously received as a parameter. That web method will check the database for a response using the guid, returning null if no response is there. If the browser receives a null response, it will “sleep” a bit and then retry.

One of the problems with this solution is that polling uses up server resources - both on the web server and our DB; threads, memory, DB connections. A better solution would decrease the resource cost of the polling. Let’s use the fundamental building blocks of the web to our advantage - HTTP GET and resources:

REST-full Polling

Instead of using a guid to represent the id of the response, let’s consider the REST principle of “everything’s a resource”. That would mean that the response itself would be a resource. And since every resource has a URI, we might as well use that URI in lieu of the guid. So, instead of our web service returning a guid, let’s return a URI - something like:

http://www.acme.com/responses/88ec5359-a5d8-4491-a570-3bfe469f3a64.xml

As you can see, the guid is still there. So, what’s different?

image

What’s different is that instead of having the processing code write the response to the database, it writes it to a resource. This can be done by writing some XML to a file on the SAN in the case of a webfarm. Also, the browser wouldn’t need to call a web service to get the response, it would just do an HTTP GET on the URI. If the it gets an HTTP 404, it would sleep and retry as before. The reason that the SAN is needed is that, as the browser polls, it may have its requests arrive at various web servers so the response needs to be accessible from any one of them.

Just as an aside, it would be better to free the processing node as quickly as possible and have something else write the response to the SAN. That would be done simply by sending a message from the processing node that would be handled by a different node that all it did was write responses to disk.

The reason that the URI makes a difference is that serving “static” resources is something that web servers do extremely efficiently without requiring any managed resources (like ASP.NET threads). That’s a big deal.

We’re still using HTTP connections for the polling but that’s something whose effect can be mitigated to a certain degree.

Timed REST-full Pollingimage

Since various requests can take varying amounts of time to process, it’s difficult to know at what rate the browser should poll. So, why don’t we have the web service tell it. As a part of the response to the original web service call, instead of just returning a URI, we could also return the polling interval - 1 second, 5 seconds, whatever is appropriate for the type of request. This value could easily be configurable [RequestType, PollingInterval].

An even more advanced solution would allow you to change these values dynamically. The advantage that would be gained would be that your operations team could better manage the load on your servers. When a large number of users are hitting your system, you could decrease the rate at which your servers would be polled, thus leaving more HTTP connections for other users.

Scaling and Adaptive Polling

You’d probably also want to scale out the number of processing nodes behind your queue. The nice thing is that you could change the polling interval as you scale the various processing nodes per request type providing better responsiveness for the more critical requests. Once we add virtualization, things get really fun:

We had separate queues per request type, so that we could easily see the load we were under for each type of request. That way, we could scale out the processing nodes per request type as well as change the polling interval. By virtualizing our processing nodes, and writing scripts to monitor queue sizes, we had those scripts automatically provisioning (and de-provisioning) nodes as well as changing the polling interval of the browsers.

This had the enormous benefit of the system automatically shifting resources to provide the appropriate relative allocation for the current load as its macroscopic make-up changed.

Summary

Will was well-pleased with the solution which, although more complicated than what he had originally tried, was flexible enough to meet his needs. As opposed to pure server-based solutions, here we make more use of the browser (writing our own Javascript) instead of putting our faith in some Ajax-y library. That’s not to say that you couldn’t wrap this up into a library - in essence, it is a kind of messaging transport for browser to server communication allowing duplex conversations.

In fact, what could be done is to return multiple responses to the browser over a long period of time. In the response that comes back to the browser could be an additional URI where the next response will be. This can be used for reporting the status of a long running process, paging results, and in many other scenarios.

And, one parting thought, could this not be used for all browser to web service communication?

  
If you liked this article, you might also like articles in these categories:


Check out my Best Content If you've got a minute, you might enjoy taking a look at some of my best articles.
I've gone through the hundreds of articles I've written over the past 4 years and put together a list of the best ones as ranked by my 2000+ readers.
You won't be disappointed.

Subscribe to my feedIf you'd like to get new articles sent to you when they're published, it's easy and free.
Subscribe right here.



Something on your mind? Got a question? I'd be thrilled to hear it.
Leave a comment below or email me, whatever works for you.

18 Comments

  1. Klaus Hebsgaard Says:

    Could this not be handled using Asynchronous Pages in ASP.NET (see http://msdn.microsoft.com/en-us/magazine/cc163725.aspx ), as I understand it this technique works with WCF services as well….


  2. udidahan Says:

    Klaus,

    I actually have another sample with nServiceBus that demonstrates that, but it only works for ASP.NET pages. If you want to call the web service directly from the browser, there is no async model set up.


  3. Mike Says:

    What process cleans up all the static resources?


  4. chiph Says:

    Like Mike said — you’d have to have something clean up the static files after a while. In my experience, NTFS starts to have problems when you’ve got 10,000+ items in a directory, so depending on your transaction volume, you’d need to run the cleanup possibly several times during the day. As well as schedule your defragger to run.

    The trick would be knowing when an item becomes eligible for cleanup. I’m not sure you can delete it immediately after a successful (non-404 response) status request from the user (assuming you can even detect that under IIS, I don’t know). You might want to give them a little more time in case they have system problems at their end.


  5. Interesting Finds: 2008.07.30~2008.08.01 - gOODiDEA.NET Says:

    […] Scaling Long Running Web Services […]


  6. Long running web services - Sunny Nagi Says:

    […] Udi Dahan has just posted an excellent blog post about long running web services. […]


  7. udidahan Says:

    Chiph,

    You’re absolutely right - which is why the process which actually writes the response to the disk is the beginning of a saga which may have its final (delete) phase triggered either by a read, a certain number of reads, and/or time.


  8. Dan Finucane Says:

    I love this solution especially the REST piece. I have a system where sometimes the web service operation will complete in a minute or two and other times it may run for four hours. Your solution is a perfect fit. I have one question though - since the operation results are retrieved via plain vanilla HTTP GET’s how would you document/communicate the content of an operations result. Currently I use data contracts to describe the XML schema for the data I return. If I use the REST approach you outline would you continue to use the data contract to document the schema of the result or would you leave the result structure out of the WSDL and communicate the form through supplemental documentation?

    In some cases I could see leaving the schema out of the contract because it makes it easier to add elements in future releases without breaking existing code.


  9. udidahan Says:

    Dan,

    Glad you like it. Let me know how it works out for you.

    I use XSD to define the structure of the data returned. Sometimes additional documentation is needed anyway.

    BTW, the X in XML stands for eXtensible (which you already knew), but I’m just reiterating that to say that it is quite easy to add elements in future releases without breaking existing code.


  10. Colin Jack Says:

    Great stuff

    I wondered if you’d looked at the duplex WCF functionality that should help remove the need to do so much polling in Silverlight apps:

    http://weblogs.asp.net/dwahlin/archive/2008/06/16/pushing-data-to-a-silverlight-client-with-wcf-duplex-service-part-i.aspx

    It’s also really interesting reading this article. I worked on an XML over HTTP project many years ago that used many of these patterns for long running async jobs and it’s good to see you’ve adopted the same techniques.


  11. udidahan Says:

    Colin,

    I agree with the first commenter on that post - Rob.
    IIS and ASP.NET are not designed to do comet in a scalable way.

    There is the other problem of synchronous communication from server to client where server threads end up being blocked while waiting for the communication to succeed.

    Hope that helps.


  12. Colin Jack Says:

    @Udi
    Good point on the threads, hadn’t thought that through.

    Also I found this page quite interesting (particularly the distributed observer pattern near the end):

    http://duncan-cragg.org/blog/post/distributed-observer-pattern-rest-dialogues/


  13. udidahan Says:

    Colin,

    What he’s describing is using REST/HTTP to do messaging and pub/sub. I think that’s great. However, from a “how do I get my head thinking the right way” perspective, I find that plain messaging and pub/sub is simpler to grasp. Once you understand the applicative protocol you want to set up, mapping that to resources and GETs and POSTs isn’t very difficult.

    Does that make sense?


  14. Colin Jack Says:

    @Udi
    Yeah it definitely makes sense but I have one more question, do you use REST in your architectures and if so how do you find it works alongside SOA and DDD?


  15. udidahan Says:

    Colin,

    I do use REST where it makes sense - but primarily as a kind of message serialization mechanism.


  16. Jan Van Ryswyck Says:

    Hi Udi,

    I listened to your latest DNR episode and after reading this post I must say that this is a really awesome approach. Thx for sharing.

    I have a small nitpicking question though: what about security of the response resources (supposing that they contain sensitive information, which is not unlikely). I know its not easy to determine a GUID on the right time (before deleting the resource), but those kid hackers of today can do just about anything. Any thoughts about this topic?

    Again, great stuff.


  17. udidahan Says:

    Jan,

    Glad you liked it.

    Security is a big topic. The question is what threat profile we’re trying to protect against.

    One option is for the same saga that created the resource to protect it with an ACL.

    The thing is that you need to understand that probably the only way for an external attacker to know the guid/uri of the resource is for them to go for a man-in-the-middle attack. You’d need HTTPS to protect against that. Once you have that on the request, and you don’t allow anyone to list the response resources, you’re probably secure enough not to need HTTPS on the response.


  18. Jonathan Dickinson Says:

    How about using an async HTTP handler with your restful stuff. This way you are not wasting _bandwidth_ (far more expensive than CPU/Memory resources).

    I.e.
    string MyWSStart(string bla) -> Returns URL “wswait.ashx?id=100″.
    Open connection to wswait.ashx and wait for response.
    string MyWSEnd() -> Returns result.

    I will have it up on my blog at http://www.geekswithblogs.net/jcdickinson/ in a few minutes.


Your comment...



If this is your first time commenting, it may take a while to show up.
I'm working to make that better.

Subscribe here to receive updates on comments.
  
   


Don't miss my best content
 
Locations of visitors to this page

Recommendations

Sam Gentile Sam Gentile, Independent WCF & SOA Expert
“Udi, one of the great minds in this area.
A man I respect immensely.”





Simon Segal Simon Segal, CTO at IT Results Pty Ltd
“Udi is one of the outstanding software development minds in the world today, his vast insights into Service Oriented Architectures and Smart Clients in particular are indeed a rare commodity. Udi is also an exceptional teacher and can help lead teams to fall into the pit of success. I would recommend Udi to anyone considering some Architecural guidance and support in their next project.”

Ohad Israeli Ohad Israeli, Chief Architect at Hewlett-Packard, Indigo Division
“When you need a man to do the job Udi is your man! No matter if you are facing near deadline deadlock or at the early stages of your development, if you have a problem Udi is the one who will probably be able to solve it, with his large experience at the industry and his widely horizons of thinking , he is always full of just in place great architectural ideas.
I am honored to have Udi as a colleague and a friend (plus having his cell phone on my speed dial).”

Eli Brin, Program Manager at RISCO Group
“We hired Udi as a SOA specialist for a large scale project. The development is outsourced to India. SOA is a buzzword used almost for anything today. We wanted to understand what SOA really is, and what is the meaning and practice to develop a SOA based system.
We identified Udi as the one that can put some sense and order in our minds. We started with a private customized SOA training for the entire team in Israel. After that I had several focused sessions regarding our architecture and design.
I will summarize it simply (as he is the software simplist): We are very happy to have Udi in our project. It has a great benefit. We feel good and assured with the knowledge and practice he brings. He doesn’t talk over our heads. We assimilated nServicebus as the ESB of the project. I highly recommend you to bring Udi into your project.”

Yoel Arnon Yoel Arnon, MSMQ Expert
“Udi has a unique, in depth understanding of service oriented architecture and how it should be used in the real world, combined with excellent presentation skills. I think Udi should be a premier choice for a consultant or architect of distributed systems.”

Vadim Mesonzhnik, Development Project Lead at Polycom
“When we were faced with a task of creating a high performance server for a video-tele conferencing domain we decided to opt for a stateless cluster with SQL server approach. In order to confirm our decision we invited Udi.

After carefully listening for 2 hours he said: "With your kind of high availability and performance requirements you don’t want to go with stateless architecture."

One simple sentence saved us from implementing a wrong product and finding that out after years of development. No matter whether our former decisions were confirmed or altered, it gave us great confidence to move forward relying on the experience, industry best-practices and time-proven techniques that Udi shared with us.
It was a distinct pleasure and a unique opportunity to learn from someone who is among the best at what he does.”

Jack Van Hoof Jack Van Hoof, Enterprise Integration Architect at Dutch Railways
“Udi is a respected visionary on SOA and EDA, whose opinion I most of the time (if not always) highly agree with. The nice thing about Udi is that he is able to explain architectural concepts in terms of practical code-level examples.”

Sean Farmar Sean Farmar, Chief Technical Architect at Candidate Manager Ltd
“Udi has provided us with guidance in system architecture and supports our implementation of NServiceBus in our core business application.

He accompanied us in all stages of our development cycle and helped us put vision into real life distributed scalable software. He brought fresh thinking, great in depth of understanding software, and ongoing support that proved as valuable and cost effective.

Udi has the unique ability to analyze the business problem and come up with a simple and elegant solution for the code and the business alike.
With Udi's attention to details, and knowledge we avoided pit falls that would cost us dearly.”

Motty Cohen, SW Manager at KorenTec Technologies
“I know Udi very well from our mutual work at KorenTec. During the analysis and design of a complex, distributed C4I system - where the basic concepts of NServiceBus start to emerge - I gained a lot of "Udi's hours" so I can surely say that he is a professional, skilled architect with a fresh ideas and unique perspective for solving complex architecture challenges. His ideas, concepts and parts of the artifacts are the basis of several state-of-the-art C4I systems that I was involved in their architecture design.”

Aaron Jensen Aaron Jensen, VP of Engineering at Eleutian Technology
“Awesome. Just awesome.

We’d been meaning to delve into messaging at Eleutian after multiple discussions with and blog posts from Greg Young and Udi Dahan in the past. We weren’t entirely sure where to start, how to start, what tools to use, how to use them, etc. Being able to sit in a room with Udi for an entire week while he described exactly how, why and what he does to tackle a massive enterprise system was invaluable to say the least.

We now have a much better direction and, more importantly, have the confidence we need to start introducing these powerful concepts into production at Eleutian.”

Gad Rosenthal Gad Rosenthal, Department Manager at Retalix
“A thinking person. Brought fresh and valuable ideas that helped us in architecting our product. When recommending a solution he supports it with evidence and detail so you can successfully act based on it. Udi's support "comes on all levels" - As the solution architect through to the detailed class design. Trustworthy!”

Robert Lewkovich, Product / Development Manager at Eggs Overnight
“Udi's advice and consulting were a huge time saver for the project I'm responsible for. The $ spent were well worth it and provided me with a more complete understanding of nServiceBus and most importantly in helping make the correct architectural decisions earlier thereby reducing later, and more expensive, rework.”

Ray Houston Ray Houston, Director of Development at TOPAZ Technologies
“Udi's SOA class made me smart - it was awesome.

The class was very well put together. The materials were clear and concise and Udi did a fantastic job presenting it. It was a good mixture of lecture, coding, and question and answer. I fully expected that I would be taking notes like crazy, but it was so well laid out that the only thing I wrote down the entire course was what I wanted for lunch. Udi provided us with all the lecture materials and everyone has access to all of the samples which are in the nServiceBus trunk.

Now I know why Udi is the "Software Simplist." I was amazed to find that all the code and solutions were indeed very simple. The patterns that Udi presented keep things simple by isolating complexity so that it doesn't creep into your day to day code. The domain code looks the same if it's running in a single process or if it's running in 100 processes.”

Liron Levy, Team Leader at Rafael
“I've met Udi when I worked as a team leader in Rafael. One of the most senior managers there knew Udi because he was doing superb architecture job in another Rafael project and he recommended bringing him on board to help the project I was leading.
Udi brought with him fresh solutions and invaluable deep architecture insights. He is an authority on SOA (service oriented architecture) and this was a tremendous help in our project.
On the personal level - Udi is a great communicator and can persuade even the most difficult audiences (I was part of such an audience myself..) by bringing sound explanations that draw on his extensive knowledge in the software business. Working with Udi was a great learning experience for me, and I'll be happy to work with him again in the future.”

Eytan Michaeli Eytan Michaeli, CTO Korentec
“Udi was responsible for a major project in the company, and as a chief architect designed a complex multi server C4I system with many innovations and excellent performance.”

Evgeny-Hen Osipow, Head of R&D at PCLine
“Udi has helped PCLine on projects by implementing architectural blueprints demonstrating the value of simple design and code.”

Nimrod Peleg Nimrod Peleg, Lab Engineer at Technion IIT
“One of the best programmers and software engineer I've ever met, creative, knows how to design and implemet, very collaborative and finally - the applications he designed implemeted work for many years without any problems!”

Consult with Udi

Guest Authored Books
Chapter: Introduction to SOA    Article: The Enterprise Service Bus and Your SOA



Creative Commons License  © Copyright 2008, Udi Dahan.