Sunday, November 15, 2015

Learning NServiceBus - great read!

Last weekend, I finished the book 'Learning NServiceBus' that I got while attending a session on NServiceBus, by Dennis van der Stelt

For anyone interested in learning about message based architectures - this is a great read. Especially - obviously - if you want to work with NServicebus. And to @dvdstel - thank you for the book and the interesting session.

Monday, October 26, 2015

David Fowler - ASP.NET 5 Deconstructed

Super interesting talk by David Fowler on the new changes in vNext. Check it out if you want to get an idea how the request lifetime in vNext is different from the 'current stack' 4.5.1!

Tuesday, October 20, 2015

Cool library for method interception: CodeCop

Just listened to .NET Rocks! 1198 about the codecop library. I'm used to doing method interception using Ninject interceptors - but this looks very nice as well. 

Go check it out here:

Saturday, October 17, 2015

Trying to figure out Azure ML performance

As of last week, we've moved to Azure Machine Learning to host our predictive services, which run R code under the hood.

We used the PlumbR package ( ) just to get a Proof-Of-Concept up, but now that we want to get ready for higher loads, we needed something scalable.

So - we checked whether Azure ML was a feasable option - and it looked very promising. So we ported our R code into an 'Execute R script' module in Azure ML, hooked up the data sources ( Azure SQL DB ) and published a web-service.

Then we started testing this webservice, first through R, and got some unexpected results:

Request Id de01a645-4a3b-4ba8-827d-758f7e2ca0b5 took 2.43389391899109 seconds
Request Id fe6383ae-f2a7-467e-a580-0e215d9c79df took 23.6202960014343 seconds
Request Id 9895fb95-e261-4518-a062-276c6f9d3d11 took 2.31389212608337 seconds
Request Id 1d15ccc0-6d3d-414c-9320-c737ee0976b8 took 27.6219758987427 seconds
Request Id a8bf5942-8640-4b25-97df-ce7960109ec2 took 23.4478011131287 seconds
Request Id 88b8fd54-fe31-4261-a601-d57f13babed8 took 18.3553009033203 seconds
Request Id cde2311b-fa53-441e-a099-6bfcbd66860e took 2.30307912826538 seconds
Request Id a0a00ece-1b39-4c24-a636-f43bf9566a25 took 24.1078689098358 seconds

Admittedly, we were aiming to get a sub-second response but would settle for 2 seconds if at least the solution was now scalable and leveraging PaaS. The 23 seconds was way too much though, and we had hard time figuring out where that came from. So - we decided to scale out the webservice - to see whether it would run faster.
Now, the strange thing was - the average response time went up! That was against all our expectation. Thanks to the chat button in Azure ML I got in contact with some of the guys over at Microsoft and they helped me out.

What happens is that if you set the slider to the '5 concurrent calls', 5 environments will be provisioned for you in the background. However - these need a bit of a warmup time on the first call. ( approx 20 seconds in our setup ). So - in the request mentioned above - the first call hit a 'warm' environment, the second call a 'cold' environment and son on.
If you fire sequential calls ( the calls from R ), there is no way of telling whether you will hit a pre-warmed-up environment or not. That's why the response times differ so much. However - once everything is warmed up - response times will be consistent at approx 2 seconds.

Now - if you scale up your environment - you will have to sit out the warmup time 200 times - if you do enough times. That means that your average response time goes up! However - once they're all warmed up - you can have 2 second response times for 200 concurrent calls consistently.
So that explains why we were confused with our initial tests.

Now - another thing to keep in mind is that if your Azure ML service consumes a data source - that data source needs to be able to deal with the higher loads too. We're talking to an Azure SQL database and we had 2 second response times on 20 concurrent calls - but 7 seconds on 50 concurrent calls. After scaling up our SQL Azure DTUs - that response time went back down to 2 seconds, and all was good again.

Let me share some screenshots - simple VS load test on 20 concurrent calls:

I think the containers were already pretty much warmed up - because the performance is around 2.5 seconds fromt he get-go.

If you raise the consistent load to 25 - whereas your service is set to 20 - you'll get exceptions:

This is something you want to keep in mind - and probably implement a retry strategy somehwere.

Now scaling out to 50 concurrent calls:

We noticed that the test response times increased a 3 fold, whereas we were hoping it would remain stable. Checking the db load revlealed the problem however:

Every peak is a load test that we ran. If you zoom in on the peak - it turns out that the DB was running at around 90% of the DTU capacity. So - we scaled it up to 50:

Re-ran the tests - and presto:

Test response times even went under 2 seconds. In other words - the DB was a bit of a bottleneck in the first tests too.

Anyway - last friday we switched from the PlumbR to Azure ML and all our predictions should now be able to scale out nicely.

Big thanks to Yoav Helfman, Akshaya Annavajhala and Pavel Dournov over at Microsoft for helping me out discovering this.

Sunday, October 11, 2015

I'm a Code Avenger!

A couple of months ago - I volunteered to be part of the Dutch Code Avengers ( partially thanks to my buddy Rob(*+) van Uden. Last weekend - I did a full day with two groups of kids, aging 7 - 11. We did same computer basics and quickly moved on to do some basic JavaScript coding. Awesome stuff!

It was an intense day but it's very interesting to see what's going on the minds of these little people! I can't wait until code avengers is going to organize 3-day courses in The Netherlands so we can go a bit more in dept.

Thanks Rob for pushing me to do this!