Posts Tagged ‘bootstrap’

Evaluation of AngularJS, EmberJS, BackboneJS + MarionetteJS

December 28, 2013

This post will continue to be modified for at least a month from the publish date. I just didn’t want to wait another month before publishing, so people can start to get some use out of it early. If you have resources, comments, anything you think that could be useful to others, please add a comment and if it makes sense, I may add it to the post. This will also be used as a resource for the attendees to the CHC.JS MV* Battle Royale meet-up.

Recently I’ve undertaken the task of reviewing some JavaScript MV* frameworks to help organise/structure the client side code within an application I’m currently working on. This is about the third time I’ve done this. Each time has been for a different type of application with completely different requirements, frameworks and libraries to consider.
Unlike Angular and Ember, Backbone is a small library. Marionette adds quite a lot of extra functionality and provides some nice abstractions on top . All mentioned frameworks/libraries are free and open source.

I found a useful tool for helping with the selection process about a year ago. It’s called TodoMVC and it contains a generous collection of applications all satisfying the requirements of a single specification (a small web app that allows the person using it to add todo notes etc.). So basically they all do the same thing, but use a different JavaScript framework or library to do it. It’s still being maintained. Addy Osmani’s blog post on the project is here.

The idea is that you can work through a decent size selection of applications that all do the same thing.
This assists the R&D developer or architect to make informed decisions on which JavaScript framework or library will suite their purposes, if any.
There are also a couple of Todo apps (vanillajs and jquery) that don’t use a framework at all.
There’s a template to use as a starting point, so you can create your own.

Just bear in mind though, that the TodoMVC app doesn’t really show case what Ember and Angular has to offer.

On Addy’s post There are a collection of good points on how to create your selection criterion under the heading “Our Suggested Criteria For Selecting A Framework”.

I’ve heard a few times that “all you really need to do in order to make an informed decision on which framework or library to go with is just write a small app for each of the frameworks, do a bit of reading and maybe watch a few screen casts. Shouldn’t take more than a day”. I disagree with this. I don’t think there is any way you can learn all or most of the pros and cons of each framework in a day or even two. Depending on how much time you have, my recommended approach would be to go through the following activities in the following order (give or take). Spending as little or as much time as you have, ideally in a few iterations, for each of the offerings you’re investigating.

  1. Listen to a pod-cast (say, on your way to/from a clients or even in your sleep. Good time savers.)
  2. Read some of the documentation
  3. Watch a screen-cast on each one
  4. Play with some examples
  5. Evaluate on features you (definitely or may) want verses features available. Features need to be learned. If you don’t need them, you will probably be better going with the offering that doesn’t have the features you don’t need, but has the architecture to add them (thinking Backbone) if/when you do need them.
  6. Are the features implemented in an architecture that you believe is good (I.E. are the layers muddied)?
  7. Read some blog posts, tutorials.
  8. Read some opinions and evaluate for yourself.
  9. Start testing it’s limits
  10. Decide whether you like its opinions imposed
  11. Does it impose enough or to many opinions for you and your team

As the JavaScript MV* landscape is constantly and very quickly changing, the outcome of your evaluation will have a short use by date.

This is my attempt to distil the attributes of the discussed offerings. I’ve attempted to come at this with an open mind. Hopefully this will help save some work for those that come after me. lists are sorted in the order of most useful to me. I make no apologies for the abundance of links, as I’ve also used this for a resource collection point and hope that this post will fall into the category of a “one stop shop” for what I consider to “currently” be the top three contenders in the client side MV* line-up. In saying that though, there are other strong contenders like Meteor not discussed here, as it’s more than just a client side MV* framework. Without further ado, here they are…

Angular.js

AngularJS

Intro

Opinionated framework that has Models, Views and Controllers, but does not conform to the MVC pattern.

Core Team

Igor Minár, Miško Hevery, Vojta Jína.
All work at google.

Backed by the commercial giant Google (you decide whether that’s a good thing).

Community

Conferences

  1. ng-conf

Statistics

  • Version: 1.2.5
  • Payload Size: Depends on handlebars development version 85kb
    1. development version 716.7kb
    2. minified 99.8kb
    3. minified and compressed < 36kb
  • Age: Initial Github commits: January 2010

Performance

See Backbone Performance below.

Documentation

Pod-casts

  1. Angular.js

Screen-casts

  1. AngularJS on YouTube
  2. EggHead.io Lessons

Blog Posts, Tutorials, etc.

  1. Learning AngularJS in one day
  2. Angular docs Tutorial

Features

  • Directives: used via Non HTML compliant tags, attributes, comment and class names. Although there are options to make it compliant:  via the class (not recommended) and data attributes.
  • scope. The first half of this video shows how the scope may be confusing to those new to Angular. If I can not tell how code will work without running it, it violates the Principle of Least Astonishment (PoLA). It seems quite clunky to me.

Positive

  1. Good for long running and complex applications with deep nested view hierarchies
  2. Two-way data binding
  3. All tests run against IE8 (good for those that are locked into legacy MS)
  4. Test driven (and more vocal about it than Ember)
  5. Payload is about 1/3 smaller than Ember

Negative

  1. Steep learning curve compared to Backbone, but not as steep as Ember.
  2. Dirty checking to keep views and models in sync is costly. Ember keeps sync in a more elegant way. Possible perceived downside to this is Ember models have to inherit from DS.Model (next point addresses this as a positive though). Also discussed here under the “Performance issues” heading.
  3. Models are Plain Old JavaScript Objects (POJO’s). Doesn’t have to be anything special. Now there’s an argument here that attempts to explain this as being a selling point of Angular, but in reality what happens is a violation of the Uniform Access Principle, thus creating tight coupling. How’s that? Well now the view needs to know too much about the model’s members. Discussed in more detail here. For example if one of the models properties is a function, the view has to know this. So you see this sort of thing in the view {{area()}} (so we’re pulling our JavaScript into our view.) where as with Ember because it’s models are well defined and you can use computed properties on them, all the view needs to specify is an identifier, then you’ll see this sort of thing in the view {{area}}. The Ember model then creates a computed property with the same name. The opposing view is that in ES5 you can just hide your functions etc. behind property getters and setters. Most developers take the path of least resistance, so I think most will be doing it the wrong way.

Interesting Plug-ins

  1. ?

Useful Tools

  • “AngularJS Batarang” for Chrome browser (it’s an extension)

Ember.js

EmberJS

Intro

Opinionated framework that has Models, Views and Controllers, but does not conform to the MVC pattern.

Core Team

Yehuda Katz, formerly of Rails and SproutCore projects.
Tom Dale, Peter Wagenet, Trek Glowacki, Erik Bryn, Kris Selden, Stefan Penner, Leah Silber, Alex Matchneer.

Backed by the JavaScript community.

Community

Conferences

Statistics

  • Version: 1.2.0
  • Payload Size:
    1. development version 1.1MB
    2. production version 1.0MB
    3. minified and GZipped 67kb
  • Age: Initial Github commits: April 2011

Performance

See Backbone Performance below.

Documentation

Pod-casts

  1. JavaScript Jabber Ember Tools
  2. JavaScript Jabber Ember.js (also covers some backbone)
  3. JavaScript Jabber Ember.js & Discourse
  4. EmberWatch

Screen-casts

  1. Building an Ember.js Application
  2. Ember101
  3. EmberWatch
  4. tutsplus
  5. EmberWatch 

Blog Posts, Tutorials, etc.

Features

  • the ember-application class gets added to the root element (body) in the ember JavaScript file. I was wondering how this class was magically added to the markup. Couldn’t find any documentation on it, so had to look through the JavaScript.

Positive

  1. Good for long running and complex applications with deep nested view hierarchies
  2. Aggregates model data changes and update the DOM late in the RunLoop.
  3. Well defined models and computed properties (See Angular negative point around this).
  4. Test driven

Negative

  1. Steepest learning curve out of the three. Why? Because there’s more in it. If you need it, great! Maybe you don’t. If not, is the extra learning worth using it? Part of the “more in it” may also be around the elegant way things have been designed, I.E. more constraints to push the users down the right path, thus higher chance of less friction and pain in the future of your application, that is of course if your application does things they way Ember says they must be done. I’m seeing some of these things in the likes of the well defined models and computed properties.
  2. Payload is the largest out of all three.

Interesting Plug-ins

  1. ?

Useful Tools

  • “Ember Inspector” for Chrome browser (it’s an extension)
  • ember-tools. Listen to and/or read the pod-cast linked to above. Provides file organisation, scaffolding, template pre-compilation, generators, CommonJS (that’s node.js style) modules. and other goodies. Useful for setting up your project to conform to the Ember conventions, so you don’t end up fighting them.

Angular versus Ember views

Pod-casts

  1. Angular vs Ember Cage Match NDC

Screen-casts

  1. Angular vs Ember Cage Match NDC

Blog Posts, Tutorials, etc.

  1. Evil Trout Ember versus Angular (possible bias toward Ember)
  2. Why AngularJS beat EmberJS

My Thoughts

Both Frameworks Appear to be Targeting a Similar Problem Space

Don’t believe everything you read. Test it before you buy it. I’ve come across quite a few articles that are just incorrect. Even by reputable people. Sometimes because the frameworks have changed how they do things and/or their documentation has changed. So don’t just take it all at face value. The concept of MVC has changed over the past decade. Although concepts have changed, a pattern doesn’t change, that’s why it’s a pattern. Something everybody familiar with a pattern understands. If an implementation starts to change, then it no longer conforms to the pattern and should not be named after the pattern, as this just brings confusion. Microsoft’s ASP.NET MVC framework is a perfect example of this. It does not follow the MVC pattern (documented in 1979) and so should never have been named MVC. Ask me in the comments to explain if your not aware of how this is. In the MVC pattern Models are not injected into views by Controllers. With the MVC pattern, Views listen to events from a Model (The View is actually oblivious to the Model) which the Controller has hooked up, since the Controller knows about both the View and the Model. This may not be your understanding of MVC? More than likely this is due to certain frameworks being labelled as MVC when they are not, thus bringing the confusion. The following image provided by Gang-Of-Four depicts the MVC pattern.

MVC

Angular Doesn’t Pretend JavaScript has Real Classes

Personally I find both frameworks have opinions that make me nauseous.
Like Angular’s scope and Embers class-hierarchy abstraction. Yes Harmony will have pseudo classes for the classical programmers that struggle with JavaScripts declarative prototypal inheritance. (disclaimer: my roots are in classical OO languages) The way I feel about it: Say a whole lot of JavaScript programmers start using a classical OO language and decide they don’t like the way it does classical inheritance, so the classical object oriented language authorities decide to add syntactic sugar on top of the language to make it’s classical inheritance “look” more like prototypal inheritance for those that struggle with the classical paradigm. Now seriously, why would you muddy the language to cater for those that are not prepared to spend the time learning how it works?

Another and probably the most obvious reason why JavaScript didn’t have classes, is so that object hierarchies could be built up via composition (only inheriting what is actually needed) rather than having to inherit every member needlessly from a base class (essentially knowing far more than is actually needed).  Once you have to re-factor your way out of a code base that has abused inheritance thus creating very tightly coupled code by violating one of the object-oriented design principles (information hiding), the perils of over using inheritance will become very clear.

I’m open to exploring what the other client side JavaScript frameworks and libraries have to offer and I’d love to hear from everyone that’s had experience with them.

Angular and Ember do a Lot For You

With all the bells and whistles, both frameworks impose strong opinions that you must follow in order to make the magic (in a lot of the cases convention) work. Once you’ve learnt Angular and/or Ember, productivity is maximised. But… you must be building your application the way the framework creators want you to. At this stage, I’m not supper comfortable with that. This is where Backbone and friends comes in to its own.

Backbone.jsMarionette.js

BackboneJS + MarionetteJS

Intro

Backbone is an unopinionated library that has Models, Views but no Controllers out of the box. That’s right, a library rather than a framework because your code needs to know about it, rather than it knowing about and executing your code. It does not follow the MVC, MVP or MVVM patterns. It’s views and routers act similarly to a controller. Marionette brings the controller to Backbone (if you want or need it), thus you can keep your router doing what it should be doing (just routing, with no controller logic).

What I find strange is that a Backbone view contains a model. I’m not sure I’d even call this a MV* library, as it may introduce confusion.

Backbone’s sweet spot is providing the user with brief and casual interaction. Doesn’t provide help or guidance with deallocating memory and detaching events. Assumptions are that the user isn’t going to be using this application all day without closing the browser window. Although in saying that, there are many applications that use Backbone for this type of thing, but they must provide explicit code to release event handlers. Marionette provided some help here for older versions of Backbone. and Backbone has improved things with newer versions. You will still need to keep in mind that event handlers need to be released though (Backbone’s view.remove takes care of this now). Marionette provides abstractions to deal with these like the close method which provides a place to add clean-up code and then calls Backbone’s remove. Failing to remove event handlers are the largest cause of memory leaks in Backbone.

Core Team

Backbone: Jeremy Ashkenas

Marionette: Derick Bailey

Community

IRC: #marionette on FreeNode. Little activity.

Conferences

  1. BackboneConf

Statistics

  • Version: 1.1.0
  • Payload size: Depends on Underscore development version 43kb or minified and gzipped 4.9kb
    1. Backbone development version 59kb
    2. Backbone minified and gzipped 6.4kb
  • Age: Backbone: Initial Github commits: September 2010

Performance

The second half of this video shows the difference between Backbone and Ember performance. What I’ve seen to date, is that in terms of performance, Backbone leads, second is Ember, third is Angular. You need to decide how much performance matters to your situation and whether or not it’s “good enough” for the framework/library you choose.

Documentation

Pod-casts

  1. Marionette.js
  2. JavaScript Jabber Ember.js (also covers some backbone)
  3. Backbone.js

Screen-casts

  1. How to build modular Backbone applications using MarionetteJS
  2. Tuts+ Intro to Marionette
  3. Plugging in MarionetteJS. This resource is about adding Marionette to a MongoDB document explorer. Also features source code.
  4. Github
  5. BackboneConf 2013 Talks

Blog Posts, Tutorials, etc.

  1. Github
  2. backbone and ember
  3. Marionette Wiki

Books

  1. Backbone Fundamentals

Features

  • ?

Positive

  1. Free to use any templating engine. You can use underscore as it’s the only dependency of backbone, or any other of your choosing.
  2. A lot of excellent documentation
  3. Very flexible in how you may want to use it
  4. Minimalist library
  5. Easy to learn (not a lot of it).
  6. Payload including dependencies is the smallest out of all three. About 9 times smaller than Ember.

Negative

  1. No two way data-binding. Although if you want/need it, you could use the likes of the data binding offerings below in the Interesting Plug-ins section.
  2. No provision for handling nested views. This is where the likes of Marionette’s Backbone.BabySitter comes in
  3. More work required to build large scale applications than the likes of Angular or Ember (just a library after all).
  4. If your large complex application is written in Backbone, chances are you have added a lot of boiler plate code. Any new developers coming onto the project will have to get up to speed on this code. If your large complex application uses Angular or Ember and the new developers coming onto the project have worked with these frameworks, they more than likely won’t have to learn the boiler plate code that they would have to with the likes of Backbone, because it’s part of the framework.

Interesting Plug-ins

  1. There is a similar offering: backbone.layoutmanager which I haven’t really looked into, but according to Derick Bailey (Marionette BDFL) is more of a framework where as Marionette is a library.
  2. Two way data binding with Rivets.jsKnockback.jsbackbone.stickit
    NYTimes backbone.stickit “is a Backbone data binding plug-in that binds Model attributes to View elements with a myriad of options for fine-tuning a rich application experience”. What looks to be nice about this is that unlike most model binding plug-ins I’ve seen, it doesn’t require you to add any extra tags like Angular to your view. In fact your views are not contaminated at all.
  3. Backbone.routefilter plug-in allows you to add behaviour that will be executed immediately before and/or after a route (Backbone.Router or Marionette.AppRouter) executes.

Useful Tools

  • “Backbone Debugger” for Chrome browser (it’s an extension)
  • Frameworks that leverage backbone and provide more functionality
    1. chaplinJS
    2. thoraxJS (adds handlebars integration plus other functionality)

Now a few more concepts that I think are important to know about if your serious about using a client side JavaScript MV* framework/library and in regards to module loading, this applies to the server side also.

Templating

Blog Posts etc.

  1. net tuts+ Best Practices When Working With JavaScript Templates
  2. net tuts+ An Introduction to Handlebars

Some Offerings

I covered some of the template engines here under “Templating Engines evaluated”, or just use the likes of the Template Engine Chooser

Coupling Domain with Framework

As Boris Smus has said and I think it’s right on the money (although I disagree with his comments around JavaScript class as per my comments above):
Once you bite the bullet and decide to invest in a framework, you often have no easy way to move your code out of it.
If you pick Backbone, but decide mid-cycle that it’s not for you, you are in for a world of hurt:
If you have core functionality that you want to release, release it in pure JavaScript, not as a jQuery plug-in, or some MV* module.

Because there are so many JavaScript frameworks coming and going, and we don’t want to invest to heavily into any one of them,
we really need to keep our investment separate from the library/framework code.

To avoid library/framework and class-system lock-in, a good approach in regards to JavaScript MV* libraries/frameworks,
Is to keep the core functionality separate from the user interface code, thus giving us two separate layers.
This gives us flexibility to swap user interfaces as they come and go, yet still keep the majority of our code in an API layer.
The API layer being a logical single layer, but can be modularised, and loaded as needed, AMD style.
With this separation, we can implement the two layers in the following manner.

1) Build the base layer using pure JavaScript prototypal inheritance.
This is the part you write with the intention of keeping and possibly using parts for other projects also.
This base layer will implement an API that you will want to spend a bit of time getting right.
This is the code that will make the most use of unit tests.
To get the separation clear in your head, think of the user interface code as a client that uses this API as if it was service API sitting on the server.
This way you can avoid creating leaky abstractions.

2) Use an MV* library/framework to implement the user interface, and call into the base layer directly.
This lets you move quickly and focus entirely on writing the user interface.
This architecture should facilitate building your user interface on a solid foundation and avoid investing heavily into an offering that you may want to swap out further down the track.

Modules

In most browsers, just including a script tag will cause the rest of the page to stop rendering until the script has loaded then executed.
Which is why if loading scripts synchronously, they should be concatenated, minified, compressed and included at the bottom.
Loading scripts asynchronously don’t block, which is why you can load multiple scripts in parallel where ever you want (any more than 2-3 concurrently and performance will degrade). Make sure to concatenate your scripts though.

What we see as our projects get larger, is that scripts start to have many dependencies in a way that may overlap and nest.

The simplest way to load asynchronously is to create a script tag and inject it into an existing DOM element on your page.
Because the DOM element already exists, the rendering is not blocked.
See the first code example here

// Create a new script element.
var script = document.createElement('script');

// Find an existing script element on the page (usually the one this code is in).
var firstScript = document.getElementsByTagName('script')[0];

// Set the location of the script.
script.src = "http://example.com/myscript.min.js";

// Inject with insertBefore to avoid appendChild errors.
firstScript.parentNode.insertBefore( script, firstScript );

If you want or need to get serious about script loading (which you’re probably going to have to do at some stage), use a best-of-breed script loader. This will also push you down the path of defining modular JavaScirpt (AKA modules).

Next we look at employing script loaders to load our modules…

Formats available for Writing and Using Modular JavaScript

Asynchronous Module Definition (AMD)

For writing modular JavaScript in the browser. To save re-writing what’s already been done… http://addyosmani.com/writing-modular-js/ see “AMD” section, explains it well. What does AMD actually give us? http://requirejs.org/docs/whyamd.html#amd Separation of Concerns, essentially placing value on interface rather than implementation. Mapping of module IDs to different paths. Lots more. Allows asynchronous loading of modules and their dependencies, which is something we need on the client side, but is not generally a requirement for the server side. For getting started, see “Getting Started With Modules” under the AMD section here. Also check out the AMD specification and of course the most common AMD implementation: RequireJS. Then at some stage you’re probably going to want to concatenate and minify your modules and that’s where the likes of r.js comes in. r.js also has a node.js adapter which allows you to use node’s implementation of  require.

Tom Dale (core team member on Ember) also has some interesting ideas around why he thinks AMD is not the answer.

CommonJS API (Optimised for the server)

Although we have the likes of browserify a CommonJS module implementation that can run in the browser or browser-build… makes CommonJS modules available in the browser and is very fast. Ryan Florence discusses module loaders in the pod-cast listed above “JavaScript Jabber Ember Tools” where he decided to move to CommonJS rather than RequireJS for his Ember Tools mostly due to speed. So it’s horses for courses. Decide what your requirements are, then decide which module loader satisfies the most of them. Also see “writing modular js” under the “CommonJS” section.
Providing a rich standard library. The intention is that an application developer will be able to write an application using the CommonJS APIs and then run that application across different JavaScript interpreters and host environments. With CommonJS-compliant systems, you can use JavaScript to write:

  • Server-side JavaScript applications
  • Command line tools
  • Desktop GUI-based applications
  • Hybrid applications (Titanium, Adobe AIR)

Why it doesn’t excel in the browser “out of the box”: http://requirejs.org/docs/whyamd.html#commonjs
ES Harmony (Modules implemented in the language. were not quite there yet, but the current offerings look to be a pretty good step in the right direction).

http://addyosmani.com/writing-modular-js/ (specifically “ES Harmony” section) discusses where TC39 are going in regards to implementing modules in ES.next.

So AMD and CommonJS can be used on server side or client side. In some cases one will work better for you than the other. You’ll need to do your homework as to what to use in which scenarios. Both have advantages and disadvantages that may work for or against you.

I’m keen to get a discussion going here on peoples experiences with the three MV* offerings mentioned. Especially those that have experience with two or more.

Advertisements

Up and Running with Express on Node.js … and friends

July 27, 2013

This is a result of a lot of trial and error, reading, notes taken, advice from more knowledgeable people than myself over a period of a few months in my spare time. This is the basis of a web site I’m writing for a new business endeavour.

Web Frameworks evaluated

  1. ExpressJS Version 3.1 I talked to quite a few people on the #Node.js IRC channel and the preference in most cases was Express. I took notes around the web frameworks, but as there were not that many good contenders, and I hadn’t thought about pushing this to a blog post at the time, I’ve pretty much just got a decision here.
  2. Geddy Version 0.6

MV* Frameworks evaluated

  1. CompoundJS (old name = RailwayJS) Version 1.1.2-7
  2. Locomotive Version 0.3.6. built on Express

At this stage I worked out that I don’t really need a server side MV* framework, as Express.js routes are near enough to controllers. My mind may change on  this further down the track, if and when it does, I’ll re-evaluate.

Templating Engines evaluated

  1. jade Version 0.28.2, but reasonably mature and stable. 2.5 years old. A handful of active contributors headed by Chuk Holoway. Plenty of support on the net. NPM: 4696 downloads in the last day, 54 739 downloads in the last week, 233 570 downloads in the last month (as of 2013-04-01). Documentation: Excellent. The default view engine when running the express binary without specifying the desired view engine. Discussion on LinkedIn. Discussed in the Learning Node book. Easy to read and intuitive. Encourages you down the path of keeping your logic out of the view. The documentation is found here and you can test it out here.
  2. handlebars Version 1.0.10 A handful of active contributors. NPM: 191 downloads in the last day, 15 657 downloads in the last week, 72 174 downloads in the last month (as of 2013-04-01). Documentation: Excellent: nettuts. Also discussed in Nicholas C. Zakas’s book under Chapter 5 “Loose Coupling of UI Layers”.
  3. EJS Most of the work done by the Chuk Holoway (BDFL). NPM: 258 downloads in the last day, 13 875 downloads in the last week, 56 962 downloads in the last month (as of 2013-04-01). Documentation: possibly a little lacking, but the ASP.NET syntax makes it kind of intuitive for developers from the ASP.NET world. Discussion on LinkedIn. Discussed in the “Learning Node” book by Shelley Powers. Plenty of support on the net. deoxxa from #Node.js mentioned: “if you’re generating literally anything other than all-html-all-the-time, you’re going to have a tough time getting the job done with something like jade or handlebars (though EJS can be a good contender there). For this reason, I ended up writing node-ginger a while back. I wouldn’t suggest using it in production at this stage, but it’s a good example of how you don’t need all the abstractions that some of the other libraries provide to achieve the same effects.”
  4. mu (Mustache template engine for Node.js) NPM: 0 downloads in the last day, 46 downloads in the last week, 161 downloads in the last month (as of 2013-04-01).
  5. hogan-express NPM: 1 downloads in the last day, 183 downloads in the last week, 692 downloads in the last month (as of 2013-04-01). Documentation: lacking

Middleware AKA filters

Connect

Details here https://npmjs.org/package/connect express.js shows that connect().use([takes a path defaulting to ‘/’ here], andACallbackHere) http://expressjs.com/api.html#app.use the body of andACallbackHere will only get executed if the request had the sub directory that matches the first parameter of connect().use

Styling extensions etc evaluated

  1. less (CSS3 extension and (preprocessor) compilation to CSS3) Version 1.4.0 Beta. A couple of solid committers plus many others. runs on both server-side and client-side. NPM: 269 downloads in the last day, 16 688 downloads in the last week, 74 992 downloads in the last month (as of 2013-04-01). Documentation: Excellent. Wiki. Introduction.
  2. stylus (CSS3 extension and (preprocessor) compilation to CSS3) Worked on since 2010-12. Written by the Chuk Holoway (BDFL) that created Express, Connect, Jade and many more. NPM: 282 downloads in the last day, 16 284 downloads in the last week, 74 500 downloads in the last month (as of 2013-04-01).
  3. sass (CSS3 extension and (preprocessor) compilation to CSS3) Version 3.2.7. Worked on since 2006-06. Still active. One solid committer with lots of other help. NPM: 12 downloads in the last day, 417 downloads in the last week, 1754 downloads in the last month (as of 2013-04-01). Documentation: Looks pretty good. Community looks strong: #sass on irc.freenode.net. forum. less, stylus, sass comparison on nettuts.
  • rework (processor) Version 0.13.2. Worked on since 2012-08. Written by the Chuk Holoway (BDFL) that created Express, Connect, Jade and many more. NPM: 77 downloads in the last week, 383 downloads in the last month (as of 2013-04-01). As explained and recommended by mikeal from #Node.js its basically a library for building something like stylus and less, but you can turn on the features you need and add them easily.  No new syntax to learn. Just CSS syntax, enables removal of prefixes and provides variables. Basically I think the idea is that rework is going to use the likes of less, stylus, sass, etc as plugins. So by using rework you get what you need (extensibility) and nothing more.

Responsive Design (CSS grid system for Responsive Web Design (RWD))

There are a good number of offerings here to help guide the designer in creating styles that work with the medium they are displayed on (leveraging media queries).

Keeping your Node.js server running

Development

During development nodemon works a treat. Automatically restarts node when any source file is changed and notifies you of the event. I install it locally:

$ npm install nodemon

Start your node app wrapped in nodemon:

$ nodemon [your node app]

Production

There are a few modules here that will keep your node process running and restart it if it dies or gets into a faulted state. forever seems to be one of the best options. forever usage. deoxxa’s jesus seems to be a reasonable option also, ningu from #Node.js is using it as forever was broken for a bit due to problems with lazy.

Reverse Proxy

I’ve been looking at reverse proxies to forward requests to different process’s on the same machine based on different domain names and cname prefixes. At this stage the picks have been node-http-proxy and NGinx. node-http-proxy looks perfect for what I’m trying to do. It’s always worth chatting to the hoards of developers on #Node.js for personal experience. If using Express, you’ll need to enable the ‘trust proxy’ setting.

Adding less-middleware

I decided to add less after I had created my project and structure with the express executable.
To do this, I needed to do the following:
Update my package.json in the projects root directory by adding the following line to the dependencies object.
“less-middleware”: “*”

Usually you’d specify the version, so that when you update in the future, npm will see that you want to stay on a particular version, this way npm won’t update a particular version and potentially break your app. By using the “*” npm will download the latest package. So now I just copy the version of the less-middleware and replace the “*”.

Run npm install from within your project root directory:

my-command-prompt npm install
npm WARN package.json my-apps-name@0.0.1 No README.md file found!
npm http GET https://registry.npmjs.org/less-middleware
npm http 200 https://registry.npmjs.org/less-middleware
npm http GET https://registry.npmjs.org/less-middleware/-/less-middleware-0.1.11.tgz
npm http 200 https://registry.npmjs.org/less-middleware/-/less-middleware-0.1.11.tgz
npm http GET https://registry.npmjs.org/less
npm http GET https://registry.npmjs.org/mkdirp
npm http 200 https://registry.npmjs.org/mkdirp
npm http 200 https://registry.npmjs.org/less
npm http GET https://registry.npmjs.org/less/-/less-1.3.3.tgz
npm http 200 https://registry.npmjs.org/less/-/less-1.3.3.tgz
npm http GET https://registry.npmjs.org/ycssmin
npm http 200 https://registry.npmjs.org/ycssmin
npm http GET https://registry.npmjs.org/ycssmin/-/ycssmin-1.0.1.tgz
npm http 200 https://registry.npmjs.org/ycssmin/-/ycssmin-1.0.1.tgz
less-middleware@0.1.11 node_modules/less-middleware
├── mkdirp@0.3.5
└── less@1.3.3 (ycssmin@1.0.1)

So you can see that less-middleware pulls in less as well.
Now you need to require your new middleware and tell express to use it.
Add the following to your app.js in your root directory.

var lessMiddleware = require('less-middleware');

and within your function that you pass to app.configure, add the following.

app.use(lessMiddleware({
   src : __dirname + "/public",
   // If you want a different location for your destination style sheets, uncomment the next two lines.
   // dest: __dirname + "/public/css",
   // prefix: "/css",
   // if you're using a different src/dest directory, you MUST include the prefix, which matches the dest public directory
   // force true recompiles on every request... not the best for production, but fine in debug while working through changes. Uncomment to activate.
   // force: true
   compress : true,
   // I'm also using the debug option...
   debug: true
}));

Now you can just rename your css files to .less and less will compile to css for you.
Generally you’ll want to exclude the compiled styles (.css) from your source control.

The middleware is made to watch for any requests for a .css file and check if there is a corresponding .less file. If there is a less file it checks to see if it has been modified. To prevent re-parsing when not needed, the .less file is only reprocessed when changes have been made or there isn’t a matching .css file.
less-middleware documentation

Bootstrap

Twitters Bootstap is also really helpful for getting up and running and comes with allot of helpful components and ideas to get you kick started.
Getting started.
Docs
.

Bootstrap-for-jade

As I decided to use the Node Jade templating engine, Bootstrap-for-Jade also came in useful for getting started with ideas and helping me work out how things could fit together. In saying that, I came across some problems.

ReferenceError: home.jade:23

body is not defined
    at eval (eval at <anonymous> (MySite/node_modules/jade/lib/jade.js:171:8), <anonymous>:238:64)
    at MySite/node_modules/jade/lib/jade.js:172:35
    at Object.exports.render (MySite/node_modules/jade/lib/jade.js:206:14)
    at View.exports.renderFile [as engine] (MySite/node_modules/jade/lib/jade.js:233:13)
    at View.render (MySite/node_modules/express/lib/view.js:75:8)
    at Function.app.render (MySite/node_modules/express/lib/application.js:506:10)
    at ServerResponse.res.render (MySite/node_modules/express/lib/response.js:756:7)
    at exports.home (MySite/routes/index.js:19:7)
    at callbacks (MySite/node_modules/express/lib/router/index.js:161:37)
    at param (MySite/node_modules/express/lib/router/index.js:135:11)
GET /home 500 22ms

I found a fix and submitted a pull request. Details here.

I may make a follow up post to this titled something like “Going Steady with Express on Node.js … and friends'”

Sanitising User Input from Browser. part 2

November 16, 2012

Untrusted data (data entered by a user), should always be treated as though it contains attack code.
This data should not be sent anywhere without taking the necessary steps to detect and neutralise the malicious code.
With applications becoming more interconnected, attacks being buried in user input and decoded and/or executed by a downstream interpreter is becoming all the more common.
Input validation, that’s restricting user input to allow only certain white listed characters and restricting field lengths are only two forms of defence.
Any decent attacker can get around client side validation, so you need to employ defence in depth.
validation and escaping also needs to be performed on the server side.

Leveraging existing libraries

  1. Microsofts AntiXSS is not extensible,
    it doesn’t allow the user to define their own whitelist.
    It didn’t allow me to add behaviour to the routines.
    I want to know how many instances of HTML encoded values there were.
    There was certainly a lot of code in there, but I didn’t find it very useful.
  2. The OWASP encoding project (Reform)(as mentioned in part 1 of this series).
    This is quite a useful set of projects for different technologies.
  3. System.Net.WebUtility from the System.Web.dll.
    Now this did most of what I needed other than provide me with fine grained information of what had been tampered with.
    So I took it and extended it slightly.
    We hadn’t employed AOP at this stage and it wasn’t considered important enough to invest the time to do so.
    So it was a matter of copy past modify.

What’s the point in client side validation if the server has to do it again anyway?

Now there are arguments both ways for this.
My current take on this for the project in question was:
If you only have server side validation, the client side is less responsive and user friendly.
If you only have client side validation, it’s out of our control.
This also gives fuel to the argument of using JavaScript on the client and server side (with the likes of node.js).
So the same code can be used both sides without having to code the same validation in two different languages.
Personally I find writing validation code easier using JavaScript than C#.
This maybe just because I’ve been writing considerably more JavaScript than C# lately though.

The code

I drew a sequence diagram of how this should work, but it got lost in a move.
So I wasn’t keen on doing it again, as the code had already been done.
In saying that, the code has reasonably good documentation (I think).
Code is king, providing it has been written to be read.
If you notice any of the escaping isn’t quite making sense, it could be the blogging engine either doing what it’s meant to, or not doing what it’s meant to.
I’ve been over the code a few times, but I may have missed something.
Shout out if anything’s not clear.

First up, we’ll look at the custom exceptions as we’ll need those soon.

using System;

namespace Common.WcfHelpers.ErrorHandling.Exceptions
{
    public abstract class WcfException : Exception
    {
        /// <summary>
        /// In order to set the message for the client, set it here, or via the property directly in order to over ride default value.
        /// </summary>
        /// <param name="message">The message to be assigned to the Exception's Message.</param>
        /// <param name="innerException">The exception to be assigned to the Exception's InnerException.</param>
        /// <param name="messageForClient">The client friendly message. This parameter is optional, but should be set.</param>
        public WcfException(string message, Exception innerException = null, string messageForClient = null) : base(message, innerException)
        {
            MessageForClient = messageForClient;
        }

        /// <summary>
        /// This is the message that the service's client will see.
        /// Make sure it is set in the constructor. Or here.
        /// </summary>
	    public string MessageForClient
        {
            get { return string.IsNullOrEmpty(_messageForClient) ? "The MessageForClient property of WcfException was not set" : _messageForClient; }
            set { _messageForClient = value; }
        }
        private string _messageForClient;
    }
}

And the more specific SanitisationWcfException

using System;
using System.Configuration;

namespace Common.WcfHelpers.ErrorHandling.Exceptions
{
    /// <summary>
    /// Exception class that is used when the user input sanitisation fails, and the user needs to be informed.
    /// </summary>
    public class SanitisationWcfException : WcfException
    {
        private const string _defaultMessageForClient = "Answers were NOT saved. User input validation was unsuccessful.";
        public string UnsanitisedAnswer { get; private set; }

        /// <summary>
        /// In order to set the message for the client, set it here, or via the property directly in order to over ride default value.
        /// </summary>
        /// <param name="message">The message to be assigned to the Exception's Message.</param>
        /// <param name="innerException">The Exception to be assigned to the base class instance's inner exception. This parameter is optional.</param>
        /// <param name="messageForClient">The client friendly message. This parameter is optional, but should be set.</param>
        /// <param name="unsanitisedAnswer">The user input string before service side sanitisatioin is performed.</param>
        public SanitisationWcfException
        (
            string message,
            Exception innerException = null,
            string messageForClient = _defaultMessageForClient,
            string unsanitisedAnswer = null
        )
            : base(
                message,
                innerException,
                messageForClient + " If this continues to happen, please contact " + ConfigurationManager.AppSettings["SupportEmail"] + Environment.NewLine
                )
        {
            UnsanitisedAnswer = unsanitisedAnswer;
        }
    }
}

Now as we define whether our requirements are satisfied by way of executable requirements (unit tests(in their rawest form))
Lets write some executable specifications.

using NUnit.Framework;
using Common.Security.Sanitisation;

namespace Common.Security.Encoding.UnitTest
{
    [TestFixture]
    public class ExtensionsTest
    {

        private readonly string _inNeedOfEscaping = @"One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.";
        private readonly string _noNeedForEscaping = @"One x2F two amp three x27 four lt five quot six gt       .";

        [Test]
        public void SingleDecodeDoubleEncodedHtml_ShouldSingleDecodeDoubleEncodedHtml()
        {
            string doubleEncodedHtml = @"";               // between the ""'s we have a string of Html with double escaped values like &amp;#x27; user entered text &amp;#x2F.
            string singleEncodedHtmlShouldLookLike = @""; // between the ""'s we have a string of Html with single escaped values like ' user entered text &#x2F.
            // In the above, the bloging engine is escaping the sinlge escaped entity encoding, so all you'll see is the entity it self.
            // but it should look like the double encoded entity encodings without the first &amp->;


            string singleEncodedHtml = doubleEncodedHtml.SingleDecodeDoubleEncodedHtml();
            
            Assert.That(singleEncodedHtml, Is.EqualTo(singleEncodedHtmlShouldLookLike));
        }

        [Test]
        public void Extensions_CompliesWithWhitelist_ShouldNotComply()
        {
            Assert.That(_inNeedOfEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,]+$"), Is.False);
        }

        [Test]
        public void Extensions_CompliesWithWhitelist_ShouldComply()
        {
            Assert.That(_noNeedForEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,]+$"), Is.True);
            Assert.That(_inNeedOfEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,#/&'<"">]+$"), Is.True);
        }
    }
}

Now the code that satisfies the above executable specifications, and more.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Text.RegularExpressions;

namespace Common.Security.Sanitisation
{
    /// <summary>
    /// Provides a series of extension methods that perform sanitisation.
    /// Escaping, unescaping, etc.
    /// Usually targeted at user input, to help defend against the likes of XSS and other injection attacks.
    /// </summary>
    public static class Extensions
    {

        private const int CharacterIndexNotFound = -1;

        /// <summary>
        /// Returns a new string in which all occurrences of a double escaped html character (that's an html entity immediatly prefixed with another html entity)
        /// in the current instance are replaced with the single escaped character.
        /// </summary>
        /// <param name="source">The target text used to strip one layer of Html entity encoding.</param>
        /// <returns>The singly escaped text.</returns>
        public static string SingleDecodeDoubleEncodedHtml(this string source)
        {
            return source.Replace("&amp;#x", "&#x");
        }
        /// <summary>
        /// Filter a text against a regular expression whitelist of specified characters.
        /// </summary>
        /// <param name="target">The text that is filtered using the whitelist.</param>
        /// <param name="alternativeTarget"></param>
        /// <param name="whiteList">Needs to be be assigned a valid whitelist, otherwise nothing gets through.</param>
        public static bool CompliesWithWhitelist(this string target, string alternativeTarget = "", string whiteList = "")
        {
            if (string.IsNullOrEmpty(target))
                target = alternativeTarget;
            
            return Regex.IsMatch(target, whiteList);
        }
        /// <summary>
        /// Takes a string and returns another with a single layer of Html entity encoding replaced with it's Html entity literals.
        /// </summary>
        /// <param name="encodedUserInput">The text to perform the opperation on.</param>
        /// <param name="numberOfEscapes">The number of Html entity encodings that were replaced.</param>
        /// <returns>The text that's had a single layer of Html entity encoding replaced with it's Html entity literals.</returns>
        public static string HtmlDecode(this string encodedUserInput, ref int numberOfEscapes)
        {
            const int NotFound = -1;

            if (string.IsNullOrEmpty(encodedUserInput))
                return string.Empty;

            StringWriter output = new StringWriter(CultureInfo.InvariantCulture);
            
            if (encodedUserInput.IndexOf('&') == NotFound)
            {
                output.Write(encodedUserInput);
            }
            else
            {
                int length = encodedUserInput.Length;
                for (int index1 = 0; index1 < length; ++index1)
                {
                    char ch1 = encodedUserInput[index1];
                    if (ch1 == 38)
                    {
                        int index2 = encodedUserInput.IndexOfAny(_htmlEntityEndingChars, index1 + 1);
                        if (index2 > 0 && encodedUserInput[index2] == 59)
                        {
                            string entity = encodedUserInput.Substring(index1 + 1, index2 - index1 - 1);
                            if (entity.Length > 1 && entity[0] == 35)
                            {
                                ushort result;
                                if (entity[1] == 120 || entity[1] == 88)
                                    ushort.TryParse(entity.Substring(2), NumberStyles.AllowHexSpecifier, NumberFormatInfo.InvariantInfo, out result);
                                else
                                    ushort.TryParse(entity.Substring(1), NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite | NumberStyles.AllowLeadingSign, NumberFormatInfo.InvariantInfo, out result);
                                if (result != 0)
                                {
                                    ch1 = (char)result;
                                    numberOfEscapes++;
                                    index1 = index2;
                                }
                            }
                            else
                            {
                                index1 = index2;
                                char ch2 = HtmlEntities.Lookup(entity);
                                if ((int)ch2 != 0)
                                {
                                    ch1 = ch2;
                                    numberOfEscapes++;
                                }
                                else
                                {
                                    output.Write('&');
                                    output.Write(entity);
                                    output.Write(';');
                                    continue;
                                }
                            }
                        }
                    }
                    output.Write(ch1);
                }
            }
            string decodedHtml = output.ToString();
            output.Dispose();
            return decodedHtml;
        }
        /// <summary>
        /// Escapes all character entity references (double escaping where necessary).
        /// Why? The XmlTextReader that is setup in XmlDocument.LoadXml on the service considers the character entity references (&#xxxx;) to be the character they represent.
        /// All XML is converted to unicode on reading and any such entities are removed in favor of the unicode character they represent.
        /// </summary>
        /// <param name="unencodedUserInput">The string that needs to be escaped.</param>
        /// <param name="numberOfEscapes">The number of escapes applied.</param>
        /// <returns>The escaped text.</returns>
        public static unsafe string HtmlEncode(this string unencodedUserInput, ref int numberOfEscapes)
        {
            if (string.IsNullOrEmpty(unencodedUserInput))
                return string.Empty;

            StringWriter output = new StringWriter(CultureInfo.InvariantCulture);
            
            if (output == null)
                throw new ArgumentNullException("output");
            int num1 = IndexOfHtmlEncodingChars(unencodedUserInput);
            if (num1 == -1)
            {
                output.Write(unencodedUserInput);
            }
            else
            {
                int num2 = unencodedUserInput.Length - num1;
                fixed (char* chPtr1 = unencodedUserInput)
                {
                    char* chPtr2 = chPtr1;
                    while (num1-- > 0)
                        output.Write(*chPtr2++);
                    while (num2-- > 0)
                    {
                        char ch = *chPtr2++;
                        if (ch <= 62)
                        {
                            switch (ch)
                            {
                                case '"':
                                    output.Write(""");
                                    numberOfEscapes++;
                                    continue;
                                case '&':
                                    output.Write("&amp;");
                                    numberOfEscapes++;
                                    continue;
                                case '\'':
                                    output.Write("&amp;#x27;");
                                    numberOfEscapes = numberOfEscapes + 2;
                                    continue;
                                case '<':
                                    output.Write("<");
                                    numberOfEscapes++;
                                    continue;
                                case '>':
                                    output.Write(">");
                                    numberOfEscapes++;
                                    continue;
                                case '/':
                                    output.Write("&amp;#x2F;");
                                    numberOfEscapes = numberOfEscapes + 2;
                                    continue;
                                default:
                                    output.Write(ch);
                                    continue;
                            }
                        }
                        if (ch >= 160 && ch < 256)
                        {
                            output.Write("&#");
                            output.Write(((int)ch).ToString(NumberFormatInfo.InvariantInfo));
                            output.Write(';');
                            numberOfEscapes++;
                        }
                        else
                            output.Write(ch);
                    }
                }
            }
            string encodedHtml = output.ToString();
            output.Dispose();
            return encodedHtml;
        }

 

        private static unsafe int IndexOfHtmlEncodingChars(string searchString)
        {
            int num = searchString.Length;
            fixed (char* chPtr1 = searchString)
            {
                char* chPtr2 = (char*)((UIntPtr)chPtr1);
                for (; num > 0; --num)
                {
                    char ch = *chPtr2;
                    if (ch <= 62)
                    {
                        switch (ch)
                        {
                            case '"':
                            case '&':
                            case '\'':
                            case '<':
                            case '>':
                            case '/':
                                return searchString.Length - num;
                        }
                    }
                    else if (ch >= 160 && ch < 256)
                        return searchString.Length - num;
                    ++chPtr2;
                }
            }
            return CharacterIndexNotFound;
        }

        private static char[] _htmlEntityEndingChars = new char[2]
        {
            ';',
            '&'
        };
        private static class HtmlEntities
        {
            private static string[] _entitiesList = new string[253]
            {
                "\"-quot",
                "&-amp",
                "'-apos",
                "<-lt",
                ">-gt",
                " -nbsp",
                "¡-iexcl",
                "¢-cent",
                "£-pound",
                "¤-curren",
                "¥-yen",
                "¦-brvbar",
                "§-sect",
                "¨-uml",
                "©-copy",
                "ª-ordf",
                "«-laquo",
                "¬-not",
                "\x00AD-shy",
                "®-reg",
                "¯-macr",
                "°-deg",
                "±-plusmn",
                "\x00B2-sup2",
                "\x00B3-sup3",
                "´-acute",
                "µ-micro",
                "¶-para",
                "·-middot",
                "¸-cedil",
                "\x00B9-sup1",
                "º-ordm",
                "»-raquo",
                "\x00BC-frac14",
                "\x00BD-frac12",
                "\x00BE-frac34",
                "¿-iquest",
                "À-Agrave",
                "Á-Aacute",
                "Â-Acirc",
                "Ã-Atilde",
                "Ä-Auml",
                "Å-Aring",
                "Æ-AElig",
                "Ç-Ccedil",
                "È-Egrave",
                "É-Eacute",
                "Ê-Ecirc",
                "Ë-Euml",
                "Ì-Igrave",
                "Í-Iacute",
                "Î-Icirc",
                "Ï-Iuml",
                "Ð-ETH",
                "Ñ-Ntilde",
                "Ò-Ograve",
                "Ó-Oacute",
                "Ô-Ocirc",
                "Õ-Otilde",
                "Ö-Ouml",
                "×-times",
                "Ø-Oslash",
                "Ù-Ugrave",
                "Ú-Uacute",
                "Û-Ucirc",
                "Ü-Uuml",
                "Ý-Yacute",
                "Þ-THORN",
                "ß-szlig",
                "à-agrave",
                "á-aacute",
                "â-acirc",
                "ã-atilde",
                "ä-auml",
                "å-aring",
                "æ-aelig",
                "ç-ccedil",
                "è-egrave",
                "é-eacute",
                "ê-ecirc",
                "ë-euml",
                "ì-igrave",
                "í-iacute",
                "î-icirc",
                "ï-iuml",
                "ð-eth",
                "ñ-ntilde",
                "ò-ograve",
                "ó-oacute",
                "ô-ocirc",
                "õ-otilde",
                "ö-ouml",
                "÷-divide",
                "ø-oslash",
                "ù-ugrave",
                "ú-uacute",
                "û-ucirc",
                "ü-uuml",
                "ý-yacute",
                "þ-thorn",
                "ÿ-yuml",
                "Œ-OElig",
                "œ-oelig",
                "Š-Scaron",
                "š-scaron",
                "Ÿ-Yuml",
                "ƒ-fnof",
                "\x02C6-circ",
                "˜-tilde",
                "Α-Alpha",
                "Β-Beta",
                "Γ-Gamma",
                "Δ-Delta",
                "Ε-Epsilon",
                "Ζ-Zeta",
                "Η-Eta",
                "Θ-Theta",
                "Ι-Iota",
                "Κ-Kappa",
                "Λ-Lambda",
                "Μ-Mu",
                "Ν-Nu",
                "Ξ-Xi",
                "Ο-Omicron",
                "Π-Pi",
                "Ρ-Rho",
                "Σ-Sigma",
                "Τ-Tau",
                "Υ-Upsilon",
                "Φ-Phi",
                "Χ-Chi",
                "Ψ-Psi",
                "Ω-Omega",
                "α-alpha",
                "β-beta",
                "γ-gamma",
                "δ-delta",
                "ε-epsilon",
                "ζ-zeta",
                "η-eta",
                "θ-theta",
                "ι-iota",
                "κ-kappa",
                "λ-lambda",
                "μ-mu",
                "ν-nu",
                "ξ-xi",
                "ο-omicron",
                "π-pi",
                "ρ-rho",
                "ς-sigmaf",
                "σ-sigma",
                "τ-tau",
                "υ-upsilon",
                "φ-phi",
                "χ-chi",
                "ψ-psi",
                "ω-omega",
                "ϑ-thetasym",
                "ϒ-upsih",
                "ϖ-piv",
                " -ensp",
                " -emsp",
                " -thinsp",
                "\x200C-zwnj",
                "\x200D-zwj",
                "\x200E-lrm",
                "\x200F-rlm",
                "–-ndash",
                "—-mdash",
                "‘-lsquo",
                "’-rsquo",
                "‚-sbquo",
                "“-ldquo",
                "”-rdquo",
                "„-bdquo",
                "†-dagger",
                "‡-Dagger",
                "•-bull",
                "…-hellip",
                "‰-permil",
                "′-prime",
                "″-Prime",
                "‹-lsaquo",
                "›-rsaquo",
                "‾-oline",
                "⁄-frasl",
                "€-euro",
                "ℑ-image",
                "℘-weierp",
                "ℜ-real",
                "™-trade",
                "ℵ-alefsym",
                "←-larr",
                "↑-uarr",
                "→-rarr",
                "↓-darr",
                "↔-harr",
                "↵-crarr",
                "⇐-lArr",
                "⇑-uArr",
                "⇒-rArr",
                "⇓-dArr",
                "⇔-hArr",
                "∀-forall",
                "∂-part",
                "∃-exist",
                "∅-empty",
                "∇-nabla",
                "∈-isin",
                "∉-notin",
                "∋-ni",
                "∏-prod",
                "∑-sum",
                "−-minus",
                "∗-lowast",
                "√-radic",
                "∝-prop",
                "∞-infin",
                "∠-ang",
                "∧-and",
                "∨-or",
                "∩-cap",
                "∪-cup",
                "∫-int",
                "∴-there4",
                "∼-sim",
                "≅-cong",
                "≈-asymp",
                "≠-ne",
                "≡-equiv",
                "≤-le",
                "≥-ge",
                "⊂-sub",
                "⊃-sup",
                "⊄-nsub",
                "⊆-sube",
                "⊇-supe",
                "⊕-oplus",
                "⊗-otimes",
                "⊥-perp",
                "⋅-sdot",
                "⌈-lceil",
                "⌉-rceil",
                "⌊-lfloor",
                "⌋-rfloor",
                "〈-lang",
                "〉-rang",
                "◊-loz",
                "♠-spades",
                "♣-clubs",
                "♥-hearts",
                "♦-diams"
            };
            private static Dictionary<string, char> _lookupTable = GenerateLookupTable();

            private static Dictionary<string, char> GenerateLookupTable()
            {
                Dictionary<string, char> dictionary = new Dictionary<string, char>(StringComparer.Ordinal);
                foreach (string str in _entitiesList)
                    dictionary.Add(str.Substring(2), str[0]);
                return dictionary;
            }

            public static char Lookup(string entity)
            {
                char ch;
                _lookupTable.TryGetValue(entity, out ch);
                return ch;
            }
        }
    }
}

You may also notice that I’ve mocked the OperationContext.
Thanks to WCFMock, a mocking framework for WCF services.
I won’t include this code, but you can get it here.
I’ve used the popular NUnit test framework and RhinoMocks for the stubbing and mocking.
Both pulled into the solution using NuGet.
Most useful documentation for RhinoMocks:
http://ayende.com/Wiki/Rhino+Mocks+3.5.ashx
http://ayende.com/wiki/Rhino+Mocks.ashx

For this project I used NLog and wrapped it.
Now you start to get an idea of how to use the sanitisation.

using System;
using System.ServiceModel;
using System.ServiceModel.Channels;
using NUnit.Framework;
using System.Configuration;
using Rhino.Mocks;
using Common.Wrapper.Log;
using MockedOperationContext = System.ServiceModel.Web.MockedOperationContext;
using Common.WcfHelpers.ErrorHandling.Exceptions;

namespace Sanitisation.UnitTest
{
    [TestFixture]
    public class SanitiseTest
    {
        private const string _myTestIpv4Address = "My.Test.Ipv4.Address";
        private readonly int _maxLengthHtmlEncodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlEncodedUserInput"]);
        private readonly int _maxLengthHtmlDecodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlDecodedUserInput"]);
        private readonly string _encodedUserInput_thatsMaxDecodedLength = @"One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.";
        private readonly string _decodedUserInput_thatsMaxLength = @"One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.";

        [Test]
        public void Sanitise_UserInput_WhenGivenNull_ShouldReturnEmptyString()
        {
            Assert.That(new Sanitise().UserInput(null), Is.EqualTo(string.Empty));
        }

        [Test]
        public void Sanitise_UserInput_WhenGivenEmptyString_ShouldReturnEmptyString()
        {
            Assert.That(new Sanitise().UserInput(string.Empty), Is.EqualTo(string.Empty));
        }

        [Test]
        public void Sanitise_UserInput_WhenGivenSanitisedString_ShouldReturnSanitisedString()
        {
            // Open the whitelist up in order to test the encoding without restriction.
            Assert.That(new Sanitise(whiteList: @"^[\w\s\.,#/&'<"">]+$").UserInput(_encodedUserInput_thatsMaxDecodedLength), Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength));
        }
        [Test]
        [ExpectedException(typeof(SanitisationWcfException))]
        public void Sanitise_UserInput_ShouldThrowExceptionIfEscapedInputToLong()
        {
            string fourThousandAndOneCharacters = "Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand character";
            string expectedError = "The un-modified string received from the client with the following IP address: " +
                   '"' + _myTestIpv4Address + "\" " +
                   "exceeded the allowed maximum length of an escaped Html user input string. " +
                   "The maximum length allowed is: " +
                   _maxLengthHtmlEncodedUserInput +
                   ". The length was: " +
                   (_maxLengthHtmlEncodedUserInput+1) + ".";

            using(new MockedOperationContext(StubbedOperationContext))
            {
                try
                {
                    new Sanitise().UserInput(fourThousandAndOneCharacters);
                }
                catch(SanitisationWcfException e)
                {
                    Assert.That(e.Message, Is.EqualTo(expectedError));
                    Assert.That(e.UnsanitisedAnswer, Is.EqualTo(fourThousandAndOneCharacters));
                    throw;
                }
            }
        }
        [Test]
        [ExpectedException(typeof(SanitisationWcfException))]
        public void Sanitise_UserInput_DecodedUserInputShouldThrowException_WhenMaxLengthHtmlDecodedUserInputIsExceeded()
        {
            char oneCharOverTheLimit = '.';
            string expectedError =
                           "The string received from the client with the following IP address: " +
                           "\"" + _myTestIpv4Address + "\" " +
                           "after Html decoding exceded the allowed maximum length of an un-escaped Html user input string." +
                           Environment.NewLine +
                           "The maximum length allowed is: " + _maxLengthHtmlDecodedUserInput + ". The length was: " +
                           (_decodedUserInput_thatsMaxLength + oneCharOverTheLimit).Length + oneCharOverTheLimit;

            using(new MockedOperationContext(StubbedOperationContext))
            {
                try
                {
                    new Sanitise().UserInput(_encodedUserInput_thatsMaxDecodedLength + oneCharOverTheLimit);
                }
                catch(SanitisationWcfException e)
                {
                    Assert.That(e.Message, Is.EqualTo(expectedError));
                    Assert.That(e.UnsanitisedAnswer, Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength + oneCharOverTheLimit));
                    throw;
                }
            }
        }
        [Test]
        public void Sanitise_UserInput_ShouldLogAndSendEmail_IfNumberOfDecodedHtmlEntitiesDoesNotMatchNumberOfEscapes()
        {
            string encodedUserInput_with6HtmlEntitiesNotEscaped = _encodedUserInput_thatsMaxDecodedLength.Replace("&amp;#x2F;", "/");
            string errorWeAreExpecting =
                "It appears as if someone has circumvented the client side Html entity encoding." + Environment.NewLine +
                "The requesting IP address was: " +
                "\"" + _myTestIpv4Address + "\" " +
                "The sanitised input we receive from the client was the following:" + Environment.NewLine +
                "\"" + encodedUserInput_with6HtmlEntitiesNotEscaped + "\"" + Environment.NewLine +
                "The same input after decoding and re-escaping on the server side was the following:" + Environment.NewLine +
                "\"" + _encodedUserInput_thatsMaxDecodedLength + "\"";
            string sanitised;
            // setup _logger
            ILogger logger = MockRepository.GenerateMock<ILogger>();
            logger.Expect(lgr => lgr.logError(errorWeAreExpecting));

            Sanitise sanitise = new Sanitise(@"^[\w\s\.,#/&'<"">]+$", logger);

            using (new MockedOperationContext(StubbedOperationContext))
            {
                // Open the whitelist up in order to test the encoding etc.
                sanitised = sanitise.UserInput(encodedUserInput_with6HtmlEntitiesNotEscaped);
            }

            Assert.That(sanitised, Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength));
            logger.VerifyAllExpectations();
        }        

        private static IOperationContext StubbedOperationContext
        {
            get
            {
                IOperationContext operationContext = MockRepository.GenerateStub<IOperationContext>();
                int port = 80;
                RemoteEndpointMessageProperty remoteEndpointMessageProperty = new RemoteEndpointMessageProperty(_myTestIpv4Address, port);
                operationContext.Stub(oc => oc.IncomingMessageProperties[RemoteEndpointMessageProperty.Name]).Return(remoteEndpointMessageProperty);
                return operationContext;
            }
        }
    }
}

Now the API code that we can use to do our sanitisation.

using System;
using System.Configuration;
// Todo : KC We need time to implement DI. Should be using something like ninject.extensions.wcf.
using OperationContext = System.ServiceModel.Web.MockedOperationContext;
using System.ServiceModel.Channels;
using Common.Security.Sanitisation;
using Common.WcfHelpers.ErrorHandling.Exceptions;
using Common.Wrapper.Log;

namespace Sanitisation
{

    public class Sanitise
    {
        private readonly string _whiteList;
        private readonly ILogger _logger;
        

        private string RequestingIpAddress
        {
            get
            {
                RemoteEndpointMessageProperty remoteEndpointMessageProperty = OperationContext.Current.IncomingMessageProperties[RemoteEndpointMessageProperty.Name] as RemoteEndpointMessageProperty;
                return ((remoteEndpointMessageProperty != null) ? remoteEndpointMessageProperty.Address : string.Empty);
            }
        }
        /// <summary>
        /// Provides server side escaping of Html entities, and runs the supplied whitelist character filter over the user input string.
        /// </summary>
        /// <param name="whiteList">Should be provided by DI from the ResourceFile.</param>
        /// <param name="logger">Should be provided by DI. Needs to be an asynchronous logger.</param>
        /// <example>
        /// The whitelist can be obtained from a ResourceFile like so...
        /// <code>
        /// private Resource _resource;
        /// _resource.GetString("WhiteList");
        /// </code>
        /// </example>
        public Sanitise(string whiteList = "", ILogger logger = null)
        {
            _whiteList = whiteList;
            _logger = logger ?? new Logger();
        }
        /// <summary>
        /// 1) Check field lengths.         Client side validation may have been negated.
        /// 2) Check against white list.	Client side validation may have been negated.
        /// 3) Check Html escaping.         Client side validation may have been negated.

        /// Generic Fail actions:	Drop the payload. No point in trying to massage and save, as it won't be what the user was expecting,
        ///                         Add full error to a WCFException Message and throw.
        ///                         WCF interception reads the WCFException.MessageForClient, and sends it to the user. 
        ///                         On return, log the WCFException's Message.
        ///                         
        /// Escape Fail actions:	Asynchronously Log and email full error to support.


        /// 1) BA confirmed 50 for text, and 400 for textarea.
        ///     As we don't know the field type, we'll have to go for 400."
        ///
        ///     First we need to check that we haven't been sent some huge string.
        ///     So we check that the string isn't longer than 400 * 10 = 4000.
        ///     10 is the length of our double escaped character references.
        ///     Or, we ask the business for a number."
        ///     If we fail here, perform Generic Fail actions and don't complete the following steps.
        /// 
        ///     Convert all Html Entity Encodings back to their equivalent characters, and count how many occurrences.
        ///
        ///     If the string is longer than 400, perform Generic Fail actions and don't complete the following steps.
        /// 
        /// 2) check all characters against the white list
        ///     If any don't match, perform Generic Fail actions and don't complete the following steps.
        /// 
        /// 3) re html escape (as we did in JavaScript), and count how many escapes.
        ///     If count is greater than the count of Html Entity Encodings back to their equivalent characters,
        ///     Perform Escape Fail actions. Return sanitised string.
        /// 
        ///     If we haven't returned, return sanitised string.
        
        
        /// Performs checking on the text passed in, to verify that client side escaping and whitelist validation has already been performed.
        /// Performs decoding, and re-encodes. Counts that the number of escapes was the same, otherwise we log and send email with the details to support.
        /// Throws exception if the client side validation failed to restrict the number of characters in the escaped string we received.
        ///     This needs to be intercepted at the service.
        ///     The exceptions default message for client needs to be passed back to the user.
        ///     On return, the interception needs to log the exception's message.
        /// </summary>
        /// <param name="sanitiseMe"></param>
        /// <returns></returns>
        public string UserInput(string sanitiseMe)
        {
            if (string.IsNullOrEmpty(sanitiseMe))
                return string.Empty;

            ThrowExceptionIfEscapedInputToLong(sanitiseMe);

            int numberOfDecodedHtmlEntities = 0;
            string decodedUserInput = HtmlDecodeUserInput(sanitiseMe, ref numberOfDecodedHtmlEntities);

            if(!decodedUserInput.CompliesWithWhitelist(whiteList: _whiteList))
            {
                string error = "The answer received from client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "had characters that failed to match the whitelist.";
                throw new SanitisationWcfException(error);
            }

            int numberOfEscapes = 0;
            string sanitisedUserInput = decodedUserInput.HtmlEncode(ref numberOfEscapes);

            if(numberOfEscapes != numberOfDecodedHtmlEntities)
            {
                AsyncLogAndEmail(sanitiseMe, sanitisedUserInput);
            }

            return sanitisedUserInput;
        }
        /// <note>
        /// Make sure the logger is setup to log asynchronously
        /// </note>
        private void AsyncLogAndEmail(string sanitiseMe, string sanitisedUserInput)
        {
            // no need for SanitisationException

            _logger.logError(
                "It appears as if someone has circumvented the client side Html entity encoding." + Environment.NewLine +
                "The requesting IP address was: " +
                "\"" + RequestingIpAddress + "\" " +
                "The sanitised input we receive from the client was the following:" + Environment.NewLine +
                "\"" + sanitiseMe + "\"" + Environment.NewLine +
                "The same input after decoding and re-escaping on the server side was the following:" + Environment.NewLine +
                "\"" + sanitisedUserInput + "\""
                );
        }

        /// <summary>
        /// This procedure may throw a SanitisationWcfException.
        /// If it does, ErrorHandlerBehaviorAttribute will need to pass the "messageForClient" back to the client from within the IErrorHandler.ProvideFault procedure.
        /// Once execution is returned, the IErrorHandler.HandleError procedure of ErrorHandlerBehaviorAttribute
        /// will continue to process the exception that was thrown in the way of logging sensitive info.
        /// </summary>
        /// <param name="toSanitise"></param>
        private void ThrowExceptionIfEscapedInputToLong(string toSanitise)
        {
            int maxLengthHtmlEncodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlEncodedUserInput"]);
            if (toSanitise.Length > maxLengthHtmlEncodedUserInput)
            {
                string error = "The un-modified string received from the client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "exceeded the allowed maximum length of an escaped Html user input string. " +
                    "The maximum length allowed is: " +
                    maxLengthHtmlEncodedUserInput +
                    ". The length was: " +
                    toSanitise.Length + ".";
                throw new SanitisationWcfException(error, unsanitisedAnswer: toSanitise);
            }
        }

        private string HtmlDecodeUserInput(string doubleEncodedUserInput, ref int numberOfDecodedHtmlEntities)
        {
            string decodedUserInput = doubleEncodedUserInput.HtmlDecode(ref numberOfDecodedHtmlEntities).HtmlDecode(ref numberOfDecodedHtmlEntities) ?? string.Empty;
            
            // if the decoded string is longer than MaxLengthHtmlDecodedUserInput throw
            int maxLengthHtmlDecodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlDecodedUserInput"]);
            if(decodedUserInput.Length > maxLengthHtmlDecodedUserInput)
            {
                throw new SanitisationWcfException(
                    "The string received from the client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "after Html decoding exceded the allowed maximum length of an un-escaped Html user input string." +
                    Environment.NewLine +
                    "The maximum length allowed is: " + maxLengthHtmlDecodedUserInput + ". The length was: " +
                    decodedUserInput.Length + ".",
                    unsanitisedAnswer: doubleEncodedUserInput
                    );
            }
            return decodedUserInput;
        }
    }
}

As you can see, there’s a lot more work in the server side sanitisation than the client side.