Microsoft | Binarymist

Archive for the ‘Microsoft’ Category

Exploring JavaScript Prototypes

June 28, 2014

Not to be confused with the GoF Prototype pattern that defines a lot more than the simple JavaScript prototype. Although the abstract concept of the prototype is the same.

My intention with this post is to arm our developers with enough information around JavaScript prototypes to know when they are the right tool for the job as opposed to other constructs when considering how to create polymorphic JavaScript that’s performant and easy to maintain. Often performant code and easy to maintain code are in conflict with each other. I.E. if you want code that’s fast, it’s often hard to read and if you want code that’s really easy to read, it “may” not be as fast as it could/should be. So we make trade-offs.

Make your code as readable as possible in as many places as possible. The more eyes that are going to be on it, generally the more readable it needs to be. Where performance really matters, we “may” have to carefully sacrifice some precious readability to achieve the essential performance required. This really needs measuring though, because often we think we’re writing fast code that either doesn’t matter or that just isn’t fast. So we should always favour readability, then profile your running application in an environment as close to production as possible. This removes the guess work, which we usually get wrong anyway. I’m currently working on a Node.js performance blog post in which I’ll attempt to address many things to do with performance. What I’m finding a lot of the time is that techniques that I’ve been told are essential for fast code are all to often incorrect. We must measure.

Some background

Before we do the deep dive thing, lets step back for a bit. Why do prototypes matter in JavaScript? What do prototypes do for us? Where do prototypes fit into the design philosophy of JavaScript?

What do JavaScript Prototypes do for us?

Removal of Code Duplication (DRY)

Excellent for reducing unnecessary duplication of members that will need garbage collecting

Performance

Prototypes also allow us to maximise economy of memory, thus reducing Garbage Collection (GC) activity, thus increasing performance. There are other ways to get this performance though. Prototypes which obtain re-use of the parent object are not always the best way to get the performance benefits we crave. You can see here under the “Cached Functions in the Module Pattern” section that using closure (although not mentioned) which is what modules leverage, also gives us the benefit of re-use, as the free variable in the outer scope is baked into the closure. Just check the jsperf for proof.

The Design Philosophy of JavaScript and Prototypes

Prototypal inheritance was implemented in JavaScript as a key technique to support the object oriented principle of polymorphism. Prototypal inheritance provides the flexibility of being able to choose what the more specific object is going to inherit, rather than in the classical paradigm where you’re forced to inherit all the base class’s baggage whether you want it or not.

Three obvious ways to achieve polymorphism:

Composition (creating an object that composes a contract to another object)(has-a relationship). Learn the pros and cons. Use when it makes sense
Prototypal inheritance (is-a relationship). Learn the pros and cons. Use when it makes sense
Monkey Patching courtesy of call, apply and bind
Classical inheritance (is-a relationship). Why would you? Please don’t try this ~~at home~~ in production 😉

Of course there are other ways and some languages have unique techniques to achieve polymorphism. like templates in C++, generics in C#, first-class polymorphism in Haskell, multimethods in Clojure, etc, etc.

Diving into the Implementation Details

Before we dive into Prototypes…

What does Composition look like?

There are many great examples of how composing our objects from other object interfaces whether they’re owned by the composing object (composition), or aggregated from independent objects (aggregation), provide us with the building blocks to create complex objects to look and behave the way we want them to. This generally provides us with plenty of flexibility to swap implementation at will, thus overcoming the tight coupling of classical inheritance.

Many of the Gang of Four (GoF) design patterns we know and love leverage composition and/or aggregation to help create polymorphic objects. There is a difference between aggregation and composition, but both concepts are often used loosely to just mean creating objects that contain other objects. Composition implies ownership, aggregation doesn’t have to. With composition, when the owning object is destroyed, so are the objects that are contained within the owner. This is not necessarily the case for aggregation.

An example: Each coffee shop is composed of it’s own unique culture. Each coffee shop has a different type of culture that it fosters and the unique culture is an aggregation of its people and their attributes. Now the people that aggregate the specific coffee shop culture can also be a part of other cultures that are completely separate to the coffee shops culture, they could even leave the current culture without destroying it, but the culture of the specific coffee shop can not be the same culture of another coffee shop. Every coffee shops culture is unique, even if only slightly.

Programmer	Show Pony

Following we have a coffeeShop that composes a culture. We use the Strategy pattern within the culture to aggregate the customers. The Visit function provides an interface to encapsulate the Concrete Strategy, which is passed as an argument to the Visit constructor and closed over by the describe method.

// Context component of Strategy pattern.
var Programmer = function () {
   this.casualVisit = {};
   this.businessVisit = {};
   // Add additional visit types.
};
// Context component of Strategy pattern.
var ShowPony = function () {
   this.casualVisit = {};
   this.businessVisit = {};
   // Add additional visit types.
};
// Add more persons to make a unique culture.

var customer = {
   setCasualVisitStrategy: function (casualVisit) {
      this.casualVisit = casualVisit;
   },
   setBusinessVisitStrategy: function (businessVisit) {
      this.businessVisit = businessVisit;
   },
   doCasualVisit: function () {
      console.log(this.casualVisit.describe());
   },
   doBusinessVisit: function () {
      console.log(this.businessVisit.describe());
   }
};

// Strategy component of Strategy pattern.
var Visit = function (description) {
   // description is closed over, so it's private. Check my last post on closures for more detail
   this.describe = function () {
      return description;
   };
};

var coffeeShop;

Programmer.prototype = customer;
ShowPony.prototype = customer;

coffeeShop = (function () {
   var culture = {};
   var flavourOfCulture = '';
   // Composes culture. The specific type of culture exists to this coffee shop alone.
   var whatWeWantExposed = {
      culture: {
         looksLike: function () {
            console.log(flavourOfCulture);

         }
      }
   };

   // Other properties ...
   (function createCulture() {
      var programmer = new Programmer();
      var showPony = new ShowPony();
      var i = 0;
      var propertyName;

      programmer.setCasualVisitStrategy(
         // Concrete Strategy component of Strategy pattern.
         new Visit('Programmer walks to coffee shop wearing jeans and T-shirt. Brings dog, Drinks macchiato.')
      );
      programmer.setBusinessVisitStrategy(
         // Concrete Strategy component of Strategy pattern.
         new Visit('Programmer brings software development team. Performs Sprint Planning. Drinks long macchiato.')
      );
      showPony.setCasualVisitStrategy(
         // Concrete Strategy component of Strategy pattern.
         new Visit('Show pony cycles to coffee shop in lycra pretending he\'s just done a hill ride. Struts past the ladies chatting them up. Orders Chai Latte.')
      );
      showPony.setBusinessVisitStrategy(
         // Concrete Strategy component of Strategy pattern.
         new Visit('Show pony meets business friends in suites. Pretends to work on his macbook pro. Drinks latte.')
      );

      culture.members = [programmer, showPony, /*lots more*/];

      for (i = 0; i < culture.members.length; i++) {
         for (propertyName in culture.members[i]) {
            if (culture.members[i].hasOwnProperty(propertyName)) {
               flavourOfCulture += culture.members[i][propertyName].describe() + '\n';
            }
         }
      }

   }());
   return whatWeWantExposed;
}());

coffeeShop.culture.looksLike();
// Programmer walks to coffee shop wearing jeans and T-shirt. Brings dog, Drinks macchiato.
// Programmer brings software development team. Performs Sprint Planning. Drinks long macchiato.
// Show pony cycles to coffee shop in lycra pretending he's just done a hill ride. Struts past the ladies chatting them up. Orders Chai Latte.
// Show pony meets business friends in suites. Pretends to work on his macbook pro. Drinks latte.

Now for Prototype

EcmaScript 5

In ES5 we’re a bit spoilt as we have a selection of methods on Object that help with prototypal inheritance.

Object.create takes an argument that’s an object and an optional properties object which is a EcmaScript 5 property descriptor like the second parameter of Object.defineProperties and returns a new object with the first argument passed as it’s prototype and the properties described in the property descriptor (if present) added to the returned object.

// The object we use as the prototype for hobbit.
var person = {
   personType: 'Unknown',
   backingOccupation: 'Unknown occupation',
   age: 'Unknown'
};

var hobbit = Object.create(person);

Object.defineProperties(person, {
   'typeOfPerson': {
      enumerable: true,
      value: function () {
         if(arguments.length === 0)
            return this.personType;
         else if(arguments.length === 1 && typeof arguments[0] === 'string')
            this.personType = arguments[0];
         else
            throw 'Number of arguments not supported. Pass 0 arguments to get. Pass 1 string argument to set.';
      }
   },
   'greeting': {
      enumerable: true,
      value: function () {
         console.log('Hi, I\'m a ' + this.typeOfPerson() + ' type of person.');
      }
   },
   'occupation': {
      enumerable: true,
      get: function () {return this.backingOccupation;},
      // Would need to add some parameter checking on the setter.
      set: function (value) {this.backingOccupation = value;}
   }
});

// Add another property to hobbit.
hobbit.fatAndHairyFeet = 'Yes indeed!';
console.log(hobbit.fatAndHairyFeet); // 'Yes indeed!'
// prototype is unaffected
console.log(person.fatAndHairyFeet); // undefined

console.log(hobbit.typeOfPerson()); // 'Unknown '
hobbit.typeOfPerson('short and hairy');
console.log(hobbit.typeOfPerson()); // 'short and hairy'
console.log(person.typeOfPerson()); // 'Unknown'

hobbit.greeting(); // 'Hi, I'm a short and hairy type of person.'

person.greeting(); // 'Hi, I'm a Unknown type of person.'

console.log(hobbit.age); // 'Unknown'
hobbit.age = 'young';
console.log(hobbit.age); // 'young'
console.log(person.age); // 'Unknown'

console.log(hobbit.occupation); // 'Unknown occupation'
hobbit.occupation = 'mushroom hunter';
console.log(hobbit.occupation); // 'mushroom hunter'
console.log(person.occupation); // 'Unknown occupation'

Object.getPrototypeOf

console.log(Object.getPrototypeOf(hobbit));
// Returns the following:
// { personType: 'Unknown',
//   backingOccupation: 'Unknown occupation',
//   age: 'Unknown',
//   typeOfPerson: [Function],
//   greeting: [Function],
//   occupation: [Getter/Setter] }

EcmaScript 3

One of the benefits of programming in ES 3, is that we have to do more work ourselves, thus we learn how some of the lower level language constructs actually work rather than just playing with syntactic sugar. Syntactic sugar is generally great for productivity, but I still think there is danger of running into problems when you don’t really understand what’s happening under the covers.

So lets check out what really goes on with….

Prototypal Inheritance

What is a Prototype?

All objects have a prototype, but not all objects reveal their prototype directly by a property called prototype. All prototypes are objects.

So, if all objects have a prototype and all prototypes are objects, we have an inheritance chain right? That’s right. See the debug image below.

All properties that you may want to add to an objects prototype are shared through inheritance by all objects sharing the prototype.

So, if all objects have a prototype, where is it stored? All objects in JavaScript have an internal property called [[Prototype]]. You won’t see this internal property. All prototypes are stored in this internal property. How this internal property is accessed is dependant on whether it’s object is an object (object literal or object returned from a constructor) or a function. I discuss how this works below. When you dereference an object in order to find a property, the engine will first look on the current object, then the prototype of the current object, then the prototype of the prototype object and so on up the prototype chain. It’s a good idea to try and keep your inheritance hierarchies as shallow as possible for performance reasons.

Prototypes in Functions

Every function object is created with a prototype property, whether it’s a constructor or not. The prototype property has a value which is a constructor property which has a value that’s actually the function. See the below example to help clear it up. ES3 and ES5 spec 13.2 say pretty much the same thing.

var MyConstructor = function () {};
console.log(MyConstructor.prototype.constructor === MyConstructor); // true

and to help with visualising, see the below example and debug. myObj and myObjLiteral are for the two code examples below the debug image.

var MyConstructor = function () {};
var myObj = new MyConstructor();
var myObjLiteral = {};

Up above in the composition example on line 40 and 41, you can see how we access the prototype of the constructor. We can also access the prototype of the object returned from the constructor like this:

var MyConstructor = function () {};
var myObj = new MyConstructor();
console.log(myObj.constructor.prototype === MyConstructor.prototype); // true

We can also do similar with an object literal. See below.

Prototypes in Objects that are Not Functions

Every object that is not a function is not created with a prototype property (All objects do have the hidden internal [[Prototype]] property though). Now sometimes you’ll see Object.prototype talked about. Even MDN make the matter a little confusing IMHO. In this case, the Object is the Object constructor function and as discussed above, all functions have the prototype property.

When we create object literals, the object we get is the same as if we ran the expression new Object(); (see ES3 and ES5 11.1.5)
So although we can access the prototype property of functions (that may or not be constructors), there is no such exposed prototype property directly on objects returned by constructors or on object literals.
There is however conveniently a constructor property directly on all objects returned by constructors and on object literals (as you can think of their construction procedure producing the same result). This looks similar to the above debug image:

var myObjLiteral = {};
            // ES3 ->                              // ES5 ->
console.log(myObjLiteral.constructor.prototype === Object.getPrototypeOf(myObjLiteral)); // true

I’ve purposely avoided discussing the likes of __proto__ as it’s not defined in EcmaScript and there’s no valid reason to use something that’s not standard.

Polyfilling to ES5

Now to get a couple of terms used in web development well defined before we start talking about them:

A shim is a library that brings a new API to an environment that doesn’t support it by using only what the older environment supports to support the new API.
A polyfill is some code in the form of a function, module, plugin, etc that provides the functionality of a later environment (ES5 for example) if it doesn’t exist for an older environment (ES3 for example). The polyfill often acts as a fallback. The programmer writes code targeting the newer environment as though the older environment doesn’t exist, but when the code is pulled into the older environment the polyfill kicks into action as the new language feature isn’t yet implemented natively.

If you’re supporting older browsers that don’t have full support for ES5, you can still use the ES5 additions so long as you provide ES5 polyfills. es5-shim is a good choice for this. Checkout the html5please ECMAScript 5 section for a little more detail. Also checkout Kangax’s ECMAScript 5 compatibility table to see which browsers currently support which ES5 language features. A good approach and one I like to take is to use a custom build of a library such as Lo-Dash to provide a layer of abstraction so I don’t need to care whether it’ll be in an ES5 or ES3 environment. Then for anything that the abstraction library doesn’t provide I’ll use a customised polyfill library such as es5-shim to fall back on. I prefer to use Lo-Dash over Underscore too, as I think Lo-Dash is starting to leave Underscore behind in terms of performance and features. I also like to use the likes of yepnope.js to conditionally load my polyfills based on whether they’re actually needed in the users browser. As there’s no point in loading them if we have browser support now is there?

Polyfilling Object.create as discussed above, to ES5

You could use something like the following that doesn’t accommodate an object of property descriptors. Or just go with the following next two choices which is what I do:

Use an abstraction like the lodash create method which takes an optional second argument object of properties and treats them the same way
Use a polyfill like this one.

if (typeof Object.create !== 'function') {
   (function () {
      var F = function () {};
      Object.create = function (proto) {
         if (arguments.length > 1) {
            throw Error('Second argument not supported');
         }
         if (proto === null) {
            throw Error('Cannot set a null [[Prototype]]');
         }
         if (typeof proto !== 'object') {
            throw TypeError('Argument must be an object');
         }
         F.prototype = proto;
         return new F();
      };
   })();
};

Polyfilling Object.getPrototypeOf as discussed above, to ES5

Use an abstraction like the lodash isPlainObject method (source here), or…
Use a polyfill like this one. Just keep in mind the gotcha.

EcmaScript 6

I got a bit excited when I saw an earlier proposed prototype-for (also seen with the name prototype-of) operator: <| . Additional example here. This would have provided a terse syntax for providing an object literal with an object to use as its prototype. It looks like it must have lost traction though as it was removed in the June 15, 2012 Draft.

There are a few extra methods in ES6 that deal with prototypes, but on trawling the EcmaScript 6 draft spec, nothing at this stage that really stands out as revolutionising the way I write JavaScript or being a mental effort/time saver for me. Of course I may have missed something. I’d like to hear from anyone that has seen something interesting to the contrary?

Yes we’re getting class‘s in ES6, but they are just an abstraction giving us a terse and declarative mechanism for doing what we already do with functions that we use as constructors, prototypes and the objects (or instances if you will) that are returned from our functions that we’ve chosen to act as constructors.

Architectural Ideas that Prototypes Help With

This is a common example that I often use for domain objects that are fairly hot that use one set of accessor properties added to the business objects prototype, as you can see on line 13 of my Hobbit module (Hobbit.js) below.

First a quick look at the tests/spec to drive the development. This is being run using mocha with the help of a Makefile in the root directory of my module under test.

Makefile

# The relevant section.
unit-test:
	@NODE_ENV=test ./node_modules/.bin/mocha \
		test/unit/*test.js test/unit/**/*test.js

Hobbit-test.js

var requireFrom = require('requirefrom');
var assert = require('assert');
var should = require('should');
var shire = requireFrom('shire/');

// Hardcode $NODE_ENV=test for debugging.
process.env.NODE_ENV='test';

describe('shire/Hobbit business object unit suite', function () {
   it('Should be able to instantiate a shire/Hobbit business object.', function (done) {
      // Uncomment below lines if you want to debug.
      //this.timeout(444000);
      //setTimeout(done, 444000);

      var Hobbit = shire('Hobbit');
      var hobbit = new Hobbit();

      // Properties should be declared but not initialised.
      // No good checking for undefined alone, as that would be true whether it was declared or not.

      hobbit.should.have.property('id');
      (hobbit.id === undefined).should.be.true;
      hobbit.should.have.property('typeOfPerson');
      (hobbit.typeOfPerson === undefined).should.be.true;
      hobbit.should.have.property('greeting');
      (hobbit.greeting === undefined).should.be.true;
      hobbit.should.have.property('occupation');
      (hobbit.occupation === undefined).should.be.true;
      hobbit.should.have.property('emailFrom');
      (hobbit.emailFrom === undefined).should.be.true;
      hobbit.should.have.property('name');
      (hobbit.name === undefined).should.be.true;      

      done();
   });

   it('Should be able to set and get all properties of a shire/Hobbit business object.', function (done){
      // Uncomment below lines if you want to debug.
      this.timeout(444000);
      setTimeout(done, 444000);

      // Arrange
      var Hobbit = shire('Hobbit');
      var hobbit = new Hobbit();      

      // Act
      hobbit.id = '32f4d01e-74dc-45e8-b3a8-9aa24840bc6a';
      hobbit.typeOfPerson = 'short and hairy';
      hobbit.greeting = {
         intro: 'Hi, I\'m a ',
         outro: ' type of person.'};
      hobbit.occupation = 'mushroom hunter';
      hobbit.emailFrom = 'Bilbo.Baggins@theshire.arn';
      hobbit.name = 'Bilbo Baggins';

      // Assert
      hobbit.id.should.equal('32f4d01e-74dc-45e8-b3a8-9aa24840bc6a');
      hobbit.typeOfPerson.should.equal('short and hairy');
      hobbit.greeting.should.equal('Hi, I\'m a short and hairy type of person.');
      hobbit.occupation.should.equal('mushroom hunter');
      hobbit.emailFrom.should.equal('Bilbo.Baggins@theshire.arn');
      hobbit.name.should.eql('Bilbo Baggins');

      done();
   });
});

Now the business object itself Hobbit.js
–
Now what’s happening here is that on instance creation of new Hobbit, the empty members object you see created on line 9 is the only instance data. All of the Hobbit‘s accessor properties are defined once per export of the Hobbit module which is assigned the constructor function object. So what we store on each instance are the values assigned in the Hobbit-test.js from lines 47 through 54. That’s just the strings. So very little space is used for each instance of the Hobbit function returned by invoking the Hobbit constructor that the Hobbit module exports.

// Could achieve a cleaner syntax with Object.create, but constructor functions are a little faster.
// As this will be hot code, it makes sense to favour performance in this case.
// Of course profiling may say it's not worth it, in which case this could be rewritten.
var Hobbit = (function () {
   function Hobbit (/*Optionally Construct with DTO and serializer*/) {
      // Todo: Implement pattern for enforcing new.
      Object.defineProperty (this, 'members', {
         value: {}
      });
   }

   (function definePublicAccessors (){
      Object.defineProperties(Hobbit.prototype, {
         id: {
            get: function () {return this.members.id;},
            set: function (newValue) {
               // Todo: Validation goes here.
               this.members.id = newValue;
            },
            configurable: false, enumerable: true
         },
         typeOfPerson: {
            get: function () {return this.members.typeOfPerson;},
            set: function (newValue) {
               // Todo: Validation goes here.
               this.members.typeOfPerson = newValue;
            },
            configurable: false, enumerable: true
         },
         greeting: {
            get: function () {
               return this.members.greeting === undefined ?
                  undefined :
               this.members.greeting.intro +
                  this.typeOfPerson +
                  this.members.greeting.outro;
            },
            set: function (newValue) {
               // Todo: Validation goes here.
               this.members.greeting = newValue;
            },
            configurable: false, enumerable: true
         },
         occupation: {
            get: function () {return this.members.occupation;},
            set: function (newValue) {
               // Todo: Validation goes here.
               this.members.occupation = newValue;
            },
            configurable: false, enumerable: true
         },
         emailFrom: {
            get: function () {return this.members.emailFrom;},
            set: function (newValue) {
               // Todo: Validation goes here.
               this.members.emailFrom = newValue;
            },
            configurable: false, enumerable: true
         },
         name: {
            get: function () {return this.members.name;},
            set: function (newValue) {
               // Todo: Validation goes here.
               this.members.name = newValue;
            },
            configurable: false, enumerable: true
         }
      });

   })();
   return Hobbit;
})();

// JSON.parse provides a hydrated hobbit from the DTO.
//    So you would call this to populate this DO from a DTO
// JSON.stringify provides the DTO from a hydrated hobbit

module.exports = Hobbit;

Now running the test

Flyweights using Prototypes

A couple of interesting examples of the Flyweight pattern implemented in JavaScript are by the GoF and Addy Osmani.

The GoF’s implementation of the FlyweightFactory makes extensive use of closure to store its flyweights and uses aggregation in order to create it’s ConcreteFlyweight from the Flyweight. It doesn’t use prototypes.

Addy Osmani has a free book “JavaScript Design Patterns” containing an example of the Flyweight pattern, which IMO is considerably simpler and more elegant. In saying that, the GoF want you to buy their product, so maybe they do a better job when you give them money. In this example closure is also used extensively, but it’s a good example of how to leverage prototypes to share your less specific behaviour.

Mixins using Prototypes

Again if you check out the last example of Mixins in Addy Osmani’s book, there is quite an elegant example.

We can even do multiple inheritance using mixins, by adding which ever properties we want from what ever objects we want to the target objects prototype.

This is a similar concept to the post I wrote on Monkey Patching.

Mixins support the Open/Closed principle, where objects should be able to have their behaviour modified without their source code being altered.

Keep in mind though, that you shouldn’t just expect all consumers to know you’ve added additional behaviour. So think this through before using.

Factory functions using Prototypes

Again a decent example of the Factory function pattern is implemented in the “JavaScript Design Patterns” book here.

There are many other areas you can get benefits from using prototypes in your code.

Prototypal Inheritance: Not Right for Every Job

Prototypes give us the power to share only the secrets of others that need to be shared. We have fine grained control. If you’re thinking of using inheritance be it classical or prototypal, ask yourself “Is the class/object I’m wanting to provide a parent for truly a more specific version of the proposed parent?”. This is the idea behind the Liskov Substitution Principle (LSP) and Design by Contract (DbC) which I posted on here. Don’t just inherit because it’s convenient In my “javascript object creation patterns” post I also discussed inheritance.

The general consensus is that composition should be favoured over inheritance. If it makes sense to compose once you’ve considered all options, then go for it, if not, look at inheritance. Why should composition be favoured over inheritance? Because when you compose your object from another contract of an object, your sub object (the object doing the composing) doesn’t inherit anything or need to know anything about the composed objects secrets. The object being composed has complete freedom as to how it minds it’s own business, so long as it provides a consistent contract for consumers. This gives us the much loved polymorphism we crave without the crazy tight coupling of classical inheritance (inherit everything, even your fathers drinking problem :-s).

I’m pretty much in agreement with this when we’re talking about classical inheritance. When it comes to prototypal inheritance, we have a lot more flexibility and control around how we use the object that we’re deriving from and exactly what we inherit from it. So we don’t suffer the same “all or nothing” buy in and tight coupling as we do with classical inheritance. We get to pick just the good parts from an object that we decide we want as our parent. The other thing to consider is the memory savings of inheriting from a prototype rather than achieving your polymorphic behaviour by way of composition, which has us creating the composed object each time we want another specific object.

So in JavaScript, we really are spoilt for choice when it comes to how we go about getting our fix of polymorphism.

When surveys are carried out on..

Why Software Projects Fail

the following are the most common causes:

Ambiguous Requirements
Poor Stakeholder Involvement
Unrealistic Expectations
Poor Management
Poor Staffing (not enough of the right skills)
Poor Teamwork
Forever Changing Requirements
Poor Leadership
Cultural & Ethical Misalignment
Inadequate Communication

You’ll notice that technical reasons are very low on the list of why projects fail. You can see the same point mentioned by many of our software greats, but when a project does fail due to technical reasons, it’s usually because the complexity got out of hand. So as developers when focusing on the art of creating good code, our primary concern should be to reduce complexity, thus enhance the ability to maintain the code going forward.

I think one of Edsger W. Dijkstra’s phrases sums it up nicely. “Simplicity is prerequisite for reliability”.

Stratification is a design principle that focuses on keeping the different layers in code autonomous, I.E. you should be able to work in one layer without having to go up or down adjacent layers in order to fully understand the current layer you’re working in. Its internals should be able to move independently of the adjacent layers without effecting them or being concerned that a change in it’s own implementation will affect other layers. Modules are an excellent design pattern used heavily to build medium to large JavaScript applications.

With composition, if your composing with contracts, this is exactly what you get.

References and interesting reads

I also discuss the prototype linkage briefly in my post on JavaScript object creation patterns.
Pitfalls of Constructor Functions
Exemplar Patterns
Exemplar theory
Taking another look at classical inheritance
Angus Croll on Prototypes
Yehuda Kats Understanding Prototypes in JavaScript

Tags:ecmascript, EcmaScript 3, EcmaScript 5, EcmaScript 6, ES3, ES5, es6, Garbage Collection, GC, inheritance, JavaScript, polymorphism, prototypal inheritance, prototype
Posted in Architecture, Garbage Collection, JavaScript, Scripting, Web | 2 Comments »

Automating Specification by Example for .NET Web Applications

February 22, 2014

If you or your organisation:

are/is constrained to running your .NET tests (unit, acceptance) on-site rather than in the cloud
would like some guidance on how to set-up Continuous Integration

read on.

Introduction

Purpose

Remember, an acceptance test system as a tool is only as good as the specification provided by it’s humans. The most important ingredients there-for is the relationships between the people creating the tests and the interactions performed by those people. Or as the Agile Manifesto states: Value “Individuals and interactions over processes and tools”. In order for an acceptance test system to be successful, the relationships of the Developers creating the increment and the interactions between them and the stake holders must be in good shape first. Once this is in order, you can take the next step and find some tools that will assist in creating working software that does what the stake holders want it to do.

It’s my intention that the following details will help you to create a system that automates “Specification by Example”.

The purpose of providing an automated Specification by Example Implementation, A.K.A Automated Acceptance Test System, is clearly explained here.

Do not fall into the trap of inverting the test triangle. Instead invest where it matters.

Scope

Create a system that can be triggered from

Every developers workstation
A build on the build machine, preferably from a best of bread build tool. TFS is not a best of bread build tool and if you want to get serious about Continuous Integration (CI), nightly builds, continuous deployment, I’d recommend not going down the path of TFS. Even Microsoft uses Git. Doesn’t that tell you something? Do you see TFS here? Last time I evaluated build tools, Jenkins previously named Hudson came out on top.

The system will include

An acceptance test framework that will run all the acceptance tests
A Unit test framework. UI tests need to be run in parallel on a collection of VM’s (See the section on supported browsers for why). There are three immediately obvious approaches we could take here.
1. We could try and rely on a unit test framework to distribute the tests. MSTest 2012 doesn’t provide the ability to run tests in parallel, but 2010 does. In order to have 2012 run tests in parallel, you can force it to use the 2012 test settings file. Only a maximum of 5 tests can be run concurrently though. Not a great option, considering it’s not going to be supported going forward.
2. My ParallelBrowser. If this link is not active and you’re interested in this, contact me.
3. PNUnit. An example of how this works is here under the “PNunit Framework for writing selenium test cases” heading. I wrote the ParallelBrowser before Selenium had good support for running the same tests on multiple supported browsers. Both my ParallelBrowser and this option are reasonable options, but I’d go for the latter now. This way someone else can maintain the parallel aspect. As unless people are interested in ParallelBrowser I won’t be doing any further work on it.
A Web User Interface Test Framework that will be driven by the acceptance test framework. Selenium in this case.
A set of tests that run Selenium tests. These will of course need to be thread-safe.
As per the Supported Browsers section, a collection of VM’s with our supported browsers installed.
1. Each with a standalone selenium server setup with a role of webdriver. Details further on.
A stand-alone selenium server setup with a role of hub

High Level Flow

Many organisations bound to .NET seem to be locked into using sub-standard tooling like TFS for their build. If you are in this predicament and can not break free, I’d suggest once all the unit tests, integration tests have run, then have the build kick off a psake script to:

Clean out the existing target web app
Deploy the newly built and tested web app
Drop the database
Create database by using latest DDL and DML scripts pulled from source control
Apply any specific configurations
Stop and start the target web server
Run the acceptance tests which will include any Web UI tests.

If it’s within your power to choose a real CI Tool to run in-house, there are a handful of very solid contenders. A good proportion of which are free and open source.

Audience

Who ever is setting up the system. Often a developer or two. It’s important to make sure more than one person knows how it all hangs together, otherwise you have a single point of failure.

Chosen Tools

Evaluation Criterion I used

Who is the creator? I favour teams rather than individuals, as individuals move on often leaving projects stranded?
Does it do what you need it to do?
Does it suite the way you and your team want to work?
Does it integrate well with all of your other chosen components? This is based on communicating with those that have used the offerings more so than using Proof Of Concepts (POC).
Works with the versions of dependencies you currently use.
Cost in money. Is it free? Are there catches once you get further down the road? Usually open source projects are marketed as is. No catches
Cost in time. Is the set-up painful? Customisation feedback? Upgrade feedback?
How well does it appear to be supported? What do the users say?
Documentation. Is there any / much? What is its quality?
Community. Does it have an active one? Are the users getting their questions answered satisfactorily? Why are the unhappy users unhappy (do they have valid reasons).
Release schedule. How often are releases being made? When was the last release?
Intuition. How does it feel. If you have experience in making these sorts of choices, lean on it. Believe it or not, this should probably be No. 1

The following tools have been my choice based on the above criterion.

Acceptance Test Framework

The following offerings are all free and open source.

If you’re not using User Stories and/or Test Conditions, the context/specification offerings provide greater flexibility than the xBehave style frameworks. As most Scrum teams use User Stories for their Product Backlog items and drive their acceptance tests with test conditions, xBehave offerings are a great choice. In saying that, there is probably no reason why both couldn’t be used where it makes sense to do so. In this section I’ve provided the results of evaluating the current xSpec and xBehave offerings for .NET ordered by best first for the categories.

xBehave (test conditions)

SpecFlow

Sourcecode: https://github.com/techtalk/SpecFlow/
Age: Over 4 years
Actively maintained: Yes
Large number of active committers
Community: Lively
Visual Studio Plug-in has been downloaded 70 times as many times as NBehave
Documentation: Excellent
Integrates well with Selenium (I’ve setup a couple of systems using SpecFlow and it’s been a joy to work with). The stake holders loved the visibility it provided too. I discussed it here in a recent presentation.

NBehave

Not a lot of activity
Only two committers

StoryQ

Only two coordinators
Well established framework

xSpec (context/specification)

Machine.Specification (MSpec)

Sourcecode: https://github.com/machine/machine.specifications
Age: Over 6 years.
Actively maintained: Yes
Active committers: enough
Machine.Fakes (NuGet packages available). Simplifies the usage of mocking libraries on top of MSpec, reducing a lot of the typical fake framework related clutter code in specifications. Wrapper API’s available to abstract the usage of mocking frameworks, allowing you to swap them without re-writing your test code.
Machine.Specifications.Mvc4 (NuGet package) is a set of extensions for testing ASP.NET MVC specific types
Documentation: Not great, but RSpec has plenty and the syntax is very similar.
Integrating with Selenium: See the github page for links.
How to set-up MSpec
Getting started with MSpec
How to MSpec

NSpec

Sourcecode: https://github.com/mattflo/NSpec/
Age: Under 2 years
Actively maintained: Yes
Active committers: enough
Documentation: Not great, but RSpec has plenty and the syntax is very similar.
Home page.

Web User Interface Test Framework

For me when I look at this category of tools for .NET, Selenium is always at the top and it just keeps getting better. If anyone has any questions around Selenium, feel free to contact me or leave a comment on this post. I can’t guarantee I’ll have the answer, but I’ll try. All the documentation can be found here. I would recommend installing the Selenium IDE for initially recording tests and be sure to check-out the IDE plug-ins. All the documentation you’ll need for the IDE is here. Once you get familiar with the code it generates, you will not use it much. I would recommend using the newer Web drivers rather than the selenium server by itself. The user group is very active and looks like a good place to ask questions also. Although I haven’t needed to as there is a huge amount of documentation that’s great.

The tools I would use are detailed here. Specifically we would be using

Selenium 2 (aka WebDriver)
The IDE for recording tests initially
Selenium Server which is used by WebDriver and RC (now considered legacy) now includes built-in grid capabilities.

Supported Browsers

What I’ve done in the past is have each of our supported versions from each supported browser vendor installed on a single VM. So each VM has all the vendors browsers installed, but just a single version obviously.

Mid Level Flow

These are the same points listed above under “High Level Flow“

1. Build Kicks off PSake Script

The choice to use PSake over the likes of NAant, Rake and the other build scripting languages is reasonably straight forward for me. PSake (PowerShell build scripting language) gives us access to the full .NET environment. NAnt with all it’s angle brackets, was never a very nice scripting language to use. Rake is excellent and a possible option if you have ruby installed. If you don’t, why install it if you have .NET? There are many resources for PowerShell on the inter-webs. The wiki for PSake is good.

In the case where you may have a TFS Build run, I would suggest once all the unit tests and integration tests have run, then the build kicks off a possibly pre-build and post-build psake script to perform the following operations. This is how you do this. Oh, before you try to actually run a PSake script, download and import the module, or install the NuGet package. So once you have your PSake scripts running, just start adding PowerShell scripts to do the following work. PSake is just syntactic sugar around PowerShell, so anything you can do with PS, you can do with PSake.

2. Clean out the existing target web application

Using your PSaki script, use the Web Deploy cmdlets. You will find everything you need here for it. You can also install the NuGet package.

3. Deploy the newly built and unit tested web application

As above, just use the Web Deploy cmdlets.

4. Drop the database

As above, just use the Web Deploy cmdlets.

5. Create database by using latest DDL and DML scripts pulled from source control

Database update via Application

Kind of related, but not specific to CI.

Depending on your needs, there are quite a few ways you could do this.

One way of doing this is to have your application utilise a library that determines which version of the database the application needs and be able to update the database accordingly. This library would use similar or the same upgrade scripts that we would use in this test process.

Your applications should create (if non existent) and update database on run. So all the DDL, DML code per database lives in a library. Each application that uses a specific database, references the databases DDL code library. Script all stored procedures, views, functions, triggers they’re recreated as part of a deployment scrip.

When the application is deployed, and the database created or updated, anything that must be there for the application to run out of the box should be part of the scripts, and of course versioned. This includes the part of our data that is constant or configuration data. Tables, stored procedures, views, functions and triggers. For the variable part of your data, you will need a synthetic data generation plan for testing.

Database Process for Versioning

Also related, but not specific to CI.

DBA, Devs, Product Owner and consultants must be aware of the process.

When any schema, constant data, configuration data, test data is updated… the (version controlled) scripts must also be updated, else the updates will get overwritten.

As part of the nightly build, if your supporting multiple versions of your application, you could also hydrate the collection of database versions, then run the appropriate upgrade scripts against each one, to verify the upgrades work. If any don’t, the build fails.

Create set of well defined processes that:

In most cases, looks after itself
Upgrades existing databases if they are not on the latest version, to the latest version
Creates databases for those applications that don’t have a database
Informs the user on deployment if the database is corrupt, or can not be upgraded
Outlines who is responsible for, and who may update the DDL and DML scripts for your projects
Clearly documents that any changes made to any databases by un-authorised personal will more than likely be overwritten.

A User Story for this might look something like the following:

As the team, we need to create a set of well defined processes that clearly outline what is required in regards to setting up the development teams database versioning, creation, upgrade systems and processes strategy for our organisations databases. So that all team personal are aware of the benefits and dangers of making changes to the databases, and understand the change process.

Possibly useful tools

1. DB Ghost
2. http://www.red-gate.com/products/sql-development/sql-source-control/index-2
3. http://www.sqlaccessories.com/SQL_Data_Examiner/

6. Apply any specific configurations

As above, just use the Web Deploy cmdlets.

7. Stop and start the target web server

As above, just use the Web Deploy cmdlets.

8. Run the acceptance tests which will include any Web UI tests

As above, just use the Web Deploy cmdlets.

Start each VM that hosts a set of browsers you want to use to farm your tests out to. From memory, you do not need to start each browser. There are of course many ways to do this. PS provides the following cmdlets Start-VM and Stop-VM. These would be my first options.
Start the selenium standalone server. All details found here. Or just work through the “Distributed Testing with Selenium Grid” chapter until you get to the “Creating and executing Selenium script in parallel with TestNG” heading, at which point switch to this documentation to replace TestNG with PNUnit.

If I’ve failed to explain anything in enough detail for you, drop me a message below and I’ll do my best to help 🙂

Tags:.NET, ATDD, BDD, CI, Continuous Integration, jenkins, Machine.Specification, MSpec, nightly build, NSpec, operational efficiencies, performance, PNUnit, power shell, PS, psake, selenium, SpecFlow, specification by example, tdd, xBehave, xSpec
Posted in Agile, C#, Frameworks, Microsoft, PowerShell, Projects, Scrum, Testing, Version Control, Virtualisation | 2 Comments »

Evaluation of .Net Mocking libraries

December 14, 2013

I’ve recently undertaken another round of evaluating .NET mocking (fake/substitute/dummy/stub/ or what ever you want to call them now) libraries. Interestingly the landscape has changed quite a bit since last time I went through this exercise, which was about two years ago. The outcome of the previous investigation is at the bottom of this post.

Evaluation criterion

Who is the creator. I’ve favoured teams rather than individuals, as individuals move on, then where does that leave the product? RhinoMocks is a prime example of this. It’s was an excellent library. maybe a new owner, maybe not.
Does it do what we need it to do?
Are there any integration problems with all of our other chosen components? Works with .Net versions the development team are using. Any other complaints around integration?
Cost in money. Is it free? Are there catches once you get further down the road? Usually open source projects are marketed as is. No catches
Cost in time. Is the set-up painful? Customisation feedback? Upgrade feedback?
How well does it appear to be supported? What do the users say?
Documentation. Is there any / much? What is it’s quality?
Community. Does it have an active one? Are the users getting their questions answered satisfactorily? Why are the unhappy users unhappy (do they have a valid reason).
Release schedule. How often are releases being made? When was the last release?

Following is the collection of libraries I looked at. Numbering from highest scorers to lowest. All have NuGet packages:

How the Playing Field Looks Today

NSubstitute (new style)

Free and open source.

Source code: https://github.com/nsubstitute/NSubstitute/

BDFL has 534 commits. Next highest is 30.

4.5 years old. Recent activity.

Stackoverflow 69 tagged questions

Has an active Google discussion group

Regular releases

Documentation looks very good.

Very easy to read, well thought out syntax.

FakeItEasy (new style)

Free and open source.

Source code: https://github.com/FakeItEasy/FakeItEasy/

Nice spread across contributors. No single point of failure.

Almost 4 years old.

Plenty of current activity. About 30% more than NSubstitute

Stackoverflow 85 tagged questions

Regular releases

Documentation looks OK.

Syntax looks OK.

JustMock

Not free and closed source.

If you happen to have a Telerik Devcraft bundle you’ll be entitled to one free JustMock license. Not much help if you want to use all the features across the team.

There is a light free version which has most/all of the features that most development teams would require.

It would have to be head and shoulders above the rest to warrant paying for it. Going on the feature set I don’t think it is, but I haven’t used it. Plus I have more confidence in the right open source offerings.

$US400 license per user.

Light edition is free, but I don’t see any reason why they couldn’t remove this offering or put a price tag on it.

NuGet package

Are we prepared to invest building code around this with the possibility of it becoming not free?

Lite vs full: http://www.telerik.com/freemocking.aspx#comparison

Doesn’t appear to be a lot of community around the free edition.

Moq

Free and open source.

Source code: https://github.com/Moq/moq4

Last release was 2013-11-18 previous to that it was 2.5 years ago.

Very small learning curve

Rhino Mocks

Free and open source.

Source code: https://github.com/hibernating-rhinos/rhino-mocks

Last activity: 3 years ago.

Has a new owner (MIke Meisinger), but I haven’t seen any new work yet.

There were also NMock and TypeMock which didn’t evaluate high enough this time or last time.

–

if it walks like a duck and quacks like a duck

How the Playing Field Looked Two Years Ago

Rhino Mocks

Free and open source.
Very full featured.
Easy enough to use.
logical and consistent syntax.
Most up to date documentation (best place to start)
somewhat out of date documentation, but more of it than the above link.
Community, Download, More code examples here.
Example of the old record/playback syntax as opposed to the new AAA syntax.
Keeping up to date on the progress of Rhino Mocks.
The most popular mocking framework two years ago.

Moq

Clean discoverable API design and lack of complicated record/playback model, which is nice.
Have used this, and haven’t had any issues I couldn’t get around.
Very easy to learn and use.

TypeMock

Commercial product (expensive, so not really viable).
Ability to mock anything including statics, privates and events on multiple languages.

NMock

Appears to be abandoned

I’ve just started using NSubstitute and have used Rhino Mocks, Moq and NMock previously.

Feel free to offer your experiences on the mocking libraries you have used and comparisons. I’d love to hear your experiences with these and other mocking libraries.

Tags:.NET, dummy, fake, FakeItEasy, faking, Free and open source, JustMock, mock, mocking, Moq, NMock, NSubstitute, Rhino Mocks, sbubbing, stub, stubbing, substitute
Posted in C#, Frameworks, Libraries | 2 Comments »

Up and Running with Sass (scss) and Less in Visual Studio

November 26, 2013

I recently evaluated the support for the top two CSS preprocessors (Sass and Less) for the environment my client team and myself are currently constrained to (Visual Studio 2012).

I setup Less and Sass and decided to go with Sass. Below I outline what the process was, briefly what each has to offer and why the decision went the Sass way. Both Sass and Less are super sets of CSS, so you can just rename your CSS file extension to .less if using Less or .scss (file extension for the newer Sass format (stands for Sassy CSS)) if you decide to use Sass and save the file. A CSS file will be automatically generated. Now you can just start changing the CSS to Sass or Less to start using all the new functionality you have at your disposal. There is no preprocessor lock-in. If I decided to change to Less or even back to CSS, you’d just do the same thing with the generated CSS file.

If you’re not up to speed on what a CSS preprocessor is or why you would want to use one on your web project, now’s a good time to find out before we carry on with the two setup procedures. Go and checkout Sass and Less.

Less

Less was originally written in Ruby, but later re-written in JavaScript.

Less setup is reasonably straight forward, but didn’t work straight away like the last time I used it. Turns out it was just an install order thing.
In Visual Studio…
Install “Web Essentials 2012” through TOOLS -> Extensions and Updates
Then
Install Web Developer Tools 2012.2. You may have them already installed. Install them again if you do.
Web Essentials actually provides us with a lot of other features I’d promote like JSHint (a better JSLint) and quite a few other goodies.

In Visual Studio TOOLS -> Options -> Web Essentials -> LESS
You’ll see something like this…

This is how I set up the options. I usually turn of the “Show preview window”, but it’s fine to have it on to start with, so you can see the .css getting generated side by side with the Less.There are other extensions that support Less also, but I think this was the easiest setup.

Sass (scss)

Written in Ruby.

I tried the SassyStudio extension. It has

syntax highlighting
region outlining
some intellisense support
CSS generation.

I also tried Mindscape Web Workbench which also has CoffeeScript and LESS support. It has what SassyStudio has plus

(by the look of it, better intellisense support)
warnings of syntax errors
warnings of unknown variables and mixins
go to variable or mixin definition
CSS file minification (pro edition only. You probably don’t need this as there are plenty of other ways to minify)
more customisation capabilities than SassyStudio.

The Setup

You may also need to Install Web Developer Tools 2012.2 as I did for the Less trial above. You may as well do this anyway to make sure you’re on the latest version.
In Visual Studio…
Install “Mindscape Web Workbench” through TOOLS -> Extensions and Updates.

Better Sass (or more correctly with the new format scss) extensions will keep appearing I’d say, so when they do it’d probably be worth looking at them. The only thing so far I don’t like about Mindscape Web Workbench is that it has a small “Go Pro” label at the bottom of the scss file. I don’t think it’ll bother anyone to much though.

Adding a scss file

Right click on the folder you want to add the new file to -> Add -> New Scss File… -> Save

Once you’ve created a new scss file or renamed a CSS file to have the .scss extension and saved it, for now we can commit both of these files to source control. You’ll need to setup a preprocessor on the build server in order to be able to not have the generated CSS files in source control.

Why I chose Scss over Less

Sass has similar functionalities to Less (nested rules, variables, mixins, functions, inheritance, operators), but it can be used with Compass and Susy. Compass provides a framework of functions and add-ons built on top of Sass. Compass automatically handles image spriting, writes cleaner code, provides page layout tools, resets and lots of other useful features. Susy is a responsive grid add-on for Compass. Mindscape Web Workbench also provides support for Compass . So if we want to take advantage of these sometime, they are available.

Sass (scss) Resources

Blog post “Beginner’s Guide”
There are also a couple of books at least on Sass and Compass
http://sass-lang.com/guide
Documentation

Tags:CSS, JSHint, less, preprocessor, Sass, scss, visual studio
Posted in CSS, Microsoft, Web | 2 Comments »

Software Engineer Interview Quick Question Set

May 11, 2013

Ice breakers

Tell us a little bit about yourself and what drives you?
Ask a question from their CV that is positive, ‘what was your greatest success in your current or last role’
What’s your ideal job?
Can you give us one thing you really enjoyed in your last job?
What about one thing that you didn’t enjoy as much?
How did you solve that?

Testing

How can you implement unit testing when there are dependencies between a business layer and a data layer, or the presentation layer and the business layer?
The development team is getting near release date. They start saying things like, we’re going to need a sprint to test. What would your reaction be?

Maintenance

What measures have you taken to make your software products more easily maintainable?
What is the most expensive part of the SDLC?
(hint: reading others code)

Design and architecture

Can you explain some design patterns, and where you have used them?

Scrum

Have you used scrum before? (If the answer is no, move on)
If you were taken on as a team member and the team was failing Sprint after Sprint. What would you do?
What would you do if you were part of a Scrum Team and your manager asked you to do a piece of work not in the Scrum Backlog?
(hint: manager needs to consult PO. Something has to be removed from Sprint backlog in order for something to be added)

Construction

When do you use an abstract class and when do you use an interface?
How do you make sure that your code is both safe and fast?
Can you describe the process you use for writing a piece of code, from requirements to delivery?

Software engineering questions

What are the benefits and drawbacks of Object Orientated Design?
(hint: polymorphism inheritance encapsulation)
What books have you read on software engineering that you thought were good?
Explain the terms YAGNI, DRY, SOLID?
(hint You Aint Gonna Need It. Build what you need as you need it, aggressively refactoring as you go along; don’t spend a lot of time planning for grandiose, unknown future scenarios. Good software can evolve into what it will ultimately become. Every piece of code is code we have to test. If the code is not needed, why are we spending time on it?)

Functional design questions

Which controls would you use when a user must select multiple items from a big list, in a minimal amount of space?
How would you design editing twenty fields for a list of 10 items? And editing 3 fields for a list of 1000 items?

Specific technical requirements

When, where and how do you optimize code?

Web questions

How would you mitigate SQL injection?
(hint: looking for multi layered sanitisation. parameterised SQL. Least privileged account for data access)
Have you used XSS and can you provide us an example?
What JavaScript libraries have you used?
What are some of the irritating limitations of CSS?
How would you remove the ASP.NET_SessionId cookie from a MVC controllers Response?
(hint: Response.Cookies["ASP.NET_SessionId"].Expires = DateTime.Now;)

JavaScript

How does JavaScript implement inheritance?
(hint: via Object’s prototype property)

Service Oriented

What are the 3 things a WCF end point must have, or what is the ABC of a WCF service?
(hint:
Address – where the WCF service is hosted.
Binding – that specifies the protocol and its myriad of options.
Contract – service contract defines what service operations are available to the client for consumption.
)

C# / .Net questions

What’s the difference between public, private, protected and internal modifiers?
What are the main differences between the .NET 2.0 and 4.0 garbage collector?
(hint: background GC was introduced)
Describe the different ways arguments can be passed in C#
(hint: pass val by val, pass val by ref, pass ref by val, pass ref by ref)
We have a Base class, we have a child class that inherits BaseClass. Does the child class inherit the base class’s private members?
(hint: this is normally good for a laugh)
Have you ever worked with a deadlock and how did it occur?
When should locks be used in concurrent programming?
(hint:
when synchronization cannot be performed in any other way. This is rare. With careful thought and planning, there is just about always a better way. There are many ways to synchronise without using locks. System.Threading.Interlocked class generally supported by the processor
)
What are some of your favourite .NET features?

Finally, this question is from Google; can you quickly tell us something that we don’t know anything about? It can be anything.

Tags:Interview Questions
Posted in Agile, Architecture, C#, Development Methodologies, JavaScript, Scrum, Uncategorized | 2 Comments »

Software Engineer Interview Process and Questions

April 27, 2013

A short time ago, I was tasked with finding the right software engineer/s for the organisation I was working for. I settled on a process, a set of background questions, a set of practical programming exercises and a set of verbal questions. Later on I cut the set of verbal questions down to a quicker set. In this post, I’ll be going over the process and the full set of verbal questions. In a subsequent post I’ll go over the quicker set.

The Process

We sent them an email with a series of questions.
Technical and non-technical.
They have two days to reply with answers.
The programming exercises are not covered here.
If they passed this…

We would get them in for an interview.
Technical and non-technical questions would be asked.
They would be put on the spot and asked to speak to the development team about a technical subject that they were familiar with.
The development team would quiz them on whatever comes to mind.
Once the candidate had left, the development team would collaborate on what they thought of the candidate and whether or not they would be a good fit for the team.
The team would take this feedback and discuss whether the candidate should be given a trial. Step 2 could be broken into two parts depending on how many questions and their intensity, you wanted to drill the candidate with.

The following set of tests will confirm whether the candidate satisfies the points we have asked for in the job description.

The non functional (soft) qualities listed on the Job add would need to be kept in mind during the interview events.

Qualities such as:

Quality focus
Passion
Personality
Commitment to the organisations needs
A genuine sense of excitement about the technologies we work with

Email test

Send Screening.pdf
Send InterviewQuestions.doc

Now with the following questions, with many of them there is not necessarily a right or wrong answer. Many of them are just to gauge how the candidate thinks and whether or not they hold the right set of values.

Ice breakers

Would you like to be the team leader or team member?
Tell me about a conflict at a previous job and how you resolved it.
(Summary personality item: Think to yourself, “If we hire this person, would I want to spend four hours driving in a car with them?”)

Design and architecture

What’s the difference between TDD and BDD and why do they matter?
What is Technical Debt. How do you deal with it once in it? How do you stay out of it?
How would you deal with a pair when reviewing their code, when they have not followed good design principles?
What would you do if a fellow team member reviewed your code and suggested you change something you had designed that followed good design principles, to something inferior?
Can you explain how the Composite pattern works and where you would use it?
Can you describe several class construction techniques?
What are two design patterns that are focused on class construction, and how do they work?
(hint: Builder, Factory Method).
How would you model the animal kingdom (with species and their behaviour) as a class system?
(hint GoF design pattern. Abstract Factory)
Can you name a number of non-functional (or quality) requirements?
What is your advice when a customer wants high performance, high usability and high security?
What is your advice when a customer wants high performance, Good design, Cheap?
(hint: pick 2)
What do low coupling and high cohesion mean? What does the principle of encapsulation mean to you?
Can you think of some concurrency patterns?
(hint: Asynchronous Results, Background Worker, Compare/Exchange pattern via Interlocked.CompareExchange)
How would you manage conflicts in a web application when different people are editing the same data?
Where would you use the Command pattern?
Do you know what a stateless business layer is? Where do long-running transactions fit into that picture?
(hint: if you have long-running transactions, you are going to have to manage state somehow. How would you do this?)
What kinds of diagrams have you used in designing parts of an architecture, or a technical design?
Can you name the different tiers and responsibilities in an N-tier architecture?
(hint: presentation, business, data)
Can you name different measures to guarantee correctness and robustness of data in an architecture?
(hint: for example transactions, thread synchronisation)
What does the acronym ACID stand for in relation to transactions?
(hint: atomicity, consistency, isolation, durability)
Can you name any differences between object-oriented design and component-based design?
(hint: objects vs services or documents)
How would you model user authorization, user profiles and permissions in a database?(hint: Membership API)

Scrum questions

Have you used Scrum before? (If the answer is no, not much point in asking the rest of these questions).
If you were taken on as a team member and the team was failing Sprint after Sprint. What would you do?
What are the Scrum events and the purpose of them?
(hint: Daily Scrum, Sprint Planning Meetings 1 & 2, Sprint Review and Sprint Retrospective)
What would you do if you were part of a Scrum Team and your manager asked you to do a piece of work not in the Scrum Backlog?
Who decides what Product Backlog Items should be pulled into a Sprint?
What is the DoD and what is it useful for?
Where and how do changing requirements fit into scrum?

Construction questions

How do you make sure that your code can handle different kinds of error situations?
(hint: TDD, BDD, testing…)
How do you make sure that your code is both safe and fast?
When would you use polymorphism and when would you use delegates?
When would you use a class with static members and when would you use a Singleton class?
Can you name examples of anticipating changing requirements in your code?
Can you describe the process you use for writing a piece of code, from requirements to delivery?
Explain DI / IoC. Are there any differences between the two? If so, what are they?
(hint: DI is one method of following the Dependency Inversion Principle (DIP) or IoC)

Software engineering skills

What is Object Oriented Design? What are the benefits and drawbacks?
(hint: polymorphism inheritance encapsulation)
What is the role of interfaces in design?
What books have you read on software engineering that you thought were good?
What are important aspects of GUI design?
What Object Relational Mapping tools have you used?
What are the differences between Model-View-Controller, Model-View-Presenter and Model-View-ViewModel
Can you draw MVC and MVP?
(hint: doted lines are pub/sub)

What is the difference between Mocks, Stubs, Fakes and Dummies?
(hint:
Mocks are objects pre-programmed with expectations which form a specification of the calls they are expected to receive. Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what’s programmed in for the test.
Stubs may also record information about calls, such as an email gateway stub that remembers the messages it ‘sent’, or maybe only how many messages it ‘sent’.
Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.)
Describe the process you would take in setting up CI for our company?
We’re going to design the new IMDB.
On the whiteboard, what would the table that holds the movies look like?
Every movie has actors, how would the Actors table look?
Actors star in many movies, any adjustments?
We need to track Characters also. Any adjustments to the schema?

What metrics, like cyclomatic complexity, do you think are important to track in code?

Functional design questions

What are metaphors used for in functional design? Can you name some successful examples?
(hint: Partial Function Application, Currying)
How can you reduce the user’s perception of waiting when some routines take a long time?
Which controls would you use when a user must select multiple items from a big list, in a minimal amount of space?
How would you design editing twenty fields for a list of 10 items? And editing 3 fields for a list of 1000 items?
Can you name some limitations of a web environment vs. a Windows environment?

Specific technical requirements

What software have you used for bug tracking and version control?
Which branching models have you used?
(hint: No Branches, Release, Maintenance, Feature, Team)
What have you used for unit testing, integration testing, UA testing, UI testing?
What build tools are you familiar with?
(hint: Nant, Make, Rake, PSake)

Web questions

Would you use a black list or white list? Why?
Can you explain XSS and how it works?
Can you explain CSRF? and how it works?
What is the difference between GET and POST in web forms? How do you decide which to use?
What do you know about HTTP.
(hint: Application Layer of OSI model (layer 7), stateless)
What are the HTTP methods sometimes called verbs?
(hint: there are 9 of them. HEAD, GET, POST, PUT, DELETE, TRACE, OPTIONS, CONNECT, PATCH)
How do you get the current users name from an MVC Controller?
(hint: The controller has a User property which is of type IPrinciple which has an Identity property of type IIdentity, which has a Name property)
What JavaScript libraries have you used?
What is the advantage of using CSS?
What are some of the irritating limitations of CSS?

JavaScript questions

How does JavaScript implement inheritance?
(hint: via Object’s prototype property)
What is the difference between "==" and "===", "!=" and "!=="?
(hint: If the two operands are of the same type and have the same value, then “===” produces true and “!==” produces false. The evil twins do the right thing when the operands are of the same type, but if they are of different types, they attempt to coerce the values. The rules by which they do that are complicated and unmemorable.
If you want to use "==", "!=" be sure you know how it works and test well.
By default use “===” and “!==“. )
These are some of the interesting cases:

'' == '0'          // false
0 == ''            // true
0 == '0'           // true
false == 'false'   // false
false == '0'       // true
false == undefined // false
false == null      // false
null == undefined  // true
' \t\r\n ' == 0    // true

On the whiteboard, could you show us how to create a function that takes an object and returns a child object?

if (typeof Object.create !== ‘function’) {
   Object.create = function (o) {
      var F = function () {};
      F.prototype = o;
      return new F();
   };
}
var child = Object.create(parent);

When is “this” bound to the global object?
(hint: When the function being invoked is not the property of an object)
With the following code, how does myObject.pleaseSetValue set myObject.value?

var myObject = {
	value: 0
};

myObject.setValue = function () {
	var that = this; // don’t show this

	var pleaseSetValue = function () {
		that.value = 10; // don’t show this
	};
	pleaseSetValue ();
}
myObject.setValue();
document.writeln(myObject.value); // 10

Service Oriented questions

Can you think of any Advantages and Disadvantages in using SOA over an object oriented n-tier model?
What’s the simplest way to make a service call from within a web page and how many lines could you do this in?
What scales better, per-call services or per-session and why?
(hint: maintaining service instances (maintaining state) in memory or any entities for that matter quickly blows out memory and other resources.)
What is REST’s primary objective?
How many ways can you create a WCF proxy?
(hint:
Add Service Reference via Visual Studio project
Using svcutil.exe
Create proxy on the fly with… new ChannelFactory<IMyContract>().CreateChannel();
)
What do you need to turn on on the service in order to create a proxy?
(hint: enable an HTTP-GET behaviour, or MEX endpoint)

C# / .Net questions

What’s the difference between public, private, protected and internal modifiers?
Which ones can be used together?
What’s the difference between static and non-static methods?
What’s the most obvious difference in IL with static constructors?
(hint: static method causes compiler to not mark type with beforefieldinit, thus giving lazy initialisation.)
How have you used Reflection?
What does the garbage collector clean up?
(hint: managed resources, not unmanaged resources. Such as files, streams and handles)
Why would you implement the the IDisposable interface?
(hint: clean up resources deterministically. Clean up unmanaged resources.)
Where should the Dispose function be called from?
(hint: the objects finalizer)
Where is an objects finalizer called from?
(hint: the GC)
If you call an objects Dispose method, what System method should you also make sure is called?
(hint: System.GC.SuppressFinalize)
Why should System.GC.SuppressFinalize be called?
(hint: finalization is expensive)
Are strings mutable or immutable?
(hint: immutable)
What’s the most significant difference between struct’s and class’s?
(hint: struct : value type, class : reference type)
What are the other differences between struct’s and class’s?
(hint: struct’s don’t support inheritance (all value types are sealed) or finalizers)
(hint: struct’s can have the same fields, methods, properties and operators)
(hint: struct’s can implement interfaces)
Where are reference types stored? Where are value types stored?
(hint:
bit of a trick question. Ref on the heap, val on the stack (generally)
The reference part of reference type local variables is stored on the stack.
Value type local variables also on the stack.
Content of reference type variables is stored on the heap.
Member variables are stored on the heap.
)
Where is the yield key word used?
(hint: within an iterator)
What are some well known interfaces in the .net library that iterators provide implementation for?
(hint: IEnumerable<T> )
Are static methods thread safe?
(hint: a new stack frame is created with every method call. All local variables are safe… so long as they are not reference types being passed to another thread or being passed to another thread by ref.)
What is the TPL used for?
(hint: a set of API’s in the System.Threading and System.Threading.Tasks namespaces simplifying the process of adding parallelism and concurrency to applications.)
What rules would you consider when choosing a lock object?
(hint: keep the scope as tight as possible (private), so other threads cannot change its value, thus causing the thread to block.
Declare as readonly, as its value should not be changed.
Must not be a value type.
If the lock keyword is used on a value type, the compiler will report an error.
If used with System.Threading.Monitor, an exception will occur at runtime, because Monitor.Exit receives a boxed copy of the original variable.
Never lock on “this”.)
Why would you declare a field as volatile?
(hint: So that the order of the operations performed on the variable are not optimised to a different order.)
Are reads and writes to a long (System.Int64) atomic? Are reads and writes to a int (System.Int32) atomic?
(hint: The runtime guarantees that a type whose size is no bigger than a native integer will not be read or written only partially. This is in the CLI spec and the C# 4.0 spec.)
Before invoking a delegate instance just before the null check is performed, What’s a good way to make sure no other threads can set your delegate to null between when the check occurs and when you invoke it?
(hint:
assign reference to heap allocated memory to stack allocated implements thread safety.
Assign your delegate instance to a second local delegate variable.
This ensures that if subscribers to your delegate instance are removed (by a different thread) between checking for null and firing the invocation, you won’t fire a NullReferenceException.)

void OnCheckChanged(EventArgs e) {
	// assign reference to heap allocated memory to
	// stack allocated implements thread safety

	// CheckChanged is a member declared as…  public event EventHandler CheckChanged;
	EventHandler threadSafeCheckChanged = CheckChanged;
	if (threadSafeCheckChanged != null)  {
		// fire the event off
		foreach(EventHandler handler in threadSafeCheckChanged.GetInvocationList()) {
			try {
				handler(this, e);
			}
			catch(Exception e) {
				// handling code
			}
		}
	}
}

What is a deadlock and how does one occur? Can you draw it on the white board?
(hint: two or more threads wait for each other to release a synchronization lock.
Example:
Thread A requests a lock on _sync1, and then later requests a lock on _sync2 before releasing the lock on _sync1.
At the same time,
Thread B requests a lock on _sync2, followed by a lock on _sync1, before releasing the lock on _sync2.
)
How many ways are there to implement an interface member, and what are they?
(hint: two. Implicit and explicit member implementation)
How do I declare an explicit interface member?
(hint: prefix the member name with the interface name)

public class MyClass : SomeBaseClass ,IListable, IComparable {
    // …
    public intCompareTo(object obj) {
        // …
    }

    #region IListable Members
    string[] Ilistable.ColumnValues {

        get {
            // …
            return values;
        }
    }
    #endregion
}

Write the above on a white board, then ask the following question. If I want to make a call to an explicit member implementation like the above, How do I do it?

string[] values;
    MyClass obj1, obj2;

    // ERROR:  Unable to call ColumnValues() directly on a contact
    // values = obj1.ColumnValues;

    // First cast to IListable.
    values = ((IListable)obj2).ColumnValues;

What is wrong with the following snippet?
(hint: possibility of race condition.
If two threads in the program both call GetNext simultaneously, two threads might be given the same number. The reason is that _curr++ compiles into three separate steps:
1. Read the current value from the shared _curr variable into a processor register.
2. Increment that register.
3. Write the register value back to the shared _curr variable.
Two threads executing this same sequence can both read the same value from _curr locally (say, 42), increment it (to, say, 43), and publish the same resulting value. GetNext thus returns the same number for both threads, breaking the algorithm. Although the simple statement _curr++ appears to be atomic, this couldn’t be further from the truth.)

// Each call to GetNext should hand out a new unique number
static class Counter {
    internal static int _curr = 0;
    internal static int GetNext() {
        return _curr++;
    }
}

What are some of your favourite .NET features?

Data structures

How would you implement the structure of the London underground in a computer’s memory?
(hint: how about a graph. The set of vertices would represent the stations. The edges connecting them would be the tracks)
How would you store the value of a colour in a database, as efficiently as possible?
(hint: assuming we are measuring efficiency in size and not retrieval or storage speed, and the colour is 16^6 (FFFFFF), store it as an int)
What is the difference between a queue and a stack?
What is the difference between storing data on the heap vs. on the stack?
What is the number 21 in binary format? And in hex?
(hint: 10101, 15)
What is the last thing you learned about data structures from a book, magazine or web site?
Can you name some different text file formats for storing unicode characters?
How would you store a vector in N dimensions in a datatable?

Algorithms

What type of language do you prefer for writing complex algorithms?
How do you find out if a number is a power of 2? And how do you know if it is an odd number?
How do you find the middle item in a linked list?
How would you change the format of all the phone numbers in 10,000 static html web pages?
Can you name an example of a recursive solution that you created?
Which is faster: finding an item in a hashtable or in a sorted list?
What is the last thing you learned about algorithms from a book, magazine or web site?
How would you write a function to reverse a string? And can you do that without a temporary string?
In an array with integers between 1 and 1,000,000 one value is in the array twice. How do you determine which one?
Do you know about the Traveling Salesman Problem?

Testing questions

It’s Monday and we’ve just finished Sprint Planning. How would you organize testing?
How do you verify that new changes have not broken existing features?
(hint: regression test)
What can you do reduce the chance that a customer finds things that he doesn’t like during acceptance testing?
Can you tell me something that you have learned about testing and quality assurance in the last year?
What sort of information would you not want to be revealed via Http responses or error messages?
(hint: Critical info about the likes of server name, version, installed program versions, etc)
What would you make sure you turned off on an app or web server before deployment?
(hint: directory listing?)

Maintenance questions

How do you find an error in a large file with code that you cannot step through?
How can you make sure that changes in code will not affect any other parts of the product?
How can you debug a system in a production environment, while it is being used?

Configuration management questions

Which items do you normally place under version control?
How would you manage changes to technical documentation, like the architecture of a product?

Project management

How many of the three variables scope, time and cost can be fixed by the customer?
Who should make estimates for the effort of a project? Who is allowed to set the deadline?
Which kind of diagrams do you use to track progress in a project?
What is the difference between an iteration and an increment?
Can you explain the practice of risk management? How should risks be managed?
What do you need to be able to determine if a project is on time and within budget?
(hint: Product Backlog burn-down)
How do you agree on scope and time with the customer, when the customer wants too much?

Candidate displays how they communicate / present to a group of people about a technical topic they are passionate and familiar about.

References I used

Douglas Crockford’s JavaScript The Good Parts
Mocks, Aren’t Stubs
Answering the 100 Interview Questions for Software Developers: Requirements

If any of these questions or answers are not clear, or you have other great ideas for questions, please leave comments.

Tags:Interview Questions
Posted in Agile, Architecture, C#, Design Patterns, Development Methodologies, JavaScript, MVC, Scrum, Uncategorized | 10 Comments »

A Decent Console for Windows

January 19, 2013

On *nix we’re kind of spoilt when it comes to the CLI experience.
The console I use most in a GUI environment is the great terminator.

No, not that one.
This one…

Multi tab, split screen, transparency, the works.
Then we’ve also got tmux (and a comparison between terminator and tmux).
Taking things further, we’ve got awesome

Well I’ve been looking for something similar for Windows for a while.
I’ve tried terminator on Cygwin, but it’s just not the same, plus it only supports the single shell.

Meet Console2

With PowerShell as the currently active tab.
It’s a stand alone executable and crucially it’s free.
Console2 is just that, a console or terminal that seems to be able to host any shell that’s thrown at it.
As you can see with the image above, I’ve setup Console2 to host the following shells:

Windows Command shell
The Visual Studio Command Prompt (which is just the Windows Command shell (with some paths and variables added?))
PowerShell
The node.js Read-Eval-Print-Loop (REPL)
VMwares vSphere PowerCLI
And of course the bash shell we all know and love.
Cygwin required

Although project activity looks minimal to non existent currently.

How I setup Console2

Once running, right click the console -> Edit -> Settings…

Setup the hot keys under the Hotkeys node to behave like the terminal I use on Linux (currently terminator).
- Select the specific command, put your cursor in the Hotkey text box, press your preferred key combination, press the Assign key.
- For opening new tabs I use Ctrl+Shift+T
- Change the Copy selection node to what it should be: Ctrl+C
- Change the Past to what it should be: Ctrl+V
Under the Console node, enter the directory to have each shell start in
Under the Appearance/More… node, I deselect the Show menu, Show toolbar and Show status bar
- I make sure the Window transparency is set to None, as it just distracts me being able to see stuff behind the surface I’m concentrating on.
  It looks cool to turn it on, but I personally find it harder to read the text when you’ve got to lots of text overlapping
- Under the Behavior node, I turn on Copy on select, as this is on in Linux by default
Now under the Tabs node is where we set up all of our shells.
- Click the Add button
- Change the name you want the shell tab to appear as in the Main tab under the Title text box
- Now for the Icons I just got images I wanted for them and opened them in GIMP and changed the size to 32×32 pixels and saved as .ico files to the same directory that the Console.exe runs from
- I Then select them here
- Under the Shell section I just copy the short cuts from the likes of the start menu and past them in there
- You can then override the default startup dir by specifying your path in the Startup dir text box
- You can also specify if you want the shell to run as a specific user. Administrator for example.
  When you run this shell, you’ll be prompted for the users credentials if it’s not you.

As I was working through the Console2 set up, I ran into another offering…

Meet ConEmu

The actively maintained ConEmu lives here.
I had a quick play with this and a flick through the documentation.
The simple tasks of setting up different shells as pre-sets seemed to evade me.
There seems to be a lot more configuration options too.
As I’d just set up the Console2 and it seemed to be doing everything I needed for now, I decided to call it quits with ConEmu.
I think it’s worth checking out though if you need more power than Console2.
Scott Hanselmans post on Conemu.

Tags:awesome, ConEmu, console, console2, Linux, shell, terminal, terminator, tmux
Posted in GNU/Linux, Microsoft, PowerShell, Uncategorized | 4 Comments »

Sanitising User Input from Browser. part 2

November 16, 2012

Untrusted data (data entered by a user), should always be treated as though it contains attack code.
This data should not be sent anywhere without taking the necessary steps to detect and neutralise the malicious code.
With applications becoming more interconnected, attacks being buried in user input and decoded and/or executed by a downstream interpreter is becoming all the more common.
Input validation, that’s restricting user input to allow only certain white listed characters and restricting field lengths are only two forms of defence.
Any decent attacker can get around client side validation, so you need to employ defence in depth.
validation and escaping also needs to be performed on the server side.

Leveraging existing libraries

Microsofts AntiXSS is not extensible,
it doesn’t allow the user to define their own whitelist.
It didn’t allow me to add behaviour to the routines.
I want to know how many instances of HTML encoded values there were.
There was certainly a lot of code in there, but I didn’t find it very useful.
The OWASP encoding project (Reform)(as mentioned in part 1 of this series).
This is quite a useful set of projects for different technologies.
System.Net.WebUtility from the System.Web.dll.
Now this did most of what I needed other than provide me with fine grained information of what had been tampered with.
So I took it and extended it slightly.
We hadn’t employed AOP at this stage and it wasn’t considered important enough to invest the time to do so.
So it was a matter of copy past modify.

What’s the point in client side validation if the server has to do it again anyway?

Now there are arguments both ways for this.
My current take on this for the project in question was:
If you only have server side validation, the client side is less responsive and user friendly.
If you only have client side validation, it’s out of our control.
This also gives fuel to the argument of using JavaScript on the client and server side (with the likes of node.js).
So the same code can be used both sides without having to code the same validation in two different languages.
Personally I find writing validation code easier using JavaScript than C#.
This maybe just because I’ve been writing considerably more JavaScript than C# lately though.

The code

I drew a sequence diagram of how this should work, but it got lost in a move.
So I wasn’t keen on doing it again, as the code had already been done.
In saying that, the code has reasonably good documentation (I think).
Code is king, providing it has been written to be read.
If you notice any of the escaping isn’t quite making sense, it could be the blogging engine either doing what it’s meant to, or not doing what it’s meant to.
I’ve been over the code a few times, but I may have missed something.
Shout out if anything’s not clear.

First up, we’ll look at the custom exceptions as we’ll need those soon.

using System;

namespace Common.WcfHelpers.ErrorHandling.Exceptions
{
    public abstract class WcfException : Exception
    {
        /// <summary>
        /// In order to set the message for the client, set it here, or via the property directly in order to over ride default value.
        /// </summary>
        /// <param name="message">The message to be assigned to the Exception's Message.</param>
        /// <param name="innerException">The exception to be assigned to the Exception's InnerException.</param>
        /// <param name="messageForClient">The client friendly message. This parameter is optional, but should be set.</param>
        public WcfException(string message, Exception innerException = null, string messageForClient = null) : base(message, innerException)
        {
            MessageForClient = messageForClient;
        }

        /// <summary>
        /// This is the message that the service's client will see.
        /// Make sure it is set in the constructor. Or here.
        /// </summary>
	    public string MessageForClient
        {
            get { return string.IsNullOrEmpty(_messageForClient) ? "The MessageForClient property of WcfException was not set" : _messageForClient; }
            set { _messageForClient = value; }
        }
        private string _messageForClient;
    }
}

And the more specific SanitisationWcfException

using System;
using System.Configuration;

namespace Common.WcfHelpers.ErrorHandling.Exceptions
{
    /// <summary>
    /// Exception class that is used when the user input sanitisation fails, and the user needs to be informed.
    /// </summary>
    public class SanitisationWcfException : WcfException
    {
        private const string _defaultMessageForClient = "Answers were NOT saved. User input validation was unsuccessful.";
        public string UnsanitisedAnswer { get; private set; }

        /// <summary>
        /// In order to set the message for the client, set it here, or via the property directly in order to over ride default value.
        /// </summary>
        /// <param name="message">The message to be assigned to the Exception's Message.</param>
        /// <param name="innerException">The Exception to be assigned to the base class instance's inner exception. This parameter is optional.</param>
        /// <param name="messageForClient">The client friendly message. This parameter is optional, but should be set.</param>
        /// <param name="unsanitisedAnswer">The user input string before service side sanitisatioin is performed.</param>
        public SanitisationWcfException
        (
            string message,
            Exception innerException = null,
            string messageForClient = _defaultMessageForClient,
            string unsanitisedAnswer = null
        )
            : base(
                message,
                innerException,
                messageForClient + " If this continues to happen, please contact " + ConfigurationManager.AppSettings["SupportEmail"] + Environment.NewLine
                )
        {
            UnsanitisedAnswer = unsanitisedAnswer;
        }
    }
}

Now as we define whether our requirements are satisfied by way of executable requirements (unit tests(in their rawest form))
Lets write some executable specifications.

using NUnit.Framework;
using Common.Security.Sanitisation;

namespace Common.Security.Encoding.UnitTest
{
    [TestFixture]
    public class ExtensionsTest
    {

        private readonly string _inNeedOfEscaping = @"One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.";
        private readonly string _noNeedForEscaping = @"One x2F two amp three x27 four lt five quot six gt       .";

        [Test]
        public void SingleDecodeDoubleEncodedHtml_ShouldSingleDecodeDoubleEncodedHtml()
        {
            string doubleEncodedHtml = @"";               // between the ""'s we have a string of Html with double escaped values like &amp;#x27; user entered text &amp;#x2F.
            string singleEncodedHtmlShouldLookLike = @""; // between the ""'s we have a string of Html with single escaped values like ' user entered text &#x2F.
            // In the above, the bloging engine is escaping the sinlge escaped entity encoding, so all you'll see is the entity it self.
            // but it should look like the double encoded entity encodings without the first &amp->;


            string singleEncodedHtml = doubleEncodedHtml.SingleDecodeDoubleEncodedHtml();
            
            Assert.That(singleEncodedHtml, Is.EqualTo(singleEncodedHtmlShouldLookLike));
        }

        [Test]
        public void Extensions_CompliesWithWhitelist_ShouldNotComply()
        {
            Assert.That(_inNeedOfEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,]+$"), Is.False);
        }

        [Test]
        public void Extensions_CompliesWithWhitelist_ShouldComply()
        {
            Assert.That(_noNeedForEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,]+$"), Is.True);
            Assert.That(_inNeedOfEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,#/&'<"">]+$"), Is.True);
        }
    }
}

Now the code that satisfies the above executable specifications, and more.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Text.RegularExpressions;

namespace Common.Security.Sanitisation
{
    /// <summary>
    /// Provides a series of extension methods that perform sanitisation.
    /// Escaping, unescaping, etc.
    /// Usually targeted at user input, to help defend against the likes of XSS and other injection attacks.
    /// </summary>
    public static class Extensions
    {

        private const int CharacterIndexNotFound = -1;

        /// <summary>
        /// Returns a new string in which all occurrences of a double escaped html character (that's an html entity immediatly prefixed with another html entity)
        /// in the current instance are replaced with the single escaped character.
        /// </summary>
        /// <param name="source">The target text used to strip one layer of Html entity encoding.</param>
        /// <returns>The singly escaped text.</returns>
        public static string SingleDecodeDoubleEncodedHtml(this string source)
        {
            return source.Replace("&amp;#x", "&#x");
        }

        /// <summary>
        /// Filter a text against a regular expression whitelist of specified characters.
        /// </summary>
        /// <param name="target">The text that is filtered using the whitelist.</param>
        /// <param name="alternativeTarget"></param>
        /// <param name="whiteList">Needs to be be assigned a valid whitelist, otherwise nothing gets through.</param>
        public static bool CompliesWithWhitelist(this string target, string alternativeTarget = "", string whiteList = "")
        {
            if (string.IsNullOrEmpty(target))
                target = alternativeTarget;
            
            return Regex.IsMatch(target, whiteList);
        }

        /// <summary>
        /// Takes a string and returns another with a single layer of Html entity encoding replaced with it's Html entity literals.
        /// </summary>
        /// <param name="encodedUserInput">The text to perform the opperation on.</param>
        /// <param name="numberOfEscapes">The number of Html entity encodings that were replaced.</param>
        /// <returns>The text that's had a single layer of Html entity encoding replaced with it's Html entity literals.</returns>
        public static string HtmlDecode(this string encodedUserInput, ref int numberOfEscapes)
        {
            const int NotFound = -1;

            if (string.IsNullOrEmpty(encodedUserInput))
                return string.Empty;

            StringWriter output = new StringWriter(CultureInfo.InvariantCulture);
            
            if (encodedUserInput.IndexOf('&') == NotFound)
            {
                output.Write(encodedUserInput);
            }
            else
            {
                int length = encodedUserInput.Length;
                for (int index1 = 0; index1 < length; ++index1)
                {
                    char ch1 = encodedUserInput[index1];
                    if (ch1 == 38)
                    {
                        int index2 = encodedUserInput.IndexOfAny(_htmlEntityEndingChars, index1 + 1);
                        if (index2 > 0 && encodedUserInput[index2] == 59)
                        {
                            string entity = encodedUserInput.Substring(index1 + 1, index2 - index1 - 1);
                            if (entity.Length > 1 && entity[0] == 35)
                            {
                                ushort result;
                                if (entity[1] == 120 || entity[1] == 88)
                                    ushort.TryParse(entity.Substring(2), NumberStyles.AllowHexSpecifier, NumberFormatInfo.InvariantInfo, out result);
                                else
                                    ushort.TryParse(entity.Substring(1), NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite | NumberStyles.AllowLeadingSign, NumberFormatInfo.InvariantInfo, out result);
                                if (result != 0)
                                {
                                    ch1 = (char)result;
                                    numberOfEscapes++;
                                    index1 = index2;
                                }
                            }
                            else
                            {
                                index1 = index2;
                                char ch2 = HtmlEntities.Lookup(entity);
                                if ((int)ch2 != 0)
                                {
                                    ch1 = ch2;
                                    numberOfEscapes++;
                                }
                                else
                                {
                                    output.Write('&');
                                    output.Write(entity);
                                    output.Write(';');
                                    continue;
                                }
                            }
                        }
                    }
                    output.Write(ch1);
                }
            }
            string decodedHtml = output.ToString();
            output.Dispose();
            return decodedHtml;
        }

        /// <summary>
        /// Escapes all character entity references (double escaping where necessary).
        /// Why? The XmlTextReader that is setup in XmlDocument.LoadXml on the service considers the character entity references (&#xxxx;) to be the character they represent.
        /// All XML is converted to unicode on reading and any such entities are removed in favor of the unicode character they represent.
        /// </summary>
        /// <param name="unencodedUserInput">The string that needs to be escaped.</param>
        /// <param name="numberOfEscapes">The number of escapes applied.</param>
        /// <returns>The escaped text.</returns>
        public static unsafe string HtmlEncode(this string unencodedUserInput, ref int numberOfEscapes)
        {
            if (string.IsNullOrEmpty(unencodedUserInput))
                return string.Empty;

            StringWriter output = new StringWriter(CultureInfo.InvariantCulture);
            
            if (output == null)
                throw new ArgumentNullException("output");
            int num1 = IndexOfHtmlEncodingChars(unencodedUserInput);
            if (num1 == -1)
            {
                output.Write(unencodedUserInput);
            }
            else
            {
                int num2 = unencodedUserInput.Length - num1;
                fixed (char* chPtr1 = unencodedUserInput)
                {
                    char* chPtr2 = chPtr1;
                    while (num1-- > 0)
                        output.Write(*chPtr2++);
                    while (num2-- > 0)
                    {
                        char ch = *chPtr2++;
                        if (ch <= 62)
                        {
                            switch (ch)
                            {
                                case '"':
                                    output.Write(""");
                                    numberOfEscapes++;
                                    continue;
                                case '&':
                                    output.Write("&amp;");
                                    numberOfEscapes++;
                                    continue;
                                case '\'':
                                    output.Write("&amp;#x27;");
                                    numberOfEscapes = numberOfEscapes + 2;
                                    continue;
                                case '<':
                                    output.Write("<");
                                    numberOfEscapes++;
                                    continue;
                                case '>':
                                    output.Write(">");
                                    numberOfEscapes++;
                                    continue;
                                case '/':
                                    output.Write("&amp;#x2F;");
                                    numberOfEscapes = numberOfEscapes + 2;
                                    continue;
                                default:
                                    output.Write(ch);
                                    continue;
                            }
                        }
                        if (ch >= 160 && ch < 256)
                        {
                            output.Write("&#");
                            output.Write(((int)ch).ToString(NumberFormatInfo.InvariantInfo));
                            output.Write(';');
                            numberOfEscapes++;
                        }
                        else
                            output.Write(ch);
                    }
                }
            }
            string encodedHtml = output.ToString();
            output.Dispose();
            return encodedHtml;
        }

        private static unsafe int IndexOfHtmlEncodingChars(string searchString)
        {
            int num = searchString.Length;
            fixed (char* chPtr1 = searchString)
            {
                char* chPtr2 = (char*)((UIntPtr)chPtr1);
                for (; num > 0; --num)
                {
                    char ch = *chPtr2;
                    if (ch <= 62)
                    {
                        switch (ch)
                        {
                            case '"':
                            case '&':
                            case '\'':
                            case '<':
                            case '>':
                            case '/':
                                return searchString.Length - num;
                        }
                    }
                    else if (ch >= 160 && ch < 256)
                        return searchString.Length - num;
                    ++chPtr2;
                }
            }
            return CharacterIndexNotFound;
        }

        private static char[] _htmlEntityEndingChars = new char[2]
        {
            ';',
            '&'
        };

        private static class HtmlEntities
        {
            private static string[] _entitiesList = new string[253]
            {
                "\"-quot",
                "&-amp",
                "'-apos",
                "<-lt",
                ">-gt",
                " -nbsp",
                "¡-iexcl",
                "¢-cent",
                "£-pound",
                "¤-curren",
                "¥-yen",
                "¦-brvbar",
                "§-sect",
                "¨-uml",
                "©-copy",
                "ª-ordf",
                "«-laquo",
                "¬-not",
                "\x00AD-shy",
                "®-reg",
                "¯-macr",
                "°-deg",
                "±-plusmn",
                "\x00B2-sup2",
                "\x00B3-sup3",
                "´-acute",
                "µ-micro",
                "¶-para",
                "·-middot",
                "¸-cedil",
                "\x00B9-sup1",
                "º-ordm",
                "»-raquo",
                "\x00BC-frac14",
                "\x00BD-frac12",
                "\x00BE-frac34",
                "¿-iquest",
                "À-Agrave",
                "Á-Aacute",
                "Â-Acirc",
                "Ã-Atilde",
                "Ä-Auml",
                "Å-Aring",
                "Æ-AElig",
                "Ç-Ccedil",
                "È-Egrave",
                "É-Eacute",
                "Ê-Ecirc",
                "Ë-Euml",
                "Ì-Igrave",
                "Í-Iacute",
                "Î-Icirc",
                "Ï-Iuml",
                "Ð-ETH",
                "Ñ-Ntilde",
                "Ò-Ograve",
                "Ó-Oacute",
                "Ô-Ocirc",
                "Õ-Otilde",
                "Ö-Ouml",
                "×-times",
                "Ø-Oslash",
                "Ù-Ugrave",
                "Ú-Uacute",
                "Û-Ucirc",
                "Ü-Uuml",
                "Ý-Yacute",
                "Þ-THORN",
                "ß-szlig",
                "à-agrave",
                "á-aacute",
                "â-acirc",
                "ã-atilde",
                "ä-auml",
                "å-aring",
                "æ-aelig",
                "ç-ccedil",
                "è-egrave",
                "é-eacute",
                "ê-ecirc",
                "ë-euml",
                "ì-igrave",
                "í-iacute",
                "î-icirc",
                "ï-iuml",
                "ð-eth",
                "ñ-ntilde",
                "ò-ograve",
                "ó-oacute",
                "ô-ocirc",
                "õ-otilde",
                "ö-ouml",
                "÷-divide",
                "ø-oslash",
                "ù-ugrave",
                "ú-uacute",
                "û-ucirc",
                "ü-uuml",
                "ý-yacute",
                "þ-thorn",
                "ÿ-yuml",
                "Œ-OElig",
                "œ-oelig",
                "Š-Scaron",
                "š-scaron",
                "Ÿ-Yuml",
                "ƒ-fnof",
                "\x02C6-circ",
                "˜-tilde",
                "Α-Alpha",
                "Β-Beta",
                "Γ-Gamma",
                "Δ-Delta",
                "Ε-Epsilon",
                "Ζ-Zeta",
                "Η-Eta",
                "Θ-Theta",
                "Ι-Iota",
                "Κ-Kappa",
                "Λ-Lambda",
                "Μ-Mu",
                "Ν-Nu",
                "Ξ-Xi",
                "Ο-Omicron",
                "Π-Pi",
                "Ρ-Rho",
                "Σ-Sigma",
                "Τ-Tau",
                "Υ-Upsilon",
                "Φ-Phi",
                "Χ-Chi",
                "Ψ-Psi",
                "Ω-Omega",
                "α-alpha",
                "β-beta",
                "γ-gamma",
                "δ-delta",
                "ε-epsilon",
                "ζ-zeta",
                "η-eta",
                "θ-theta",
                "ι-iota",
                "κ-kappa",
                "λ-lambda",
                "μ-mu",
                "ν-nu",
                "ξ-xi",
                "ο-omicron",
                "π-pi",
                "ρ-rho",
                "ς-sigmaf",
                "σ-sigma",
                "τ-tau",
                "υ-upsilon",
                "φ-phi",
                "χ-chi",
                "ψ-psi",
                "ω-omega",
                "ϑ-thetasym",
                "ϒ-upsih",
                "ϖ-piv",
                " -ensp",
                " -emsp",
                " -thinsp",
                "\x200C-zwnj",
                "\x200D-zwj",
                "\x200E-lrm",
                "\x200F-rlm",
                "–-ndash",
                "—-mdash",
                "‘-lsquo",
                "’-rsquo",
                "‚-sbquo",
                "“-ldquo",
                "”-rdquo",
                "„-bdquo",
                "†-dagger",
                "‡-Dagger",
                "•-bull",
                "…-hellip",
                "‰-permil",
                "′-prime",
                "″-Prime",
                "‹-lsaquo",
                "›-rsaquo",
                "‾-oline",
                "⁄-frasl",
                "€-euro",
                "ℑ-image",
                "℘-weierp",
                "ℜ-real",
                "™-trade",
                "ℵ-alefsym",
                "←-larr",
                "↑-uarr",
                "→-rarr",
                "↓-darr",
                "↔-harr",
                "↵-crarr",
                "⇐-lArr",
                "⇑-uArr",
                "⇒-rArr",
                "⇓-dArr",
                "⇔-hArr",
                "∀-forall",
                "∂-part",
                "∃-exist",
                "∅-empty",
                "∇-nabla",
                "∈-isin",
                "∉-notin",
                "∋-ni",
                "∏-prod",
                "∑-sum",
                "−-minus",
                "∗-lowast",
                "√-radic",
                "∝-prop",
                "∞-infin",
                "∠-ang",
                "∧-and",
                "∨-or",
                "∩-cap",
                "∪-cup",
                "∫-int",
                "∴-there4",
                "∼-sim",
                "≅-cong",
                "≈-asymp",
                "≠-ne",
                "≡-equiv",
                "≤-le",
                "≥-ge",
                "⊂-sub",
                "⊃-sup",
                "⊄-nsub",
                "⊆-sube",
                "⊇-supe",
                "⊕-oplus",
                "⊗-otimes",
                "⊥-perp",
                "⋅-sdot",
                "⌈-lceil",
                "⌉-rceil",
                "⌊-lfloor",
                "⌋-rfloor",
                "〈-lang",
                "〉-rang",
                "◊-loz",
                "♠-spades",
                "♣-clubs",
                "♥-hearts",
                "♦-diams"
            };

            private static Dictionary<string, char> _lookupTable = GenerateLookupTable();

            private static Dictionary<string, char> GenerateLookupTable()
            {
                Dictionary<string, char> dictionary = new Dictionary<string, char>(StringComparer.Ordinal);
                foreach (string str in _entitiesList)
                    dictionary.Add(str.Substring(2), str[0]);
                return dictionary;
            }

            public static char Lookup(string entity)
            {
                char ch;
                _lookupTable.TryGetValue(entity, out ch);
                return ch;
            }
        }
    }
}

You may also notice that I’ve mocked the OperationContext.
Thanks to WCFMock, a mocking framework for WCF services.
I won’t include this code, but you can get it here.
I’ve used the popular NUnit test framework and RhinoMocks for the stubbing and mocking.
Both pulled into the solution using NuGet.
Most useful documentation for RhinoMocks:
http://ayende.com/Wiki/Rhino+Mocks+3.5.ashx
http://ayende.com/wiki/Rhino+Mocks.ashx

For this project I used NLog and wrapped it.
Now you start to get an idea of how to use the sanitisation.

using System;
using System.ServiceModel;
using System.ServiceModel.Channels;
using NUnit.Framework;
using System.Configuration;
using Rhino.Mocks;
using Common.Wrapper.Log;
using MockedOperationContext = System.ServiceModel.Web.MockedOperationContext;
using Common.WcfHelpers.ErrorHandling.Exceptions;

namespace Sanitisation.UnitTest
{
    [TestFixture]
    public class SanitiseTest
    {
        private const string _myTestIpv4Address = "My.Test.Ipv4.Address";
        private readonly int _maxLengthHtmlEncodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlEncodedUserInput"]);
        private readonly int _maxLengthHtmlDecodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlDecodedUserInput"]);
        private readonly string _encodedUserInput_thatsMaxDecodedLength = @"One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.";
        private readonly string _decodedUserInput_thatsMaxLength = @"One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.";

        [Test]
        public void Sanitise_UserInput_WhenGivenNull_ShouldReturnEmptyString()
        {
            Assert.That(new Sanitise().UserInput(null), Is.EqualTo(string.Empty));
        }

        [Test]
        public void Sanitise_UserInput_WhenGivenEmptyString_ShouldReturnEmptyString()
        {
            Assert.That(new Sanitise().UserInput(string.Empty), Is.EqualTo(string.Empty));
        }

        [Test]
        public void Sanitise_UserInput_WhenGivenSanitisedString_ShouldReturnSanitisedString()
        {
            // Open the whitelist up in order to test the encoding without restriction.
            Assert.That(new Sanitise(whiteList: @"^[\w\s\.,#/&'<"">]+$").UserInput(_encodedUserInput_thatsMaxDecodedLength), Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength));
        }

[Test]
[ExpectedException(typeof(SanitisationWcfException))]
public void Sanitise_UserInput_ShouldThrowExceptionIfEscapedInputToLong()
{
string fourThousandAndOneCharacters = "Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand character";
string expectedError = "The un-modified string received from the client with the following IP address: " +
'"' + _myTestIpv4Address + "\" " +
"exceeded the allowed maximum length of an escaped Html user input string. " +
"The maximum length allowed is: " +
_maxLengthHtmlEncodedUserInput +
". The length was: " +
(_maxLengthHtmlEncodedUserInput+1) + ".";

using(new MockedOperationContext(StubbedOperationContext))
{
try
{
new Sanitise().UserInput(fourThousandAndOneCharacters);
}
catch(SanitisationWcfException e)
{
Assert.That(e.Message, Is.EqualTo(expectedError));
Assert.That(e.UnsanitisedAnswer, Is.EqualTo(fourThousandAndOneCharacters));
throw;
}
}
}

        [Test]
        [ExpectedException(typeof(SanitisationWcfException))]
        public void Sanitise_UserInput_DecodedUserInputShouldThrowException_WhenMaxLengthHtmlDecodedUserInputIsExceeded()
        {
            char oneCharOverTheLimit = '.';
            string expectedError =
                           "The string received from the client with the following IP address: " +
                           "\"" + _myTestIpv4Address + "\" " +
                           "after Html decoding exceded the allowed maximum length of an un-escaped Html user input string." +
                           Environment.NewLine +
                           "The maximum length allowed is: " + _maxLengthHtmlDecodedUserInput + ". The length was: " +
                           (_decodedUserInput_thatsMaxLength + oneCharOverTheLimit).Length + oneCharOverTheLimit;

            using(new MockedOperationContext(StubbedOperationContext))
            {
                try
                {
                    new Sanitise().UserInput(_encodedUserInput_thatsMaxDecodedLength + oneCharOverTheLimit);
                }
                catch(SanitisationWcfException e)
                {
                    Assert.That(e.Message, Is.EqualTo(expectedError));
                    Assert.That(e.UnsanitisedAnswer, Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength + oneCharOverTheLimit));
                    throw;
                }
            }
        }

        [Test]
        public void Sanitise_UserInput_ShouldLogAndSendEmail_IfNumberOfDecodedHtmlEntitiesDoesNotMatchNumberOfEscapes()
        {
            string encodedUserInput_with6HtmlEntitiesNotEscaped = _encodedUserInput_thatsMaxDecodedLength.Replace("&amp;#x2F;", "/");
            string errorWeAreExpecting =
                "It appears as if someone has circumvented the client side Html entity encoding." + Environment.NewLine +
                "The requesting IP address was: " +
                "\"" + _myTestIpv4Address + "\" " +
                "The sanitised input we receive from the client was the following:" + Environment.NewLine +
                "\"" + encodedUserInput_with6HtmlEntitiesNotEscaped + "\"" + Environment.NewLine +
                "The same input after decoding and re-escaping on the server side was the following:" + Environment.NewLine +
                "\"" + _encodedUserInput_thatsMaxDecodedLength + "\"";
            string sanitised;
            // setup _logger
            ILogger logger = MockRepository.GenerateMock<ILogger>();
            logger.Expect(lgr => lgr.logError(errorWeAreExpecting));

            Sanitise sanitise = new Sanitise(@"^[\w\s\.,#/&'<"">]+$", logger);

            using (new MockedOperationContext(StubbedOperationContext))
            {
                // Open the whitelist up in order to test the encoding etc.
                sanitised = sanitise.UserInput(encodedUserInput_with6HtmlEntitiesNotEscaped);
            }

            Assert.That(sanitised, Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength));
            logger.VerifyAllExpectations();
        }        

        private static IOperationContext StubbedOperationContext
        {
            get
            {
                IOperationContext operationContext = MockRepository.GenerateStub<IOperationContext>();
                int port = 80;
                RemoteEndpointMessageProperty remoteEndpointMessageProperty = new RemoteEndpointMessageProperty(_myTestIpv4Address, port);
                operationContext.Stub(oc => oc.IncomingMessageProperties[RemoteEndpointMessageProperty.Name]).Return(remoteEndpointMessageProperty);
                return operationContext;
            }
        }
    }
}

Now the API code that we can use to do our sanitisation.

using System;
using System.Configuration;
// Todo : KC We need time to implement DI. Should be using something like ninject.extensions.wcf.
using OperationContext = System.ServiceModel.Web.MockedOperationContext;
using System.ServiceModel.Channels;
using Common.Security.Sanitisation;
using Common.WcfHelpers.ErrorHandling.Exceptions;
using Common.Wrapper.Log;

namespace Sanitisation
{

    public class Sanitise
    {
        private readonly string _whiteList;
        private readonly ILogger _logger;
        

        private string RequestingIpAddress
        {
            get
            {
                RemoteEndpointMessageProperty remoteEndpointMessageProperty = OperationContext.Current.IncomingMessageProperties[RemoteEndpointMessageProperty.Name] as RemoteEndpointMessageProperty;
                return ((remoteEndpointMessageProperty != null) ? remoteEndpointMessageProperty.Address : string.Empty);
            }
        }

        /// <summary>
        /// Provides server side escaping of Html entities, and runs the supplied whitelist character filter over the user input string.
        /// </summary>
        /// <param name="whiteList">Should be provided by DI from the ResourceFile.</param>
        /// <param name="logger">Should be provided by DI. Needs to be an asynchronous logger.</param>
        /// <example>
        /// The whitelist can be obtained from a ResourceFile like so...
        /// <code>
        /// private Resource _resource;
        /// _resource.GetString("WhiteList");
        /// </code>
        /// </example>
        public Sanitise(string whiteList = "", ILogger logger = null)
        {
            _whiteList = whiteList;
            _logger = logger ?? new Logger();
        }

        /// <summary>
        /// 1) Check field lengths.         Client side validation may have been negated.
        /// 2) Check against white list.	Client side validation may have been negated.
        /// 3) Check Html escaping.         Client side validation may have been negated.

        /// Generic Fail actions:	Drop the payload. No point in trying to massage and save, as it won't be what the user was expecting,
        ///                         Add full error to a WCFException Message and throw.
        ///                         WCF interception reads the WCFException.MessageForClient, and sends it to the user. 
        ///                         On return, log the WCFException's Message.
        ///                         
        /// Escape Fail actions:	Asynchronously Log and email full error to support.


        /// 1) BA confirmed 50 for text, and 400 for textarea.
        ///     As we don't know the field type, we'll have to go for 400."
        ///
        ///     First we need to check that we haven't been sent some huge string.
        ///     So we check that the string isn't longer than 400 * 10 = 4000.
        ///     10 is the length of our double escaped character references.
        ///     Or, we ask the business for a number."
        ///     If we fail here, perform Generic Fail actions and don't complete the following steps.
        /// 
        ///     Convert all Html Entity Encodings back to their equivalent characters, and count how many occurrences.
        ///
        ///     If the string is longer than 400, perform Generic Fail actions and don't complete the following steps.
        /// 
        /// 2) check all characters against the white list
        ///     If any don't match, perform Generic Fail actions and don't complete the following steps.
        /// 
        /// 3) re html escape (as we did in JavaScript), and count how many escapes.
        ///     If count is greater than the count of Html Entity Encodings back to their equivalent characters,
        ///     Perform Escape Fail actions. Return sanitised string.
        /// 
        ///     If we haven't returned, return sanitised string.
        
        
        /// Performs checking on the text passed in, to verify that client side escaping and whitelist validation has already been performed.
        /// Performs decoding, and re-encodes. Counts that the number of escapes was the same, otherwise we log and send email with the details to support.
        /// Throws exception if the client side validation failed to restrict the number of characters in the escaped string we received.
        ///     This needs to be intercepted at the service.
        ///     The exceptions default message for client needs to be passed back to the user.
        ///     On return, the interception needs to log the exception's message.
        /// </summary>
        /// <param name="sanitiseMe"></param>
        /// <returns></returns>
        public string UserInput(string sanitiseMe)
        {
            if (string.IsNullOrEmpty(sanitiseMe))
                return string.Empty;

            ThrowExceptionIfEscapedInputToLong(sanitiseMe);

            int numberOfDecodedHtmlEntities = 0;
            string decodedUserInput = HtmlDecodeUserInput(sanitiseMe, ref numberOfDecodedHtmlEntities);

            if(!decodedUserInput.CompliesWithWhitelist(whiteList: _whiteList))
            {
                string error = "The answer received from client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "had characters that failed to match the whitelist.";
                throw new SanitisationWcfException(error);
            }

            int numberOfEscapes = 0;
            string sanitisedUserInput = decodedUserInput.HtmlEncode(ref numberOfEscapes);

            if(numberOfEscapes != numberOfDecodedHtmlEntities)
            {
                AsyncLogAndEmail(sanitiseMe, sanitisedUserInput);
            }

            return sanitisedUserInput;
        }

        /// <note>
        /// Make sure the logger is setup to log asynchronously
        /// </note>
        private void AsyncLogAndEmail(string sanitiseMe, string sanitisedUserInput)
        {
            // no need for SanitisationException

            _logger.logError(
                "It appears as if someone has circumvented the client side Html entity encoding." + Environment.NewLine +
                "The requesting IP address was: " +
                "\"" + RequestingIpAddress + "\" " +
                "The sanitised input we receive from the client was the following:" + Environment.NewLine +
                "\"" + sanitiseMe + "\"" + Environment.NewLine +
                "The same input after decoding and re-escaping on the server side was the following:" + Environment.NewLine +
                "\"" + sanitisedUserInput + "\""
                );
        }

        /// <summary>
        /// This procedure may throw a SanitisationWcfException.
        /// If it does, ErrorHandlerBehaviorAttribute will need to pass the "messageForClient" back to the client from within the IErrorHandler.ProvideFault procedure.
        /// Once execution is returned, the IErrorHandler.HandleError procedure of ErrorHandlerBehaviorAttribute
        /// will continue to process the exception that was thrown in the way of logging sensitive info.
        /// </summary>
        /// <param name="toSanitise"></param>
        private void ThrowExceptionIfEscapedInputToLong(string toSanitise)
        {
            int maxLengthHtmlEncodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlEncodedUserInput"]);
            if (toSanitise.Length > maxLengthHtmlEncodedUserInput)
            {
                string error = "The un-modified string received from the client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "exceeded the allowed maximum length of an escaped Html user input string. " +
                    "The maximum length allowed is: " +
                    maxLengthHtmlEncodedUserInput +
                    ". The length was: " +
                    toSanitise.Length + ".";
                throw new SanitisationWcfException(error, unsanitisedAnswer: toSanitise);
            }
        }

        private string HtmlDecodeUserInput(string doubleEncodedUserInput, ref int numberOfDecodedHtmlEntities)
        {
            string decodedUserInput = doubleEncodedUserInput.HtmlDecode(ref numberOfDecodedHtmlEntities).HtmlDecode(ref numberOfDecodedHtmlEntities) ?? string.Empty;
            
            // if the decoded string is longer than MaxLengthHtmlDecodedUserInput throw
            int maxLengthHtmlDecodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlDecodedUserInput"]);
            if(decodedUserInput.Length > maxLengthHtmlDecodedUserInput)
            {
                throw new SanitisationWcfException(
                    "The string received from the client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "after Html decoding exceded the allowed maximum length of an un-escaped Html user input string." +
                    Environment.NewLine +
                    "The maximum length allowed is: " + maxLengthHtmlDecodedUserInput + ". The length was: " +
                    decodedUserInput.Length + ".",
                    unsanitisedAnswer: doubleEncodedUserInput
                    );
            }
            return decodedUserInput;
        }
    }
}

As you can see, there’s a lot more work in the server side sanitisation than the client side.

Tags:.NET, application security, bootstrap, C#, Information Security, Infosec, injection, Security, sql injection, Threat Modelling, Web Security, XSRF, XSS
Posted in Architecture, C#, JavaScript, Networking, Scripting, Security, Web | 2 Comments »

Sanitising User Input from Browser. part 1

November 4, 2012

I was working on a web based project recently where there was no security thought about when designing, developing it.
The following outlines my experience with retrofitting security.
It’s my hope that someone will find it useful for their own implementation.

We’ll be focussing on the client side in this post (part 1) and the server side in part 2.
We’ll also cover some preliminary discussion that will set the stage for this series.

The application consists of a WCF service delivering up content to some embedding code on any page in the browser.
The content is stored as Xml in the database and transformed into Html via Xslt.

The first activity I find useful is to go through the process of Threat Modelling the Application.
This process can be quite daunting for those new to it.
Here’s a couple of references I find quite useful to get started:

https://www.owasp.org/index.php/Application_Threat_Modeling

https://www.owasp.org/index.php/Threat_Risk_Modeling#Decompose_Application

Actually this ones not bad either.

There is no single right way to do this.
The more you read and experiment, the more equipped you will be.
The idea is to think like an attacker thinks.
This may be harder for some than others, but it is essential, to cover as many potential attack vectors as possible.
Remember, there is no secure system, just varying levels of insecurity.
It will always be much harder to discover the majority of security weaknesses in your application as the person or team creating/maintaining it,
than for the person attacking it.
The Threat Modelling topic is large and I’m not going to go into it here, other than to say, you need to go into it.

Threat Agents

Work out who your Threat Agents are likely to be.
Learn how to think like they do.
Learn what skills they have and learn the skills your self.
Sometimes the skills are very non technical.
For example walking through the door of your organisation in the weekend because the cleaners (or any one with access) forgot to lock up.
Or when the cleaners are there and the technical staff are not (which is just as easy).
It happens more often than we like to believe.

Defense in Depth

To attempt to mitigate attacks, we need to take a multi layered approach (often called defence in depth).

What made me decide to start with sanitising user input from the browser anyway?
Well according to the OWASP Top 10, Injection and Cross Site Scripting (XSS) are still the most popular techniques chosen to compromise web applications.
So it makes sense if your dealing with web apps, to target the most common techniques exploited.

Now, in regards to defence in depth when discussing web applications;
If the attacker gets past the first line of defence, there should be something stopping them at the next layer and so forth.
The aim is to stop the attack as soon as possible.
This is why we focus on the UI first, and later move our focus to the application server, then to the database.
Bear in mind though, that what ever we do on the client side, can be circumvented relatively easy.
Client side code is out of our control, so it’s best effort.
Because of this, we need to perform the following not only in the browser, but as much as possible on the server side as well.

Minimising the attack surface
Defining maximum field lengths (validation)
Determining a white list of allowable characters (validation)
Escaping untrusted data, especially where you know it’s going to endup in an execution context. Even where you don’t think this is likely, it’s still possible.
Using Stored Procedures / parameterised queries (not covered in this series).
Least Privilege.
Minimising the privileges assigned to every database account (not covered in this series).

Minimising the attack surface

input fields should only allow certain characters to be input.
Text input fields, textareas etc that are free form (anything is allowed) are very hard to constrain to a small white list.
input fields where ever possible should be constrained to well structured data,
like dates, social security numbers, zip codes, e-mail addresses, etc. then the developer should be able to define a very strong validation pattern, usually based on regular expressions, for validating such input. If the input field comes from a fixed set of options, like a drop down list or radio buttons, then the input needs to match exactly one of the values offered to the user in the first place.
As it was with the existing app I was working on, we had to allow just about everything in our free form text fields.
This will have to be re-designed in order to provide constrained input.

Defining maximum field lengths (validation)

This was currently being done (sometimes) in the Xml content for inputs where type="text".
Don’t worry about the inputType="single", it gets transformed.

<input id="2" inputType="single" type="text" size="10" maxlength="10" />

And if no maxlength specified in the Xml, we now specify a default of 50 in the xsl used to do the transformation.
This way we had the input where type="text" covered for the client side.
This would also have to be validated on the server side when the service received values from these inputs where type="text".

    <xsl:template match="input[@inputType='single']">
      <xsl:value-of select="@text" />
        <input name="a{@id}" type="text" id="a{@id}" class="textareaSingle">
          <xsl:attribute name="value">
            <xsl:choose>
              <xsl:when test="key('response', @id)">
                <xsl:value-of select="key('response', @id)" />
              </xsl:when>
              <xsl:otherwise>
                <xsl:value-of select="string(' ')" />
              </xsl:otherwise>
            </xsl:choose>
          </xsl:attribute>
          <xsl:attribute name="maxlength">
            <xsl:choose>
              <xsl:when test="@maxlength">
                <xsl:value-of select="@maxlength"/>
              </xsl:when>
              <xsl:otherwise>50</xsl:otherwise>
            </xsl:choose>
          </xsl:attribute>
        </input>
        <br/>
    </xsl:template>

For textareas we added maxlength validation as part of the white list validation.
See below for details.

Determining a white list of allowable characters (validation)

See bottom of this section for Update

Now this was quite an interesting exercise.
I needed to apply a white list to all characters being entered into the input fields.
A user can:

type the characters in
[ctrl]+[v] a clipboard full of characters in
right click -> Paste

To cover all these scenarios as elegantly as possible, was going to be a bit of a challenge.
I looked at a few JavaScript libraries including one or two JQuery plug-ins.
None of them covered all these scenarios effectively.
I wish they did, because the solution I wasn’t totally happy with, because it required polling.
In saying that, I measured performance, and even bringing the interval right down had negligible effect, and it covered all scenarios.

setupUserInputValidation = function () {

  var textAreaMaxLength = 400;
  var elementsToValidate;
  var whiteList = /[^A-Za-z_0-9\s.,]/g;

  var elementValue = {
    textarea: '',
    textareaChanged: function (obj) {
      var initialValue = obj.value;
      var replacedValue = initialValue.replace(whiteList, "").slice(0, textAreaMaxLength);
      if (replacedValue !== initialValue) {
        this.textarea = replacedValue;
        return true;
      }
      return false;
    },
    inputtext: '',
    inputtextChanged: function (obj) {
      var initialValue = obj.value;
      var replacedValue = initialValue.replace(whiteList, "");
      if (replacedValue !== initialValue) {
        this.inputtext = replacedValue;
        return true;
      }
      return false;
    }
  };

  elementsToValidate = {
    textareainputelements: (function () {
      var elements = $('#page' + currentPage).find('textarea');
      if (elements.length > 0) {
        return elements;
      }
      return 'no elements found';
    } ()),
    textInputElements: (function () {
      var elements = $('#page' + currentPage).find('input[type=text]');
      if (elements.length > 0) {
        return elements;
      }
      return 'no elements found';
    } ())
  };

  // store the intervals id in outer scope so we can clear the interval when we change pages.
  userInputValidationIntervalId = setInterval(function () {
    var element;

    // Iterate through each one and remove any characters not in the whitelist.
    // Iterate through each one and trim any that are longer than textAreaMaxLength.

    for (element in elementsToValidate) {
      if (elementsToValidate.hasOwnProperty(element)) {
        if (elementsToValidate[element] === 'no elements found')
          continue;

        $.each(elementsToValidate[element], function () {
          $(this).attr('value', function () {
            var name = $(this).prop('tagName').toLowerCase();
            name = name === 'input' ? name + $(this).prop('type') : name;
            if (elementValue[name + 'Changed'](this))
              this.value = elementValue[name];
          });
        });
      }
    }
  }, 300); // milliseconds
};

Each time we change page, we clear the interval and reset it for the new page.

clearInterval(userInputValidationIntervalId);

setupUserInputValidation();

Update 2013-06-02:

Now with HTML5 we have the pattern attribute on the input tag, which allows us to specify a regular expression that the text about to be received is checked against. We can also see it here amongst the new HTML5 attributes . If used, this can make our JavaScript white listing redundant, providing we don’t have textareas which W3C has neglected to include the new pattern attribute on. I’d love to know why?

Escaping untrusted data

Escaped data will still render in the browser properly.
Escaping simply lets the interpreter know that the data is not intended to be executed,
and thus prevents the attack.

Now what we do here is extend the String prototype with a function called htmlEscape.

if (typeof Function.prototype.method !== "function") {
  Function.prototype.method = function (name, func) {
    this.prototype[name] = func;
    return this;
  };
}

String.method('htmlEscape', function () {

  // Escape the following characters with HTML entity encoding to prevent switching into any execution context,
  // such as script, style, or event handlers.
  // Using hex entities is recommended in the spec.
  // In addition to the 5 characters significant in XML (&, <, >, ", '), the forward slash is included as it helps to end an HTML entity.
  var character = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    // Double escape character entity references.
    // Why?
    // The XmlTextReader that is setup in XmlDocument.LoadXml on the service considers the character entity references () to be the character they represent.
    // All XML is converted to unicode on reading and any such entities are removed in favor of the unicode character they represent.
    // So we double escape character entity references.
    // These now get read to the XmlDocument and saved to the database as double encoded Html entities.
    // Now when these values are pulled from the database and sent to the browser, it decodes the & and displays #x27; and/or #x2F.
    // This isn't what we want to see in the browser.
    "'": '&amp;#x27;',    // &apos; is not recommended
    '/': '&amp;#x2F;'     // forward slash is included as it helps end an HTML entity
  };

  return function () {
    return this.replace(/[&<>"'/]/g, function (c) {
      return character[c];
    });
  };
}());

This allows us to, well, html escape our strings.

element.value.htmlEscape();

In looking through here,
The only untrusted data we are capturing is going to be inserted into an Html element

tag by way of insertion into a textarea element,
or the attribute value of input elements where type="text".
I initially thought I’d have to:

Html escape the untrusted data which is only being captured from textarea elements.
Attribute escape the untrusted data which is being captured from the value attribute of input elements where type="text".

RULE #2 – Attribute Escape Before Inserting Untrusted Data into HTML Common Attributes of here,
mentions
“Properly quoted attributes can only be escaped with the corresponding quote.”
So I decided to test it.
Created a collection of injection attacks. None of which worked.
Turned out we only needed to Html escape for the untrusted data that was going to be inserted into the textarea element.
More on this in a bit.

Now in regards to the code comments in the above code around having to double escape character entity references;
Because we’re sending the strings to the browser, it’s easiest to single decode the double encoded Html on the service side only.
Now because we’re still focused on the client side sanitisation,
and we are going to shift our focus soon to making sure we cover the server side,
we know we’re going to have to create some sanitisation routines for our .NET service.
Because the routines are quite likely going to be static, and we’re pretty much just dealing with strings,
lets create an extensions class in a new project in a common library we’ve already got.
This will allow us to get the widest use out of our sanitisation routines.
It also allows us to wrap any existing libraries or parts of them that we want to get use of.

namespace My.Common.Security.Encoding
{
    /// <summary>
    /// Provides a series of extension methods that perform sanitisation.
    /// Escaping, unescaping, etc.
    /// Usually targeted at user input, to help defend against the likes of XSS attacks.
    /// </summary>
    public static class Extensions
    {
        /// <summary>
        /// Returns a new string in which all occurrences of a double escaped html character (that's an html entity immediatly prefixed with another html entity)
        /// in the current instance are replaced with the single escaped character.
        /// </summary>
        ///
        /// The new string.
        public static string SingleDecodeDoubleEncodedHtml(this string source)
        {
            return source.Replace("&amp;#x", "&#x");
        }
    }
}

Now when we run our xslt transformation on the service, we chain our new extension method on the end.
Which gives us back a single encoded string that the browser is happy to display as the decoded value.

return Transform().SingleDecodeDoubleEncodedHtml();

Now back to my findings from the test above.
Turns out that “Properly quoted attributes can only be escaped with the corresponding quote.” really is true.
I thought that if I entered something like the following into the attribute value of an input element where type="text",
then the first double quote would be interpreted as the corresponding quote,
and the end double quote would be interpreted as the end quote of the onmouseover attribute value.

 " onmouseover="alert(2)

What actually happens, is during the transform…

xslCompiledTransform.Transform(xmlNodeReader, args, writer, new XmlUrlResolver());

All the relevant double quotes are converted to the double quote Html entity ‘”‘ without the single quotes.

And all double quotes are being stored in the database as the character value.

Libraries and useful code

Microsoft Anti-Cross Site Scripting Library

OWASP Encoding Project
This is the Reform library. Supports Perl, Python, PHP, JavaScript, ASP, Java, .NET

Online escape tool supporting Html escape/unescape, Java, .NET, JavaScript

The characters that need escaping for inserting untrusted data into Html element content

JavaScript The Good Parts: pg 90 has a nice ‘entityify’ function

OWASP Enterprise Security API Used for JavaScript escaping (ESAPI4JS)

JQuery plugin

Changing encoding on html page

Cheat Sheets and Check Lists I found helpful

https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet

https://www.owasp.org/index.php/OWASP_Validation_Regex_Repository

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet

https://www.owasp.org/index.php/DOM_based_XSS_Prevention_Cheat_Sheet

https://www.owasp.org/index.php/OWASP_AJAX_Security_Guidelines

If any of this is unclear, let me know and I’ll do my best to clarify. Maybe you have suggestions of how this could have been improved? Let’s spark a discussion.

Tags:info sec, injection, Security, sql injection, threat modeling, XSRF, XSS
Posted in Architecture, C#, JavaScript, Networking, Scripting, Security, Web | 2 Comments »

C#.NET Coding Standards and Guidelines

August 12, 2012

This is the current set of coding standards and guidelines I use when I’m coding in the C#.NET language.
I thought it would be good to share so others could get use out of them also, and maybe start a discussion as to amendments / changes they see that could be useful?

Naming Conventions
Coding Style
Language Usage

Naming Conventions

Do not use Hungarian notation, I.E. a boolian variable may have the name MyBool, but shouldn’t be called bMyBool.
Do prefix member variables with the underscore ‘_’. Do not prefix member variables with “this”. also use camelCasing for member variables. The underscore is easy to see, is one key stroke.
Do prefix interfaces names with “I”
Do not prefix enums, classes, or delegates with any letter.

Key:

“c” = camelCase
“P” = PascalCase
“_” = Prefix with _Underscore
“x” = Not Applicable.

Identifier	Public	Protected	Internal	Private	Notes
Project File	P	x	x	x	Match Assembly and Root Namespace
Project Folder	P	x	x	x	Match Project File
Source File	P	x	x	x	Match contained class
Test Source File	P	x	x	x	Append the word Test if it contains tests
Image File	c	x	x	x
Other Files	P	x	x	x	Apply where possible
Namespace	P	x	x	x	Partial Project/Assembly match.(Also see the namespace section)
Solution File	P	x	x	x	CompanyNameSolutionDescription
Solution Folder	P	x	x	x	CompanyNameSolutionDescription (if multiple solutions in repository). Source (if single solution)
SpecFlow Feature File	P	x	x	x	USn_BriefUserStoryName where n is the user story number
oject Folder	P	x	x	x	Same as Project file
Class or Struct	P	P	P	P	Add suffix of subclass.
Interface	P	P	P	P	Prefix with a capital I.
Generic Class	P	P	P	P	Use T (type) or K (key) as Type identifier.
Method	P	P	P	P	Use a Verb or Verb-Object pair.
Test Method	P	x	x	x	MemberUnderTest_StateUnderTest_ExpectedBehavior . StateUnderBehavior can be leftout if not applicable.
Property	P	P	P	P	Do not prefix with Get or Set.
Field	P	P	P	_c	Only use Private fields.
Constant	P	P	P	_c
Static Field	P	P	P	_c	Only use Private fields.
Enum	P	P	P	P	Options are also PascalCase.
Delegate	P	P	P	P	See under Events, Delegates for naming Dot NET
Event	P	P	P	P	See under Events, Delegates for naming Dot NET
Inline Variable	x	x	x	c	Avoid single-character and enumerated names.
Parameter	x	x	x	c

Coding Style

Commenting

Comment Style

Block comments should usually be avoided

/* Line 1
* Line 2
* Line 3
*/

/* … */

Begin comment text with an upper case character. End comment text with a period.

If you have to comment your code, consider refactoring, so that it is easier to read.
Prefer not to use inline-comments to explain obvious code. Well written code is self-documenting.
Rather fix or clean up code now, than put a // Todo in.

You can access // Todo‘s in Visual Studio via

View menu -> Task List
The Tokens can be setup in Tools -> Options… -> Environment->Task List

or for ReSharper

ReSharper menu -> Tools -> To-do Items (or use the key shortcuts)

Use the following tokens:

Todo
Note
Bug
Not Implemented

XML Documentation

Always apply C# comment-blocks (///) to public, protected, and internal declarations.
Only use C# comment-blocks for documenting the API I.E the interface.
include <summary> comments. Include <param>, <return>, and <exception> comment
sections where applicable.
Include <see cref=””/> and <seeAlso cref=””/> where possible.

Always add CDATA tags to comments containing code and other embedded markup in order to avoid
encoding issues.
Example:

///
/// Add the following key to the appSettings” section of your config:
/// <code><![CDATA[
///   <configuration>
///     <appSettings>
///       <add key=”mySetting” value=”myValue”/>
///     </appSettings>
///   </configuration>
/// ]]></code>
///

File Organisation

Group internal class implementation by type in the following order:

Member variables.
Constructors & Finalizers.
Nested Enums, Structs, and Classes.
Properties
Methods

Sequence declarations within type groups based upon access modifier and visibility:

Public
Protected
Internal
Private

Do not use #region statements
Always match class name and file name where ever possible. Avoid including more than one class per file.

Formatting

Bracing

Place first brace of the block at the end of the line preceded with a space.
In languages like C, C++, C#, Java, it doesn’t matter where you put the first curly brace, it’s just personal preference or based on vote.
In languages like JavaScript, it does matter. I use quite a bit of JavaScript, so just find it easier to use the same convention. Although at work, we use the “opening brace on a new line convention, simply because it won the vote”.
Always use curly braces ({ and }) in conditional statements. Unless there is a very simple statement, like return bla.
Recursively indent all code blocks contained within braces.

Spacing

Use white space (CR/LF, Tabs, etc) liberally to separate and organize code.

Only declare related attribute declarations on a single line, otherwise stack each attribute as a separate declaration.

Example:

// Bad!
[Attrbute1, Attrbute2, Attrbute3]
public class MyClass {
   …
}

// Good!
[Attrbute1, RelatedAttribute2]
[Attrbute3]
[Attrbute4]
public class MyClass {
   …
}

Tabs and Indenting

Tab characters (x09) should not be used in code. All indentation should be done with 3 space characters.

Language Usage

Access Modifiers

Do not omit access modifiers.
Explicitly declare all identifiers with the appropriate access modifier instead of allowing the default.
Example:

// Bad!
Void WriteEvent(string message) {
   …
}

// Good!
private Void WriteEvent(string message) {
   …
}

	Both the above definitions are private.
	Prefer explicit to implicit.

Calling Routines

When calling a routine that takes a bool or a number.
Don’t pass litterals, as it’s unclair what they represent.
Instead create a variable with a meaningful name.

// calling MethodTakingExampleArgs
MethodTakingExampleArgs(true, 12);

// instead do the following

bool temperatureHasChanged = true;
int temperatureInCelcius = 12;

// calling MethodTakingExampleArgs
MethodTakingExampleArgs(temperatureHasChanged, temperatureInCelcius);

The intent becomes clearer, thus making for code that’s easier to read, thus we work faster.

If a routine call has its parameters spread over more than a single line due to being to long, place each parameter on its own line.
Also consider how many arguments are being passed, if it’s over 5, consider other ways to pass the information needed.

Class

Avoid putting multiple classes in a single file.

Events, Delegates

The delegate type should be prefixed with “Handler”.
The name of the procedure that does the work should be a verb.

public class MyDelegateExample {
   delegate void ChangeHandler();
   event ChangeHandler _change;

   private void OnChange() {
      if (_change != null)
         _change();
   }
}

	Rather than checking for null, you can add an empty delegate to your _change event
	so that you don’t have to check the event for null before you raise it.

The traditional null check followed by the next action is not atomic, so not thread safe. Discussed in more depth here.

public class MyDelegateExample {

   delegate void ChangeHandler();
   event ChangeHandler _change = delegate{};

   public void Attach(ChangeHandler update) {
      Change += update;
   }

   public void Detach(ChangeHandler update) {
      Change -= update;
   }

   private void OnChange() {
      _change();
   }
}

Exceptions

Do not use try/catch blocks for flow-control. Only use for exceptional cases.
Only catch exceptions that you can handle.
Never declare an empty catch block.
Avoid nesting a try/catch within a catch block.
Always catch the most derived exception via exception filters.
Order exception filters from most to least derived exception type.
Avoid re-throwing an exception. Allow it to bubble-up instead.
If re-throwing an exception, preserve the original call stack by omitting the exception argument from the throw statement.Example:
```
// Bad!
catch(Exception e) {
   Log(e);
   throw e;
}

// Good!
catch(Exception e) {
   Log(e);
   throw;
}
```
Only use the finally block to release resources from a try statement.

Always use validation to avoid exceptions.
Example:

// Bad!
try {
   conn.Close();
}
Catch(Exception ex) {
   // handle exception if already closed!
}

// Good!
if(conn.State != ConnectionState.Closed) {
   conn.Close();
}

Always set the innerException property on thrown exceptions so the exception chain & call stack are maintained.
Avoid defining custom exception classes if there is an existing Exception derived class available in the .NET library.
Always suffix exception class names with the word “Exception”.
Always add the SerializableAttribute to exception classes.

Always implement the standard “Exception Constructor Pattern”:

public MyCustomException ();
public MyCustomException (string message);
public MyCustomException (string message, Exception innerException);

Or better… if using .NET 4.0 or greater, use optional parameters.

Always implement the deserialization constructor:

protected MyCustomException(SerializationInfo info, treamingContext contxt);

Flow Control

Case Statements

Only use switch/case statements for simple operations with parallel conditional logic.
Prefer nested if/else over switch/case for short conditional sequences and complex conditions.
Prefer polymorphism over switch/case to encapsulate and delegate complex operations.
Don’t fall into the trap of writing procedural code in an OO language.

Conditions

Avoid evaluating Boolean conditions against true or false.
Example:

// Bad!
if(isValid == true) {
   …
}

// Good!
if(isValid) {
   …
}

Use braces {} as shown above in all situations but for the most simple.
If you have more than a single line statement in a conditional, surround it with braces.

Implicit typing using the var keyword

Some background on var:

The compiler simply takes the compile time
type of the initialization expression and makes the variable have that type too.
An example:

var stringVariable = "Hello, world."
stringVariable = 0;

The above code is invalid.

The var keyword should only be used with LINQ and Anonymous types.
Unless there’s a significant gain in code simplicity, use explicit typing.

It is recommended to use var only when it is necessary, that is, when the variable will be used to store an anonymous type or a collection of anonymous types.
See Microsofts reference on var.

Sometimes you actually want the code to break when a type is changed.

Consider a control system. Many elements have On() and Off() methods. there are many cases where there is no relationship between the types (i.e. no common base classes or interfaces), there is only the similarity that both have methods with those signatures.

Writting code:

var thing = SomeFactory.GetThing();  // Returns something that is safe to turn off...
thing.Off();

Then later a change is made to the Factory and that method now returns something completely different, which happens to have severe consequences if it is arbitrarily turned off having such a design is debatable for many reasons.

By using var, the previous code will compile without complaint. Even though the return type may have changed from ReadingLamp to LifeSupportSystem.

I believe that there are more times when there is the possibility of an “unintended side-effect” caused by a change in the type than there are times where the change in type has no bearing on the code that consumes it. As a result, I very rarely use var. Even when the return type is obvious (such as the LHS of a new), I find it easier to be consistent.

Namespace

CompanyName.SolutionDescription.AssemblyDescription
Never declare more than 1 namespace per file.
Append folder-name to namespace for source files within sub-folders.
Also see the Naming conventions table.
Place namespace “using” statements together at the top of file. Group .NET namespaces above custom namespaces.
Followed by grouping of external namespaces.
Followed by grouping of organisation namespaces.
Order namespace “using” statements alphabetically.

Variables and Types

Declare and preferably initialize local variables at the same point and as close to where you first use them.
Always choose the simplest data type, list, or object required.
Always use the built-in C# data type aliases, not the .NET common type system (CTS).
Example:

	short NOT System.Int16
	int NOT System.Int32
	long NOT System.Int64
	string NOT System.String

Declare one variable per line.
Only declare member variables as private. Use properties to provide access to them with public, protected, or internal access modifiers.
Prefer to use the as operator and check for null, rather than directly casting, and having to handle potential InvalidCastException.
```
object dataObject = LoadData();
DataSet ds = dataObject as DataSet;
if(ds != null) {
   …
}
```

Avoid boxing and unboxing value types.
Especially in loops, or where performance matters.Example:

int count = 1;
object refCount = count; // Implicitly boxed.
int newCount = (int)refCount; // Explicitly unboxed.

Strings

Use the “@” prefix for string literals instead of escaped strings.
Prefer String.Format() or StringBuilder over string concatenation.
StringBuilder performs many times faster (thousands in fact)
Never concatenate strings inside a loop. Remember, string’s are immutable. Each time you concatenate, a new instance of string is created.
Checking whether a string is empty?
String.Length == 0 or “” is faster than String.Empty, but… beware of null strings, if null when you perform a String.Length, you’ll get a NullReferenceException.
The safest technique is to use the static IsNullOrEmpty function on string.
Using “” does not create a new object. Due to string interning, it will be created either once per assembly or once per AppDomain.

Tags:coding standards
Posted in Architecture, C# | 24 Comments »