Archive for the ‘Networking’ Category

Up and Running with Express on Node.js … and friends

July 27, 2013

This is a result of a lot of trial and error, reading, notes taken, advice from more knowledgeable people than myself over a period of a few months in my spare time. This is the basis of a web site I’m writing for a new business endeavour.

Web Frameworks evaluated

  1. ExpressJS Version 3.1 I talked to quite a few people on the #Node.js IRC channel and the preference in most cases was Express. I took notes around the web frameworks, but as there were not that many good contenders, and I hadn’t thought about pushing this to a blog post at the time, I’ve pretty much just got a decision here.
  2. Geddy Version 0.6

MV* Frameworks evaluated

  1. CompoundJS (old name = RailwayJS) Version 1.1.2-7
  2. Locomotive Version 0.3.6. built on Express

At this stage I worked out that I don’t really need a server side MV* framework, as Express.js routes are near enough to controllers. My mind may change on  this further down the track, if and when it does, I’ll re-evaluate.

Templating Engines evaluated

  1. jade Version 0.28.2, but reasonably mature and stable. 2.5 years old. A handful of active contributors headed by Chuk Holoway. Plenty of support on the net. NPM: 4696 downloads in the last day, 54 739 downloads in the last week, 233 570 downloads in the last month (as of 2013-04-01). Documentation: Excellent. The default view engine when running the express binary without specifying the desired view engine. Discussion on LinkedIn. Discussed in the Learning Node book. Easy to read and intuitive. Encourages you down the path of keeping your logic out of the view. The documentation is found here and you can test it out here.
  2. handlebars Version 1.0.10 A handful of active contributors. NPM: 191 downloads in the last day, 15 657 downloads in the last week, 72 174 downloads in the last month (as of 2013-04-01). Documentation: Excellent: nettuts. Also discussed in Nicholas C. Zakas’s book under Chapter 5 “Loose Coupling of UI Layers”.
  3. EJS Most of the work done by the Chuk Holoway (BDFL). NPM: 258 downloads in the last day, 13 875 downloads in the last week, 56 962 downloads in the last month (as of 2013-04-01). Documentation: possibly a little lacking, but the ASP.NET syntax makes it kind of intuitive for developers from the ASP.NET world. Discussion on LinkedIn. Discussed in the “Learning Node” book by Shelley Powers. Plenty of support on the net. deoxxa from #Node.js mentioned: “if you’re generating literally anything other than all-html-all-the-time, you’re going to have a tough time getting the job done with something like jade or handlebars (though EJS can be a good contender there). For this reason, I ended up writing node-ginger a while back. I wouldn’t suggest using it in production at this stage, but it’s a good example of how you don’t need all the abstractions that some of the other libraries provide to achieve the same effects.”
  4. mu (Mustache template engine for Node.js) NPM: 0 downloads in the last day, 46 downloads in the last week, 161 downloads in the last month (as of 2013-04-01).
  5. hogan-express NPM: 1 downloads in the last day, 183 downloads in the last week, 692 downloads in the last month (as of 2013-04-01). Documentation: lacking

Middleware AKA filters

Connect

Details here https://npmjs.org/package/connect express.js shows that connect().use([takes a path defaulting to ‘/’ here], andACallbackHere) http://expressjs.com/api.html#app.use the body of andACallbackHere will only get executed if the request had the sub directory that matches the first parameter of connect().use

Styling extensions etc evaluated

  1. less (CSS3 extension and (preprocessor) compilation to CSS3) Version 1.4.0 Beta. A couple of solid committers plus many others. runs on both server-side and client-side. NPM: 269 downloads in the last day, 16 688 downloads in the last week, 74 992 downloads in the last month (as of 2013-04-01). Documentation: Excellent. Wiki. Introduction.
  2. stylus (CSS3 extension and (preprocessor) compilation to CSS3) Worked on since 2010-12. Written by the Chuk Holoway (BDFL) that created Express, Connect, Jade and many more. NPM: 282 downloads in the last day, 16 284 downloads in the last week, 74 500 downloads in the last month (as of 2013-04-01).
  3. sass (CSS3 extension and (preprocessor) compilation to CSS3) Version 3.2.7. Worked on since 2006-06. Still active. One solid committer with lots of other help. NPM: 12 downloads in the last day, 417 downloads in the last week, 1754 downloads in the last month (as of 2013-04-01). Documentation: Looks pretty good. Community looks strong: #sass on irc.freenode.net. forum. less, stylus, sass comparison on nettuts.
  • rework (processor) Version 0.13.2. Worked on since 2012-08. Written by the Chuk Holoway (BDFL) that created Express, Connect, Jade and many more. NPM: 77 downloads in the last week, 383 downloads in the last month (as of 2013-04-01). As explained and recommended by mikeal from #Node.js its basically a library for building something like stylus and less, but you can turn on the features you need and add them easily.  No new syntax to learn. Just CSS syntax, enables removal of prefixes and provides variables. Basically I think the idea is that rework is going to use the likes of less, stylus, sass, etc as plugins. So by using rework you get what you need (extensibility) and nothing more.

Responsive Design (CSS grid system for Responsive Web Design (RWD))

There are a good number of offerings here to help guide the designer in creating styles that work with the medium they are displayed on (leveraging media queries).

Keeping your Node.js server running

Development

During development nodemon works a treat. Automatically restarts node when any source file is changed and notifies you of the event. I install it locally:

$ npm install nodemon

Start your node app wrapped in nodemon:

$ nodemon [your node app]

Production

There are a few modules here that will keep your node process running and restart it if it dies or gets into a faulted state. forever seems to be one of the best options. forever usage. deoxxa’s jesus seems to be a reasonable option also, ningu from #Node.js is using it as forever was broken for a bit due to problems with lazy.

Reverse Proxy

I’ve been looking at reverse proxies to forward requests to different process’s on the same machine based on different domain names and cname prefixes. At this stage the picks have been node-http-proxy and NGinx. node-http-proxy looks perfect for what I’m trying to do. It’s always worth chatting to the hoards of developers on #Node.js for personal experience. If using Express, you’ll need to enable the ‘trust proxy’ setting.

Adding less-middleware

I decided to add less after I had created my project and structure with the express executable.
To do this, I needed to do the following:
Update my package.json in the projects root directory by adding the following line to the dependencies object.
“less-middleware”: “*”

Usually you’d specify the version, so that when you update in the future, npm will see that you want to stay on a particular version, this way npm won’t update a particular version and potentially break your app. By using the “*” npm will download the latest package. So now I just copy the version of the less-middleware and replace the “*”.

Run npm install from within your project root directory:

my-command-prompt npm install
npm WARN package.json my-apps-name@0.0.1 No README.md file found!
npm http GET https://registry.npmjs.org/less-middleware
npm http 200 https://registry.npmjs.org/less-middleware
npm http GET https://registry.npmjs.org/less-middleware/-/less-middleware-0.1.11.tgz
npm http 200 https://registry.npmjs.org/less-middleware/-/less-middleware-0.1.11.tgz
npm http GET https://registry.npmjs.org/less
npm http GET https://registry.npmjs.org/mkdirp
npm http 200 https://registry.npmjs.org/mkdirp
npm http 200 https://registry.npmjs.org/less
npm http GET https://registry.npmjs.org/less/-/less-1.3.3.tgz
npm http 200 https://registry.npmjs.org/less/-/less-1.3.3.tgz
npm http GET https://registry.npmjs.org/ycssmin
npm http 200 https://registry.npmjs.org/ycssmin
npm http GET https://registry.npmjs.org/ycssmin/-/ycssmin-1.0.1.tgz
npm http 200 https://registry.npmjs.org/ycssmin/-/ycssmin-1.0.1.tgz
less-middleware@0.1.11 node_modules/less-middleware
├── mkdirp@0.3.5
└── less@1.3.3 (ycssmin@1.0.1)

So you can see that less-middleware pulls in less as well.
Now you need to require your new middleware and tell express to use it.
Add the following to your app.js in your root directory.

var lessMiddleware = require('less-middleware');

and within your function that you pass to app.configure, add the following.

app.use(lessMiddleware({
   src : __dirname + "/public",
   // If you want a different location for your destination style sheets, uncomment the next two lines.
   // dest: __dirname + "/public/css",
   // prefix: "/css",
   // if you're using a different src/dest directory, you MUST include the prefix, which matches the dest public directory
   // force true recompiles on every request... not the best for production, but fine in debug while working through changes. Uncomment to activate.
   // force: true
   compress : true,
   // I'm also using the debug option...
   debug: true
}));

Now you can just rename your css files to .less and less will compile to css for you.
Generally you’ll want to exclude the compiled styles (.css) from your source control.

The middleware is made to watch for any requests for a .css file and check if there is a corresponding .less file. If there is a less file it checks to see if it has been modified. To prevent re-parsing when not needed, the .less file is only reprocessed when changes have been made or there isn’t a matching .css file.
less-middleware documentation

Bootstrap

Twitters Bootstap is also really helpful for getting up and running and comes with allot of helpful components and ideas to get you kick started.
Getting started.
Docs
.

Bootstrap-for-jade

As I decided to use the Node Jade templating engine, Bootstrap-for-Jade also came in useful for getting started with ideas and helping me work out how things could fit together. In saying that, I came across some problems.

ReferenceError: home.jade:23

body is not defined
    at eval (eval at <anonymous> (MySite/node_modules/jade/lib/jade.js:171:8), <anonymous>:238:64)
    at MySite/node_modules/jade/lib/jade.js:172:35
    at Object.exports.render (MySite/node_modules/jade/lib/jade.js:206:14)
    at View.exports.renderFile [as engine] (MySite/node_modules/jade/lib/jade.js:233:13)
    at View.render (MySite/node_modules/express/lib/view.js:75:8)
    at Function.app.render (MySite/node_modules/express/lib/application.js:506:10)
    at ServerResponse.res.render (MySite/node_modules/express/lib/response.js:756:7)
    at exports.home (MySite/routes/index.js:19:7)
    at callbacks (MySite/node_modules/express/lib/router/index.js:161:37)
    at param (MySite/node_modules/express/lib/router/index.js:135:11)
GET /home 500 22ms

I found a fix and submitted a pull request. Details here.

I may make a follow up post to this titled something like “Going Steady with Express on Node.js … and friends'”

Running Wireshark as non-root user

April 13, 2013

As part of my journey with Node.js I decided I wanted to see exactly what was happening on the wire. I decided to use Burp Suite as the Http proxy interceptor and Wireshark as the network sniffer (not an interceptor). Wireshark can’t alter the traffic, it can’t decrypt SSL traffic unless the encryption key can be provided and Wireshark is compiled against GnuTLS.

This post is targeted at getting Wireshark running on Linux. If you’re a windows user, you can check out the Windows notes here.

When you first install Wireshark and try to start capturing packets, you will probably notice the error “You didn’t specify an interface on which to capture packets.”

When you try to specify an interface from which to capture, you will probably notice the error “There are no interfaces on which a capture can be done.”

You can try running Wireshark as root: gksudo wireshark

Wireshark as root

This will work, but of course it’s not a good idea to run a comprehensive tool like Wireshark (over 1’500’000 lines of code) as root.

So what’s actually happening here?

We have dumpcap and we have wireshark. dumpcap is the executable responsible for the low level data capture of your network interface. wireshark uses dumpcap. Dumpcap needs to run as root, wireshark does not need to run as root because it has Privilege Separation.

If you look at the above suggested “better way” here, this will make a “little” more sense. In order for it to make quite a lot more sense, I’ll share what I’ve just learnt.

Wireshark has implemented Privilege Separation which means that the Wireshark GUI (or the tshark CLI) can run as a normal user while the dumpcap capture utility runs as root. Why can’t this just work out of the box? Well there is a discussion here on that. It doesn’t appear to be resolved yet. Personally I don’t think that anybody wanting to use wireshark should have to learn all these intricacies to “just use it”. As the speed of development gets faster, we just don’t have time to learn everything. Although on the other hand, a little understanding of what’s actually happening under the covers can help in more ways than one. Anyway, enough ranting.

How do we get this to all “just work”

from your console:

sudo dpkg-reconfigure wireshark-common

You’ll be prompted:

Configuring wireshark-common

Respond yes.

The wireshark group will be added

If the Linux Filesystem Capabilities are not present at the time of installing wireshark-common (Debian GNU/kFreeBSD, Debian GNU/Hurd), the installer will fall back to set the set-user-id bit to allow non-root users to capture packets. Custom built kernels may lack Linux Capabilities.

The help text also warns about a security risk which isn’t an issue because setuid isn’t used. Rather what actually happens is the following:

addgroup --quiet --system wireshark
chown root:wireshark /usr/bin/dumpcap
setcap cap_net_raw,cap_net_admin=eip /usr/bin/dumpcap

You will then have to manually add your user to the wireshark group.

sudo adduser kim wireshark # replacing kim with your user

or

usermod -a -G wireshark kim # replacing kim with your user

log out then back in again.

I wanted to make sure that what I thought was happening was actually happening. You’ll notice that if you run the following before and after the reconfigure:

ls -liah /usr/bin/dumpcap | less

You’ll see:

-rwxr-xr-x root root /usr/bin/dumpcap initially
-rwxr-xr-x root wireshark /usr/bin/dumpcap after

And a before and after of my users and groups I ran:

cat /etc/passwd | cut -d: -f1
cat /etc/group | cut -d: -f1

Alternatively to using the following as shown above, which gives us a nice abstraction (if that’s what you like):

sudo dpkg-reconfigure wireshark-common

We could just run the following:

addgroup wireshark
sudo chgrp wireshark /usr/bin/dumpcap
sudo chmod 750 /usr/bin/dumpcap
sudo setcap cap_net_raw,cap_net_admin+eip /usr/bin/dumpcap

The following will confirm the capabilities you just set.

getcap /usr/bin/dumpcap

What’s with the setcap?

For full details, run:

man setcap
man capabilities

setcap sets the capabilities of each specified filename to the capabilities specified (thank you man ;-))

For sniffing we need two of the capabilities listed in the capabilities man page.

  1. CAP_NET_ADMIN Perform various network-related operations (e.g., setting privileged socket options, enabling multicasting, interface configuration, modifying routing tables). This allows dumpcap to set interfaces to promiscuous mode.
  2. CAP_NET_RAW Use RAW and PACKET sockets. Gives dumpcap raw access to an interface.

For further details check out Jeremy Stretch’s explanation on Linux Filesystem Capabilities and using setcap. There’s also some more info covering the “eip” in point 2 here and the following section.

man capabilities | grep -A24 "File Capabilities"

Lets run Wireshark as our usual low privilege user

Now that you’ve done the above steps including the log off/on, you should be able to run wireshark as your usual user and configure your listening interfaces and start capturing packets.

Also before we forget… Ensure Wireshark works only from root and from a user in the “wireshark” group. You can add a temp user (command shown above).

Log in as them and try running wireshark. You should have the same issues as you had initially. Remove the tempuser:

userdel -r tempuser

Setup of Chromium, Burp Suite, Node.js to view HTTP on the wire

March 30, 2013

As part of my Node.js development I really wanted to see what was going over the wire from chromium-browser to my Node.js web apps.

I have node.js installed globaly, express installed locally, a very simple express server listening on port 3000

var express = require('express');
var app = express();

app.get('/', function (request, response) {
   response.send('Welcome to Express!');
});

app.listen(3000);

Burp Suite setup in my main menu. Added the command via System menu -> Preferences -> Main Menu

Burp Suite Command

The Command string looks like the following.

java -jar -Xmx1024m /WhereTheBurpSuiteLives/burpsuite_free_v1.5.jar

Setting up Burp Suite configuration details are found here. I’ve used Burp Suite before several times. Most notably to create my PowerOffUPSGuests library which I discuss here. In that usage I reverse engineered how the VMware vSphere client shuts down it’s guests and replicated the traffic in my library code. For a simple setup, it’s very easy to use. You can spend hours exploring Burps options and all the devious things you can use it for, but to get started it’s simple. Set it up to listen on localhost and port 3001 for this example.

Burp Suite Proxy Listeners

Run the web app

to start our express app from the directory where our above server is located, from a console, run:

node index.js

Where index.js is the name of the file that contains our JavaScript.

To test that our express server is active. We can browse to http://localhost:3000/ or we can curl it:

curl -i  http://localhost:3000/

Should give us something in return like:


HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: text/html; charset=utf-8
Content-Length: 19
Date: Sun, 24 Mar 2013 07:53:38 GMT
Connection: keep-alive

Welcome to Express!

Now for the Proxy interception (Burp Suite)

Now that we’ve got end to end comms, lets test the interceptor.

Run burpsuite with the command I showed you above.

Fire the Http request at your web app via the proxy:

curl -i --proxy http://localhost:3001 http://localhost:3000/

Now you should see burps interceptor catch the request. On the Intercept tab, press the Forward button and curl should show a similar response to above.

Burp Suite Proxy Intercept

If you look at the History tab, you can select the message curl sent and also see the same Response that curl received.

Burp Suite Proxy History

Now you can also set Burp to intercept the server responses too. In fact Burp is extremely configurable. You can also pass the messages to different components of Burp to process how ever you see fit. As you can see in the above image looking at all the tabs that represent burp tools. These can be very useful for penetration testing your app as you develop it.

I wanted to be able to use chromium normally and also be able to open another window for browsing my express apps and viewing the HTTP via Burp Suite. This is actually quite simple. Again with your app running locally on port 3000 and burp listening on port 3001, run:

chromium-browser --temp-profile --proxy-server=localhost:3001

For more chromium options:

chromium-browser -help

Now you can just browse to your web app and have burp intercept your requests.

chromium proxied via burp

You may also want to ignore requests to your search provider, because as your typing in URL’s chromium will send searches when you pause. Under Proxy->Options tab you can do something like this:

Ignore Client Requests

Erasing data from your drives

March 17, 2013

Disclaimer

I take no responsibility for any damage caused by following any of the directions in this post. These tools and methods are destructive and likely to destroy your data or worse.

Deleting files from your drives does not remove them, it simply dereferences the memory. The data still exists. For further details, there is a good read here. This also covers some recovery tools.

Zero filling your disk/s

This is the process of setting all the bits on a drive to 0. Some say this is not the most secure way and that someone who knows what they’re doing can still in many cases recover the original data and that at least multiple passes of this technique are required. Others however disagree with this and say that a single pass is enough.
Thanks Miles for pointing this out and providing another view point.

dd

A cloning tool. AKA “data destroyer”.
To zero-fill: direct the output of the character file /dev/zero to the device you want zero-filled.

How?
Boot your machine from a live Linux disk that includes the dd programme. Most Linux distros will have dd included. I’ve done this using Knoppix as it loads reasonably fast.
From the shell terminal as root:

dd if=/dev/zero of=/dev/[device you want to wipe] bs=1M

/dev/zero, /dev/random and /dev/urandom are character special files. /dev/random and /dev/urandom are interfaces to the Linux kernel’s random number generator.

To find the device you want to wipe, run

fdisk

You’ll get something along these lines:

/dev/hda = primary master IDE
/dev/hdb = primary slave IDE
/dev/hdc = secondary master IDE
/dev/hdd = secondary slave IDE
/dev/sda = first SCSI hard drive
/dev/sdb = second SCSI hard drive

So for example if you want to zero your primary master:

sudo dd if=/dev/zero of=/dev/hda bs=1M

UBCD

AKA Ultimate Boot CD.
Once you’ve downloaded UBCD and have it written to your boot media and have your machine booted into it.
Press F2 to enter the Hard Drive tool section.
Press right arrow key to enter the diagnostic tools.
Select the most recent version of the diagnostic tool under the name of the manufacturer of your drive.

Applying patterns to the bits

A more effective approach to zero-filling, is to use bit flipping patterns in your wiping approach and perform multiple passes.

dd if=/dev/random of=/dev/[device you want to wipe] bs=1M

should be a little more effective.

Better still, run the following 3 – 7 times, as discussed here


dd if=/dev/random of=/dev/[device you want to wipe] && dd if=/dev/zero of=/dev/[device you want to wipe]

Wipe

I haven’t used this, but it looks good.

dban

Recommended by Stanford University’s Disk and Data Sanitisation Policy and Guidelines.
Stanford also lists a collection of other useful disk sanitisation tools.
Download the iso from https://sourceforge.net/projects/dban/
Burn the image to a CD / DVD or USB drive (using something like ISO to USB.
Set your BIOS to boot from which ever device has the ISO image.

Once dban loads you’ll be given options to proceed.
dban start options

I hit Enter to start in interactive mode.

In the next screen, you’ll be able to see that dban is using urandom as it’s Entropy. This must be /dev/urandom which will be used to set your bits on/off randomly rather than just zeroing or oneing (probably not a word ;-)).
This is considered a far better technique to make it forensically close to impossible to reconstruct the original contents of the disk.

NukeOptions

In this screen you can also select other options.
Method: allows you to use a selection of different techniques.
The current default is DoD Short.
Both DoD 5220.22-M Short and DoD 5220.22-M Standard are used by the American Department of Defense
DoD 5220.22-M Short performs 3 passes
DoD 5220.22-M Standard performs 7 passes

See here for the standards for data erasure

Once dban has performed the sanitisation, you’ll see a screen similar to the following with the details

FinishDetails

As always, feel free to offer corrections and comments on things I may have missed out that you think worth mentioning.

How to Increase Software Developer Productivity

March 2, 2013

Is your organisation:

  • Wanting to get more out of your Software Developers?
  • Wanting to increase RoI?
  • Spending too much money fixing bugs?
  • Development team not releasing business value fast enough?
  • Maybe your a software developer and you want to lift your game to the next level?

If any of these points are of concern to you… read on.

There are many things we can do to lift a software developers productivity and thus the total output of The Development Team. I’m going to address some quick and cheap wins, followed by items that may take a little longer to implement, but non the less, will in many cases provide even greater results.

What ever it takes to remove friction and empower your software developers to work with the least amount of interruptions, do it.
Allow them to create a space that they love working in. I know when I work from home my days are far more productive than when working for a company that insists on cramming as many workers around you into a small space as possible. Chitter chatter from behind, both sides and in front of you will not help one get their mind into a state of deep thought easily.

I have included thoughts from Nicholas C. Zakas post to re-iterate the common fallacies uttered by non-engineers.

  • I don’t understand why this is such a big deal. Isn’t it just a few lines of code? (Technically, everything is a few lines of code. That doesn’t make it easy or simple.)
  • {insert name here} says it can be done in a couple of days. (That’s because {insert name here} already has perfect knowledge of the solution. I don’t, I need to learn it first.)
  • What can we do to make this go faster? Do you need more engineers? (Throwing more engineers at a problem frequently makes it worse. The only way to get something built faster is to build a smaller thing.)

Screen real estate

When writing code, a software developers work requires a lot of time spent deep in thought. Holding multiple layers of complexity within immediately accessible memory.
One of the big wins I’ve found that helps with continuity, is maximising your screen real estate.
I’ve now moved up to 3 x 27″ 2560×1440 IPS flat panels. These are absolutely gorgeous to look at/work with.
Software development generally requires a large number of applications to be running at any one time.
For example in any average session for me, I generally have somewhere around 30 windows open.
The more screen real estate a developer has, the less he/she has to fossick around for what he/she needs and switch between them.
Also, the less brain cycles he/she has to spend locating that next running application, means the more cycles you have in order to do real work.
So, the less gap there is switching between say one code editor and another, the easier it is for a developer to keep the big picture in memory.
We’re looking at:

  1. physical screen size
  2. total pixel count

The greater real estate available (physical screen size and pixel count) the more information you can have instant access to, which means:

  • less waiting
  • less memory loss
  • less time spent rebuilding structures in your head
  • greater continuity

Which then gives your organisation and developers:

  • greater productivity
  • greater RoI

These screens are cheaper than many realise. I set these up 4 months ago. They continue to drop in price.

  1. FSM-270YG 27″ PC Monitor LED S-IPS WIDE 2560×1440 16:9 WQHD DVI-D $470.98 NZD
  2. [QH270-IPSMS] Achieva ShiMian HDMI DVI D-Sub 27″ LG LED 2560×1440 $565.05 NZD
  3. [QH270-IPSMS] Achieva ShiMian HDMI DVI D-Sub 27″ LG LED 2560×1440 $565.05 NZD

It’s just simply not worth not to upgrading to these types of panels.

korean monitors

In this setup, I’m running Linux Mint Maya. Besides the IPS panels, I’m using the following hardware.

  • Video card: 1 x Gigabyte GV-N650OC-2GI GTX 650 PCIE
  • PSU: 1200w Corsair AX1200 (Corsair AX means no more PSU troubles (7 yr warranty))
  • CPU: Intel Core i7 3820 3.60GHz (2011)
  • Mobo: Asus P9X79
  • HDD: 1TB Western Digital WD10EZEX Caviar Blue
  • RAM: Corsair 16GB (2x8GB) Vengeance Performance Memory Module DDR3 1600MHz

One of the ShiMian panels is using the VGA port on the video card as the FSM-270YG only supports DVI.
The other ShiMian and the FSM-270YG are hooked up to the 2 DVI-D (dual link) ports on the video card. The two panels feeding on the dual link are obviously a lot clearer than the panel feeding on the VGA. Also I can reduce the size of the text considerably giving me greater clarity while reading, while enabling me to fit a lot more information on the screens.

With this development box, I’m never left waiting for the machine to catchup with my thought process.
So don’t skimp on hardware. It just doesn’t make sense any way you look at it.

Machine Speed

The same goes for your machine speed. If you have to wait for your machine to do what you’ve commanded it to do and at the same time try and keep a complex application structure in your head, the likelihood of loosing part of that picture increases. Plus your brain has to work harder to hold the image in memory while your trying to maintain continuity of thought. Again using precious cycles for something that shouldn’t be required rather than on the essential work. When a developer looses part of this picture, they have to rebuild it again when the machine finishes executing the last command given. This is re-work that should not be necessary.

An interesting observation from Joel Spolsky:

“The longer it takes to task switch, the bigger the penalty you pay for multitasking.
OK, back to the more interesting topic of managing humans, not CPUs. The trick here is that when you manage programmers, specifically, task switches take a really, really, really long time. That’s because programming is the kind of task where you have to keep a lot of things in your head at once. The more things you remember at once, the more productive you are at programming. A programmer coding at full throttle is keeping zillions of things in their head at once: everything from names of variables, data structures, important APIs, the names of utility functions that they wrote and call a lot, even the name of the subdirectory where they store their source code. If you send that programmer to Crete for a three week vacation, they will forget it all. The human brain seems to move it out of short-term RAM and swaps it out onto a backup tape where it takes forever to retrieve.”

Many of my posts so far have been focused on productivity enhancements. Essentially increasing RoI. This list will continue to grow.

Coding Standards and Guidelines

Agreeing on a set of Coding Standards and Guidelines and policing them (generally by way of code reviews and check-in commit scripts) means software developers get to spend less time thinking about things that they don’t need to and get to throw more time at the real problems.

For example:

Better Tooling

Improving tool sets has huge gains in productivity. In most cases many of the best tools are free. Moving from the likes of non distributed source control systems to best of bread distributed.

There are many more that should be considered.

Wiki

Implementing an excellent Wiki that is easy to use. I’ve put a few wiki’s in place now and have used even more. My current pick of the bunch would have to be Atlassians Confluence. I’ve installed this on a local server and also migrated the instance to their cloud. There are varying plans and all very reasonably priced with excellent support. If the wiki you’re planning on using is not as intuitive as it could be, developers just wont use it. So don’t settle for anything less.

Improving Processes

Code Reviews

Also a very important step in all successful development teams and often a discipline that must be satisfied as part of Scrums Definition of Done (DoD). What this gives us is high quality designs and code, conforming to the coding standards. This reduces defects, duplicate code (DRY) and enforces easily readable code as the reviewer has to understand it. Saves a lot of money in re-work.

Cost of Change

Scott Amblers Cost of change curve

Definition of Done (DoD)

Get The Team together and decide on what it means to have each Product Backlog Item that’s pulled into the Sprint Done.
Here’s an example of a DoD that one of my previous Development Teams compiled:

Definition of Done

What does Done actually mean?

Come Sprint Review on the last day of the Sprint, everyone knows what it means to be done. There is no “well I thought it was Done because I’ve written the code for it, but it’s not tested yet”.

Continuous Integration (CI)

There are many tools and ways to implement CI. What does CI give you? Visibility of code quality, adherence to standards, reports on cyclomatic complexity, predictability and quite a number of other positive side effects. You’ll know as soon as the code fails to build and/or your fast running tests (unit tests) fail. This means The Development Team don’t keep writing code on top of faulty code, thus reducing technical debt by not having to undo changes on changes later down the track.
I’ve used a number of these tools and have carried out extensive research and evaluation spikes on a number of the most popular offerings. In order of preference, the following are my candidates.

  1. Jenkins (free and open source, with a great community)
  2. TeamCity
  3. Atlassian Bamboo

Release Plans

Make sure you have these. This will reduce confusion and provide a clear definition of the steps involved to get your software out the door. This will reduce the likelihood of screwing up a release and re-work being required. You’ll definitely need one of these for the next item.

Here’s an example of a release notes guideline I wrote for one of the previous companies I worked for.

release notes

Continuous Deployment

If using Scrum, The Scrum Team will be forecasting a potentially releasable Increment (the sum of all the Product Backlog items completed during a Sprint and all previous Sprints).
You may decide to actually release this. When you do, you can look at the possibility of automating this deployment. Thus reducing the workload of the release manager or who ever usually deploys (often The Development Team in a Scrum environment). This has the added benefit of consistency, predictability, reliability and of course happy customers. I’ve also been through this process of research and evaluation on the tools available and the techniques to implement.

Here’s a good podcast that got me started. I’ve got a collection of other resources if you need them and can offer you my experience in this process. Just leave a comment.

Implement Scrum (and not the Flaccid flavour)

I hope this goes without saying?
Implementing Scrum to provide ultimate visibility

Get maximum quality out of the least money spent

How to get the most out of your limited QA budget

Driving your designs with tests, thus creating maintainable code, thus reducing technical debt.

Hold Retrospectives

Scrum is big on continual inspection and adaption, self-organisation and fostering innovation. The military have another term for inspection and adaption. It’s called the OODA Loop.
The Retrospective is just one of the Scrum Events that enable The Scrum Team to continually inspect the way they are doing things and improve the way they develop and deliver business value.

Invest a little into your servant leaders

Empowering the servant leaders.

Context Switching

Don’t do it. This is a real killer.
This is hard. What you need to do is be aware of how much productivity is killed with each switch. Then do everything in your power to make sure your Development Team is sheltered from as much as possible. There are many ways to do this. For starters, you’re going to need as much visibility as possible into how much this is currently happening. track add-hock requests and any other types of interruptions that steel the developers concentration. In the last Scrum Team that I was Scrum Master of, The Development Team decided to include another metric to the burn down chart that was on the middle of the wall, clearly visible to all. Every time one of the developers was interrupted during a Sprint, they would record this time, the reason and who interrupted them, on the burn down chart. The Scrum Team would then address this during the Retrospective and empirically address why this happened and work out how to stop it happening every Sprint. Jeff Atwood has an informative post on why and how context-switching/multitasking kills productivity. Be sure to check it out.

As always, if anything I’ve mentioned isn’t completely clear, or you have any questions, please leave a comment 🙂

Establishing your SSH Server’s Key Fingerprint

February 16, 2013

When you connect to a remote host via SSH that you haven’t established a trust relationship with before,
you’re going to be told that the authenticity of the host your attempting to connect to can’t be established.

me@mybox ~ $ ssh me@10.1.1.40
The authenticity of host '10.1.1.40 (10.1.1.40)' can't be established.
RSA key fingerprint is 23:d9:43:34:9c:b3:23:da:94:cb:39:f8:6a:95:c6:bc.
Are you sure you want to continue connecting (yes/no)? y
Please type 'yes' or 'no':

Do you type yes to continue without actually knowing that it is the host you think it is? Well, if you do, you should be more careful. The fingerprint that’s being put in front of you could be a Man In The Middle (MITM). You can query the target (from “it’s” shell of course) for the fingerprint of it’s key easily. On Debian you’ll find the keys in /etc/ssh/

On

ls /etc/ssh/

you should get a listing that reveals the private and public keys. Run the following command on the appropriate key to reveal it’s fingerprint. For example if SSH is using rsa:

ssh-keygen -lf ssh_host_rsa_key.pub

For example if SSH is using dsa:

ssh-keygen -lf ssh_host_dsa_key.pub

If you try the command on either the private or publick key you’ll be given the public key’s fingerprint, which is exactly what you need for verifying the authenticity from the client side.

Sometimes you may need to force the output of the fingerprint_hash algorithm as ssh-keygen may be displaying it in a different form than it’s shown when you try to SSH for the first time. The default when using ssh-keygen to show the key fingerprint is sha256, but in order to compare apples with apples you may need to specify md5 if that’s what’s being shown when you attempt to login. You would do that like the following:

ssh-keygen -lE md5 -f ssh_host_dsa_key.pub

Details on the man page for the options.

Do not connect remotely and then run the above command, as the machine you’re connected to is still untrusted. The command could be dishing you up any string replacement if it’s an attackers machine. You need to run the command on the physical box or get someone you trust (your network admin) to do this and hand you the fingerprint.

Now when you try to establish your SSH connection for the first time, you can check that the remote host is actually the host you think it is by comparing the output of one of the previous commands with what SSH on your client is telling you the remote hosts fingerprint is. If it’s different it’s time to start tracking down the origin of the host masquerading as the address your trying to hook up with.

Now, when you get the following message when attempting to SSH to your server, due to something or somebody changing the hosts key fingerprint:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
23:d9:43:34:9c:b3:23:da:94:cb:39:f8:6a:95:c6:bc.
Please contact your system administrator.
Add correct host key in /home/me/.ssh/known_hosts to get rid of this message.
Offending RSA key in /home/me/.ssh/known_hosts:6
  remove with: ssh-keygen -f "/home/me/.ssh/known_hosts" -R 10.1.1.40
RSA host key for 10.1.1.40 has changed and you have requested strict checking.
Host key verification failed.

The same applies. Check that the fingerprint is indeed the intended target hosts key fingerprint. If it is, run the specified command.

Painless git diff

February 2, 2013

I’ve been using Node.js quite a bit lately and decided it was time to start using Git for my projects.

I’m used to using Mercurial (Hg) for DVCS, but have only used it on Windows and a little on Linux via command line.

I was looking for a similar experience that Windows gave me for Hg (file explorer integration with tortoisehg), but for Linux. I had created a repository using tortoisehg. When I attempted to add files to the repository using tortoisehg or straight from the command line, I was getting a few errors. tortoisehg, nautilus integration is broken on my distro at the time of writing this too. So this encouraged me to invest a little more time in Git. I had done a bit of reading and listened to a few good podcasts on Git, so I felt it was a good time.
think like a git is also good for a read.

As I was creating repositories, dealing with remote repositories, cloning, setting up all the config files, adding, committing, pulling, pushing, viewing status and diffing. What I quickly came to realise, was that the Git commands were very extensive, made more sense to me than Hg, and there is a lot of good documentation around. In saying that, it’s been a while since I used hg from the command line and most of my work has been through the GUI tools.

One area I was struggling with was the diffing of files and directories on the command line. There are a couple of good ways to make this experience a lot more pleasurable.
I like using meld on Linux for my file and directory comparisons, so already had that installed.

git diff

Create a bash file in the /bin directory.
I called it git-meld, and it looks like the following:

#!/bin/bash
meld $2 $5

Turn the executable bit on, so it can be executed.

chmod git-meld +x

Now modify your ~/.gitconfig file

git config --global diff.external git-meld

To make sure your’ve added git-meld as the script that’ll run meld with the correct parameters:

cat .gitconfig

and you should see at least the following:

[diff]
external = git-meld

Now that should be all you need to get git to pop meld on diff.

git diff [options] <commit> <commit> [<path_to_file_to_compare>]

If you have a stack of files (rather than just one, as shown in my above example) that were changed between these commits, diff will pop each file open in meld. One at a time until you’ve finished with each one

meld

git difftool

git also comes with difftool. I found this really nice to use. There is no setting up for it. All you do is replace the diff command with difftool. Optionally you can specify the GUI diff tool you want to use, simply by appending -t [your_GUI_diff_tool] like this if you like using meld.

git difftool -t meld <commit> <commit> [<path_to_file_to_compare>]

If you do this without specifying the file you want to compare, you are prompted if you want to view each file, rather than how diff works by just opening every one.

Launchmeld

If you choose to leave the -t option out, difftool will give you the option of all the possible tools able to perform the diff (some of which may need installation).

multiple diff tools

So using difftool is a better diff IMHO. This is how git difftool behaves whether or not you set up ~/.gitconfig file with your prefered diff tool.

Sanitising User Input from Browser. part 2

November 16, 2012

Untrusted data (data entered by a user), should always be treated as though it contains attack code.
This data should not be sent anywhere without taking the necessary steps to detect and neutralise the malicious code.
With applications becoming more interconnected, attacks being buried in user input and decoded and/or executed by a downstream interpreter is becoming all the more common.
Input validation, that’s restricting user input to allow only certain white listed characters and restricting field lengths are only two forms of defence.
Any decent attacker can get around client side validation, so you need to employ defence in depth.
validation and escaping also needs to be performed on the server side.

Leveraging existing libraries

  1. Microsofts AntiXSS is not extensible,
    it doesn’t allow the user to define their own whitelist.
    It didn’t allow me to add behaviour to the routines.
    I want to know how many instances of HTML encoded values there were.
    There was certainly a lot of code in there, but I didn’t find it very useful.
  2. The OWASP encoding project (Reform)(as mentioned in part 1 of this series).
    This is quite a useful set of projects for different technologies.
  3. System.Net.WebUtility from the System.Web.dll.
    Now this did most of what I needed other than provide me with fine grained information of what had been tampered with.
    So I took it and extended it slightly.
    We hadn’t employed AOP at this stage and it wasn’t considered important enough to invest the time to do so.
    So it was a matter of copy past modify.

What’s the point in client side validation if the server has to do it again anyway?

Now there are arguments both ways for this.
My current take on this for the project in question was:
If you only have server side validation, the client side is less responsive and user friendly.
If you only have client side validation, it’s out of our control.
This also gives fuel to the argument of using JavaScript on the client and server side (with the likes of node.js).
So the same code can be used both sides without having to code the same validation in two different languages.
Personally I find writing validation code easier using JavaScript than C#.
This maybe just because I’ve been writing considerably more JavaScript than C# lately though.

The code

I drew a sequence diagram of how this should work, but it got lost in a move.
So I wasn’t keen on doing it again, as the code had already been done.
In saying that, the code has reasonably good documentation (I think).
Code is king, providing it has been written to be read.
If you notice any of the escaping isn’t quite making sense, it could be the blogging engine either doing what it’s meant to, or not doing what it’s meant to.
I’ve been over the code a few times, but I may have missed something.
Shout out if anything’s not clear.

First up, we’ll look at the custom exceptions as we’ll need those soon.

using System;

namespace Common.WcfHelpers.ErrorHandling.Exceptions
{
    public abstract class WcfException : Exception
    {
        /// <summary>
        /// In order to set the message for the client, set it here, or via the property directly in order to over ride default value.
        /// </summary>
        /// <param name="message">The message to be assigned to the Exception's Message.</param>
        /// <param name="innerException">The exception to be assigned to the Exception's InnerException.</param>
        /// <param name="messageForClient">The client friendly message. This parameter is optional, but should be set.</param>
        public WcfException(string message, Exception innerException = null, string messageForClient = null) : base(message, innerException)
        {
            MessageForClient = messageForClient;
        }

        /// <summary>
        /// This is the message that the service's client will see.
        /// Make sure it is set in the constructor. Or here.
        /// </summary>
	    public string MessageForClient
        {
            get { return string.IsNullOrEmpty(_messageForClient) ? "The MessageForClient property of WcfException was not set" : _messageForClient; }
            set { _messageForClient = value; }
        }
        private string _messageForClient;
    }
}

And the more specific SanitisationWcfException

using System;
using System.Configuration;

namespace Common.WcfHelpers.ErrorHandling.Exceptions
{
    /// <summary>
    /// Exception class that is used when the user input sanitisation fails, and the user needs to be informed.
    /// </summary>
    public class SanitisationWcfException : WcfException
    {
        private const string _defaultMessageForClient = "Answers were NOT saved. User input validation was unsuccessful.";
        public string UnsanitisedAnswer { get; private set; }

        /// <summary>
        /// In order to set the message for the client, set it here, or via the property directly in order to over ride default value.
        /// </summary>
        /// <param name="message">The message to be assigned to the Exception's Message.</param>
        /// <param name="innerException">The Exception to be assigned to the base class instance's inner exception. This parameter is optional.</param>
        /// <param name="messageForClient">The client friendly message. This parameter is optional, but should be set.</param>
        /// <param name="unsanitisedAnswer">The user input string before service side sanitisatioin is performed.</param>
        public SanitisationWcfException
        (
            string message,
            Exception innerException = null,
            string messageForClient = _defaultMessageForClient,
            string unsanitisedAnswer = null
        )
            : base(
                message,
                innerException,
                messageForClient + " If this continues to happen, please contact " + ConfigurationManager.AppSettings["SupportEmail"] + Environment.NewLine
                )
        {
            UnsanitisedAnswer = unsanitisedAnswer;
        }
    }
}

Now as we define whether our requirements are satisfied by way of executable requirements (unit tests(in their rawest form))
Lets write some executable specifications.

using NUnit.Framework;
using Common.Security.Sanitisation;

namespace Common.Security.Encoding.UnitTest
{
    [TestFixture]
    public class ExtensionsTest
    {

        private readonly string _inNeedOfEscaping = @"One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.";
        private readonly string _noNeedForEscaping = @"One x2F two amp three x27 four lt five quot six gt       .";

        [Test]
        public void SingleDecodeDoubleEncodedHtml_ShouldSingleDecodeDoubleEncodedHtml()
        {
            string doubleEncodedHtml = @"";               // between the ""'s we have a string of Html with double escaped values like &amp;#x27; user entered text &amp;#x2F.
            string singleEncodedHtmlShouldLookLike = @""; // between the ""'s we have a string of Html with single escaped values like ' user entered text &#x2F.
            // In the above, the bloging engine is escaping the sinlge escaped entity encoding, so all you'll see is the entity it self.
            // but it should look like the double encoded entity encodings without the first &amp->;


            string singleEncodedHtml = doubleEncodedHtml.SingleDecodeDoubleEncodedHtml();
            
            Assert.That(singleEncodedHtml, Is.EqualTo(singleEncodedHtmlShouldLookLike));
        }

        [Test]
        public void Extensions_CompliesWithWhitelist_ShouldNotComply()
        {
            Assert.That(_inNeedOfEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,]+$"), Is.False);
        }

        [Test]
        public void Extensions_CompliesWithWhitelist_ShouldComply()
        {
            Assert.That(_noNeedForEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,]+$"), Is.True);
            Assert.That(_inNeedOfEscaping.CompliesWithWhitelist(whiteList: @"^[\w\s\.,#/&'<"">]+$"), Is.True);
        }
    }
}

Now the code that satisfies the above executable specifications, and more.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Text.RegularExpressions;

namespace Common.Security.Sanitisation
{
    /// <summary>
    /// Provides a series of extension methods that perform sanitisation.
    /// Escaping, unescaping, etc.
    /// Usually targeted at user input, to help defend against the likes of XSS and other injection attacks.
    /// </summary>
    public static class Extensions
    {

        private const int CharacterIndexNotFound = -1;

        /// <summary>
        /// Returns a new string in which all occurrences of a double escaped html character (that's an html entity immediatly prefixed with another html entity)
        /// in the current instance are replaced with the single escaped character.
        /// </summary>
        /// <param name="source">The target text used to strip one layer of Html entity encoding.</param>
        /// <returns>The singly escaped text.</returns>
        public static string SingleDecodeDoubleEncodedHtml(this string source)
        {
            return source.Replace("&amp;#x", "&#x");
        }
        /// <summary>
        /// Filter a text against a regular expression whitelist of specified characters.
        /// </summary>
        /// <param name="target">The text that is filtered using the whitelist.</param>
        /// <param name="alternativeTarget"></param>
        /// <param name="whiteList">Needs to be be assigned a valid whitelist, otherwise nothing gets through.</param>
        public static bool CompliesWithWhitelist(this string target, string alternativeTarget = "", string whiteList = "")
        {
            if (string.IsNullOrEmpty(target))
                target = alternativeTarget;
            
            return Regex.IsMatch(target, whiteList);
        }
        /// <summary>
        /// Takes a string and returns another with a single layer of Html entity encoding replaced with it's Html entity literals.
        /// </summary>
        /// <param name="encodedUserInput">The text to perform the opperation on.</param>
        /// <param name="numberOfEscapes">The number of Html entity encodings that were replaced.</param>
        /// <returns>The text that's had a single layer of Html entity encoding replaced with it's Html entity literals.</returns>
        public static string HtmlDecode(this string encodedUserInput, ref int numberOfEscapes)
        {
            const int NotFound = -1;

            if (string.IsNullOrEmpty(encodedUserInput))
                return string.Empty;

            StringWriter output = new StringWriter(CultureInfo.InvariantCulture);
            
            if (encodedUserInput.IndexOf('&') == NotFound)
            {
                output.Write(encodedUserInput);
            }
            else
            {
                int length = encodedUserInput.Length;
                for (int index1 = 0; index1 < length; ++index1)
                {
                    char ch1 = encodedUserInput[index1];
                    if (ch1 == 38)
                    {
                        int index2 = encodedUserInput.IndexOfAny(_htmlEntityEndingChars, index1 + 1);
                        if (index2 > 0 && encodedUserInput[index2] == 59)
                        {
                            string entity = encodedUserInput.Substring(index1 + 1, index2 - index1 - 1);
                            if (entity.Length > 1 && entity[0] == 35)
                            {
                                ushort result;
                                if (entity[1] == 120 || entity[1] == 88)
                                    ushort.TryParse(entity.Substring(2), NumberStyles.AllowHexSpecifier, NumberFormatInfo.InvariantInfo, out result);
                                else
                                    ushort.TryParse(entity.Substring(1), NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite | NumberStyles.AllowLeadingSign, NumberFormatInfo.InvariantInfo, out result);
                                if (result != 0)
                                {
                                    ch1 = (char)result;
                                    numberOfEscapes++;
                                    index1 = index2;
                                }
                            }
                            else
                            {
                                index1 = index2;
                                char ch2 = HtmlEntities.Lookup(entity);
                                if ((int)ch2 != 0)
                                {
                                    ch1 = ch2;
                                    numberOfEscapes++;
                                }
                                else
                                {
                                    output.Write('&');
                                    output.Write(entity);
                                    output.Write(';');
                                    continue;
                                }
                            }
                        }
                    }
                    output.Write(ch1);
                }
            }
            string decodedHtml = output.ToString();
            output.Dispose();
            return decodedHtml;
        }
        /// <summary>
        /// Escapes all character entity references (double escaping where necessary).
        /// Why? The XmlTextReader that is setup in XmlDocument.LoadXml on the service considers the character entity references (&#xxxx;) to be the character they represent.
        /// All XML is converted to unicode on reading and any such entities are removed in favor of the unicode character they represent.
        /// </summary>
        /// <param name="unencodedUserInput">The string that needs to be escaped.</param>
        /// <param name="numberOfEscapes">The number of escapes applied.</param>
        /// <returns>The escaped text.</returns>
        public static unsafe string HtmlEncode(this string unencodedUserInput, ref int numberOfEscapes)
        {
            if (string.IsNullOrEmpty(unencodedUserInput))
                return string.Empty;

            StringWriter output = new StringWriter(CultureInfo.InvariantCulture);
            
            if (output == null)
                throw new ArgumentNullException("output");
            int num1 = IndexOfHtmlEncodingChars(unencodedUserInput);
            if (num1 == -1)
            {
                output.Write(unencodedUserInput);
            }
            else
            {
                int num2 = unencodedUserInput.Length - num1;
                fixed (char* chPtr1 = unencodedUserInput)
                {
                    char* chPtr2 = chPtr1;
                    while (num1-- > 0)
                        output.Write(*chPtr2++);
                    while (num2-- > 0)
                    {
                        char ch = *chPtr2++;
                        if (ch <= 62)
                        {
                            switch (ch)
                            {
                                case '"':
                                    output.Write(""");
                                    numberOfEscapes++;
                                    continue;
                                case '&':
                                    output.Write("&amp;");
                                    numberOfEscapes++;
                                    continue;
                                case '\'':
                                    output.Write("&amp;#x27;");
                                    numberOfEscapes = numberOfEscapes + 2;
                                    continue;
                                case '<':
                                    output.Write("<");
                                    numberOfEscapes++;
                                    continue;
                                case '>':
                                    output.Write(">");
                                    numberOfEscapes++;
                                    continue;
                                case '/':
                                    output.Write("&amp;#x2F;");
                                    numberOfEscapes = numberOfEscapes + 2;
                                    continue;
                                default:
                                    output.Write(ch);
                                    continue;
                            }
                        }
                        if (ch >= 160 && ch < 256)
                        {
                            output.Write("&#");
                            output.Write(((int)ch).ToString(NumberFormatInfo.InvariantInfo));
                            output.Write(';');
                            numberOfEscapes++;
                        }
                        else
                            output.Write(ch);
                    }
                }
            }
            string encodedHtml = output.ToString();
            output.Dispose();
            return encodedHtml;
        }

 

        private static unsafe int IndexOfHtmlEncodingChars(string searchString)
        {
            int num = searchString.Length;
            fixed (char* chPtr1 = searchString)
            {
                char* chPtr2 = (char*)((UIntPtr)chPtr1);
                for (; num > 0; --num)
                {
                    char ch = *chPtr2;
                    if (ch <= 62)
                    {
                        switch (ch)
                        {
                            case '"':
                            case '&':
                            case '\'':
                            case '<':
                            case '>':
                            case '/':
                                return searchString.Length - num;
                        }
                    }
                    else if (ch >= 160 && ch < 256)
                        return searchString.Length - num;
                    ++chPtr2;
                }
            }
            return CharacterIndexNotFound;
        }

        private static char[] _htmlEntityEndingChars = new char[2]
        {
            ';',
            '&'
        };
        private static class HtmlEntities
        {
            private static string[] _entitiesList = new string[253]
            {
                "\"-quot",
                "&-amp",
                "'-apos",
                "<-lt",
                ">-gt",
                " -nbsp",
                "¡-iexcl",
                "¢-cent",
                "£-pound",
                "¤-curren",
                "¥-yen",
                "¦-brvbar",
                "§-sect",
                "¨-uml",
                "©-copy",
                "ª-ordf",
                "«-laquo",
                "¬-not",
                "\x00AD-shy",
                "®-reg",
                "¯-macr",
                "°-deg",
                "±-plusmn",
                "\x00B2-sup2",
                "\x00B3-sup3",
                "´-acute",
                "µ-micro",
                "¶-para",
                "·-middot",
                "¸-cedil",
                "\x00B9-sup1",
                "º-ordm",
                "»-raquo",
                "\x00BC-frac14",
                "\x00BD-frac12",
                "\x00BE-frac34",
                "¿-iquest",
                "À-Agrave",
                "Á-Aacute",
                "Â-Acirc",
                "Ã-Atilde",
                "Ä-Auml",
                "Å-Aring",
                "Æ-AElig",
                "Ç-Ccedil",
                "È-Egrave",
                "É-Eacute",
                "Ê-Ecirc",
                "Ë-Euml",
                "Ì-Igrave",
                "Í-Iacute",
                "Î-Icirc",
                "Ï-Iuml",
                "Ð-ETH",
                "Ñ-Ntilde",
                "Ò-Ograve",
                "Ó-Oacute",
                "Ô-Ocirc",
                "Õ-Otilde",
                "Ö-Ouml",
                "×-times",
                "Ø-Oslash",
                "Ù-Ugrave",
                "Ú-Uacute",
                "Û-Ucirc",
                "Ü-Uuml",
                "Ý-Yacute",
                "Þ-THORN",
                "ß-szlig",
                "à-agrave",
                "á-aacute",
                "â-acirc",
                "ã-atilde",
                "ä-auml",
                "å-aring",
                "æ-aelig",
                "ç-ccedil",
                "è-egrave",
                "é-eacute",
                "ê-ecirc",
                "ë-euml",
                "ì-igrave",
                "í-iacute",
                "î-icirc",
                "ï-iuml",
                "ð-eth",
                "ñ-ntilde",
                "ò-ograve",
                "ó-oacute",
                "ô-ocirc",
                "õ-otilde",
                "ö-ouml",
                "÷-divide",
                "ø-oslash",
                "ù-ugrave",
                "ú-uacute",
                "û-ucirc",
                "ü-uuml",
                "ý-yacute",
                "þ-thorn",
                "ÿ-yuml",
                "Œ-OElig",
                "œ-oelig",
                "Š-Scaron",
                "š-scaron",
                "Ÿ-Yuml",
                "ƒ-fnof",
                "\x02C6-circ",
                "˜-tilde",
                "Α-Alpha",
                "Β-Beta",
                "Γ-Gamma",
                "Δ-Delta",
                "Ε-Epsilon",
                "Ζ-Zeta",
                "Η-Eta",
                "Θ-Theta",
                "Ι-Iota",
                "Κ-Kappa",
                "Λ-Lambda",
                "Μ-Mu",
                "Ν-Nu",
                "Ξ-Xi",
                "Ο-Omicron",
                "Π-Pi",
                "Ρ-Rho",
                "Σ-Sigma",
                "Τ-Tau",
                "Υ-Upsilon",
                "Φ-Phi",
                "Χ-Chi",
                "Ψ-Psi",
                "Ω-Omega",
                "α-alpha",
                "β-beta",
                "γ-gamma",
                "δ-delta",
                "ε-epsilon",
                "ζ-zeta",
                "η-eta",
                "θ-theta",
                "ι-iota",
                "κ-kappa",
                "λ-lambda",
                "μ-mu",
                "ν-nu",
                "ξ-xi",
                "ο-omicron",
                "π-pi",
                "ρ-rho",
                "ς-sigmaf",
                "σ-sigma",
                "τ-tau",
                "υ-upsilon",
                "φ-phi",
                "χ-chi",
                "ψ-psi",
                "ω-omega",
                "ϑ-thetasym",
                "ϒ-upsih",
                "ϖ-piv",
                " -ensp",
                " -emsp",
                " -thinsp",
                "\x200C-zwnj",
                "\x200D-zwj",
                "\x200E-lrm",
                "\x200F-rlm",
                "–-ndash",
                "—-mdash",
                "‘-lsquo",
                "’-rsquo",
                "‚-sbquo",
                "“-ldquo",
                "”-rdquo",
                "„-bdquo",
                "†-dagger",
                "‡-Dagger",
                "•-bull",
                "…-hellip",
                "‰-permil",
                "′-prime",
                "″-Prime",
                "‹-lsaquo",
                "›-rsaquo",
                "‾-oline",
                "⁄-frasl",
                "€-euro",
                "ℑ-image",
                "℘-weierp",
                "ℜ-real",
                "™-trade",
                "ℵ-alefsym",
                "←-larr",
                "↑-uarr",
                "→-rarr",
                "↓-darr",
                "↔-harr",
                "↵-crarr",
                "⇐-lArr",
                "⇑-uArr",
                "⇒-rArr",
                "⇓-dArr",
                "⇔-hArr",
                "∀-forall",
                "∂-part",
                "∃-exist",
                "∅-empty",
                "∇-nabla",
                "∈-isin",
                "∉-notin",
                "∋-ni",
                "∏-prod",
                "∑-sum",
                "−-minus",
                "∗-lowast",
                "√-radic",
                "∝-prop",
                "∞-infin",
                "∠-ang",
                "∧-and",
                "∨-or",
                "∩-cap",
                "∪-cup",
                "∫-int",
                "∴-there4",
                "∼-sim",
                "≅-cong",
                "≈-asymp",
                "≠-ne",
                "≡-equiv",
                "≤-le",
                "≥-ge",
                "⊂-sub",
                "⊃-sup",
                "⊄-nsub",
                "⊆-sube",
                "⊇-supe",
                "⊕-oplus",
                "⊗-otimes",
                "⊥-perp",
                "⋅-sdot",
                "⌈-lceil",
                "⌉-rceil",
                "⌊-lfloor",
                "⌋-rfloor",
                "〈-lang",
                "〉-rang",
                "◊-loz",
                "♠-spades",
                "♣-clubs",
                "♥-hearts",
                "♦-diams"
            };
            private static Dictionary<string, char> _lookupTable = GenerateLookupTable();

            private static Dictionary<string, char> GenerateLookupTable()
            {
                Dictionary<string, char> dictionary = new Dictionary<string, char>(StringComparer.Ordinal);
                foreach (string str in _entitiesList)
                    dictionary.Add(str.Substring(2), str[0]);
                return dictionary;
            }

            public static char Lookup(string entity)
            {
                char ch;
                _lookupTable.TryGetValue(entity, out ch);
                return ch;
            }
        }
    }
}

You may also notice that I’ve mocked the OperationContext.
Thanks to WCFMock, a mocking framework for WCF services.
I won’t include this code, but you can get it here.
I’ve used the popular NUnit test framework and RhinoMocks for the stubbing and mocking.
Both pulled into the solution using NuGet.
Most useful documentation for RhinoMocks:
http://ayende.com/Wiki/Rhino+Mocks+3.5.ashx
http://ayende.com/wiki/Rhino+Mocks.ashx

For this project I used NLog and wrapped it.
Now you start to get an idea of how to use the sanitisation.

using System;
using System.ServiceModel;
using System.ServiceModel.Channels;
using NUnit.Framework;
using System.Configuration;
using Rhino.Mocks;
using Common.Wrapper.Log;
using MockedOperationContext = System.ServiceModel.Web.MockedOperationContext;
using Common.WcfHelpers.ErrorHandling.Exceptions;

namespace Sanitisation.UnitTest
{
    [TestFixture]
    public class SanitiseTest
    {
        private const string _myTestIpv4Address = "My.Test.Ipv4.Address";
        private readonly int _maxLengthHtmlEncodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlEncodedUserInput"]);
        private readonly int _maxLengthHtmlDecodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlDecodedUserInput"]);
        private readonly string _encodedUserInput_thatsMaxDecodedLength = @"One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.
One #x2F &amp;#x2F; two amp &amp; three #x27 &amp;#x27; four lt < five quot " six gt >.";
        private readonly string _decodedUserInput_thatsMaxLength = @"One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.
One #x2F / two amp & three #x27 ' four lt < five quot "" six gt >.";

        [Test]
        public void Sanitise_UserInput_WhenGivenNull_ShouldReturnEmptyString()
        {
            Assert.That(new Sanitise().UserInput(null), Is.EqualTo(string.Empty));
        }

        [Test]
        public void Sanitise_UserInput_WhenGivenEmptyString_ShouldReturnEmptyString()
        {
            Assert.That(new Sanitise().UserInput(string.Empty), Is.EqualTo(string.Empty));
        }

        [Test]
        public void Sanitise_UserInput_WhenGivenSanitisedString_ShouldReturnSanitisedString()
        {
            // Open the whitelist up in order to test the encoding without restriction.
            Assert.That(new Sanitise(whiteList: @"^[\w\s\.,#/&'<"">]+$").UserInput(_encodedUserInput_thatsMaxDecodedLength), Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength));
        }
        [Test]
        [ExpectedException(typeof(SanitisationWcfException))]
        public void Sanitise_UserInput_ShouldThrowExceptionIfEscapedInputToLong()
        {
            string fourThousandAndOneCharacters = "Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand characters. Four thousand character";
            string expectedError = "The un-modified string received from the client with the following IP address: " +
                   '"' + _myTestIpv4Address + "\" " +
                   "exceeded the allowed maximum length of an escaped Html user input string. " +
                   "The maximum length allowed is: " +
                   _maxLengthHtmlEncodedUserInput +
                   ". The length was: " +
                   (_maxLengthHtmlEncodedUserInput+1) + ".";

            using(new MockedOperationContext(StubbedOperationContext))
            {
                try
                {
                    new Sanitise().UserInput(fourThousandAndOneCharacters);
                }
                catch(SanitisationWcfException e)
                {
                    Assert.That(e.Message, Is.EqualTo(expectedError));
                    Assert.That(e.UnsanitisedAnswer, Is.EqualTo(fourThousandAndOneCharacters));
                    throw;
                }
            }
        }
        [Test]
        [ExpectedException(typeof(SanitisationWcfException))]
        public void Sanitise_UserInput_DecodedUserInputShouldThrowException_WhenMaxLengthHtmlDecodedUserInputIsExceeded()
        {
            char oneCharOverTheLimit = '.';
            string expectedError =
                           "The string received from the client with the following IP address: " +
                           "\"" + _myTestIpv4Address + "\" " +
                           "after Html decoding exceded the allowed maximum length of an un-escaped Html user input string." +
                           Environment.NewLine +
                           "The maximum length allowed is: " + _maxLengthHtmlDecodedUserInput + ". The length was: " +
                           (_decodedUserInput_thatsMaxLength + oneCharOverTheLimit).Length + oneCharOverTheLimit;

            using(new MockedOperationContext(StubbedOperationContext))
            {
                try
                {
                    new Sanitise().UserInput(_encodedUserInput_thatsMaxDecodedLength + oneCharOverTheLimit);
                }
                catch(SanitisationWcfException e)
                {
                    Assert.That(e.Message, Is.EqualTo(expectedError));
                    Assert.That(e.UnsanitisedAnswer, Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength + oneCharOverTheLimit));
                    throw;
                }
            }
        }
        [Test]
        public void Sanitise_UserInput_ShouldLogAndSendEmail_IfNumberOfDecodedHtmlEntitiesDoesNotMatchNumberOfEscapes()
        {
            string encodedUserInput_with6HtmlEntitiesNotEscaped = _encodedUserInput_thatsMaxDecodedLength.Replace("&amp;#x2F;", "/");
            string errorWeAreExpecting =
                "It appears as if someone has circumvented the client side Html entity encoding." + Environment.NewLine +
                "The requesting IP address was: " +
                "\"" + _myTestIpv4Address + "\" " +
                "The sanitised input we receive from the client was the following:" + Environment.NewLine +
                "\"" + encodedUserInput_with6HtmlEntitiesNotEscaped + "\"" + Environment.NewLine +
                "The same input after decoding and re-escaping on the server side was the following:" + Environment.NewLine +
                "\"" + _encodedUserInput_thatsMaxDecodedLength + "\"";
            string sanitised;
            // setup _logger
            ILogger logger = MockRepository.GenerateMock<ILogger>();
            logger.Expect(lgr => lgr.logError(errorWeAreExpecting));

            Sanitise sanitise = new Sanitise(@"^[\w\s\.,#/&'<"">]+$", logger);

            using (new MockedOperationContext(StubbedOperationContext))
            {
                // Open the whitelist up in order to test the encoding etc.
                sanitised = sanitise.UserInput(encodedUserInput_with6HtmlEntitiesNotEscaped);
            }

            Assert.That(sanitised, Is.EqualTo(_encodedUserInput_thatsMaxDecodedLength));
            logger.VerifyAllExpectations();
        }        

        private static IOperationContext StubbedOperationContext
        {
            get
            {
                IOperationContext operationContext = MockRepository.GenerateStub<IOperationContext>();
                int port = 80;
                RemoteEndpointMessageProperty remoteEndpointMessageProperty = new RemoteEndpointMessageProperty(_myTestIpv4Address, port);
                operationContext.Stub(oc => oc.IncomingMessageProperties[RemoteEndpointMessageProperty.Name]).Return(remoteEndpointMessageProperty);
                return operationContext;
            }
        }
    }
}

Now the API code that we can use to do our sanitisation.

using System;
using System.Configuration;
// Todo : KC We need time to implement DI. Should be using something like ninject.extensions.wcf.
using OperationContext = System.ServiceModel.Web.MockedOperationContext;
using System.ServiceModel.Channels;
using Common.Security.Sanitisation;
using Common.WcfHelpers.ErrorHandling.Exceptions;
using Common.Wrapper.Log;

namespace Sanitisation
{

    public class Sanitise
    {
        private readonly string _whiteList;
        private readonly ILogger _logger;
        

        private string RequestingIpAddress
        {
            get
            {
                RemoteEndpointMessageProperty remoteEndpointMessageProperty = OperationContext.Current.IncomingMessageProperties[RemoteEndpointMessageProperty.Name] as RemoteEndpointMessageProperty;
                return ((remoteEndpointMessageProperty != null) ? remoteEndpointMessageProperty.Address : string.Empty);
            }
        }
        /// <summary>
        /// Provides server side escaping of Html entities, and runs the supplied whitelist character filter over the user input string.
        /// </summary>
        /// <param name="whiteList">Should be provided by DI from the ResourceFile.</param>
        /// <param name="logger">Should be provided by DI. Needs to be an asynchronous logger.</param>
        /// <example>
        /// The whitelist can be obtained from a ResourceFile like so...
        /// <code>
        /// private Resource _resource;
        /// _resource.GetString("WhiteList");
        /// </code>
        /// </example>
        public Sanitise(string whiteList = "", ILogger logger = null)
        {
            _whiteList = whiteList;
            _logger = logger ?? new Logger();
        }
        /// <summary>
        /// 1) Check field lengths.         Client side validation may have been negated.
        /// 2) Check against white list.	Client side validation may have been negated.
        /// 3) Check Html escaping.         Client side validation may have been negated.

        /// Generic Fail actions:	Drop the payload. No point in trying to massage and save, as it won't be what the user was expecting,
        ///                         Add full error to a WCFException Message and throw.
        ///                         WCF interception reads the WCFException.MessageForClient, and sends it to the user. 
        ///                         On return, log the WCFException's Message.
        ///                         
        /// Escape Fail actions:	Asynchronously Log and email full error to support.


        /// 1) BA confirmed 50 for text, and 400 for textarea.
        ///     As we don't know the field type, we'll have to go for 400."
        ///
        ///     First we need to check that we haven't been sent some huge string.
        ///     So we check that the string isn't longer than 400 * 10 = 4000.
        ///     10 is the length of our double escaped character references.
        ///     Or, we ask the business for a number."
        ///     If we fail here, perform Generic Fail actions and don't complete the following steps.
        /// 
        ///     Convert all Html Entity Encodings back to their equivalent characters, and count how many occurrences.
        ///
        ///     If the string is longer than 400, perform Generic Fail actions and don't complete the following steps.
        /// 
        /// 2) check all characters against the white list
        ///     If any don't match, perform Generic Fail actions and don't complete the following steps.
        /// 
        /// 3) re html escape (as we did in JavaScript), and count how many escapes.
        ///     If count is greater than the count of Html Entity Encodings back to their equivalent characters,
        ///     Perform Escape Fail actions. Return sanitised string.
        /// 
        ///     If we haven't returned, return sanitised string.
        
        
        /// Performs checking on the text passed in, to verify that client side escaping and whitelist validation has already been performed.
        /// Performs decoding, and re-encodes. Counts that the number of escapes was the same, otherwise we log and send email with the details to support.
        /// Throws exception if the client side validation failed to restrict the number of characters in the escaped string we received.
        ///     This needs to be intercepted at the service.
        ///     The exceptions default message for client needs to be passed back to the user.
        ///     On return, the interception needs to log the exception's message.
        /// </summary>
        /// <param name="sanitiseMe"></param>
        /// <returns></returns>
        public string UserInput(string sanitiseMe)
        {
            if (string.IsNullOrEmpty(sanitiseMe))
                return string.Empty;

            ThrowExceptionIfEscapedInputToLong(sanitiseMe);

            int numberOfDecodedHtmlEntities = 0;
            string decodedUserInput = HtmlDecodeUserInput(sanitiseMe, ref numberOfDecodedHtmlEntities);

            if(!decodedUserInput.CompliesWithWhitelist(whiteList: _whiteList))
            {
                string error = "The answer received from client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "had characters that failed to match the whitelist.";
                throw new SanitisationWcfException(error);
            }

            int numberOfEscapes = 0;
            string sanitisedUserInput = decodedUserInput.HtmlEncode(ref numberOfEscapes);

            if(numberOfEscapes != numberOfDecodedHtmlEntities)
            {
                AsyncLogAndEmail(sanitiseMe, sanitisedUserInput);
            }

            return sanitisedUserInput;
        }
        /// <note>
        /// Make sure the logger is setup to log asynchronously
        /// </note>
        private void AsyncLogAndEmail(string sanitiseMe, string sanitisedUserInput)
        {
            // no need for SanitisationException

            _logger.logError(
                "It appears as if someone has circumvented the client side Html entity encoding." + Environment.NewLine +
                "The requesting IP address was: " +
                "\"" + RequestingIpAddress + "\" " +
                "The sanitised input we receive from the client was the following:" + Environment.NewLine +
                "\"" + sanitiseMe + "\"" + Environment.NewLine +
                "The same input after decoding and re-escaping on the server side was the following:" + Environment.NewLine +
                "\"" + sanitisedUserInput + "\""
                );
        }

        /// <summary>
        /// This procedure may throw a SanitisationWcfException.
        /// If it does, ErrorHandlerBehaviorAttribute will need to pass the "messageForClient" back to the client from within the IErrorHandler.ProvideFault procedure.
        /// Once execution is returned, the IErrorHandler.HandleError procedure of ErrorHandlerBehaviorAttribute
        /// will continue to process the exception that was thrown in the way of logging sensitive info.
        /// </summary>
        /// <param name="toSanitise"></param>
        private void ThrowExceptionIfEscapedInputToLong(string toSanitise)
        {
            int maxLengthHtmlEncodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlEncodedUserInput"]);
            if (toSanitise.Length > maxLengthHtmlEncodedUserInput)
            {
                string error = "The un-modified string received from the client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "exceeded the allowed maximum length of an escaped Html user input string. " +
                    "The maximum length allowed is: " +
                    maxLengthHtmlEncodedUserInput +
                    ". The length was: " +
                    toSanitise.Length + ".";
                throw new SanitisationWcfException(error, unsanitisedAnswer: toSanitise);
            }
        }

        private string HtmlDecodeUserInput(string doubleEncodedUserInput, ref int numberOfDecodedHtmlEntities)
        {
            string decodedUserInput = doubleEncodedUserInput.HtmlDecode(ref numberOfDecodedHtmlEntities).HtmlDecode(ref numberOfDecodedHtmlEntities) ?? string.Empty;
            
            // if the decoded string is longer than MaxLengthHtmlDecodedUserInput throw
            int maxLengthHtmlDecodedUserInput = int.Parse(ConfigurationManager.AppSettings["MaxLengthHtmlDecodedUserInput"]);
            if(decodedUserInput.Length > maxLengthHtmlDecodedUserInput)
            {
                throw new SanitisationWcfException(
                    "The string received from the client with the following IP address: " +
                    "\"" + RequestingIpAddress + "\" " +
                    "after Html decoding exceded the allowed maximum length of an un-escaped Html user input string." +
                    Environment.NewLine +
                    "The maximum length allowed is: " + maxLengthHtmlDecodedUserInput + ". The length was: " +
                    decodedUserInput.Length + ".",
                    unsanitisedAnswer: doubleEncodedUserInput
                    );
            }
            return decodedUserInput;
        }
    }
}

As you can see, there’s a lot more work in the server side sanitisation than the client side.

Sanitising User Input from Browser. part 1

November 4, 2012

I was working on a web based project recently where there was no security thought about when designing, developing it.
The following outlines my experience with retrofitting security.
It’s my hope that someone will find it useful for their own implementation.

We’ll be focussing on the client side in this post (part 1) and the server side in part 2.
We’ll also cover some preliminary discussion that will set the stage for this series.

The application consists of a WCF service delivering up content to some embedding code on any page in the browser.
The content is stored as Xml in the database and transformed into Html via Xslt.

The first activity I find useful is to go through the process of Threat Modelling the Application.
This process can be quite daunting for those new to it.
Here’s a couple of references I find quite useful to get started:

https://www.owasp.org/index.php/Application_Threat_Modeling

https://www.owasp.org/index.php/Threat_Risk_Modeling#Decompose_Application

Actually this ones not bad either.

There is no single right way to do this.
The more you read and experiment, the more equipped you will be.
The idea is to think like an attacker thinks.
This may be harder for some than others, but it is essential, to cover as many potential attack vectors as possible.
Remember, there is no secure system, just varying levels of insecurity.
It will always be much harder to discover the majority of security weaknesses in your application as the person or team creating/maintaining it,
than for the person attacking it.
The Threat Modelling topic is large and I’m not going to go into it here, other than to say, you need to go into it.

Threat Agents

Work out who your Threat Agents are likely to be.
Learn how to think like they do.
Learn what skills they have and learn the skills your self.
Sometimes the skills are very non technical.
For example walking through the door of your organisation in the weekend because the cleaners (or any one with access) forgot to lock up.
Or when the cleaners are there and the technical staff are not (which is just as easy).
It happens more often than we like to believe.

Defense in Depth

To attempt to mitigate attacks, we need to take a multi layered approach (often called defence in depth).

What made me decide to start with sanitising user input from the browser anyway?
Well according to the OWASP Top 10, Injection and Cross Site Scripting (XSS) are still the most popular techniques chosen to compromise web applications.
So it makes sense if your dealing with web apps, to target the most common techniques exploited.

Now, in regards to defence in depth when discussing web applications;
If the attacker gets past the first line of defence, there should be something stopping them at the next layer and so forth.
The aim is to stop the attack as soon as possible.
This is why we focus on the UI first, and later move our focus to the application server, then to the database.
Bear in mind though, that what ever we do on the client side, can be circumvented relatively easy.
Client side code is out of our control, so it’s best effort.
Because of this, we need to perform the following not only in the browser, but as much as possible on the server side as well.

  1. Minimising the attack surface
  2. Defining maximum field lengths (validation)
  3. Determining a white list of allowable characters (validation)
  4. Escaping untrusted data, especially where you know it’s going to endup in an execution context. Even where you don’t think this is likely, it’s still possible.
  5. Using Stored Procedures / parameterised queries (not covered in this series).
  6. Least Privilege.
    Minimising the privileges assigned to every database account (not covered in this series).

Minimising the attack surface

input fields should only allow certain characters to be input.
Text input fields, textareas etc that are free form (anything is allowed) are very hard to constrain to a small white list.
input fields where ever possible should be constrained to well structured data,
like dates, social security numbers, zip codes, e-mail addresses, etc. then the developer should be able to define a very strong validation pattern, usually based on regular expressions, for validating such input. If the input field comes from a fixed set of options, like a drop down list or radio buttons, then the input needs to match exactly one of the values offered to the user in the first place.
As it was with the existing app I was working on, we had to allow just about everything in our free form text fields.
This will have to be re-designed in order to provide constrained input.

Defining maximum field lengths (validation)

This was currently being done (sometimes) in the Xml content for inputs where type="text".
Don’t worry about the inputType="single", it gets transformed.

<input id="2" inputType="single" type="text" size="10" maxlength="10" />

And if no maxlength specified in the Xml, we now specify a default of 50 in the xsl used to do the transformation.
This way we had the input where type="text" covered for the client side.
This would also have to be validated on the server side when the service received values from these inputs where type="text".

    <xsl:template match="input[@inputType='single']">
      <xsl:value-of select="@text" />
        <input name="a{@id}" type="text" id="a{@id}" class="textareaSingle">
          <xsl:attribute name="value">
            <xsl:choose>
              <xsl:when test="key('response', @id)">
                <xsl:value-of select="key('response', @id)" />
              </xsl:when>
              <xsl:otherwise>
                <xsl:value-of select="string(' ')" />
              </xsl:otherwise>
            </xsl:choose>
          </xsl:attribute>
          <xsl:attribute name="maxlength">
            <xsl:choose>
              <xsl:when test="@maxlength">
                <xsl:value-of select="@maxlength"/>
              </xsl:when>
              <xsl:otherwise>50</xsl:otherwise>
            </xsl:choose>
          </xsl:attribute>
        </input>
        <br/>
    </xsl:template>

For textareas we added maxlength validation as part of the white list validation.
See below for details.

Determining a white list of allowable characters (validation)

See bottom of this section for Update

Now this was quite an interesting exercise.
I needed to apply a white list to all characters being entered into the input fields.
A user can:

  1. type the characters in
  2. [ctrl]+[v] a clipboard full of characters in
  3. right click -> Paste

To cover all these scenarios as elegantly as possible, was going to be a bit of a challenge.
I looked at a few JavaScript libraries including one or two JQuery plug-ins.
None of them covered all these scenarios effectively.
I wish they did, because the solution I wasn’t totally happy with, because it required polling.
In saying that, I measured performance, and even bringing the interval right down had negligible effect, and it covered all scenarios.

setupUserInputValidation = function () {

  var textAreaMaxLength = 400;
  var elementsToValidate;
  var whiteList = /[^A-Za-z_0-9\s.,]/g;

  var elementValue = {
    textarea: '',
    textareaChanged: function (obj) {
      var initialValue = obj.value;
      var replacedValue = initialValue.replace(whiteList, "").slice(0, textAreaMaxLength);
      if (replacedValue !== initialValue) {
        this.textarea = replacedValue;
        return true;
      }
      return false;
    },
    inputtext: '',
    inputtextChanged: function (obj) {
      var initialValue = obj.value;
      var replacedValue = initialValue.replace(whiteList, "");
      if (replacedValue !== initialValue) {
        this.inputtext = replacedValue;
        return true;
      }
      return false;
    }
  };

  elementsToValidate = {
    textareainputelements: (function () {
      var elements = $('#page' + currentPage).find('textarea');
      if (elements.length > 0) {
        return elements;
      }
      return 'no elements found';
    } ()),
    textInputElements: (function () {
      var elements = $('#page' + currentPage).find('input[type=text]');
      if (elements.length > 0) {
        return elements;
      }
      return 'no elements found';
    } ())
  };

  // store the intervals id in outer scope so we can clear the interval when we change pages.
  userInputValidationIntervalId = setInterval(function () {
    var element;

    // Iterate through each one and remove any characters not in the whitelist.
    // Iterate through each one and trim any that are longer than textAreaMaxLength.

    for (element in elementsToValidate) {
      if (elementsToValidate.hasOwnProperty(element)) {
        if (elementsToValidate[element] === 'no elements found')
          continue;

        $.each(elementsToValidate[element], function () {
          $(this).attr('value', function () {
            var name = $(this).prop('tagName').toLowerCase();
            name = name === 'input' ? name + $(this).prop('type') : name;
            if (elementValue[name + 'Changed'](this))
              this.value = elementValue[name];
          });
        });
      }
    }
  }, 300); // milliseconds
};

Each time we change page, we clear the interval and reset it for the new page.

clearInterval(userInputValidationIntervalId);

setupUserInputValidation();

Update 2013-06-02:

Now with HTML5 we have the pattern attribute on the input tag, which allows us to specify a regular expression that the text about to be received is checked against. We can also see it here amongst the new HTML5 attributes . If used, this can make our JavaScript white listing redundant, providing we don’t have textareas which W3C has neglected to include the new pattern attribute on. I’d love to know why?

Escaping untrusted data

Escaped data will still render in the browser properly.
Escaping simply lets the interpreter know that the data is not intended to be executed,
and thus prevents the attack.

Now what we do here is extend the String prototype with a function called htmlEscape.

if (typeof Function.prototype.method !== "function") {
  Function.prototype.method = function (name, func) {
    this.prototype[name] = func;
    return this;
  };
}

String.method('htmlEscape', function () {

  // Escape the following characters with HTML entity encoding to prevent switching into any execution context,
  // such as script, style, or event handlers.
  // Using hex entities is recommended in the spec.
  // In addition to the 5 characters significant in XML (&, <, >, ", '), the forward slash is included as it helps to end an HTML entity.
  var character = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    // Double escape character entity references.
    // Why?
    // The XmlTextReader that is setup in XmlDocument.LoadXml on the service considers the character entity references () to be the character they represent.
    // All XML is converted to unicode on reading and any such entities are removed in favor of the unicode character they represent.
    // So we double escape character entity references.
    // These now get read to the XmlDocument and saved to the database as double encoded Html entities.
    // Now when these values are pulled from the database and sent to the browser, it decodes the & and displays #x27; and/or #x2F.
    // This isn't what we want to see in the browser.
    "'": '&amp;#x27;',    // &apos; is not recommended
    '/': '&amp;#x2F;'     // forward slash is included as it helps end an HTML entity
  };

  return function () {
    return this.replace(/[&<>"'/]/g, function (c) {
      return character[c];
    });
  };
}());

This allows us to, well, html escape our strings.

element.value.htmlEscape();

In looking through here,
The only untrusted data we are capturing is going to be inserted into an Html element

tag by way of insertion into a textarea element,
or the attribute value of input elements where type="text".
I initially thought I’d have to:

  1. Html escape the untrusted data which is only being captured from textarea elements.
  2. Attribute escape the untrusted data which is being captured from the value attribute of input elements where type="text".

RULE #2 – Attribute Escape Before Inserting Untrusted Data into HTML Common Attributes of here,
mentions
“Properly quoted attributes can only be escaped with the corresponding quote.”
So I decided to test it.
Created a collection of injection attacks. None of which worked.
Turned out we only needed to Html escape for the untrusted data that was going to be inserted into the textarea element.
More on this in a bit.

Now in regards to the code comments in the above code around having to double escape character entity references;
Because we’re sending the strings to the browser, it’s easiest to single decode the double encoded Html on the service side only.
Now because we’re still focused on the client side sanitisation,
and we are going to shift our focus soon to making sure we cover the server side,
we know we’re going to have to create some sanitisation routines for our .NET service.
Because the routines are quite likely going to be static, and we’re pretty much just dealing with strings,
lets create an extensions class in a new project in a common library we’ve already got.
This will allow us to get the widest use out of our sanitisation routines.
It also allows us to wrap any existing libraries or parts of them that we want to get use of.

namespace My.Common.Security.Encoding
{
    /// <summary>
    /// Provides a series of extension methods that perform sanitisation.
    /// Escaping, unescaping, etc.
    /// Usually targeted at user input, to help defend against the likes of XSS attacks.
    /// </summary>
    public static class Extensions
    {
        /// <summary>
        /// Returns a new string in which all occurrences of a double escaped html character (that's an html entity immediatly prefixed with another html entity)
        /// in the current instance are replaced with the single escaped character.
        /// </summary>
        ///
        /// The new string.
        public static string SingleDecodeDoubleEncodedHtml(this string source)
        {
            return source.Replace("&amp;#x", "&#x");
        }
    }
}

Now when we run our xslt transformation on the service, we chain our new extension method on the end.
Which gives us back a single encoded string that the browser is happy to display as the decoded value.

return Transform().SingleDecodeDoubleEncodedHtml();

Now back to my findings from the test above.
Turns out that “Properly quoted attributes can only be escaped with the corresponding quote.” really is true.
I thought that if I entered something like the following into the attribute value of an input element where type="text",
then the first double quote would be interpreted as the corresponding quote,
and the end double quote would be interpreted as the end quote of the onmouseover attribute value.

 " onmouseover="alert(2)

What actually happens, is during the transform…

xslCompiledTransform.Transform(xmlNodeReader, args, writer, new XmlUrlResolver());

All the relevant double quotes are converted to the double quote Html entity ‘”‘ without the single quotes.

onmouseover

And all double quotes are being stored in the database as the character value.

Libraries and useful code

Microsoft Anti-Cross Site Scripting Library

OWASP Encoding Project
This is the Reform library. Supports Perl, Python, PHP, JavaScript, ASP, Java, .NET

Online escape tool supporting Html escape/unescape, Java, .NET, JavaScript

The characters that need escaping for inserting untrusted data into Html element content

JavaScript The Good Parts: pg 90 has a nice ‘entityify’ function

OWASP Enterprise Security API Used for JavaScript escaping (ESAPI4JS)

JQuery plugin

Changing encoding on html page

Cheat Sheets and Check Lists I found helpful

https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet

https://www.owasp.org/index.php/OWASP_Validation_Regex_Repository

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet

https://www.owasp.org/index.php/DOM_based_XSS_Prevention_Cheat_Sheet

https://www.owasp.org/index.php/OWASP_AJAX_Security_Guidelines

If any of this is unclear, let me know and I’ll do my best to clarify. Maybe you have suggestions of how this could have been improved? Let’s spark a discussion.

Data Centre in a Rack

June 9, 2012

I recently took the plunge to install some of my more used networking components into a server rack.
I’d been putting this off for a few years.
Most of these components have been projects of mine which I’ve already blogged on in various places on this blog.
The obvious places are the following

There are also many other topics I’ve blogged on that form part of the work gone into these components and set-up of.
Check them out.

There’s also a home made router in an old $30 desktop pc run from a CF card.

Small Data Centre

Home Rack Server

HPR Pod cast on a bunch of good tools useful for setting up and maintaining an Open Source Data Centre.
ep0366 :: The Open Source Data Center

Questions welcome.
I’m happy to provide directions and insights from my experience.