Keeping Your NodeJS Web App Running on Production Linux

June 27, 2015

All the following offerings that I’ve evaluated target different scenarios. I’ve listed the pros and cons for each of them and where I think they fit into a potential solution to monitor your web applications (I’m leaning toward NodeJS) and make sure they keep running. I’ve listed the goals I was looking to satisfy.

For me I have to have a good knowledge of the landscape before I commit to a decision and stand behind it. I like to know I’ve made the best decision based on all the facts that are publicly available. Therefore, as always, it’s my responsibility to make sure I’ve done my research in order to make an informed and ideally… best decision possible. I’m pretty sure my evaluation was un-biased, as I hadn’t used any of the offerings other than forever before.

I looked at quite a few more than what I’ve detailed below, but the following candidates I felt were worth spending some time on.

Keep in mind, that everyone’s requirements will be different, so rather than tell you which to use because I don’t know your situation, I’ve listed the attributes (positive, negative and neutral) that I think are worth considering when making this choice. After my evaluation I make some decisions and start the configuration.

Evaluation criterion

  1. Who is the creator. I favour teams rather than individuals, as individuals move on, then where does that leave the product?
  2. Does it do what we need it to do? Goals address this.
  3. Do I foresee any integration problems with other required components?
  4. Cost in money. Is it free? I usually gravitate toward free software. It’s usually an easier sell to clients and management. Are there catches once you get further down the road? Usually open source projects are marketed as is.
  5. Cost in time. Is the set-up painful?
  6. How well does it appear to be supported? What do the users say?
  7. Documentation. Is there any / much? What is it’s quality?
  8. Community. Does it have an active one? Are the users getting their questions answered satisfactorily? Why are the unhappy users unhappy (do they have a valid reason).
  9. Release schedule. How often are releases being made? When was the last release?
  10. Gut feeling, Intuition. How does it feel. If you have experience in making these sorts of choices, lean on it. Believe it or not, this should probably be No. 1.

The following tools have been my choice based on the above criterion.

Goals

  1. Application should start automatically on system boot
  2. Application should be re-started if it dies or becomes un-responsive
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy (Nginx, node-http-proxy, Tinyproxy, Squid, Varnish, etc)
    2. Clustering and providing load balancing for your single threaded application
    3. Visibility of application statistics.
  4. Enough documentation to feel comfortable consuming the offering
  5. The offering should be production ready. This means: mature with a security conscious architecture.

Sysvinit, Upstart, systemd & Runit

You’ll have one of these running on your Linux box.

These are system and service managers for Linux. Upstart and the later systemd were developed as replacements for the traditional init daemon (Sysvinit), which all depend on init. Init is an essential package that pulls in the default init system. In Debian, starting with Jessie, systemd is your default system and service manager.

There’s some quite helpful info on the differences between Sysvinit and systemd here.

systemd

As I have systemd installed out of the box on my test machine (Debian Jessie), I’ll be using this for my set-up.

Documentation

  1. Well written comparison with Upstart, systemd, Runit and even Supervisor.

Running the likes of the below commands will provide some good details on how these packages interact with each other:

aptitude show sysvinit
aptitude show systemd
# and any others you think of

These system and service managers all run as PID 1 and start the rest of the system. Your Linux system will more than likely be using one of these to start tasks and services during boot, stop them during shutdown and supervise them while the system is running. Ideally you’re going to want to use something higher level to look after your NodeJS app. See the following candidates…

forever

and it’s web UI. Can run any kind of script continuously (whether it is written in node.js or not). This wasn’t always the case though. It was originally targeted toward keeping NodeJS applications running.

Requires NPM to install globally. We already have a package manager on Debian and all other main-stream Linux distros. Installing NPM just adds more attack surface area. Unless it’s essential, I’d rather do without NPM on a production server where we’re actively working to reduce the installed package count and disable everything else we can. I could install forever on a development box and then copy to the production server, but it starts to turn the simplicity of a node module into something not as simple, which then makes offerings like Supervisor, Monit and Passenger look even more attractive.

NPM Details

Does it Meet our Goals?

  1. Not without an extra script. Crontab or similar
  2. Application will be re-started if it dies, but if it’s response times go up, there’s not much forever is going to do about it. It has no way of knowing.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: I don’t see a problem
    2. Integrate NodeJS’s core module cluster into your NodeJS application for load balancing
    3. Visibility of application statistics could be added later with the likes of Monit or something else, but if you used Monit, then there wouldn’t really be a need for forever as Monit does the little that forever does and is capable of so much more, but is not pushy on what to do and how to do it. All the behaviour is defined with quite a nice syntax in a config file or as many as you like.
  4. I think there is enough documentation to feel comfortable consuming it, as forever doesn’t do a lot, which doesn’t have to be a bad thing.
  5. The code it self is probably production ready, but I’ve heard quite a bit about stability issues. You’re also expected to have NPM installed (more attack surface) when we already have native package managers on the server(s).

Overall Thoughts

For me, I’m looking for a tool set that does a bit more. Forever doesn’t satisfy my requirements. There’s often a balancing act between not doing enough and doing too much.

PM2

PM2

Younger than forever, but seems to have quite a few more features and does actually look quite good. I’m not sure about production ready though?

As mentioned on the github page: “PM2 is a production process manager for Node.js applications with a built-in load balancer“. This “Sounds” and at the initial glance looks shiny. Very quickly you should realise there are a few security issues you need to be aware of though.

The word “production” is used but it requries NPM to install globally. We already have a package manager on Debian and all other main-stream Linux distros. Installing NPM just adds more attack surface area. Unless it’s essential and it shouldn’t be, I’d rather do without it on a production system. I could install PM2 on a development box and then copy to the production server, but it starts to turn the simplicity of a node module into something not as simple, which then makes offerings like Supervisor, Monit and Passenger look even more attractive.

At the time of writing this, it’s less than a year old and in nodejs land, that means it’s very much in the immature realm. Do you really want to use something that young on a production server? I’d personally advise against it.

Yes, it’s very popular currently. That doesn’t tell me it’s ready for production though. It tells me the marketing is working.

Is your production server ready for PM2? That phrase alone tells me the mind-set behind the project. I’d much sooner see it the other way around. Is PM2 ready for my production server? You’re going to need a staging server for this, unless you honestly want development tools installed on your production server (git, build-essential, NVM and an unstable version of node 0.11.14 (at time of writing)) and run test scripts on your production server? Not for me or my clients thanks.

If you’ve considered the above concerns and can justify adding the additional attack surface area, check out the features if you haven’t already.

Features that Stood Out

They’re also listed on the github repository. Just beware of some of the caveats. Like for the load balancing: “we recommend the use of node#0.11.15+ or io.js#1.0.2+. We do not support node#0.10.* cluster module anymore!” 0.11.15 is unstable, but hang-on, I thought PM2 was a “production” process manager? OK, so were happy to mix unstable in with something we label as production?

On top of NodeJS, PM2 will run the following scripts: bash, python, ruby, coffee, php, perl.

Start-up Script Generation

Although I’ve heard a few stories that this is fairly un-reliable at the time of writing this. Which doesn’t surprise me, as the project is very young.

Documentation

  1. Advanced Readme

Does it Meet our Goals?

  1. The feature exists, unsure of how reliable it is currently though?
  2. Application should be re-started if it dies shouldn’t be a problem. PM2 can also restart your application if it reaches a certain memory threshold. I haven’t seen anything around restarting based on response times or other application health issues.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: I don’t see a problem
    2. Clustering and load-balancing is integrated but immature.
    3. PM2 provides a small collection of viewable statistics. Personally I’d want more, but I don’t see any reason why you’d have to swap PM2 because of this.
  4. There is reasonable official documentation for the age of the project. The community supplied documentation will need to catch up a bit, although there is a bit of that too. After working through all of the offerings and edge-cases, I feel as I usually do with NodeJS projects. The documentation doesn’t cover all the edge-cases and the development itself misses edge cases. Hopefully with time it’ll get better though as the project does look promising.
  5. I haven’t seen much that would make me think PM2 is production ready. It’s not yet mature. I don’t agree with it’s architecture.

Overall Thoughts

For me, the architecture doesn’t seem to be heading in the right direction to be used on a production web server where less is better. I’d like to see this change. If it did, I think it could be a serious contender for this space.

 


The following are better suited to monitoring and managing your applications. Other than Passenger, they should all be in your repository, which means trivial installs and configurations.

Supervisor

Supervisord

Supervisor is a process manager with a lot of features and a higher level of abstraction than the likes of the above Sysvinit, upstart, systemd, Runit, etc so it still needs to be run by an init daemon in itself.

From the docs: “It shares some of the same goals of programs like launchd, daemontools, and runit. Unlike some of these programs, it is not meant to be run as a substitute for init as “process id 1”. Instead it is meant to be used to control processes related to a project or a customer, and is meant to start like any other program at boot time.” Supervisor monitors the state of processes. Where as a tool like Monit can perform so many more types of tests and take what ever actions you define.

It’s in the Debian repositories  (trivial install on Debian and derivatives).

Documentation

  1. Main web site
  2. There’s a good short comparison here.

Source

Does it Meet our Goals?

  1. Application should start automatically on system boot: Yip. That’s what Supervisor does well.
  2. Application will be re-started if it dies, or becomes un-responsive. It’s often difficult to get accurate up/down status on processes on UNIX. Pidfiles often lie. Supervisord starts processes as subprocesses, so it always knows the true up/down status of its children.If your application becomes unresponsive or can’t connect to it’s database or any other service/resource it needs to work as expected. To be able to monitor these events and respond accordingly your application can expose a health-check interface, like GET /healthcheck. If everything goes well it should return HTTP 200, if not then HTTP 5**In some cases the restart of the process will solve this issue. httpok is a Supervisor event listener which makes GET requests to the configured URL. If the check fails or times out, httpok will restart the process.To enable httpok the following lines have to be placed in supervisord.conf:
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: I don’t see a problem
    2. Integrate NodeJS’s core module cluster into your NodeJS application for load balancing. This would be completely separate to supervisor.
    3. Visibility of application statistics could be added later with the likes of Monit or something else. For me, Supervisor doesn’t do enough. Monit does. Plus if you need what Monit offers, then you have to have three packages to think about, or Something like Supervisor, which is not an init system, so it kind of sits in the middle of the ultimate stack. So my way of thinking is, use the init system you already have to do the low level lifting and then something small to take care of everything else on your server that the init system is not really designed for and Monit has done this job really well. Just keep in mind also. This is not based on any bias. I hadn’t used Monit before this exercise.
  4. Supervisor is a mature product. It’s been around since 2004 and is still actively developed. The official and community provided docs are good.
  5. Yes it’s production ready. It’s proven itself.

 

Overall Thoughts

The documentation is quite good, easy to read and understand. I felt that the config was quite intuitive also. I already had systemd installed out of the box and didn’t see much point in installing Supervisor as systemd appeared to do everything Supervisor could do, plus systemd is an init system (it sits at the bottom of the stack). In most scenarios you’re going to have a Sysvinit or replacement of (that runs with a PID of 1), so in many cases Supervisor although it’s quite nice is kind of redundant, and of course Ubuntu has Upstart.

Supervisor is better suited to running multiple scripts with the same runtime, for example a bunch of different client applications running on Node. This can be done with systemd and the others, but Supervisor is a better fit for this sort of thing.

 

Monit

monit

Is a utility for monitoring and managing daemons or similar programs. It’s mature, actively maintained, free, open source and licensed with GNU AGPL.

It’s in the debian repositories (trivial install on Debian and derivatives). The home page told me the binary was just under 500kB. The install however produced a different number:

After this operation, 765 kB of additional disk space will be used.

Monit provides an impressive feature set for such a small package.

Monit provides far more visibility into the state of your application and control than any of the offerings mentioned above. It’s also generic. It’ll manage and/or monitor anything you throw at it. It has the right level of abstraction. Often when you start working with a product you find it’s limitations and they stop you moving forward and you end up settling for imperfection or you swap the offering for something else providing you haven’t already invested to much effort into it. For me Monit hit the sweet spot and never seems to stop you in your tracks. There always seems to be an easy to relatively easy way to get any monitoring->take action sort of task done. What I also really like is that moving away from Monit should be relatively painless also. The time investment is small and some of it will be transferable in many cases. It’s just config from the control file.

Features that Stood Out

  • Ability to monitor files, directories, disks, processes, the system and other hosts.
  • Can perform emergency logrotates if a log file suddenly grows too large too fast
  • File Checksum TestingThis is good so long as the compromised server hasn’t also had the tool your using to perform your verification (md5sum or sha1sum) modified, which would be common. That’s why in cases like this, tools such as stealth can be a good choice.
  • Testing of other attributes like ownership and access permissions. These are good, but again can easily be modified.
  • Monitoring directories using time-stamp. Good idea, but don’t rely solely on this. time-stamps are easily modified with touch -r … providing you do it between Monit’s cycles and you don’t necessarily know when they are unless you have permissions to look at Monit’s control file.
  • Monitoring space of file-systems
  • Has a built-in lightweight HTTP(S) interface you can use to browse the Monit server and check the status of all monitored services. From the web-interface you can start, stop and restart processes and disable or enable monitoring of services. Monit provides fine grained control over who/what can access the web interface or whether it’s even active or not. Again an excellent feature that you can choose to use or not even have the extra attack surface.
  • There’s also an agregator (m/monit) that allows sys-admins to monitor and manage many hosts at a time. Also works well on mobile devices and is available at a one off cost (reasonable price) to monitor all hosts.
  • Once you install Monit you have to actively enable the http daemon in the monitrc in order to run the Monit cli and/or access the Monit http web UI. At first I thought “is this broken?” I couldn’t even run monit status (it’s a Monit command). ps told me Monit was running. Then I realised… it’s secure by default. You have to actually think about it in order to expose anything. It was this that confirmed Monit for me.
  • The Control File
  • Just like SSH, to protect the security of your control file and passwords the control file must have read-write permissions no more than 0700 (u=xrw,g=,o=); Monit will complain and exit otherwise.

Documentation

The following was the documentation I used in the same order and I found that the most helpful.

  1. Main web site
  2. Official Documentation
  3. Source and links to other documentation including a QUICK START guide of about 6 lines.
  4. Adding Monit to systemd
  5. Release notes

Does it Meet our Goals?

  1. Application can start automatically on system boot
  2. Monit has a plethora of different types of tests it can perform and then follow up with actions based on the outcomes. Http is but one of them.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: Yes, I don’t see any issues here
    2. Integrate NodeJS’s core module cluster into your NodeJS application for load balancing. Monit will still monitor, restart and do what ever else you tell it to do.
    3. Monit provides application statistics to look at if that’s what you want, but it also goes further and provides directives for you to declare behaviour based on conditions that Monit checks for.
  4. Plenty of official and community supplied documentation
  5. Yes it’s production ready. It’s proven itself. Some extra education around some of the points I raised above with some of the security features would be good. If you could trust the hosts hashing programme (and other commonly trojanised programmes like find, ls, etc) that Monit uses, perhaps because you were monitoring it from a stealth controller (which had already taken a known good copy and produced it’s own bench-mark hash) or similar then yes, you could use that feature of Monit with greater assurance that the results it was producing were in fact accurate. In saying that, you don’t have to use the feature, but it’s there if you want it, which I see as very positive so long as you understand what could go wrong and where.

 

Overall Thoughts

The accepted answer here is a pretty good mix and approach to using the right tools for each job. Monit has a lot of capabilities, none of which you must use, so it doesn’t get in your way, as many opinionated tools do and like to dictate how you do things and what you must use in order to do them. Monit allows you to leverage what ever you already have in your stack. You don’t have to install package managers or increase your attack surface other than [apt-get|aptitude] install monit It’s easy to configure and has lots of good documentation.

Passenger

Passenger

I’ve looked at Passenger before and it looked quite good then. It still does, with one main caveat. It’s trying to do to much. One can easily get lost in the official documentation (example of the Monit install (handfull of commands to cover all Linux distros one page) vs Passenger install (aprx 10 pages)).  “Passenger is a web server and application server, designed to be fast, robust and lightweight. It runs your web apps with the least amount of hassle by taking care of almost all administrative heavy lifting for you.” I’d like to see the actual weight rather than just a relative term “lightweight”. To me it doesn’t look light weight. The feeling I got when evaluating Passenger was similar to the feeling produced with my Ossec evaluation.

The learning curve is quite a bit steeper than all the previous offerings. Passenger has strong opinions that once you buy into could make it hard to use the tools you may want to swap in and out. I’m not seeing the UNIX Philosophy here.

If you look at the Phusion Passenger Philosophy we see some note-worthy comments. “We believe no good software has bad documentation“. If your software is 100% intuitive, the need for documentation should be minimal. Few software products are 100% intuitive, because we only have so much time to develop it. The comment around “the Unix way” is interesting also. At this stage I’m not sure they’ve done better. I’d like to spend some time with someone or some team that has Passenger in production in a diverse environment and see how things are working out.

Passenger isn’t in the Debian repositories, so you would need to add the apt repository.

Passenger is six years old at the time of writing this, but the NodeJS support is only just over a year old.

Features that Stood Didn’t really Stand Out

Sadly there weren’t many that stood out for me.

  • Handle more traffic looked similar to Monit resource testing but without the detail. If there’s something Monit can’t do well, it’ll say “Hay, use this other tool and I’ll help you configure it to suite the way you want to work. If you don’t like it, swap it out for something else” With Passenger it seems to integrate into everything rather than providing tools to communicate loosely. Essentially locking you into a way of doing something that hopefully you like. It also talks about “Uses all available CPU cores“. If you’re using Monit you can use the NodeJS cluster module to take care of that. Again leaving the best tool for the job to do what it does best.
  • Reduce maintenance
    • Keep your app running, even when it crashesPhusion Passenger supervises your application processes, restarting them when necessary. That way, your application will keep running, ensuring that your website stays up. Because this is automatic and builtin, you do not have to setup separate supervision systems like Monit, saving you time and effort.” but this is what Monit excels at and it’s a much easier set-up than Passenger. This sort of marketing doesn’t sit right with me.
    • Host multiple apps at once. Host multiple apps on a single server with minimal effort. ” If we’re talking NodeJS web apps, then they are their own server. They host themselves. In this case it looks like Passenger is trying to solve a problem that doesn’t exist?
  • Improve security
    • Privilege separationIf you host multiple apps on the same system, then you can easily run each app as a different Unix user, thereby separating privileges.“. The Monit documentation says this: “If Monit is run as the super user, you can optionally run the program as a different user and/or group.” and goes on to provide examples how it’s done. So again I don’t see anything new here. Other than the “Slow client protections” which has side affects, that’s it for security considerations with Passenger. From what I’ve seen Monit has more in the way of security related features.
  • What I saw happening here was a lot of stuff that I actually didn’t need. Your mileage may vary.

Offerings

Phusion Passenger is a commercial product that has enterprise, custom and open source (which is free and still has loads of features).

Documentation

The following was the documentation I used in the same order and I found that the most helpful.

  1. NodeJS tutorial (This got me started with how it could work with NodeJS)
  2. Main web site
  3. Documentation and support portal
  4. Design and Architecture
  5. User Guide Index
  6. Nginx specific User Guide
  7. Standalone User Guide
  8. Twitter, blog
  9. IRC: #passenger at irc.freenode.net. I was on there for several days. There was very little activity.

Source

Does it Meet our Goals?

  1. Application should start automatically on system boot. There is no doubt that Passenger goes way beyond this aim.
  2. Application should be re-started if it dies or becomes un-responsive. There is no doubt that Passenger goes way beyond this aim.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: Passenger provides Integrations into Nginx, Apache and stand-alone (provide your own proxy)
    2. Passenger scales up NodeJS processes and automatically load balances between them
    3. Passenger is advertised as offering easily viewable statistics.
  4. There is loads of official documentation. Not as much community contributed though, as it’s still young.
  5. From what I’ve seen so far, I’d say Passenger is production ready. I would like to see more around how security was baked into the architecture though before I committed to using it.

Overall Thoughts

I spent quite a while reading the documentation. I just think it’s doing to much. I prefer to have stronger single focused tools that do one job, do it well and play nicely with all the other kids in the sand pit. You pick the tool up and it’s just intuitive how to use it and you end up reading docs to confirm how you think it should work. For me, this is not how passenger is.

If you’re looking for something even more comprehensive, check out Zabbix. If you like to pay for your tools, check out Nagios if you haven’t already.


At this point it was fairly clear as to which components I’d be using and configuring to keep my NodeJS application monitored, alive and healthy along with any other scripts or processes. systemd and Monit. If you’re on Ubuntu, you’d probably use Upstart instead of systemd as it should already be your default init system. So going with the default for the init system should give you a quick start and provide plenty of power. Plus it’s well supported, reliable, feature rich and you can manage anything/everything you want without installing extra packages. For the next level up, I’d choose Monit. I’ve now used it in production and it’s taken care of everything above the init system. I feel it has a good level of abstraction, plenty of features, doesn’t get in the way and integrates nicely into your production OS.

Getting Started with Monit

So we’ve installed Monit with an apt-get install monit and we’re ready to start configuring it.

ps aux | grep -i monit

Will reveal that Monit is running:

/usr/bin/monit -c /etc/monit/monitrc

Now if you issue a sudo service monit restart, it won’t work as you can’t access the Monit CLI due to the httpd not running.

The first thing we need to do is make some changes to the control file (/etc/monit/monitrc in Debian). The control file has sensible defaults already. At this stage I don’t need a web UI accessible via localhost or any other hosts, but it still needs to be turned on and accessible by at least localhost. Here’s why:

Note that HTTP support is required for almost all Monit CLI interface operation, as CLI commands (such as “monit status”) are handled by communicating with the Monit background process via the the HTTP interface. So basically you should have this enable, though you can bind the HTTP interface to localhost only so Monit is not accessible from the outside.

In order to turn on the httpd, all you need in your control file for that is:

set httpd port 2812 and use address localhost # only accept connection from localhost
allow localhost # allow localhost to connect to the server and

If you want to receive alerts via email, then you’ll need to configure that. Then on reload you should get start and stop events (when you quit).

sudo monit reload

Now if you issue a curl localhost:2812 you should get the web UI’s response of a html page. Now you can start to play with the Monit CLI

Now to stop the Monit background process use:

monit quit

Oh, you can find all the arguments you can throw at Monit here, or just issue a:

monit -h # will list all options.

To check the control file for syntax errors:

sudo monit -t

Also keep an eye on your log file which is specified in the control file: set logfile /var/log/monit.log

Right. So what happens when Monit dies…

Keep Monit Alive

Now you’re going to want to make sure your monitoring tool that can be configured to take all sorts of actions never just stops running, leaving you flying blind. No noise from your servers means all good right? Not necessarily. Your monitoring tool just has to keep running. So lets make sure of that now.

When Monit is apt-get install‘ed on Debian it gets installed and configured to run as a daemon. This is defined in Monit’s init script.
Monit’s init script is copied to /etc/init.d/ and the run levels set-up for it. This means when ever a run level is entered the init script will be run taking either the single argument of stop (example: /etc/rc0.d/K01monit), or start (example: /etc/rc2.d/S17monit). Further details on run levels here.

systemd to the rescue

Monit is pretty stable, but if for some reason it dies, then it won’t be automatically restarted again.
This is where systemd comes in. systemd is installed out of the box on Debian Jessie on-wards. Ubuntu uses Upstart which is similar. Both SysV init and systemd can act as drop-in replacements for each other or even work along side of each other, which is the case in Debian Jessie. If you add a unit file which describes the properties of the process that you want to run, then issue some magic commands, the systemd unit file will take precedence over the init script (/etc/init.d/monit)

Before we get started, lets get some terminology established. The two concepts in systemd we need to know about are unit and target.

  1. A unit is a configuration file that describes the properties of the process that you’d like to run. There are many examples of these I can show you and I’ll point you in the direction soon. They should have a [Unit] directive at a minimum. The syntax of the unit files and the target files were derived from Microsoft Windows .ini files. Now I think the idea is that if you want to have a [Service] directive within your unit file, then you would append .service to the end of your unit file name.
  2. A target is a grouping mechanism that allows systemd to start up groups of processes at the same time. This happens at every boot as processes are started at different run levels.

Now in Debian there are two places that systemd looks for unit files… In order from lowest to highest precedence, they are as follows:

  1. /lib/systemd/system/ (prefix with /usr dir for archlinux) unit files provided by installed packages. Have a look in here for many existing examples of unit files.
  2. /etc/systemd/system/ unit files created by the system administrator

As mentioned above, systemd should be the first process started on your Linux server. systemd reads the different targets and runs the scripts within the specific target’s “target.wants” directory (which just contains a collection of symbolic links to the unit files). For example the target file we’ll be working with is the multi-user.target file (actually we don’t touch it, systemctl does that for us (as per the magic commands mentioned above)). Just as systemd has two locations in which it looks for unit files. I think this is probably the same for the target files, although there wasn’t any target files in the system administrator defined unit location but there were some target.wants files there.

systemd Monit Unit file

I found a template that Monit had already provided for a unit file in /usr/share/doc/monit/examples/monit.service. There’s also one for Upstart. Copy that to where the system administrator unit files should go and make the change so that systemd restarts Monit if it dies for what ever reason. Check the Restart= options on the systemd.service man page. The following is what my initial unit file looked like:

[Unit]
Description=Pro-active monitoring utility for unix systems
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/monit -I -c /etc/monit/monitrc
ExecStop=/usr/bin/monit -c /etc/monit/monitrc quit
ExecReload=/usr/bin/monit -c /etc/monit/monitrc reload
Restart=always

[Install]
WantedBy=multi-user.target

Now, some explanation. Most of this is pretty obvious. The After= directive just tells systemd to make sure the network.target file has been acted on first and of course network.target has After=network-pre.target which doesn’t have a lot in it. I’m not going to go into this now, as I don’t really care too much about it. It works. It means the network interfaces have to be up first. If you want to know how, why, check this documentation. Type=simple. Again check the systemd.service man page.
Now to have systemd control Monit, Monit must not run as a background process (the default). To do this, we can either add the set init statement to Monit’s control file or add the -I option when running systemd, which is exactly what we’ve done above. The WantedBy= is the target that this specific unit is part of.

Now we need to tell systemd to create the symlinks in multi-user.target.wants directory and other things. See the man page for more details about what enable actually does if you want them. You’ll also need to start the unit.

Now what I like to do here is:

systemctl status /etc/systemd/system/monit.service

Then compare this output once we enable the service:

● monit.service - Pro-active monitoring utility for unix systems
   Loaded: loaded (/etc/systemd/system/monit.service; disabled)
   Active: inactive (dead)
sudo systemctl enable /etc/systemd/system/monit.service
# systemd now knows about monit.service
systemctl status /etc/systemd/system/monit.service

Outputs:

● monit.service - Pro-active monitoring utility for unix systems
   Loaded: loaded (/etc/systemd/system/monit.service; enabled)
   Active: inactive (dead)

Now start the service:

sudo systemctl start monit.service # there's a stop and restart also.

Now you can check the status of your Monit service again. This shows terse runtime information about the units or PID you specify (monit.service in our case).

sudo systemctl status monit.service

By default this function will show you 10 lines of output. The number of lines can be controlled with the --lines= option

sudo systemctl --lines=20 status monit.service

Now try killing the Monit process. At the same time, you can watch the output of Monit in another terminal. tmux or screen is helpful for this:

sudo tail -f /var/log/monit.log
sudo kill -SIGTERM $(pidof monit)
# SIGTERM is a safe kill and is the default, so you don't actually need to specify it. Be patient, this may take a minute or two for the Monit process to terminate.

Or you can emulate a nastier termination with SIGKILL or even SEGV (which may kill monit faster).

Now when you run another status command you should see the PID has changed. This is because systemd has restarted Monit.

When you need to make modifications to the unit file, you’ll need to run the following command after save:

sudo systemctl daemon-reload

When you need to make modifications to the running services configuration file
/etc/monit/monitrc for example, you’ll need to run the following command after save:

sudo systemctl reload monit.service
# because systemd is now in control of Monit, rather than the before mentioned: sudo monit reload

 

Keep NodeJS Application Alive

Right, we know systemd is always going to be running. So lets use it to take care of the coarse grained service control. That is keeping your NodeJS application service alive.

Using systemd

systemd my-web-app.service Unit file

You’ll need to know where your NodeJS binary is. The following will provide the path:

which NodeJS

Now create a systemd unit file my-nodejs-app.service

[Unit]
Description=My amazing NodeJS application
After=network.target

[Service]
# systemctl start my-nodejs-app # to start the NodeJS script
ExecStart=[where nodejs binary lives] [where your app.js/index.js lives]
# systemctl stop my-nodejs-app # to stop the NodeJS script
# SIGTERM (15) - Termination signal. This is the default and safest way to kill process.
# SIGKILL (9) - Kill signal. Use SIGKILL as a last resort to kill process. This will not save data or cleaning kill the process.
ExecStop=/bin/kill -SIGTERM $MAINPID
# systemctl reload my-nodejs-app # to perform a zero-downtime restart.
# SIGHUP (1) - Hangup detected on controlling terminal or death of controlling process. Use SIGHUP to reload configuration files and open/close log files.
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=my-nodejs-app
User=my-nodejs-app
Group=my-nodejs-app # Not really needed unless it's different, as the default group of the user is chosen without this option. Self documenting though, so I like to have it present.
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

Add the system user and group so systemd can actually run your service as the user you’ve specified.

sudo groupadd --system my-nodejs-app # this is not needed if you adduser like below...
getent group # to verify which groups exist.
sudo adduser --system --no-create-home --group my-nodejs-app # This will create a system group with the same name and ID of the user.
groups my-nodejs-app # to verify which groups the new user is in.

Now as we did above, go through the same procedure enable‘ing, start‘ing and verifying your new service.

Make sure you have your directory permissions set-up correctly and you should have a running NodeJS application that when it dies will be restarted automatically by systemd.

Don’t forget to backup all your new files and changes in case something happens to your server.

We’re done with systemd for now. Following are some useful resources I’ve used:

 

Using Monit

Now just configure your Monit control file. You can spend a lot of time here tweaking a lot more than just your NodeJS application. There are loads of examples around and the control file itself has lots of commented out examples also. You’ll find the following the most helpful:

There are a few things that had me stuck for a bit. By default Monit only sends alerts on change, not on every cycle if the condition stays the same, unless when you set-up your

set alert your-ame@your.domain

Append receive all alerts, so that it looks like this:

set alert your-ame@your.domain receive all alerts

There’s quite a few things you just work out as you go. The main part I used to health-check my NodeJS app was:

check host myhost with address 1.2.3.4
   start program = "/bin/systemctl start my-nodejs-app.service"
   stop program = "/bin/systemctl stop my-nodejs-app.service"
   if failed ping then alert
   if failed
      port 80 and
      protocol http and
      status = 200 # The default without status is failure if status code >= 400
      request /testdir with content = "some text on my web page" and
         then restart
   if 5 restarts within 5 cycles then alert

I carry on and check things like:

  1. cpu and memory usage
  2. load averages
  3. File system space on all the mount points
  4. Check SSH that it hasn’t been restarted by anything other than Monit (potentially swapping the binary or it’s config). Of course if an attacker kills Monit, systemd immediately restarts it and we get Monit alert(s). We also get real-time logging hopefully to an off-site syslog server. Ideally your off-site syslog server also has alerts set-up on particular log events. On top of that you should also have inactivity alerts set-up so that if your log files are not generating events that you expect, then you also receive alerts. Services like Dead Man’s Snitch or packages like Simple Event Correlator with Cron are good for this. On top of all that, if you have a file integrity checker that resides on another system that your host reveals no details of and you’ve got it configured to check all the right file check-sums, dates, permissions, etc, you’re removing a lot of low hanging fruit for someone wanting to compromise your system.
  5. Directory permissions, uid, gid and checksums. Of course you’re also going to have to make sure the tools that Monit uses to do these checks haven’t been modified.

 

If you find anything I haven’t explained clearly, or you need a hand with any of this just leave a comment. Cheers.

Advertisement

Evaluation of Host Intrusion Detection Systems (HIDS)

May 30, 2015

Followed up with a test deployment and drive.

The best time to install a HIDS is on a fresh install before you open the host up to the internet or even your LAN if it’s corporate. Of course if you don’t have that luxury, there are a bunch of tools that can help you determine if you’re already owned. Be sure to run one or more over your target system before your HIDS bench-marks it.

The reason I chose Stealth and OSSEC to take further into an evaluation was because they rose to the top of a preliminary evaluation I performed during a recent Debian Web server hardening process where I looked at several other HIDS as well.

OSSEC

ossec-hids

Source

ossec-hids on github

Who’s Behind Ossec

  • Many developers, contributors, managers, reviewers, translators (infact the OSSEC team looks almost as large as the Stealth user base)

Documentation

Lots of documentation. Not always the easiest to navigate. Lots of buzz on the inter-webs.
Several books.

Community / Communication

IRC channel #ossec at irc.freenode.org Although it’s not very active.

Components

  • Manager (sometimes called server): does most of the work monitoring the Agents. It stores the file integrity checking databases, the logs, events and system auditing entries, rules, decoders, major configuration options.
  • Agents: small collections of programs installed on the machines we are interested in monitoring. Agents collect information and forward it to the manager for analysis and correlation.

There are quite a few other ancillary components also.

Architecture

You can also go the agent-less route which may allow the Manager to perform file integrity checks using agent-less scripts. As with Stealth, you’ve still got the issue of needing to be root in order to read some of the files.

Agents can be installed on VMware ESX but from what I’ve read it’s quite a bit of work.

Features (What does it do)

  • File integrity checking
  • Rootkit detection
  • Real-time log file monitoring and analysis (you may already have something else doing this)
  • Intrusion Prevention System (IPS) features as well: blocking attacks in real-time
  • Alerts can go to a databases MySQL or PostgreSQL or other types of outputs
  • There is a PHP web UI that runs on Apache if you would rather look at pretty outputs vs log files.

What I like

To me, the ability to scan in real-time off-sets the fact that the agents need binaries installed. This hinders the attacker from covering their tracks.

Can be configured to scan systems in realtime based on inotify events.

Backed by a large company Trend Micro.

Options. Install options for starters. You’ve got the options of:

  • Agent-less installation as described above
  • Local installation: Used to secure and protect a single host
  • Agent installation: Used to secure and protect hosts while reporting back to a
    central OSSEC server
  • Server installation: Used to aggregate information

Can install a web UI on the manager, so you need Apache, PHP, MySQL.

What I don’t like

  • Unlike Stealth, The fact that something has to be installed on the agents
  • The packages are not in the standard repositories. The downloads, PGP keys and directions are here.
  • I think Ossec may be doing to much and if you don’t like the way it does one thing, you may be stuck with it. Personally I really like the idea of a tool doing one thing, doing it well and providing plenty of configuration options to change the way it does it’s one thing. This provides huge flexibility and minimises your dependency on a suite of tools and/or libraries
  • Information overload. There seems to be a lot to get your head around to get it set-up. There are a lot of install options documented (books, interwebs, official docs). It takes a bit to workout exactly the best procedure for your environment. Following are some of the resources I used:
    1. Some official documentation
    2. Perving in the repository
    3. User take one install
    4. User take two install
    5. User take three install

Stealth

Stealth file integrity checker

Source

sourceforge

Who’s Behind Stealth

Author: Frank B. Brokken. An admirable job for one person. Frank is not a fly-by-nighter though. Stealth was first presented to Congress in 2003. It’s still actively maintained and used by a few. It’s one of GNU/Linux’s dirty little secrets I think. It’s a great idea, makes a tricky job simple and does it in an elegant way.

Documentation

  • 3.00.00 (2014-08-29)
  • 2.11.03 (2013-06-18)
    • Once you install Stealth, all the documentation can be found by sudo updatedb && locate stealth. I most commonly used: HTML docs (/usr/share/doc/stealth-doc/manual/html/) and (/usr/share/doc/stealth-doc/manual/pdf/stealth.pdf) for easy searching across the HTML docs
    • man page (/usr/share/doc/stealth/stealthman.html)
    • Examples: (/usr/share/doc/stealth/examples/)
  • More covered in my demo set-up below

Binaries

  • 3.00.00 (2014-08-29) is in the repository for Debain Jessie
  • 2.11.03-2 (2013-06-18) This is as recent as you can go for Linux Mint Qiana (17) and Rebecca (17.1) within the repositories, unless you want to go out of band. There are quite a few dependencies ldd will show you:
    libbobcat.so.3 => /usr/lib/libbobcat.so.3 (0xb765d000)
    libpthread.so.0 => /lib/i386-linux-gnu/i686/cmov/libpthread.so.0 (0xb7641000)
    libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb754e000)
    libm.so.6 => /lib/i386-linux-gnu/i686/cmov/libm.so.6 (0xb7508000)
    libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb74eb000)
    libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xb7340000)
    libX11.so.6 => /usr/lib/i386-linux-gnu/libX11.so.6 (0xb71ee000)
    libcrypto.so.1.0.0 => /usr/lib/i386-linux-gnu/i686/cmov/libcrypto.so.1.0.0 (0xb7022000)
    libreadline.so.6 => /lib/i386-linux-gnu/libreadline.so.6 (0xb6fdc000)
    libmilter.so.1.0.1 => /usr/lib/i386-linux-gnu/libmilter.so.1.0.1 (0xb6fcb000)
    /lib/ld-linux.so.2 (0xb7748000)
    libxcb.so.1 => /usr/lib/i386-linux-gnu/libxcb.so.1 (0xb6fa5000)
    libdl.so.2 => /lib/i386-linux-gnu/i686/cmov/libdl.so.2 (0xb6fa0000)
    libtinfo.so.5 => /lib/i386-linux-gnu/libtinfo.so.5 (0xb6f7c000)
    libXau.so.6 => /usr/lib/i386-linux-gnu/libXau.so.6 (0xb6f78000)
    libXdmcp.so.6 => /usr/lib/i386-linux-gnu/libXdmcp.so.6 (0xb6f72000)
    
  • 2.10.01 (2012-10-04) is in the repository for Debian Wheezy

Community / Communication

No community really. I see it as one of the dirty little secretes that I’m surprised many diligent sysadmins haven’t jumped on. The author is happy to answer emails. The author doesn’t market.

Components

Controller

The computer initiating the scan.

Needs two kinds of outgoing services:

  1. ssh to reach the clients
  2. mail transport agent (MTA)(sendmail, postfix)

Considerations for the controller:

  1. No public access.
  2. All inbound services should be denied.
  3. Access only via its console.
  4. Physically secure location (one would think goes without saying, but you may be surprised).
  5. Sensitive information of the clients are stored on the controller.
  6. Password-less access to the clients for anyone who gains controller root access, unless ssh-cron is used, which appears to have been around since 2014-05.

Client

The computer/s being scanned. I don’t see any reason why a Stealth solution couldn’t be set-up to look after many clients.

Architecture

The controller stores one to many policy files. Each of which is specific to a single client and contains use directives and commands. It’s recommended policy to take copies of the client programmes such as the hashing programme sha1sum, find and others that are used extensively during the integrity scans and copy them to the controller to take bench-mark hashes. Subsequent runs will do the same to compare with the initial hashes stored.

Features (What does it do)

File integrity tests leaving virtually no sediments on the tested client.

Stealth subscribes to the “dark cockpit” approach. I.E. no mail is sent when no changes were detected. If you have a MTA, Stealth can be configured to send emails on changes it finds.

What I like

  • It’s simplicity. There is one package to install on the controller. Nothing to install on the client machines. Client just needs to have the controllers SSH public key. You will need a Mail Transfer Agent on your controller if you don’t already have one. My test machine (Linux Mint) didn’t have one.
  • Rather than just modifying the likes of sha1sum on the clients that Stealth uses to perform it’s integrity checks, Stealth would somehow have to be fooled into thinking that the changed hash of the sha1sum it’s just copied to the controller is the same as the previously recorded hash that it did the same with. If the previously recorded hash is removed or does not match the current hash, then Stealth will fire an alert off.
  • It’s in the Debian repositories. Which is a little surprising considering I couldn’t find any test suite results.
  • The whole idea behind it. Systems being monitored give little appearance that they’re being monitored, other than I think the presence of a single SSH login when Stealth first starts in the auth.log. This could actually be months ago, as the connection remains active for the life of Stealth. The login could be from a user doing anything on the client. It’s very discrete.
  • unpredictability of Stealth runs is offered through Stealth’s --random-interval and --repeat options. E.g., --repeat 60 --random-interval 30 results in new Stealth-runs on average every 75 seconds.
  • Subscribes to the Unix philosophy: “do one thing and do it well”
  • Stealth’s author is very approachable and open. After talking with Frank and suggesting some ideas to promote Stealth and it’s community, Frank started a discussion list.

What I don’t like

  • Lack of visible code reviews and testing.Yes it’s in Debian, but so was OpenSSL and Bash
  • One man band. Support provided via one person alone via email. Comparing with the likes of Ossec which has …
  • Lack of use cases. I don’t see anyone using/abusing it. Although Frank did send me some contacts of other people that are using it, so again, a very helpful author. Can’t find much on the interwebs. The documentation has clear signs that it’s been written and is targeting people already familiar with the tool. This is understandable as the author has been working on this project for nine years and could possibly be disconnected with what’s involved for someone completely new to the project to dive in and start using it. In saying that, that’s what I did and so far it worked out well.
  • This tells me that either very few are using it or it has no bugs and the install and configuration is stupidly straight forward or both.
  • Small user base. This is how many people are happy to reveal they are using Stealth.
  • Reading through the userguide, the following put me off a little: “preferably the public ssh-identity key of the controller should be placed in the client’s root .ssh/authorized_keys file, granting the controller root access to the client. Root access is normally needed to gain access to all directories and files of the client’s file system.” I never allow SSH root access to servers. So I’m not about to start. What’s worse, is that Stealth SSHing from server to client with key-pair can only do so automatically if the pass-phrase is blank. If it’s not blank then someone has to drive Stealth and the whole idea of Stealth (as far as I can tell) is to be an automatic file integrity checker.
    There are however some work-arounds to this. ssh-cron can run scheduled Stealth jobs, but needs to aquire the pass-phrase from the user once. It does this via ssh-askpass which is a X11 based pass-phrase input dialog. If your running ssh-cron and Stealth from a server (which would be a large number of potential users I would think) you won’t have X11. So if that’s the case ssh-cron is out of the question. At least that’s how I understand it from the man page and emails from Frank. Frank mentioned in an email: “Stealth itself could perfectly well be started `by hand’ setting up the connection using, e.g., an existing ssh private-key, which could thereafter completely be removed from the system running Stealth. The running Stealth process would then continue to use the established connection for as long as it’s running. This may be a very long time if the –daemon option is used.” I’ve used Monit to do any checks on the client that need root access. This way Stealth can run as a low privileged user.
  • In order to use an SSH key-pair with passphrase and have your controller resume scans on reboot, you’re going to need ssh-cron. Some distros like Mint and Ubuntu only have very old versions of libbobcat (3.19.01 (December 2013)) in their repositories. You could re-build if you fancy the dependency issues it may bring. Failing that, use Debian which is way more up to date, or just fire stealth off each time you reboot your controller machine manually and run it as a daemon with arguments such as --keep-alive (or --daemon if running stealth >= 4.00.00), --repeat. This will cause Stealth to keep running (sleeping) most of the time, then doing it’s checks at the intervals you define.

Outcomes

In making all of my considerations, I changed my mind quite a few times on which offerings were most suited to which environments. I think this is actually a good thing, as I think it means my evaluations were based on the real merits of each offering rather than any biases.

If you already have real-time logging to an off-site syslog server set-up and alerting. OSSEC would provide redundant features.

If you don’t already have real-time logging to an off-site syslog server, then OSSEC can help here.

The simplicity of Stealth, flatter learning curve and it’s over-all philosophy is what won me over.

Stealth Up and Running

I tested this out on a Mint installation.

Installed stealth and stealth-doc (2.11.03-2) via synaptic package manager. Then just did a locate for stealth to find the docs and other example files. The following are the files I used for documentation, how I used them and the tab order that made sense to me:

  1. The main documentation index: file:///usr/share/doc/stealth-doc/manual/html/stealth.html
  2. Chapter one introduction: file:///usr/share/doc/stealth-doc/manual/html/stealth01.html
  3. Chapter three to help build up a policy file: file:///usr/share/doc/stealth-doc/manual/html/stealth03.html
  4. Chapter five for running Stealth and building up policy file: file:///usr/share/doc/stealth-doc/manual/html/stealth05.html
  5. Chapter six for running Stealth: file:///usr/share/doc/stealth-doc/manual/html/stealth06.html
  6. Chapter seven for arguements to pass to Stealth: file:///usr/share/doc/stealth-doc/manual/html/stealth07.html
  7. Chapter eight for error messages: file:///usr/share/doc/stealth-doc/manual/html/stealth08.html
  8. The Man page: file:///usr/share/doc/stealth/stealthman.html
  9. Policy file examples: file:///usr/share/doc/stealth/examples/
  10. Useful scripts to use with Stealth: file:///usr/share/doc/stealth/scripts/usr/bin/
  11. All of the documentation in simple text format (good for searching across chapters for strings): file:///usr/share/doc/stealth-doc/manual/text/stealth.txt

Files I would need to copy and modify were:

  • /usr/share/doc/stealth/scripts/usr/bin/stealthcleanup.gz
  • /usr/share/doc/stealth/scripts/usr/bin/stealthcron.gz
  • /usr/share/doc/stealth/scripts/usr/bin/stealthmail.gz

Files I used for reference to build up a policy file:

  • /usr/share/doc/stealth/examples/demo.pol.gz
  • /usr/share/doc/stealth/examples/localhost.pol.gz
  • /usr/share/doc/stealth/examples/simple.pol.gz

As mentioned above, providing you have a working MTA, then Stealth will just do it’s thing when you run it. The next step is to schedule it’s runs. This can be also (as mentioned above) with a pseudo random interval.

Feel free to leave a comment if you need help setting Stealth up, as it did take a bit of fiddling, but does what it says it does on the box very well.

Web Server Log Management

April 25, 2015

As part of the ongoing work around preparing a Debian web server to host applications accessible from the WWW I performed some research, analysis, made decisions along the way and implemented a first stage logging strategy. I’ve done similar set-ups many times before, but thought it worth sharing my experience for all to learn something from it and/or provide input, recommendations, corrections to the process so we all get to improve.

The main system loggers I looked into

  • GNU syslogd which I don’t think is being developed anymore? Correct me if I’m wrong. Most Linux distributions no longer ship with this. Only supports UDP. It’s also a bit lacking in features. From what I gather is single-threaded. I didn’t spend long looking at this as there wasn’t much point. The following two offerings are the main players.
  • rsyslog: which ships with Debian and most other Linux distros now I believe. I like to do as little as possible and rsyslog fits this description for me. The documentation seems pretty good. Rainer Gerhards wrote rsyslog and his blog provides some good insights. Supports UDP, TCP. Can send over TLS. There is also the Reliable Event Logging Protocol (RELP) which Rainer created.
    rsyslog is great at gathering, transporting, storing log messages and includes some really neat functionality for dividing the logs. It’s not designed to alert on logs. That’s where the likes of Simple Event Correlator (SEC) comes in. Rainer discusses why TCP isn’t as reliable as many think here.
  • syslog-ng: I didn’t spend to long here, as I didn’t see any features that I needed that were better than the default of rsyslog. Can correlate log messages, both real-time and off-line. Supports reliable and encrypted transport using TCP and TLS. message filtering, sorting, pre-processing, log normalisation.

There are are few comparisons around. Most of the ones I’ve seen are a bit biased and often out of date.

Aims

  • Record events and have them securely transferred to another syslog server in real-time, or as close to it as possible, so that potential attackers don’t have time to modify them on the local system before they’re replicated to another location
  • Reliability (resilience / ability to recover connectivity)
  • Extensibility: ability to add more machines and be able to aggregate events from many sources on many machines
  • Receive notifications from the upstream syslog server of specific events. No HIDS is going to remove the need to reinstall your system if you are not notified in time and an attacker plants and activates their root-kit.
  • Receive notifications from the upstream syslog server of lack of events. The network is down for example.

Environmental Considerations

A couple of servers in the mix:

FreeNAS File Server

Recent versions can send their syslog events to a syslog server. With some work, it looks like FreeNAS can be setup to act as a syslog server.

pfSense Router

Can send log events, but only by UDP by the look of it.

Following are the two strategies that emerged. You can see by the detail that I went down the path of the first one initially. It was the path of least resistance / quickest to setup. I’m going to be moving away from papertrail toward strategy two. Mainly because I’ve had a few issues where messages have been getting lost that have been very hard to track down (I’ve spent over a week on it). As the sender, you have no insight into what papertrail is doing. The support team don’t provide a lot of insight into their service when you have to trouble-shoot things. They have been as helpful as they can be, but I’ve expressed concern around them being unable to trouble-shoot their own services.

Outcomes

Strategy One

Rsyslog, TCP, local queuing, TLS, papertrail for your syslog server (PT doesn’t support RELP, but say that’s because their clients haven’t seen any issues with reliability in using plain TCP over TLS with local queuing). My guess is they haven’t looked hard enough. I must be the first then. Beware!

As I was setting this up and watching both ends. We had an internet outage of just over an hour. At that stage we had very few events being generated, so it was trivial to verify both ends. I noticed that once the ISP’s router was back on-line and the events from the queue moved to papertrail, that there was in fact one missing.

Why did Rainer Gerhards create RELP if TCP with queues was good enough? That was a question that was playing on me for a while. In the end, it was obvious that TCP without RELP isn’t good enough.
At this stage it looks like the queues may loose messages. Rainer says things like “In rsyslog, every action runs on its own queue and each queue can be set to buffer data if the action is not ready. Of course, you must be able to detect that the action is not ready, which means the remote server is off-line. This can be detected with plain TCP syslog and RELP“, but it can be detected without RELP.

You can aggregate log files with rsyslog or by using papertrails remote_syslog daemon.

Alerting is available, including for inactivity of events.

Papertrails documentation is good and support is reasonable. Due to the huge amounts of traffic they have to deal with, they are unable to trouble-shoot any issues you may have. If you still want to go down the papertrail path, to get started, work through this which sets up your rsyslog to use UDP (specified in the /etc/rsyslog.conf by a single ampersand in front of the target syslog server). I want something more reliable than that, so I use two ampersands, which specifies TCP.

As we’re going to be sending our logs over the internet for now, we need TLS. Check papertrails CA server bundle for integrity:

curl https://papertrailapp.com/tools/papertrail-bundle.pem | md5sum

Should be: c75ce425e553e416bde4e412439e3d09

If all good throw the contents of that URL into a file called papertrail-bundle.pem.
Then scp the papertrail-bundle.pem into the web servers /etc dir. The command for that will depend on whether you’re already on the web server and you want to pull, or whether you’re somewhere else and want to push. Then make sure the ownership is correct on the pem file.

chown root:root papertrail-bundle.pem

install rsyslog-gnutls

apt-get install rsyslog-gnutls

Add the TLS config

$DefaultNetstreamDriverCAFile /etc/papertrail-bundle.pem # trust these CAs
$ActionSendStreamDriver gtls # use gtls netstream driver
$ActionSendStreamDriverMode 1 # require TLS
$ActionSendStreamDriverAuthMode x509/name # authenticate by host-name
$ActionSendStreamDriverPermittedPeer *.papertrailapp.com

to your /etc/rsyslog.conf. Create egress rule for your router to let traffic out to dest port 39871.

sudo service rsyslog restart

To generate a log message that uses your system syslogd config /etc/rsyslog.conf, run:

logger "hi"

should log “hi” to /var/log/messages and also to papertrail, but it wasn’t.

# Show a live update of the last 10 lines (by default) of /var/log/messages
sudo tail -f [-n <number of lines to tail>] /var/log/messages

OK, so lets run rsyslog in config checking mode:

/usr/sbin/rsyslogd -f /etc/rsyslog.conf -N1

Output all good looks like:

rsyslogd: version <the version number>, config validation run (level 1), master config /etc/rsyslog.conf
rsyslogd: End of config validation run. Bye.

Trouble-shooting

  1. https://www.loggly.com/docs/troubleshooting-rsyslog/
  2. http://help.papertrailapp.com/
  3. http://help.papertrailapp.com/kb/configuration/troubleshooting-remote-syslog-reachability/
  4. /usr/sbin/rsyslogd -version will provide the installed version and supported features.

Which didn’t help a lot, as I don’t have telnet installed. I can’t ping from the DMZ as ICMP is not allowed out and I’m not going to install tcpdump or strace on a production server. The more you have running, the more surface area you have, the greater the opportunities to exploit.

So how do we tell if rsyslogd is actually running if it doesn’t appear to be doing anything useful?

pidof rsyslogd

or

/etc/init.d/rsyslog status

Showing which files rsyslogd has open can be useful:

lsof -p <rsyslogd pid>

or just combine the results of pidof rsyslogd

sudo lsof -p $(pidof rsyslogd)

To start with I had a line like:

rsyslogd 3426 root 8u IPv4 9636 0t0 TCP <web server IP>:<sending port>->logs2.papertrailapp.com:39871 (SYN_SENT)

Which obviously showed rsyslogd‘s SYN packets were not getting through. I’ve had some discussion with Troy from PT support around the reliability of plain TCP over TLS without RELP. I think if the server is business critical, then strategy two “maybe” the better option. Troy has assured me that they’ve never had any issues with logs being lost due to lack of reliability with out RELP. Troy also pointed me to their recommended local queue options. After adding the queue tweaks and a rsyslogd restart, it resulted in:

rsyslogd 3615 root 8u IPv4 9766 0t0 TCP <web server IP>:<sending port>->logs2.papertrailapp.com:39871 (ESTABLISHED)

I could now see events in the papertrail web UI in real-time.

Socket Statistics (ss)(the better netstat) should also show the established connection.

By default papertrail accepts TCP over TLS (TLS encryption check-box on, Plain text check-box off) and UDP. So if your TLS isn’t setup properly, your events won’t be accepted by papertrail. I later confirmed this to be true.

Check that our Logs are Commuting over TLS

Now without installing anything on the web server or router, or physically touching the server sending packets to papertrail or the router. Using a switch (ubiquitous) rather than a hub. No wire tap or multi-network interfaced computer. No switch monitoring port available on expensive enterprise grade switches (along with the much needed access). We’re basically down to two approaches I can think of and I really couldn’t be bothered getting up out of my chair.

  1. MAC flooding with the help of macof which is a utility from the dsniff suite. This essentially causes your switch to go into a “failopen mode” where it acts like a hub and broadcasts it’s packets to every port.

    MAC Flooding

    Or…
  2. Man in the Middle (MiTM) with some help from ARP spoofing or poisoning. I decided to choose the second option, as it’s a little more elegant.

    ARP Spoofing

On our MitM box, I set a static IP: address, netmask, gateway in /etc/network/interfaces and add domain, search and nameservers to the /etc/resolv.conf.

Follow that up with a service network-manager restart

On the web server run:

ifconfig -a

to get MAC: <MitM box MAC> On MitM box run the same command to get MAC: <web server MAC>
On web server run:

ip neighbour

to find MACs associated with IP’s (the local ARP table). Router was: <router MAC>.

myuser@webserver:~$ ip neighbour
<MitM box IP> dev eth0 lladdr <MitM box MAC> REACHABLE
<router IP> dev eth0 lladdr <router MAC> REACHABLE

Now you need to turn your MitM box into a router temporarily. On the MitM box run

cat /proc/sys/net/ipv4/ip_forward

You’ll see a ‘1’ if forwarding is on. If it’s not, throw a ‘1’ into the file:

echo 1 > /proc/sys/net/ipv4/ip_forward

and check again to make sure. Now on the MitM box run

arpspoof -t <web server IP> <router IP>

This will continue to notify <web server IP> that our (MitM box) MAC address belongs to <router IP>. Essentially… we (MitM box) are <router IP> to the <web server IP> box, but our IP address doesn’t change. Now on the web server you can see that it’s ARP table has been updated and because arpspoof keeps running, it keeps telling <web server IP> that our MitM box is the router.

myuser@webserver:~$ ip neighbour
<MitM box IP> dev eth0 lladdr <MitM box MAC> STALE
<router IP> dev eth0 lladdr <MitM box MAC> REACHABLE

Now on our MitM box, while our arpspoof continues to run, we start Wireshark listening on our eth0 interface or what ever interface your using, and you can see that all packets that the web server is sending, we are intercepting and forwarding (routing) on to the gateway.

Now Wireshark clearly showed that the data was encrypted. I commented out the five TLS config lines in the /etc/rsyslog.conf file -> saved -> restarted rsyslog -> turned on “Plain text” in papertrail and could now see the messages in clear text. Now when I turned off “Plain text” papertrail would no longer accept syslog events. Excellent!

One of the nice things about arpspoof is that it re-applies the original ARP’s once it’s done.

You can also tell arpspoof to poison the routers ARP table. This way any traffic going to the web server via the router, not originating from the web server will be routed through our MitM box also.

Don’t forget to revert the change to /proc/sys/net/ipv4/ip_forward.

Exporting Wireshark Capture

You can use the File->Save As… option here for a collection of output types, or the way I usually do it is:

  1. First completely expand all the frames you want visible in your capture file
  2. File->Export Packet Dissections->as “Plain Text” file…
  3. Check the “All packets” check-box
  4. Check the “Packet summary line” check-box
  5. Check the “Packet details:” check-box and the “As displayed”
  6. OK

Trouble-shooting messages that papertrail never shows

To run rsyslogd in debug

Check to see which arguments get passed into rsyslogd to run as a daemon in /etc/init.d/rsyslog and /etc/default/rsyslog. You’ll probably see a RSYSLOGD_OPTIONS="". There may be some arguments between the quotes.

sudo service rsyslog stop
sudo /usr/sbin/rsyslogd [your options here] -dn >> ~/rsyslog-debug.log

The debug log can be quite useful for trouble-shooting. Also keep your eye on the stderr as you can see if it’s writing anything out (most system start-up scripts throw this away).
Once you’ve finished collecting log:
ctrl+C

sudo service rsyslog start

To see if rsyslog is running

pidof rsyslogd
# or
/etc/init.d/rsyslog status
Turn on the impstats module

The stats it produces show when you run into errors with an output, and also the state of the queues.
You can also run impstats on the receiving machine if it’s in your control. Papertrail obviously is not.
Put the following into your rsyslog.conf file at the top and restart rsyslog:

# Turn on some internal counters to trouble-shoot missing messages
module(load="impstats"
interval="600"
severity="7"
log.syslog="off"

# need to turn log stream logging off
log.file="/var/log/rsyslog-stats.log")
# End turn on some internal counters to trouble-shoot missing messages

Now if you get an error like:

rsyslogd-2039: Could not open output pipe '/dev/xconsole': No such file or directory [try http://www.rsyslog.com/e/2039 ]

You can just change the /dev/xconsole to /dev/console
xconsole is still in the config file for legacy reasons, it should have been cleaned up by the package maintainers.

GnuTLS error in rsyslog-debug.log

By running rsyslogd manually in debug mode, I found an error when the message failed to send:

unexpected GnuTLS error -53 in nsd_gtls.c:1571

Standard Error when running rsyslogd manually produces:

GnuTLS error: Error in the push function

With some help from the GnuTLS mailing list:

That means that send() returned -1 for some reason.” You can enable more output by adding an environment variable GNUTLS_DEBUG_LEVEL=9 prior to running the application, and that should at least provide you with the errno. This didn’t actually provide any more detail to stderr. However, thanks to Rainer we do now have debug.gnutls parameter in the rsyslog code that if you specify this global variable in the rsyslog.conf and assign it a value between 0-10 you’ll have gnutls debug output going to rsyslog’s debug log.

Strategy Two

Rsyslog, TCP, local queuing, TLS, RELP, SEC, syslog server on local network. Notification for inactivity of events could be performed by cron and SEC?
LogAnalyzer also created by Rainer Gerhards (rsyslog author), but more work to setup than an on-line service you don’t have to setup. In saying that. You would have greater control and security which for me is the big win here.
Normalisation also looks like Rainer has his finger in this pie.

In theory Adding RELP to TCP with local queues is a step-up in terms of reliability. Others have said, the reliability of TCP over TLS with local queues is excellent anyway. I’ve yet to confirm it’s excellence. At the time of writing this post,I’m seriously considering moving toward RELP to help solve my reliability issues.

Additional Resource

gentoo rsyslog wiki

Keeping Your Linux Server/s In Time With Your Router

March 28, 2015

Your NTP Server

With this set-up, we’ve got one-to-many Linux servers in a network that all want to be synced with the same up-stream Network Time Protocol (NTP) server/s that your router (or what ever server you choose to be your NTP authority) uses.

On your router or what ever your NTP server host is, add the NTP server pools. Now how you do this really depends on what your using for your NTP server, so I’ll leave this part out of scope. There are many NTP pools you can choose from. Pick one or a collection that’s as close to you’re NTP server as possible.

If your NTP daemon is running on your router, you’ll need to decide and select which router interfaces you want the NTP daemon supplying time to. You almost certainly won’t want it on the WAN interface (unless you’re a pool member) if you have one on your router.

Make sure you restart your NTP daemon.

Your Client Machines

If you have ntpdate installed, /etc/default/ntpdate says to look at /etc/ntp.conf which doesn’t exist without ntp being installed. It looks like this:

# Set to "yes" to take the server list from /etc/ntp.conf, from package ntp,
# so you only have to keep it in one place.
NTPDATE_USE_NTP_CONF=yes

but you’ll see that it also has a default NTPSERVERS variable set which is overridden if you add your time server to /etc/ntp.conf. If you enter the following and ntpdate is installed:

dpkg-query -W -f='${Status} ${Version}\n' ntpdate

You’ll get output like:

install ok installed 1:4.2.6.p5+dfsg-3ubuntu2

Otherwise install it:

apt-get install ntp

The public NTP server/s can be added straight to the bottom of the /etc/ntp.conf file, but because we want to use our own NTP server, we add the IP address of our server that’s configured with our NTP pools to the bottom of the file.

server <IP address of your local NTP server here>

Now if your NTP daemon is running on your router, hopefully you have everything blocked on its interface/s by default and are using a white-list for egress filtering.

In which case you’ll need to add a firewall rule to each interface of the router that you want NTP served up on.

NTP talks over UDP and listens on port 123 by default.

After any configuration changes to your ntpd make sure you restart it. On most routers this is done via the web UI.

On the client (Linux) machines:

sudo service ntp restart

Now issuing the date command on your Linux machine will provide the current time, yes with seconds.

Trouble-shooting

The main two commands I use are:

sudo ntpq -c lpeer

Which should produce output like:

            remote                       refid         st t when poll reach delay offset jitter
===============================================================================================
*<server name>.<domain name> <upstream ntp ip address> 2  u  54   64   77   0.189 16.714 11.589

and the standard NTP query program followed by the as argument:

ntpq

Which will drop you at ntpq’s prompt:

ntpq> as

Which should produce output like:

ind assid status  conf reach auth condition  last_event cnt
===========================================================
  1 15720  963a   yes   yes  none  sys.peer    sys_peer  3

Now in the first output, the * in front of the remote means the server is getting it’s time successfully from the upstream NTP server/s which needs to be the case in our scenario. Often you may also get a refid of .INIT. which is one of the “Kiss-o’-Death Codes” which means “The association has not yet synchronized for the first time”. See the NTP parameters. I’ve found that sometimes you just need to be patient here.

In the second output, if you get a condition of reject, it’s usually because your local ntp can’t access the NTP server you set-up. Check your firewall rules etc.

Now check all the times are in sync with the date command.

350Z Hi-fi Install

February 28, 2015

When we acquired the 2004 350Z, it had a low quality hi-fi installed in it. The FM radio broadcast band in Japan is 76-90 MHz Which misses a large portion of the New Zealand stations.

It’s been a few years since I performed a car hi-fi installation, so I lent on some friends. Specifically Matthew Fung which sold me the AVK6’s and the LRx 5.1MT.

Previous Hi-fi install

Now this install was in my VY Ute. This image shows the sub woofer enclosure in place under the deck right next to the fuel tank. The metal deck panel gets fixed over this and then the tray liner goes over top of that.

VY sub woofer enclosure in place

 

Unlike the 350Z install, this enclosure didn’t have to be pretty at all because it would never be seen.

VY sub woofer and enclosure

 

I learnt a lot from this install that I would take forward to future installs. Namely the 350Z install

350Z Hi-fi Install

Some of Matthew’s advice was to spend in the following fashion:

35% of your budget on speakers, 35% on amplifier, 20% head unit and 10% on sub woofer. This is good advice.

Required Components

Other than the 350Z

  • Head unit: Alpine CDE-148EBT From Quality Car Audio
  • DIN/DDIN kit: 99-7402 from Quality Car Audio
  • Power amplifier: Italian made Audison LRx 5.1MT from Matthew Fung
  • Front speakers: Audison AVK6 6.5’s from Matthew Fung
  • Rear speakers: Existing factory from Matthew Fung
  • Sub woofers: 2 x Alpine SWR-12D4 From Quality Car Audio. One of the reasons for two sub woofers was appearance.
  • Sub woofer box: Fabricated with some old 19mm MDF cover sheets that have been in my garage since I left the carpentry trade about 15 years ago.
  • Square drive (because they are far superior to posi) super screws
  • Sound deadening: Single layer of Soundstream Deathmat from HyperDrive. Once finished about 15kg will be applied to floors, both front and boot and front doors.
  • In-line fuse block: 80amp from JayCar
  • Battery terminals: from JayCar
  • Power cable: 8m of STINGER SPW14TR 4 AWG GAUGE from Ebay
  • Speaker cable: Stinger SHW512B 100 ft Roll of HPM 12 Gauge Matte Blue Car Speaker Wire run to all speakers from Ebay
  • RCA cables: 3 x Stinger SI8217 Audio RCA Interconnect Cable 2 Channels 8000 Series 17 ft from Ebay
  • Internet of course
  • Carbon fiber vinyl wrap
  • Front speaker mounts: Rubber drain pipe transitions
  • Approximately one week labour

Yes I was tempted to skimp on the likes of the power cable, speaker wire, RCA cables because the good ones are much dearer than the cheap ones. This is where you really get what you pay for. If you want to produce a great result, aim for the best you can get here. A few hundred dollars really makes a big difference. I’ve used cheaper parts here on previous installs and the overall difference is very noticeable. It also give you room to upgrade components in the future without having to run all your wires again which is where most of the install time gets sucked up.

Day One

Design and Construct Sub Woofer Enclosure

Generally speaking if you are running two of the exact same sub woofers in the same air space you can just combine the air spaces, but if you swap one of the sub woofers for one that behaves differently, then you are likely to run into issues where the sound waves interfere with each other. Also if you have a sub woofer enclosure that has an air space for each sub woofer you also get bracing for free from the divider. Bracing is all about making the enclosure more rigid so that the sound waves don’t cause the casing to vibrate/move with the waves more than it should. The more rigid you can make the enclosure the more you help the enclosure do what it’s designed to do… produce accurate low frequencies.

sub woofer enclosure

I chose to have one enclosure with two separate air spaces, thus providing a rigid casing. There are three distinct volume calculations here.

  1. One for the main air space
  2. One for the arch way (Looks like a ‘D’ rotated 90° counter clockwise) you can see cut in the rear of the enclosure
  3. The space behind the rear of the main enclosure. This had to be done to accommodate the cast aluminium frame of the SWR-12D4.

I also allowed for adjustment in air space with the two ends of the enclosure, as they could be moved in or out depending on whether I needed more or less air space. As you can see in the image above, the rear and the front of the enclosure have tapered cuts down toward the bottom of the enclosure. These tapered cuts were made once A) the box had been constructed, B) the ends had been adjusted and fixed in place and C) I was certain of the internal air space.

All joints are wood glued and sealed with an acrylic sealant. Solvent based sealants can play havoc with some of the sub woofer internals, thus causing premature failure. The screws are varying lengths depending on the angle of the join. All screw holes were pre-drilled into the initial MDF that needs to be retained with the diameter of the thread, so that the thread doesn’t grip onto the MDF being retained. Then the MDF that the screws actually bite into are pre-drilled with a bit the diameter of the shank (which is a little smaller than the thread). If you don’t do this second part, the MDF will just split which means very little grip and essentially a very weak joint.

All screws were countersunk with a… countersink bit. This is done so that when you cover the box in your chosen covering, it all appears flat. Two circular holes cut with jig saw. The holes that the speaker cable passes through are also sealed with acrylic sealant.

I also chose not to use terminals on the sub woofer enclosure to fix speaker wires to, but rather run the speaker wires straight from the power amplifier to the sub woofer, thus removing one of the couplings. The sub woofer enclosure can still be disconnected at the power amplifier and at the speaker. If you’re planning on swapping sub woofers in and out frequently,

As I’m a carpenter by trade (from aprx 1990 – 2000), this construction was trivial.

For those of you unfamiliar with the 350Z, this is what the target space looks like (plan view) with the dimensions:

350Z sub woofer space

 

You’ll also notice the cross section arrows. They should actually be facing the other direction. The A:A section is the below image in reverse. If I was to go full width with the enclosure, it wouldn’t be able to be installed as the rear hatch opening is considerably narrower than the enclosures target location. It ends up being just over 100mm gap on each side of the enclosure once installed. I just fill this space with plumbers foam pipe lagging which actually looks fine. There is also a small gap in front of the enclosure (between enclosure and rear speakers) and behind it (between enclosure and strut brace). I’ve just pushed 12mm black PEF rod down the gap and it again looks fine. One of the rules in carpentry is, if you can’t hide a transition, then emphasis it. That’s what we’ve done with these gaps. The enclosure doesn’t need to be fixed in place in case of head on collision, as it can’t move forward but can tilt forward a little, but as the roof is too low to allow it to move forward far, it’s safe. Just make sure those huge sub woofers are well fixed as they would be like canon balls in an accident.

Volume Calculations

The gross internal target volume for the sub woofer was 0.85 ft³. It’s range is 0.23 ft³ both ways. I ended up being just below 0.85 ft³. So I was happy.

  1. For the main air space, that’s the largest one. I just worked out the area of the quadrilateral (bottom, front, top, rear) then multiply by the depth. All in cm’s, then convert cm³ to ft³. As I’m lazy, the easiest way is to just throw your values into the calculator here or here. These dimensions are all internal and specified in mm unless stated otherwise.
    EnclosureQuadrilateral
  2. For the semi-circle (‘D’ rotated 90° counter clockwise) it’s just πr2/2 for the area then multiplied by the depth of 1.9cm
  3. The following was the calculation for the triangle behind the ‘D’. this is also used to mount the power amplifier. The outer panel here carries across two of these internal spaces. Hope that makes sense? You’ll see an image below anyway of how the power amplifier is mounted on the rear of the enclosure. I also had the very rear panel of the enclosure over-hang the internal dimension by about 50mm. This was so that I could mount some LED strip lighting if I wanted to. The Audison LRx 5.1MT has it’s logo light up which has for now satisfied my geek addiction for LED mood lighting.
    enclosure triangle volume calculation

 

SWR-12D4 Dimensions

SWR-12D4 dimensions

SWR-12D4 Design

SWR-12D4 design

 

Day Two

Just more of the same thing plus fitment and making small adjustments. I found that by draping a towel over the rear strut brace and just sliding the enclosure over it and tilting it forward it’d just slot into place nicely.

Day Three

Removal of Internals

Removed all the internal panels that I’d need in order to run cables and apply sound deadening. I numbered each piece I pulled off in order, so that it was easier to work out the re-fit order when the time came. I also mask any plugs or screws etc to the panel so they don’t get lost.

  • Remove center console
  • Remove panels to get to rear speakers and to run power cable and speaker wires. I also decided to make use of the empty space which is about 2 ft³ behind the drivers seat. For now this is still left open. I’m looking around wreckers for another cubby-hole like the one behind the passenger seat, as they are identical. This will allow me to claim back the space that we’ve lost due to the enclosure going into the limited boot space.
  • Removed door sill panels and kick panels. Both of which can be pulled off gently.
  • Remove seats. Also followed the sequence in this post to reset the windows going all the way up.

Once I was done with the full install and started the car, the Engine Management Light (EML) came on. This will continue to happen until you reset it. First thing to do though is find out what the fault is if there is one. Follow the sequence of events here. Diagnose the fault code based on the flashes detailed here and here. My code ended up being “P0000” (No Self Diagnostic Failure Indicated). That’s:

10 (0) Long Blink
10 (0) Short Blinks
10 (0) Short Blinks
10 (0) Short Blinks

So I reseted the ECU and I was done.

Apply Sound Deadening

The 350Z’s have a lot of road noise by default. If you want to keep that out so that it doesn’t interfere with your new sounds and keep your sounds inside the car it can be a good idea to apply this liberally. Plus it’s a good time to do this when you have most of your internal panels removed. I’ve found that it made quite a big difference. The sounds are barely audible outside the vehicle even when played at a modest level. When cranked, you can still feel the low frequencies outside more than you can hear them. Believe it or not, that includes the lower frequencies. I applied this a little higher than you can see in the below image. All the rattles you hear in a lot of vehicles with large sub woofer setups are gone. This means the low frequencies are not loosing energy in the panels but are instead going into the cabin space. Which is exactly where we want them.

Sound Deadening to floors

Day Four

Power from Battery to Power Amplifier

Run power cable from battery through firewall to boot of car where amplifier will sit. This link shows a left hand drive 350Z. For right hand drive Z’s, it’s pretty similar.

Run RCA Leads

from front console to where sub woofer enclosure will live. Careful to keep speaker and RCA leads as far away from power cables as possible to reduce interference.

Here you can see the power cable on the left of the drive-shaft tunnel (top of image) and the three RCA leads on the right of the drive-shaft tunnel (bottom of image) zip-tied together and passing through the body just behind where the carpet finishes. You can also see the RCA lead ends looped around the gear stick in the image waiting to be plugged into the new head unit once installed.

Also notice the fold-out work light far left of the image. This is cordless and has magnets on the rear of both LED panels and the middle shaft. The LED panels can be pivoted around the shaft. This is one of the most handy work lights I’ve ever used. It also has a torch light in the middle shaft. This can be run with A) one panel running, B) two panels running, C) just the torch light running. It’s rechargeable and seems to run for as long as I need it each day. I think it’s supposed to run for three hours with only one panel running, but it seemed to run for longer than that. $30 from bunnings.

RCA leads and power cable

Run Speaker Wires

All except front left door and sub woofers.

For the front speakers, the hardest part of just about the entire install is getting the 12 gauge speaker wire through the plastic door harness without joiners. This post provided the detail I needed. Previous car hi-fi installs have been similar in that this task has been the most frustrating. I started with the right door (drivers door) which is harder than the passenger door for several reasons.

  • There is next to no visibility from underneath the steering wheel looking at where the speaker wire must pass through the plastic harness
  • The plastic harness has a greater population of used pins that you have to be very careful not to hit with your knife or drill. I used both tools.

The other thing to keep in mind is that the plastic harness that clicks into place on the car body must go in top first from the inside, not bottom first. This cost me a bit of time (I bent the metal retainer out of shape because I put bottom in first) and I ended up coming back to the driver side after I’d done the passenger side successfully and learnt a few more tricks.

For the front speakers, I ran the speaker wires down the door sills right next to the plastic clips that hold the sill covers on. Then through the holes at the back of the sill and through into the boot near where the power amplifier would be mounted.

I also ran 12 gauge wire for the rear speakers. There are plenty of holes to thread them through and keep them well concealed. Keeping in mind that they need to stay as far away from power cables as possible.

 

Factory Speaker Wire Colours

I took note of factory speaker wire colours in case I needed them later. I didn’t, but here they are anyway for anyone needing them:

Front Left

  • Red with silver loops
  • Blue with silver loops

Front Right: Didn’t capture these

Rear Left

  • Light green with silver loops
  • Light brown with dark brown stripes

Rear Right

  • Light orange with silver loops
  • Black and pink stripes with silver loops

 

Day Five

Fit Tweeters

See the image on Day Six for how the tweeter looks mounted. The small plastic triangle panel that the tweeter is fixed to from memory needs to be pulled in toward the car where it meets the window pane and then slid down. Mine had an existing round grill that looked like a tweeter was mounted underneath, but all it was was a grill. I drilled a hole just big enough to pass the tweeters existing wire through it. At this stage I’ve used double sided sponge tape. The tweeters are fairly heavy so we’ll wait and see if the tape continues to hold them on. If it doesn’t I’ll have to resort to some glue. Currently it’s been about a week and they’re still staying put. I used some acetone to clean the surface of both the rear of the tweeter and the area that it was going to be fixed to. Just be careful with this though, as it will start to melt the plastic, so make sure you only do it where it’s not going to be seen.

As you can see in the image, no wires are visible and it’s a good position for the sound stage. I’ve got the crossovers set to the highest gain for the tweeters. I thought this may have been to much, but it seems perfect. The lower frequencies are surprisingly well handled by the Audison AVK6 6.5″ drivers.

Run Speaker Wire to Left Front Door

Similar to the right door.

 

Day Six

Apply Sound Deadening to Side Doors

A lot of road noise comes in the doors on the 350Z’s. Also with your hi-fi setup, you’d loose a lot of energy through your doors. Applying sound deadening liberally to the doors as well as the floors stops a lot of this. It’ll also kill any rattles you may have had otherwise.

Fit Front Speakers with Crossovers

There is a sunken space where I fixed my crossovers. I soldered and crimped terminals to the speaker wires that get screwed to the crossover. I soldered and applied heat-shrink sleeving to the tweeter wires / 12 gauge wire junction.

Audison AVK6 6.5

 

Below is a closer look at what the 6.5″ and tweeter mounts look like. I used a drainage transition rubber pip for the 6.5″ driver. These are great for mitigating vibration and for cutting to the exact right size.
Now you have to get the off-set right here. The speaker from rear of rim to back is 70mm. I cut my rubber mounts to 40mm. This allowed 30mm sitting inside the door panel. The window pane when it’s wound down goes behind the speaker, so you don’t have a lot of room there. I think you’ve got about 35mm, so I had about 5mm clearance. If it’s not enough, the glass is going to hit the rear of your speaker and could be disastrous. 40mm mounts were actually to wide. It was quite hard to fit the door panels on as they were touching the speaker rim. I’d suggest making your mounts anything over 35mm (don’t go less), target should be about 36/37mm. So as you can see, this has got to be fairly accurate else either your window will hit the rear of the speaker or your door panel will be up against the speaker rim.
There is an excellent step by step guide to most of this here.

TweeterAndDoorSpeaker

Fit Rear Speakers

I was using the existing factory speakers as they don’t really matter that much. Why don’t they matter much? Most of your sound stage should be coming from the front of you.

In my case there were no signs of which speaker terminal was positive or negative. The best way to work this out is to use a 9v battery and touch wires from the batteries + and – terminals to each of the speakers terminals. if the speaker pops out slightly when connected, it means you’ve connected the batteries + to the speakers +. If the speaker pops in slightly when connected, it means you’ve connected the batteries + to the speakers -. This way you can tell which terminal you should connect your designated positive speaker wire. This method doesn’t harm your speaker as the voltage is too low.

 

Day Seven

Install Head Unit

Most of the steps you would need are here.

Fit DIN/DDIN kit 99-7402 to the dash assembly. As the CDE-148EBT is a single DIN unit, we get some space back in the form of another compartment below the head unit.

The Alpine CDE-148EBT comes with a harness. I only needed to solder -> heat shrink on the following harness wires to the existing wires. The existing wires were all factory labelled with tags. This was obviously very helpful. I had already located a wiring guide for this operation that I would not now need. Strange thing was with the guide that the colours were incorrect for my Z.

  • Orange (head unit illumination)
  • Red (ignition)
  • Yellow (battery)
  • Black (ground) just replaced existing ground wire
  • Blue/white (remote turn on, supplies 12v to power amp to tell it to turn on when head unit is on) I missed this one and had to pull the console out again and connect it. I go through this in day eight.

Connect Aux in, USB extension lead, all six RCA’s.

There are a lot of wires behind the head unit and it can be a bit tricky getting everything back in. Be gentle and patient.

Re-fit Seats and Some Panels

Self explanatory.

Apply Sound Deadening

Around where sub woofer enclosure is going to live. I applied it to the sides also.

SoundDeadeningAroundEnclosure

Fit 80 amp In-line Fuse

I fitted the fuse block about 150mm from the battery. The end of the in-line fuse block is just visible in the below image.

80 amp inline fuse

Install Sub Woofer Box

EnclosureFromTop

Fit Sub Woofers

If you plan on swapping your subs in and out a bit then I’d advise using the likes of these inserts to screw into. Screw these into your MDF then screw into them:

EnclosureMDFScrewFixings

It was about here that I realised that I should have gone with the  SWR-12D2 (the 2 ohm version) sub woofers. As I hadn’t actually used the sub woofers yet, I’d be able to just exchange the 4 ohm subs with the 2 ohm subs. Quality Car Audio refused to provide an exchange for the still new subs. Both 4 ohm and 2 ohm models where exactly the same price.

Some reading on under-powering sub woofers:

Connected Power Cable to Amplifier

Ran Earth Cable to Bare Metal

Unscrewed a bracket that came through the left side of the car in the spare wheel well and fitted the terminal I had soldered and crimped onto my 4 gauge earth cable between the bracket and the body of the car after I sanded the paint off for good connection. Cranked that bolt up nice and tight.

Connected all Speaker Wires to Power Amplifier

Connected RCA Leads

Test Front and Rear Speakers

Re-install internal panels

 

Day Eight

Power Amplifier Not Auto Powering On

Take power amp in to be tested as it wouldn’t power on. The car power amplifiers I’d used in the past would turn on when they receive signal from the pre-outs. That’s signal that’s pre-amplified. Turns out a jumper, well a plastic jumper with a single 12v remote turn on from the head unit was needed on the bottom left pin of the power amplifier. So out with the head unit again and run the extra wire. That’s why you want to leave re-fitting of panels as late as possible.

Cleaned up

Charged Battery

Initial Tuning

Power Amplifier

Currently I have the Audison LRx 5.1MT gains set to (where 0 is 0 and 10 is max):

  • Fronts: 8
  • Rears: 3 (existing very crappy factory speakers)
  • mono Sub woofer channel: 8

Head unit

  • Turn off internal power amp as it’s not used and helps to reduce power supply interference/noise
  • The Alpine CDE-148EBT has a lot of options and features when it comes to tuning your sound. This will keep me busy for a long time I’d say.

Sub Woofer Configuration

  1. I first tried 2 ohm setup with one speaker as a benchmark because I knew that it was ideal
  2. I then tried parallel 1 ohm which I don’t think provided enough energy to the sub woofers
  3. Third configuration I tried was in series 4 ohm which worked quite well.

I’ve also noticed that with the sub woofers pointing forward so I can actually hear them may not have been the best design decision. Especially with two of them running in mono as it affects the sound stage a bit. it probably would have produced a slightly better result if they had of been pointing toward the rear of the car under the strut brace so that the bass could be felt more than heard. In saying that, it’s still early days and I have a lot of tuning to get the levels just right. Also as I didn’t want to remove the spare wheel which sits under the boot base in order to mount the power amplifier under there, one of the only places left would be to mount it on top of the sub woofer enclosure, which although it is rather a good looking piece of hardware. I think 12″ sub woofers look better.

EnclosureFromFront

A bit more reading on parallel verses series wiring of your sub woofers.

Mounted Power Amplifier to Rear of Sub Woofer Enclosure

Audison LRx 5.1MT

Overall Sound

I’m kind of surprised that I’ve managed to achieve such a truthful sound of my recordings. I wasn’t sure this was possible in a car. Think of studio monitors and that’s what this system reminds me of. All of a sudden many recordings sound bad and the good recordings sound outstanding and produce the emotion that high quality music (recording, mix, mastering) is renowned for. For example my personal recordings that I did a few years ago sound amazing as I had the help from one of New Zealand’s best sound engineers / producers (Ian McAllister).

I think also the fact that I didn’t take any short-cuts that I’m aware of? It’s all the small details that add up as well as using great components.

Ongoing

Hunting down boot carpet for reduced space. The factory carpet in the Z is not actually carpet but just felt. So I’m going to get some real carpet with the ‘Z’ embroidered into it. There are a few places that actually supply this.

 

GnuPG Key-Pair with Sub-Keys

January 31, 2015

There are quite a few other posts on this topic, but my set-up hasn’t been exactly the same as any I found, so I found myself using quite a few resources to achieve exactly what I wanted.

Synopsis

For my personal work, I mostly use GNU/Linux distributions. All of the following operations have been carried out on such platforms and should work on any Debian derivative.

The initial set-up was performed on a machine other than a laptop. Then I discuss the process I took to get my key pairs into a laptop environment.

All keys are created using the RSA cryptosystem.

I’m going to create a large (4096 bit) RSA key-pair as my master (often called primary) key and then create a smaller (2048 bit) key-pair for signing and then another (2048 bit) key-pair for encrypting/decrypting.

Most of the work is done on the command line.

If you haven’t already got gnupg installed (accessed by the gpg command), run the following command as root. More than likely it’s already installed by default though:

apt-get install gnupg

Run gpg from command line. If it’s the first time it’s been run it’ll produce output like the following. This initialises your .gnupg directory and configuration:

gpg: directory `/home/<you>/.gnupg' created
gpg: new configuration file `/home/<you>/.gnupg/gpg.conf' created
gpg: WARNING: options in `/home/<you>/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/home/<you>/.gnupg/secring.gpg' created
gpg: keyring `/home/<you>/.gnupg/pubring.gpg' created
gpg: Go ahead and type your message ...

Just press Ctrl+d to terminate gpg.

Use the sks key-server pool

This section is optional apart from the first three lines that need to be added to the ~/.gnupg/gpg.conf file. The step of using the pool over TLS can of course be done later.

Rather than rely on a single specific key-server and also over an encrypted channel by using the hkps protocol. If a single server is not functioning properly it’s automatically removed from the pool.

In order to use the hkps protocol (hkp over TLS):

sudo apt-get install gnupg-curl

Now you will have a ~/.gnupg/gpg.conf file you can add the following lines to the end of the config file (SHA-1 (the default) is no longer considered secure).

personal-digest-preferences SHA512
cert-digest-algo SHA512
default-preference-list SHA512 SHA384 SHA256 SHA224 AES256 AES192 AES CAST5 ZLIB BZIP2 ZIP Uncompressed
keyid-format 0xlong
with-fingerprint

There may be a keyserver and keyserver-options option in the ~/.gnupg/gpg.conf already. If so, modify it, if not, add it.

keyserver hkps://hkps.pool.sks-keyservers.net
keyserver-options ca-cert-file=/home/kim/.gnupg/sks-keyservers.netCA.pem

This assumes you downloaded the sks-keyservers.net CA certificate and put it in ~/.gnupg/ . You can of course put it anywhere, but the keyserver-options path will need to reflect your placement.

Verify the certificate’s fingerprint. Compare the fingerprint from the previous link with the output from the following command. It should be the same:

openssl x509 -in sks-keyservers.netCA.pem -fingerprint -noout

Anywhere below where the --keyserver option is specified, can be omitted if you’ve set-up the key-server pool.

Master Key-Pair Generation

This process will create a master key-pair that can be used for signing and a sub key-pair for encrypting/decrypting. We’re actually only going to use the master key-pair that’s created out of this process and we won’t use it for anything other than simply being a master, creating other key-pairs with it, signing other peoples keys etc. We won’t be using it for signing, encrypting/decrypting. We will create two additional sub-keys for this purpose in a bit.

This allows us to remove the master key from our computer and put it in a safe place (disconnected entirely from the network) that can’t be easily accessed. This means that if any of our computers are compromised, the attacker only gets access to our sub-keys which are the keys we use to actually do our day to day work of signing, encrypting outgoing messages and decrypting incoming messages.

On top of this they also need our pass phrase in order to compromise our identity. If in fact an attacker is able to compromise this as well, then we bring our master key out of hiding and can easily revoke the compromised sub key-pair(s) of which the public part is probably on a key-server or your blog or website. This way, when ever anyone gets your public sub-keys from one of the many key-servers or your blog or website, they will see that the public key(s) have been compromised and thus deprecated.

Now run:

gpg --gen-key

Output:

gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection?

I chose 1. That’s (1) RSA and RSA (default)

RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)

Now because this is the master and we’re not actually going to be using it for signing our own messages and encrypting/decrypting and in theory we’ll probably just keep extending it’s expiry date indefinitely, we make it 4096 bit. Why? Because hardware is getting faster all the time and at some stage 2048 bit keys will not be large enough for cryptographic security. Why would we keep extending the master key-pair expiry date? Because we’ve worked hard to acquire other peoples trust (signatures of it) and we don’t really want to go through all that again. That’s why I’ve decided to not actually use the master for day to day work and do everything in my power to make sure it’s never compromised. If somehow the master key-pair was compromised, then I’d still have a revocation certificate that I could use to revoke it. It’d just be a pain though. I go through the creation of the revocation certificate for the master key-pair below.

4096 # Use smaller for sub-keys, as we can replace them easily when it becomes easier to crack them.
Requested keysize is 4096 bits
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0)

I chose 5y

Because I want my master key to expire eventually if it’s compromised along with the pass-phrase and somehow I lost the multiple copies of the master revocation certificate. If it never gets compromised, I’ll just keep extending the expiry date.

Key expires at Fri 06 Dec 2019 23:32:56 NZDT
Is this correct? (y/N)

I chose:

y
You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"

Real name:

Enter your real name:

Kim Carter
Email address:

Enter your email address:

First.Last@provider.com
Comment:

Here you can enter something like your website address or your on-line handle or what ever is useful for providing some more identification

lethalduck
You selected this USER-ID:
    "Kim Carter (lethalduck) <First.Last@provider.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?

Enter ‘O’ to continue:

O

Now you’re asked for a passphrase. Make this long and hard to guess. I don’t remember this myself. That’s why I use a password vault. To have unique credentials for everything.

You need a Passphrase to protect your secret key.

This is not my passphrase, but it’s a good example of one. Adding the extra characters that are all the same actually makes for a much harder to crack code. Oh, you’ll be prompted to enter this twice.

....................MW$]T&LP[=:[f/8=RQQ0M!++kMreX"....................

Now you’re asked to generate the entropy. This is done by interacting with the computer. keystrokes, mouse movements, storage media work. I find running my rsync scripts now is quite effective.

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

Not enough random bytes available.  Please do some other work to give
the OS a chance to collect more entropy! (Need 187 more bytes)

I added a pass phrase and waited for the entropy to be collected.
Once gpg has enough entropy, your key-pairs (master for signing, sub-key for encrypting/decrypting) will be created.

gpg: /home/kim/.gnupg/trustdb.gpg: trustdb created
gpg: key F90A5A4E marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2019-12-06
pub 4096R/F90A5A4E 2014-12-07 [expires: 2019-12-06]
Key fingerprint = D6B6 1E46 4DC9 A3E9 F450 F7F8 C9FA 6F23 F90A 5A4E
uid Kim Carter (lethalduck) <First.Last@provider.com>
sub 4096R/65CA12E5 2014-12-07 [expires: 2019-12-06]

Add photo to a uid

Now I wanted to add a photo to my master key-pair.
PGP specifies that the image be no grater than 120×144. GPG recommends it be 240×288. So I chose the smaller size and reduced the quality as much as possible. Could only get it down to 10kb before the image became unrecognisable.

gpg --edit-key F90A5A4E
# or safer...
gpg --edit-key '<your fingerprint>'
# Don't know your fingerprint?
gpg --list-keys
gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
[ultimate] (1). Kim Carter (lethalduck) <First.Last@provider.com>

gpg>

To add a jpeg:

addphoto

gpg complained that my 10kb image was very large, so I ditched adding the photo.

Add a sub-key for signing

Now before we go any further I just want to make note of the prefixes and suffixes that you’ll often encounter with gpg commands.

Listing your keys with

gpg -K # list secret keys

or

gpg -k # list public keys

will show the following prefixes for your keys.

sec === (sec)ret key
ssb === (s)ecret (s)u(b)-key
pub === (pub)lic key
sub === public (sub)-key

Roles of the key-pair will be represented by the middle character below.

Constant Character Explanation
PUBKEY_USAGE_SIG S Key is good for signing
PUBKEY_USAGE_CERT C Key is good for certifying other signatures
PUBKEY_USAGE_ENC E Key is good for encryption
PUBKEY_USAGE_AUTH A Key is good for authentication

When we add sub-keys, they are bound to the master key. The master key is modified to reference the sub-keys

What we want to do is add a sub-key for signing so we can move the master key-pair off of the machine and into a safe place.
We also want to change the expiry date and reduce the size to 2048 of both the new signing sub-key and also create another sub-key for encryption with a shorter expiry date.

Create backup of your ~/.gnupg directory:

umask 077; tar -cf $HOME/gnupg-backup.tar -C $HOME .gnupg

To add a signing sub-key:

gpg --edit-key F90A5A4E
# or safer...
gpg --edit-key '<your fingerprint>'
# Don't know your fingerprint?
gpg --list-keys

Output:

gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
[ultimate] (1). Kim Carter (lethalduck) <First.Last@provider.com>

gpg>

Now we add the key

addkey
Key is protected.

You need a passphrase to unlock the secret key for
user: "Kim Carter (lethalduck) <First.Last@provider.com>"
4096-bit RSA key, ID F90A5A4E, created 2014-12-07

Please select what kind of key you want:
   (3) DSA (sign only)
   (4) RSA (sign only)
   (5) Elgamal (encrypt only)
   (6) RSA (encrypt only)
Your selection?

Now we want (4) RSA (sign only)

4

Output:

RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)

Choose 2048 because we can easily regenerate this key-pair or extend the expiry date and at this stage 2048 is secure enough.

2048

Output:

Requested keysize is 2048 bits
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0)

I set this to 2y

Key expires at Wed 07 Dec 2016 01:21:11 NZDT
Is this correct? (y/N)

y

Really create? (y/N)

y

After this gpg collects more entropy. When it’s done it dumps you back to the gpg prompt

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

Not enough random bytes available.  Please do some other work to give
the OS a chance to collect more entropy! (Need 186 more bytes)
.......+++++
.+++++

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
sub  2048R/7A3122BD  created: 2014-12-07  expires: 2016-12-06  usage: S
[ultimate] (1). Kim Carter (lethalduck) <First.Last@provider.com>

gpg>

Now you can see from the ‘S’ suffix that we do now have a sub-key that’s “good for signing”

Same again but for encrypting

While still at the gpg prompt, run addkey again but choose option 6.

That’s (6) RSA (encrypt only)
Choose 2048 for the keysize.
Choose 2y (two years) for how long the key is valid for.

Eventually you’ll see:

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
sub  2048R/7A3122BD  created: 2014-12-07  expires: 2016-12-06  usage: S
sub  2048R/8FF9669C  created: 2014-12-07  expires: 2016-12-06  usage: E
[ultimate] (1). Kim Carter (lethalduck) <First.Last@provider.com>

gpg>

Now you can see from the ‘E’ suffix that we do now have a sub-key that’s “good for encryption”

To save the new keys before finishing with gpg, type save.

Create Revocation Certificate for Master Key

gpg --output F90A5A4E.gpg-revocation-certificate --gen-revoke F90A5A4E

Output:

sec  4096R/F90A5A4E 2014-12-07 Kim Carter (lethalduck) <First.Last@provider.com>

Create a revocation certificate for this key? (y/N)

Type y

Please select the reason for the revocation:
  0 = No reason specified
  1 = Key has been compromised
  2 = Key is superseded
  3 = Key is no longer used
  Q = Cancel
(Probably you want to select 1 here)
Your decision?

Type 1

Enter an optional description; end it with an empty line:
>

Enter anything you like here.

This revocation certificate was generated when the key was created.
>
Reason for revocation: Key has been compromised
This revocation certificate was generated when the key was created.
Is this okay? (y/N)

y

Output:

You need a passphrase to unlock the secret key for
user: "Kim Carter (lethalduck) <First.Last@provider.com>"
4096-bit RSA key, ID F90A5A4E, created 2014-12-07

ASCII armored output forced.
Revocation certificate created.

Please move it to a medium which you can hide away; if Mallory gets
access to this certificate he can use it to make your key unusable.
It is smart to print this certificate and store it away, just in case
your media become unreadable.  But have some caution:  The print system of
your machine might store the data and make it available to others!

Now store your master key-pair revocation certificate somewhere off of the network. Preferably in more than one place also.

Copy ~/.gnupg to an external device (/media/) for safe keeping before we remove the master key-pair from your computer.

Remove master key

Following are the commands to do this.

gpg --export-secret-subkeys F90A5A4E > /media/<your encrypted USB device>/subkeys
gpg --delete-secret-key F90A5A4E

Output:

gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

sec  4096R/F90A5A4E 2014-12-07 Kim Carter (lethalduck) <First.Last@provider.com>

Delete this key from the keyring? (y/N)

Type y

This is a secret key! - really delete? (y/N)

Type y

gpg --import /media/<your encrypted USB device>/subkeys

Output:

gpg: key F90A5A4E: secret key imported
gpg: key F90A5A4E: "Kim Carter (lethalduck) <First.Last@provider.com>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1
gpg:       secret keys read: 1
gpg:   secret keys imported: 1

Now check to make sure that the master key-pair is no longer on your computer but is on your USB device:

gpg -K

Output:

sec#  4096R/F90A5A4E 2014-12-07 [expires: 2019-12-06]
uid                  Kim Carter (lethalduck) <First.Last@provider.com>
ssb   4096R/65CA12E5 2014-12-07
ssb   2048R/7A3122BD 2014-12-07
ssb   2048R/8FF9669C 2014-12-07
gpg --home=/media/<your encrypted USB device>/.gnupg/ -K

Output:

sec   4096R/F90A5A4E 2014-12-07 [expires: 2019-12-06]
uid                  Kim Carter (lethalduck) <First.Last@provider.com>
ssb   4096R/65CA12E5 2014-12-07
ssb   2048R/7A3122BD 2014-12-07
ssb   2048R/8FF9669C 2014-12-07

You can see that the first command shows sec#. This means there is no master key-pair in your ~/.gnupg/ directory.

Upload your Public Keys to KeyServer

Remember if you used a key-server pool, anywhere the --keyserver option is specified, can be omitted.

I’ve chosen https://pgp.mit.edu/
You can choose any public keyserver. They all communicate with each other and sync updates at least daily. You can also send more than one public key by adding additional Ids after the –send-keys.

gpg --keyserver hkp://pgp.mit.edu/ --send-keys F90A5A4E

Output:

gpg: sending key F90A5A4E to hkp server pgp.mit.edu

Download public keys from KeyServer

gpg --keyserver hkp://pgp.mit.edu/ --recv-keys <key id to receive and merge signatures>

A safer way to do this is to not just trust every key from a key-server, but rather to verify the key belongs to who you think it belongs to before you download and trust it. Try one at a time and use the fingerprint rather than just the short Id.

gpg --keyserver hkp://pgp.mit.edu/ --recv-key '<fingerprint>'

The single quotes are mandatory around the fingerprint. Double quotes will also work.

Refreshing local Keys from Key-Server

gpg --refresh-keys

Set-up the Laptop with your key-pairs

Copy the contents of the desktops ~/.gnupg/ to the laptops ~/.gnupg/ . I just used the same USB drive for this, but made sure I didn’t mix this .gnupg/ up with the one that had the master key. Then delete the copy without the master key once copied to save any confusion. Also keep in mind that when you delete files from a flash drive they are not actually deleted. That’s why it’s important to use an encrypted USB drive. Also keep it in a very safe place, make a copy of it and keep that off site in a very safe place also.

Make sure you check the permissions of the ~/.gnupg files you just copied to the laptop so that they are the same as the files crated with the gpg command.

Adding another E-Mail Address

Now it’s easier if you do this here in the sequence, but I didn’t think about it until after I’d uploaded the public keys. If you do want to add another uid once you’ve moved the master key, copied your master key’less sub-keys to your laptop, it just means you’ve got to operate on the master key that you moved into /media//.gnupg/, then copy the contents of /media//.gnupg/ back to ~/.gnupg/ on both your desktop and laptop machines not forgetting to change file permissions again, remove master key from ~/.gnupg/ and upload the modified public keys again.

This is how you would add the additional uid:

gpg --home=/media/<your encrypted USB device>/.gnupg --edit-key F90A5A4E
# or safer...
gpg --home=/media/<your encrypted USB device>/.gnupg --edit-key '<your fingerprint>'
# Don't know your fingerprint?
gpg --list-keys

Output:

gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

gpg: DBG: locking for `/media/<your encrypted USB device>/.gnupg/trustdb.gpg.lock' done via O_EXCL
pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
sub  2048R/7A3122BD  created: 2014-12-07  expires: 2016-12-06  usage: S
sub  2048R/8FF9669C  created: 2014-12-07  expires: 2016-12-06  usage: E
[ultimate] (1). Kim Carter (lethalduck) <First.Last@provider.com>

gpg>

Add the extra uid now:

adduid

Output:

Real name:

Enter your real name:

Kim Carter

Output:

Email address:

Enter the additional email address you want:

kim.carter@owasp.org

Output:

Comment:

Add the web page that adds some proof of identity:

https://www.owasp.org/index.php/New_Zealand

Output:

You selected this USER-ID:
    "Kim Carter (https://www.owasp.org/index.php/New_Zealand) <kim.carter@owasp.org>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?

Type O

Output:

You need a passphrase to unlock the secret key for
user: "Kim Carter (lethalduck) <First.Last@provider.com>"
4096-bit RSA key, ID F90A5A4E, created 2014-12-07

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
sub  2048R/7A3122BD  created: 2014-12-07  expires: 2016-12-06  usage: S
sub  2048R/8FF9669C  created: 2014-12-07  expires: 2016-12-06  usage: E
[ultimate] (1)  Kim Carter (lethalduck) <First.Last@provider.com>
[ unknown] (2). Kim Carter (https://www.owasp.org/index.php/New_Zealand) <First.Last@owasp.org>

gpg>

Now we want the same trust level applied to the second uid as the existing:

trust

Output:

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
sub  2048R/7A3122BD  created: 2014-12-07  expires: 2016-12-06  usage: S
sub  2048R/8FF9669C  created: 2014-12-07  expires: 2016-12-06  usage: E
[ultimate] (1)  Kim Carter (lethalduck) <First.Last@provider.com>
[ unknown] (2). Kim Carter (https://www.owasp.org/index.php/New_Zealand) <First.Last@owasp.org>

Please decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)

  1 = I don't know or won't say
  2 = I do NOT trust
  3 = I trust marginally
  4 = I trust fully
  5 = I trust ultimately
  m = back to the main menu

Your decision?

Type 5

Output:

Do you really want to set this key to ultimate trust? (y/N)

Type y

Output

pub  4096R/F90A5A4E  created: 2014-12-07  expires: 2019-12-06  usage: SC
                     trust: ultimate      validity: ultimate
sub  4096R/65CA12E5  created: 2014-12-07  expires: 2019-12-06  usage: E
sub  2048R/7A3122BD  created: 2014-12-07  expires: 2016-12-06  usage: S
sub  2048R/8FF9669C  created: 2014-12-07  expires: 2016-12-06  usage: E
[ultimate] (1)  Kim Carter (lethalduck) <First.Last@provider.com>
[ unknown] (2). Kim Carter (https://www.owasp.org/index.php/New_Zealand) <First.Last@owasp.org>

gpg>

Don’t worry that it still looks like it’s unknown. Once you save and try to edit again, you’ll see the change has been saved.

If you want to make the uid that you’ve tentatively just added your primary, select it:

uid 2

issue the command:

primary

and finally save:

save

Sign Someone Else’s Public Key

You’re going to have to download, import the persons key into your ~/.gnupg/pubring.gpg

If you’ve got a key-server pool configured, you won’t need the --keyserver option.

gpg --recv-key '<fingerprint of public key you want to import>'
gpg --home=/media/<your encrypted USB device>/.gnupg/ --primary-keyring=~/.gnupg/pubring.gpg --sign-key '<fingerprint of public key you want to sign>'

There will be some other output here. I wasn’t actually asked which trust level I wanted to provide, so I carried out the following edit.

gpg --edit-key '<fingerprint of public key you want to sign>'

Output:

gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

pub  4096R/<id of public key you just signed>  created: 2014-05-09  expires: never       usage: SC
                               trust: unknown       validity: unknown
sub  4096R/<a sub-key>  created: 2014-05-09  expires: never       usage: E
sub  4096R/<another sub-key>  created: 2014-05-09  expires: 2019-05-08  usage: S
[ unknown] (1). <key holders name> (4096 bit key generated 9/5/2014) <e-mail1@gmail.com>
[ unknown] (2)  <key holders name> (Their key) <e-mail@somethingelse.com>
[ unknown] (3)  <key holders name> (Their Yahoo) <e-mail@yahoo.com>
[ unknown] (4)  <key holders name> (Their Other Email Account) <e-mail@whatever.org>

gpg>

Issue the trust command:

trust

Output:

pub  4096R/<id of public key you just signed>  created: 2014-05-09  expires: never       usage: SC
                               trust: unknown       validity: unknown
sub  4096R/<a sub-key>  created: 2014-05-09  expires: never       usage: E
sub  4096R/<another sub-key>  created: 2014-05-09  expires: 2019-05-08  usage: S
[ unknown] (1). <key holders name> (4096 bit key generated 9/5/2014) <e-mail1@gmail.com>
[ unknown] (2)  <key holders name> (Their key) <e-mail@somethingelse.com>
[ unknown] (3)  <key holders name> (Their Yahoo) <e-mail@yahoo.com>
[ unknown] (4)  <key holders name> (Their Other Email Account) <e-mail@whatever.org>

Please decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)

  1 = I don't know or won't say
  2 = I do NOT trust
  3 = I trust marginally
  4 = I trust fully
  5 = I trust ultimately
  m = back to the main menu

Your decision?
3

Output:

pub  4096R/<id of public key you just signed>  created: 2014-05-09  expires: never       usage: SC
                               trust: marginal      validity: unknown
sub  4096R/<a sub-key>  created: 2014-05-09  expires: never       usage: E
sub  4096R/<another sub-key>  created: 2014-05-09  expires: 2019-05-08  usage: S
[ unknown] (1). <key holders name> (4096 bit key generated 9/5/2014) <e-mail1@gmail.com>
[ unknown] (2)  <key holders name> (Their key) <e-mail@somethingelse.com>
[ unknown] (3)  <key holders name> (Their Yahoo) <e-mail@yahoo.com>
[ unknown] (4)  <key holders name> (Their Other Email Account) <e-mail@whatever.org>
Please note that the shown key validity is not necessarily correct
unless you restart the program.

gpg>

Email the Signed Public-Key

In order to send an email with the freshly signed public-key, attach the file generated with the following command, encrypt and send the email to the owner of the public key specified by their uid. Details on how to encrypt the e-mail are specific to the e-mail client you choose to use.

gpg --armor --output <long id of receivers public key>.signed-by.0xc9fa6f23f90a5a4e --export '<fingerprint of public key you just signed>'

Upload the Signed Public-Key to a Key Server

 

gpg --send-key '<fingerprint of public key you just signed>'

Output:

gpg: sending key <long id of receivers public key> to hkps server hkps.pool.sks-keyservers.net

Verify to make sure you’re public domain signing is good.

Import Your Public-Key Signed by Someone Else

At some stage you may need to import a copy of your public-key in the form of a file that someone else has added their signature to

gpg --import ./0xC9FA6F23F90A5A4E.signed-by-<someone else's long id>.asc

Then view your new signatures:

gpg --list-sigs 0xC9FA6F23F90A5A4E

Then upload them again with --send-key
and pull them down to your other machines with --refresh-keys. You’ll also need to --recv-key their keys so that your key recognises who the signatories are. Or… just simply copy over your ~/.gnupg/ directory. Make sure to check your permissions before and after the copy though. We don’t want anyone other than you being able to read these files. Especially the secring.gpg and any pem certs you have.

Browser based E-Mail

Two browser extensions that look OK are:

  1. Mailvelope for Firefox and Chrome (I’m using this)
    Getting set-up details
    Details of how this works here
  2. Mymail-Crypt for Gmail

Desktop based E-Mail

Thunderbird with enigmail

I also found that to send or reply to someone and encrypt, that I had to make a change in Thunderbird, as Thunderbird wrongly thinks I’m not trusting identities when I have specifically set trust levels. I’ve heard comments that if you set the trust level in gpg to “I trust ultimately” then Enigmail is happy to send your mail. I only trust myself ultimately so I found another way.
If you go into the Edit menu of Thunderbird -> Account Settings. For each email account in your gpg signature… OpenPGP Security -> Enigmail Preferences… -> Change “To send encrypted, accept” from Only trusted keys to “All usable keys”. Then when you get the final confirmation of sending encrypted email, you are asked to confirm the 8 digit ID. I just double check by

gpg --edit-key '<keyID that Thunderbird says it's using>'

Additional Resources I’ve Collected

Posts/articles, Documentation

Podcasts

Installation and Hardening of Debian Web Server

December 27, 2014

These are the steps I took to set-up and harden a Debian web server before being placed into a DMZ and undergoing additional hardening before opening the port from the WWW to it. Most of the steps below are fairly simple to do, and in doing so, remove a good portion of the low hanging fruit for nasty entities wanting to gain a foot-hold on your server->network.

Install and Set-up

Debian wheezy, currently stable (supported by the Debian security team for a year or so).

Creating ESXi 5.1 guest

First thing to do is to setup a virtual switch for the host under the Configuration tab. Now I had several quad port Gbit Ethernet adapters in this server. So I created a virtual switch and assigned a physical adapter to it. Now when you create your VM, you choose the VM Network assigned to the virtual switch you created. Provision your disks. Check the “Edit the virtual machine settings before completion” and Continue. You will now be able to modify your settings before you boot the VM. I chose 512MB of RAM at this stage which is far more than it actually needs. While I’m provisioning and hardening the Debian guest, I have the new virtual switch connected to the clients LAN.

ESX Network Configuration

Once we’re done, we can connect the virtual switch up to the new DMZ physical switch or strait into the router. Upload the debian .iso that you downloaded to the ESXi datastore. Then edit the VM settings and select the CD/DVD drive. Select the “Datastore ISO File” option and browse to the .iso file and select the “Connect at power on” option.

6_NewVMSelectIso

Kick the VM in the guts and flick to the VM’s Console tab.

OS Installation

Partitioning

Deleted all the current partitions and added the following. / was added to the start and the rest to the end, in the following order.
/, /var, /tmp, /opt, /usr, /home, swap.

Partitioning Disks

Now the sizes should be setup according to your needs. If you have plenty of RAM, make your swap small, if you have minimal RAM (barely (if) sufficient), you could double the RAM size for your swap. It’s usually a good idea to think about what mount options you want to use for your specific directories. This may shape how you setup your partitions. For example, you may want to have options nosuid,noexec on /var but you can’t because there are shell scripts in /var/lib/dpkg/info so you could setup four partitions. /var without nosuid,noexec and /var/tmp, /var/log, /var/account with nosuid,noexec. Look ahead to the Mounting of Partitions section for more info on this.
In saying this, you don’t need to partition as finely grained as you want options for. You can still mount directories on directories and alter the options at that point. This can be done in the /etc/fstab file and also ad-hoc (using the mount command) if you want to test options out.

You can think about changing /opt (static data) to mount read-only in the future as another security measure.

Continuing with the Install

When you’re asked for a mirror to pull packages from, if you have an apt-cacher[-ng] proxy somewhere on your network, this is the chance to make it work for you thus speeding up your updates and saving internet bandwidth. Enter the IP address and port and leave the rest as default. From the Software selection screen, select “Standard system utilities” and “SSH server”.

10_SoftwareSelection

When prompted to boot into your new system, we need to remove our installation media from the VMs settings. Under the Device Status settings for your VM (if you’re using ESXi), Uncheck “Connected” and “Connect at power on”. Make sure no other boot media are connected at power on. Now first thing we do is SSH into our new VM because it’s a right pain working through the VM hosts console. When you first try to SSH to it you’ll be shown the ECDSA key fingerprint to confirm that the machine you think you are SSHing to is in fact the machine you want to SSH to. Follow the directions here but change that command line slightly to the following:

ssh-keygen -lf ssh_host_ecdsa_key.pub

This will print the keys fingerprint from the actual machine. Compare that with what you were given from your remote machine. Make sure they match and accept and you should be in. Now I use terminator so I have a lovely CLI experience. Of course you can take things much further with Screen or Tmux if/when you have the need.

Next I tell apt about the apt-proxy-ng I want it to use to pull it’s packages from. This will have to be changed once the server is plugged into the DMZ. Create the file /etc/apt/apt.conf if it doesn’t already exist and add the following line:

Acquire::http::Proxy "http://[IP address of the machine hosting your apt cache]:[port that the cacher is listening on]";

Replace the apt proxy references in /etc/apt/sources.list with the internet mirror you want to use, so we contain all the proxy related config in one line in one file. This will allow the requests to be proxied and packages cached via the apt cache on your network when requests are made to the mirror of your choosing.

Update the list of packages then upgrade them with the following command line. If your using sudo, you’ll need to add that to each command:

apt-get update && apt-get upgrade # only run apt-get upgrade if apt-get update is successful (exits with a status of 0)


The steps you take to harden a server that will have many user accounts will be considerably different to this. Many of the steps I’ve gone through here will be insufficient for a server with many users.
The hardening process is not a one time procedure. It ends when you decommission the server. Be prepared to stay on top of your defenses. It’s much harder to defend against attacks than it is to exploit a vulnerability.

Passwords

After a quick look at this, I can in fact verify that we are shadowing our passwords out of the box. It may be worth looking at and modifying /etc/shadow . Consider changing the “maximum password age” and “password warning period”. Consult the man page for shadow for full details. Check that you’re happy with which encryption algorithms are currently being used. The files you’ll need to look at are: /etc/shadow and /etc/pam.d/common-password . The man pages you’ll probably need to read in conjunction with each other are the following:

  • shadow
  • pam.d
  • crypt 3
  • pam_unix

Out of the box crypt supports MD5, SHA-256, SHA-512 with a bit more work for blowfish via bcrypt. The default of SHA-512 enables salted passwords. How can you tell which algorithm you’re using, salt size etc? the crypt 3 man page explains it all.
So by default we’re using SHA-512 which is better than MD5 and the smaller SHA-256.

Now by default I didn’t have a “rounds” option in my /etc/pan.d/common-password module-arguments. Having a large iteration count (number of times the encryption algorithm is run (key stretching)) and an attacker not knowing what that number is, will slow down an attack. I’d suggest adding this and re creating your passwords. As your normal user run:

passwd

providing your existing password then your new one twice. You should now be able to see your password in the /etc/shadow file with the added rounds parameter

$6$rounds=[chosen number of rounds specified in /etc/pam.d/common-password]$[8 character salt]$0LxBZfnuDue7.n5<rest of string>

Check /var/log/auth.log
Reboot and check you can still log in as your normal user. If all good. Do the same with the root account.

Using bcrypt with slowpoke blowfish is a much slower algorithm, so it’s even better for password encryption, but more work to setup at this stage.

Some References

Consider setting a password for GRUB, especially if your server is directly on physical hardware. If it’s on a hypervisor, an attacker has another layer to go through before they can access the guests boot screen. If an attacker can access your VM through the hypervisors management app, you’re pretty well screwed anyway.

Disable Remote Root Logins

Review /etc/pam.d/login so we’re only permitting local root logins. By default this was setup that way.
Review /etc/security/access.conf . Make sure root logins are limited as much as possible. Un-comment rules that you want. I didn’t need to touch this.
Confirm which virtual consoles and text terminal devices you have by reviewing /etc/inittab then modify /etc/securetty by commenting out all the consoles you don’t need (all of them preferably). Or better just issue the following command to fill the file with nothing:

cat /dev/null > /etc/securetty

I back up this file before I do this.
Now test that you can’t log into any of the text terminals listed in /etc/inittab . Just try logging into the likes of your ESX/i vSphere guests console as root. You shouldn’t be able to now.

Make sure if your server is not physical hardware but a VM, then the hosts password is long and made up of a random mix of upper case, lower case, numbers and special characters.

Additional Resources

http://www.debian.org/doc/manuals/securing-debian-howto/ch4.en.html#s-restrict-console-login

SSH

My feeling after a lot of reading is that currently RSA with large keys (The default RSA size is 2048 bits) is a good option for key pair authentication. Personally I like to go for 4096, but with the current growth of processing power (following Moore’s law), 2048 should be good until about 2030. Update: I’m not so sure about the 2030 date for this now.

Create your key pair if you haven’t already and setup key pair authentication. Key-pair auth is more secure and allows you to log in without a password. Your pass-phrase should be stored in your keyring. You’ll just need to provide your local password once (each time you log into your local machine) when the keyring prompts for it. Of course your pass-phrase needs to be kept secret. If it’s compromised, it won’t matter how much you’ve invested into your hardening effort. To tighten security up considerably Make the necessary changes to your servers /etc/ssh/sshd_config file. Start with the changes I’ve listed here.
When you change things like setting up AllowUsers or any other potential changes that could lock you out of the server. It’s a good idea to be logged in via one shell when you exit another and test it. This way if you have locked yourself out, you’ll still be logged in on one shell to adjust the changes you’ve made. Unless you have a need for multiple users, lock it down to a single user. You can even lock it down to a single user from a specific host.
After a set of changes, issue the following restart command as root or sudo:

service ssh restart

You can check the status of the daemon with the following command:

service ssh status

Consider changing the port that SSH listens on. May slow down an attacker slightly. Consider whether it’s worth adding the extra characters to your SSH command. Consider keeping the port that sshd binds to below 1025 where only root can bind a process to.

We’ll need to tunnel SSH once the server is placed into the DMZ. I’ve discussed that in this post.

Additional Resources

Check SSH login attempts. As root or via sudo, type the following to see all failed login attempts:

cat /var/log/auth.log | grep 'sshd.*Invalid'

If you want to see successful logins, type the following:

cat /var/log/auth.log | grep 'sshd.*opened'

Consider installing and configuring denyhosts

Disable Boot Options

All the major hypervisors should provide a way to disable all boot options other than the device you will be booting from. VMware allows you to do this in vSphere Client.

Set BIOS passwords.

Lock Down the Mounting of Partitions

Getting started with your fstab.

Make a backup of your /etc/fstab before you make changes. I ended up needing this later. Read the man page for fstab and also the options section in the mount man page. The Linux File System Hierarchy (FSH) documentation is worth consulting also for directory usages.
Add the noexec mount option to /tmp but not /var because executable shell scripts such as pre, post and removal reside within /var/lib/dpkg/info .
You can also add the nodev nosuid options.
You can add the nodev option to /var, /usr, /opt, /home also.
You can also add the nosuid option to /home .
You can add ro to /usr

To add mount options nosuid,noexec to /var/tmp, /var/log, /var/account, we need to bind the target mount onto an existing directory. The following procedure details how to do this for /var/tmp. As usual, you can do all of this without a reboot. This way you can modify until your hearts content, then be confident that a reboot will not destroy anything or lock you out of your system.
Your /etc/fstab unmounted mounts can be tested like this

sudo mount -a

Then check the difference with

mount

mount options can be set up on a directory by directory basis for finer grained control. For example my /var mount in my /etc/fstab may look like this:

UUID=<block device ID goes here> /var ext4 defaults,nodev 0 2

Then add another line below that in your /etc/fstab that looks like this:

/var /var/tmp none nosuid,noexec,bind 0 2

The file system type above should be specified as none (as stated in the “The bind mounts” section of the mount man page http://man.he.net/man8/mount). The bind option binds the mount. There was a bug with the suidperl package in debian where setting nosuid created an insecurity. suidperl is no longer available in debian.

If you want this to take affect before a reboot, execute the following command:

sudo mount --bind /var/tmp /var/tmp

Then to pickup the new options from /etc/fstab:

sudo mount -o remount /var/tmp

For further details consult the remount option of the mount man page.

At any point you can check the options that you have your directories mounted as, by issuing the following command:

mount

You can test this by putting a script in /var and copying it to /var/tmp. Then try running each of them. Of course the executable bits should be on. You should only be able to run the one that is in the directory mounted without the noexec option. My file “kimsTest” looks like this:

#!/bin/sh
echo "Testing testing testing kim"

Then I…

myuser@myserver:/var$ ./kimsTest
Testing testing testing kim
myuser@myserver:/var$ ./tmp/kimsTest
-bash: ./tmp/kimsTest: Permission denied

You can set the same options on the other /var sub-directories (not /var/lib/dpkg/info).

Enable read-only / mount

There are some contradictions on /run/shm size allocation. Increase the size vs Don’t increase the size

Additional Resources

Work Around for Apt Executing Packages from /tmp

Disable Services we Don’t Need

RPC portmapper

dpkg-query -l '*portmap*'

portmap is not installed by default, so we don’t need to remove it.

Exim

dpkg-query -l '*exim*'

Exim4 is installed.
You can see from the netstat output below (in the “Remove Services” area) that exim4 is listening on localhost and it’s not publicly accessible. Nmap confirms this, but we don’t need it, so lets disable it. We should probably be using ss too.

When a run level is entered, init executes the target files that start with k with a single argument of stop, followed with the files that start with s with a single argument of start. So by renaming /etc/rc2.d/s15exim4 to /etc/rc2.d/k15exim4 you’re causing init to run the service with the stop argument when it moves to run level 2. Just out of interest sake, the scripts at the end of the links with the lower numbers are executed before scripts at the end of links with the higher two digit numbers. Now go ahead and check the directories for run levels 3-5 as well and do the same. You’ll notice that all the links in /etc/rc0.d (which are the links executed on system halt) start with ‘K’. Making sense?

Follow up with

sudo netstat -tlpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0: 0.0.0.0:* LISTEN 1910/sshd
tcp6 0 0 ::: :::* LISTEN 1910/sshd

And that’s all we should see.

Additional resources for the above

Disable Network Information Service (NIS). NIS lets several machines in a network share the same account information, such as the password file (Allows password sharing between machines). Originally known as Yellow Pages (YP). If you needed centralised authentication for multiple machines, you could set-up an LDAP server and configure PAM on your machines in order to contact the LDAP server for user authentication. We have no need for distributed authentication on our web server at this stage.

dpkg-query -l '*nis*'

Nis is not installed by default, so we don’t need to remove it.

Additional resources for the above

Remove Services

First thing I did here was run nmap from my laptop

nmap -p 0-65535 <serverImConfiguring>
PORT STATE SERVICE
23/tcp filtered telnet
111/tcp open rpcbind
/tcp open

Now because I’m using a non default port for SSH, nmap thinks some other service is listening. Although I’m sure if I was a bad guy and really wanted to find out what was listening on that port it’d be fairly straight forward.

To obtain a list of currently running servers (determined by LISTEN) on our web server. Not forgetting that man is your friend.

sudo netstat -tap | grep LISTEN

or

sudo netstat -tlp

I also like to add the ‘n’ option to see the ports. This output was created before I had disabled exim4 as detailed above.

tcp 0 0 *:sunrpc *:* LISTEN 1498/rpcbind
tcp 0 0 localhost:smtp *:* LISTEN 2311/exim4
tcp 0 0 *:57243 *.* LISTEN 1529/rpc.statd
tcp 0 0 *: *:* LISTEN 2247/sshd
tcp6 0 0 [::]:sunrpc [::]:* LISTEN 1498/rpcbind
tcp6 0 0 localhost:smtp [::]:* LISTEN 2311/exim4
tcp6 0 0 [::]:53309 [::]:* LISTEN 1529/rpc.statd
tcp6 0 0 [::]: [::]:* LISTEN 2247/sshd

Rpcbind

Here we see that sunrpc is listening on a port and was started by rpcbind with the PID of 1498.
Now Sun Remote Procedure Call is running on port 111 (also the portmapper port) netstat can tell you the port, confirmed with the nmap scan above. This is used by NFS and as we don’t need NFS as our server isn’t a file server, we can get rid of the rpcbind package.

dpkg-query -l '*rpc*'

Shows us that rpcbind is installed and gives us other details. Now if you’ve been following along with me and have made the /usr mount read only, some stuff will be left behind when we try to purge:

sudo apt-get purge rpcbind

Following are the outputs of interest:

The following packages will be REMOVED:
nfs-common* rpcbind*
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
Do you want to continue [Y/n]? y
Removing nfs-common ...
[ ok ] Stopping NFS common utilities: idmapd statd.
dpkg: error processing nfs-common (--purge):
cannot remove `/usr/share/man/man8/rpc.idmapd.8.gz': Read-only file system
Removing rpcbind ...
[ ok ] Stopping rpcbind daemon....
dpkg: error processing rpcbind (--purge):
cannot remove `/usr/share/doc/rpcbind/changelog.gz': Read-only file system
Errors were encountered while processing:
nfs-common
rpcbind
E: Sub-process /usr/bin/dpkg returned an error code (1)

Another

dpkg-query -l '*rpc*'

Will result in pH. That’s a desired action of (p)urge and a package status of (H)alf-installed.
Now the easiest thing to do here is rename your /etc/fstab to something else and rename the /etc/fstab you backed up before making changes to it back to /etc/fstab then because you know the fstab is good,

reboot

Then try the purge, dpkg-query and netstat commands again to make sure rpcbind is gone and of course no longer listening. I had to actually do the purge twice here as config files were left behind from the fist purge.

Also you can remove unused dependencies now after you get the following message:

The following packages were automatically installed and are no longer required:
libevent-2.0-5 libgssglue1 libnfsidmap2 libtirpc1
Use 'apt-get autoremove' to remove them.
The following packages will be REMOVED:
rpcbind*

sudo apt-get -s autoremove

Because I want to simulate what’s going to be removed because I”m paranoid and have made stupid mistakes with autoremove years ago and that pain has stuck with me. I autoremoved a meta-package which depended on many other packages. A subsequent autoremove for packages that had a sole dependency on the meta-package meant they would be removed. Yes it was a painful experience. /var/log/apt/history.log has your recent apt history. I used this to piece back together my system.

Then follow up with the real thing… Just remove the -s and run it again. Just remember, the less packages your system has the less code there is for an attacker to exploit.

Telnet

telnet installed:

dpkg-query -l '*telnet*'
sudo apt-get remove telnet

telnet gone:

dpkg-query -l '*telnet*'

Ftp

We’ve got scp, why would we want ftp?
ftp installed:

dpkg-query -l '*ftp*'
sudo apt-get remove ftp

ftp gone:

dpkg-query -l '*ftp*'

Don’t forget to swap your new fstab back and test that the mounts are mounted as you expect.

Secure Services

The following provide good guidance on securing what ever is left.

Scheduled Backups

Make sure all data and VM images are backed up routinely. Make sure you test that restoring your backups work. Backup system files and what ever else is important to you. There is a good selection of tools here to help. Also make sure you are backing up the entire VM if your machine is a virtual guest by export / import OVF files. I also like to backup all the VM files. Disk space is cheap. Is there such a thing as being too prepared for disaster? It’s just a matter of time before you’ll be calling on your backups.

Keep up to date

Consider whether it would make sense for you or your admin/s to set-up automatic updates and possibly upgrades. Start out the way you intend to go. Work out your strategy for keeping your system up to date and patched. There are many options here.

Logging, Alerting and Monitoring

From here on, I’ve made it less detailed and more about just getting you to think about things and ways in which you can improve your stance on security. Also if any of the offerings cost money to buy, I make note of it because this is the exception to my rule. Why? Because I prefer free software and especially when it’s Open Source FOSS.

Some of the following cross the “logging” boundaries, so in many cases it’s difficult to put them into categorical boxes.

Attackers like to try and cover their tracks by modifying information that’s distributed to the various log files. Make sure you know who has write access to these files and keep the list small. As a Sysadmin you need to read your log files often and familiarise yourself with them so you get used to what they should look like.

SWatch

Monitors “a” log file for each instance you run (or schedule), matches your defined patterns and acts. You can define different message types with different font styles. If you want to monitor a lot of log files, it’s going to be a bit messy.

Logcheck

Monitors system log files, emails anomalies to an administrator. Once installed it needs to be set-up to run periodically with cron. Not a bad we run down here. How to use and customise it. Man page and more docs here.

NewRelic

Is more of a performance monitoring tool than a security tool. It has free plans which are OK, It comes into it’s own in larger deployments. I’ve used this and it’s been useful for working out what was causing performance issues on the servers.

Advanced Web Statistics (AWStats)

Unlike NewRelic which is a Software as a Service (SaaS), AWStats is FOSS. It kind of fits a similar market space as NewRelic though, but also has Host Intrusion Prevention System (HIPS) features. Docs here.

Pingdom

Similar to NewRelic but not as feature rich. Update: Recently stumbled into Monit which is a better alternative. Free and open source. I’ve been writing about it here.

Multitail

Does what its name sounds like. Tails multiple log files at once. Provides realtime multi log file monitoring. Example here. Great for seeing strange happenings before an intruder has time to modify logs, if your watching them that is. Good for a single system if you’ve got a spare screen to throw on the wall.

PaperTrail

Targets a similar problem to MultiTail except that it collects logs from as many servers as you want and copies them off-site to PaperTrails service and aggregates them into a single easily searchable web interface. Allows you to set-up alerts on anything. Has a free plan, but you only get 100MB per month. The plans are reasonably cheap for the features it provides and can scale as you grow. I’ve used this and have found it to be excellent.

Logwatch

Monitors system logs. Not continuously, so they could be open to modification without you knowing, like SWatch and Logcheck from above. You can configure it to reduce the number of services that it analyses the logs of. It creates a report of what it finds based on your level of paranoia. It’s easy to set-up and get started though. Source and docs here.

Logrotate

Use logrotate to make sure your logs will be around long enough to examine them. Some usage examples here. Ships with Debian. It’s just a matter of applying any extra config.

Logstash

Targets a similar problem to logrotate, but goes a lot further in that it routes and has the ability to translate between protocols. Requires Java to be installed.

Fail2ban

Ban hosts that cause multiple authentication errors. or just email events. Of course you need to think about false positives here too. An attacker can spoof many IP addresses potentially causing them all to be banned, thus creating a DoS.

Rsyslog

Configure syslog to send copy of the most important data to a secure system. Mitigation for an attacker modifying the logs. See @ option in syslog.conf man page. Check the /etc/(r)syslog.conf file to determine where syslogd is logging various messages. Some important notes around syslog here, like locking down the users that can read and write to /var/log.

syslog-ng

Provides a lot more flexibility than just syslogd. Checkout the comprehensive feature-set.

Some Useful Commands

  • Checking who is currently logged in to your server and what they are doing with the who and w commands
  • Checking who has recently logged into your server with the last command
  • Checking which user has failed login attempts with the faillog command
  • Checking the most recent login of all users, or of a given user with the lastlog command. lastlog comes from the binary file /var/log/lastlog.

This, is a list of log files and their names/locations and purpose in life.

Host-based Intrusion Detection System (HIDS)

Tripwire

Is a HIDS that stores a good know state of vital system files of your choosing and can be set-up to notify an administrator upon change in the files. Tripwire stores cryptographic hashes (delta’s) in a database and compares them with the files it’s been configured to monitor changes on. Not a bad tutorial here. Most of what you’ll find with tripwire now are the commercial offerings.

RkHunter

A similar offering to Tripwire. It scans for rootkits, backdoors, checks on the network interfaces and local exploits by running tests such as:

  • MD5 hash changes
  • Files commonly created by root-kits
  • Wrong file permissions for binaries
  • Suspicious strings in kernel modules
  • Hidden files in system directories
  • Optionally scan within plain-text and binary files

Version 1.4.2 (24/02/2014) now checks ssh, sshd and telent (although you shouldn’t have telnet installed). This could be useful for mitigating non-root users running a modified sshd on a 1025-65535 port. You can run ad-hoc scans, then set them up to be run with cron. Debian Jessie has this release in it’s repository. Any Debian distro before Jessie is on 1.4.0-1 or earlier.

The latest version you can install for Linux Mint Qiana (17) and Rebecca (17.1) within the repositories is 1.4.0-3 (01/05/2012)

Change-log here.

Chkrootkit

It’s a good idea to run a couple of these types of scanners. Hopefully what one misses the other will not. Chkrootkit scans for many system programs, some of which are cron, crontab, date, echo, find, grep, su, ifconfig, init, login, ls, netstat, sshd, top and many more. All the usual targets for attackers to modify. You can specify if you don’t want them all scanned. Runs tests such as:

  • System binaries for rootkit modification
  • If the network interface is in promiscuous mode
  • lastlog deletions
  • wtmp and utmp deletions (logins, logouts)
  • Signs of LKM trojans
  • Quick and dirty strings replacement

Stealth

The idea of Stealth is to do a similar job as the above file integrity scanners, but to leave almost no sediments on the tested computer (called the client). A potential attacker therefore has no clue that Stealth is in fact scanning the integrity of its client files. Stealth is installed on a different machine (called the controller) and scans over SSH.

Ossec

Is a HIDS that also has some preventative features. This is a pretty comprehensive offering with a lot of great features.

Unhide

While not strictly a HIDS, this is quite a useful forensics tool for working with your system if you suspect it may have been compromised.

Unhide is a forensic tool to find hidden processes and TCP/UDP ports by rootkits / LKMs or by another hidden technique. Unhide runs in Unix/Linux and Windows Systems. It implements six main techniques.

  1. Compare /proc vs /bin/ps output
  2. Compare info gathered from /bin/ps with info gathered by walking thru the procfs. ONLY for unhide-linux version
  3. Compare info gathered from /bin/ps with info gathered from syscalls (syscall scanning)
  4. Full PIDs space ocupation (PIDs bruteforcing). ONLY for unhide-linux version
  5. Compare /bin/ps output vs /proc, procfs walking and syscall. ONLY for unhide-linux version. Reverse search, verify that all thread seen by ps are also seen in the kernel.
  6. Quick compare /proc, procfs walking and syscall vs /bin/ps output. ONLY for unhide-linux version. It’s about 20 times faster than tests 1+2+3 but maybe give more false positives.

It includes two utilities: unhide and unhide-tcp.

unhide-tcp identifies TCP/UDP ports that are listening but are not listed in /bin/netstat through brute forcing of all TCP/UDP ports available.

Can also be used by rkhunter in it’s daily scans. Unhide was number one in the top 10 toolswatch.org security tools pole

Web Application Firewalls (WAF’s)

which are just another part in the defense in depth model for web applications, get more specific in what they are trying to protect. They operate at the application layer, so they don’t have to deal with all the network traffic. They apply a set of rules to HTTP conversations. They can also be either Network or Host based and able to block attacks such as Cross Site Scripting (XSS), SQL injection.

ModSecurity

Is a mature and feature full WAF that is designed to work with such web servers as IIS, Apache2 and NGINX. Loads of documentation. They also look to be open to committers and challengers a-like. You can find the OWASP Core Rule Set (CRS) here to get you started which has the following:

  • HTTP Protocol Protection
  • Real-time Blacklist Lookups
  • HTTP Denial of Service Protections
  • Generic Web Attack Protection
  • Error Detection and Hiding

Or for about $500US a year you get the following rules:

  • Virtual Patching
  • IP Reputation
  • Web-based Malware Detection
  • Webshell/Backdoor Detection
  • Botnet Attack Detection
  • HTTP Denial of Service (DoS) Attack Detection
  • Anti-Virus Scanning of File Attachments

Fusker

for Node.js. Although doesn’t look like a lot is happening with this project currently. You could always fork it if you wanted to extend.

The state of the Node.js echosystem in terms of security is pretty poor, which is something I’d like to invest time into.

Fire-walling

This is one of the last things you should look at when hardening an internet facing or parameterless system. Why? Because each machine should be hard enough that it doesn’t need a firewall to cover it like a blanket with services underneath being soft and vulnerable. Rather all the services should be either un-exposed or patched and securely configured.

Most of the servers and workstations I’ve been responsible for over the last few years I’ve administered as though there was no firewall and they were open to the internet. Most networks are reasonably easy to penetrate, so we really need to think of the machines behind them as being open to the internet. This is what De-perimeterisation (the concept initialised by the Jericho Forum) is all about.

Some thoughts on firewall logging.

Keep your eye on nftables too, it’s looking good!

Additional Resources

Just keep in mind the above links are quite old. A lot of it’s still relevant though.

Machine Now Ready for DMZ

Confirm DMZ has

  • Network Intrusion Detection System (NIDS), Network Intrusion Prevention System (NIPS) installed and configured. Snort is a pretty good option for the IDS part, although with some work Snort can help with the Prevention also.
  • incoming access from your LAN or where ever you plan on administering it from
  • rules for outgoing and incoming access to/from LAN, WAN tightly filtered.

Additional Web Server Preparation

  • setup and configure soft web server
  • setup and configure caching proxy. Ex:
    • node-http-proxy
    • TinyProxy
    • Varnish
    • nginx
  • deploy application files
  • Hopefully you’ve been baking security into your web app right from the start. This is an essential part of defense in depth. Rather than having your application completely rely on other entities to protect it, it should also be standing up for itself and understanding when it’s under attack and actually fighting back.
  • set static IP address
  • double check that the only open ports on the web server are 80 and what ever you’ve chosen for SSH.
  • setup SSH tunnel
  • decide on and document VM backup strategy and set it up.

Machine Now In DMZ

Setup your CNAME or what ever type of DNS record you’re using.

Now remember, keeping any machine on (not just the internet, but any) a network requires constant consideration and effort in keeping the system as secure as possible.

Work through using the likes of harden and Lynis for your server and harden-surveillance for monitoring your network.

Consider combining “Port Scan Attack Detector” (psad) with fwsnort and Snort.

Hack your own server and find the holes before someone else does. If you’re not already familiar with the tricks of how systems on the internet get attacked read up on the “Attacks and Threats” Run OpenVAS, Run Web Vulnerability Scanners

From here on is in scope for other blog posts.

Journey To Self Hosting

November 29, 2014

I was recently tasked with working out the best options for hosting web applications and their data for a client. This was their foray into whether to throw all their stuff into the cloud or to build their own infrastructure to host everything on.

Hosting Options

There are a lot of options available now. Most of which are derivatives of either external cloud or internal (possibly cloud). All of which come with features and some price tags that need to be weighed up. I’ve been collecting resources of providers and their offerings (both cloud and in-house) for quite a while. So I didn’t have to go far to pull them together for comparison.

All sites and apps require a different amount of each resource type to be allocated to them. For example many web sites are still predominantly static, which require more network band-width than any other resource, some memory, a little processing power and provided they’re being cached on the server, not a lot else. These resources are very cheap.

If you’re running an e-commerce site, then you can potentially add more Disk I/O which is usually the first bottleneck, processing power and space for your data store. Add in redundancy, backups and administration of.
Fast disks (or lets just call it storage) are cheap. In fact most hardware is cheap.

Administration of redundancy, backups and staying on top of security starts to cost more. Although the “staying on top of security” will need to be done whether you’re on someone else’s hardware or on your own. It’s just that it’s a lot easier on your own because you’re in control and dictate the amount of visibility you have.

The Cloud

The Cloud

Pros

It’s out of your hands.
Indeed it is, in more ways than one. Your trust is going to have to be honoured here (or not). Yes you have SLA’s, but what guarantee do the SLA’s give you that the people working on your system and data are not having a bad day. Maybe they’ve broken up with their girlfriend, or what ever. It takes very little to miss something that could drastically compromise your system and or data.

VPS’s can be spun up quickly, but remember, good things take time. Everything has a cost. Things are quick and easy for a reason. There is a cost to this, think about what those (often hidden) costs are.

In some cases it can be cheaper, but you get what you pay for.

Cons

Your are trusting others with your data. Even others that you are not aware of. In many cases, hosting providers can be (and in many cases are) forced by governments and other agencies to give up your secrets. This is very common place now and you may not even know it’s happened.

Your provider may go out of business.

There is an inherent lack of security in all the cloud providers I’ve looked at and worked with. They will tell you they take security seriously, but when someone that understands security inspects how they do things, the situation often looks like Swiss cheese.

In-House Cloud

In-House Cloud

Pros

You are in control of your data and your application, providing you or “your” staff:

  • and/or external consultants are competent and haven’t made mistakes in setting up your infrastructure
  • Are patching all software/firmware involved
  • Are Fastidiously hardening your server/s (this is continuous. It doesn’t stop at the initial set-up)
  • Have set-up the routes and firewall rules correctly
  • Have the correct alerts set-up
  • Have implemented Intrusion Detection and Prevention Systems (IDS’s/IPS’s)
  • Have penetration tested the set-up and not just from a technical perspective. It’s often best to get pairs to do the reviews.

The list goes on. If you are at all in doubt, that’s where you consider the alternatives. In saying that, most hosting and cloud providers perform abysmally, despite their claims that your applications and data is safe with them.

It “can” cost less than entrusting your system and data to someone (or many someone’s) on the other side of the planet. Weigh up the costs. They will not always be what they appear at face value.

Hardware is very cheap.

Cons

Potential lack of in-house skills.

People with the right skills and attitudes are not cheap.

It may not be core business. You may not have the necessary capitols in-house to scope, architect, cost, set-up, administer. Potentially you could hire someone to do the initial work and the on going administration. The amount of on going administration will be partly determined by what your hosting. Generally speaking hosting company web sites, blogs etc, will require less work than systems with distributed components and redundancy.

Spinning up an instance to develop or prototype on, doesn’t have to be hard. In fact if you have some hardware, provisioning of VM images is usually quick and easy. There is actually a pro in this too… you decide how much security you want baked into these images and the processes taken to configure.

Consider download latencies from people you want to reach possibly in other countries.

In some cases it can be more expensive, but you get what you pay for.

Outcome

The decision for this client was made to self host. There will be a follow up post detailing some of the hardening process I took for one of their Debian web servers.

Procurement & Config of Sun Fire V240 & ALOM

October 25, 2014

This is the sequence of events I took to prepare a Sun Fire V240 for hosting pfSense which is a free and open source FreeBSD based enterprise grade routing solution for a client of mine.

Recently I was tasked with setting up a network with what I considered to be enterprise grade hardware and software as cheaply as possible. When I take on these sorts of tasks, security is forefront in my mind, so I often look toward components that are as open as possible and that don’t sport any known (to me at least) back-doors and are able to be easily upgraded and patched at little to no cost.

A requirement was clean shut-downs on power failure events at least for the critical servers.

Procured Kit

  1. APC Smart-UPS 5000 with batteries in good condition. Worth a little under $6k if you’re buying new. I wouldn’t buy new. If you shop around, these can be picked up at a fraction of that cost. From my experience the APC kit is some of the best UPS gear available.
    APC Smart-UPS 5000
  2. AP9630 UPS network management card $92 new. Most of the details around setting these UPS’s up I’ve already posted on. If you search my blog for “APC UPS” you’ll find it.
    APC AP9630
  3. Enterprise grade router/firewall:
    Sun Fire V240 (RISC architecture). 2 x UltraSparc-IIIi 1.5Ghz CPU. 4Gbit on-board Ethernet ports. Lights-out management port. 4GB RAM. 2U. Dual redundant PSU’s. 2 x 72GB Hot Swap 10k SCSI HDD’s. With rack mount rails. Currently going for around $1.5k on Ebay. Price paid: $160 incl shipping. I doubt you’d find anything of these specifications off the shelf for under a $1000. This is a lot of server for a very small amount of money.
    Sun Fire V240
  4. Firmware: pfSense. Free and open source.

Planning

As part of my planning I evaluated (again) whether or not free software routing solutions are actually up to the task of the enterprise. My research led me to believe some were… based on others that had already been down this route ( PTP 😉 ). Openness is a biggie for me. I like to know that eyes are on the software rather than it being closed up in a proprietary package.

I evaluated m0n0Wall, ipCop (Linux based), smoothwall and pfSense. pfSense had been used in quite a few large environments successfully. When I had made my decision on the firmware to use, I went through the hardware requirements and of course started looking for high quality second-hand gear.

For the router hardware I was going to need at lease 1GHz CPU as I wanted to run Snort as my IDS/IPS. PCI-X or PCI-e network adapters (which of course I didn’t need to worry about with the Sun Fire server). Snort needs 512MB RAM minimum. Preferably at least 1GB.

Gaining Access to the Sun Fire V240

Now I had no idea of how the previous owner had setup the configuration of the ALOM (Advanced Lights Out Management). In fact I hadn’t administered a Sun Fire server before at all. On page 11 of the Sun FireTM V210 and V240 Servers Getting Started Guide it states the following:

The system console is directed to ALOM by default and is configured to show server console information on startup.
ALOM enables you to monitor and control your server over either a serial
connection (using the SERIAL MGT port), or Ethernet connection (using the NET MGT port).
For information about configuring an Ethernet connection, refer to the Sun Advanced Lights Out Manager Software User’s Guide.” The NET MGT port can also be disabled and in my case it turned out it was, but I’ll get to that later. I didn’t have a spare DB-9 to RJ-45 adapter lying around to wire it up and connect to the SERIAL MGT port.

Sun Fire V240 rear

Telnet?

(but didn’t get that far)

Since I was going to go down the path of trying to connect to the ALOM console via the NET MGT Ethernet port, I thought telnet would probably be the path of least resistance.

Page 10 of the “Sun Advanced Lights Outs Manager Software User’s Guide” stated the following:

The 10-Mbyte Ethernet port enables you to access ALOM from within your company
network. You can connect to ALOM remotely using any standard Telnet client”. On the V240, the
ALOM Ethernet port is referred to as the NET MGT port.

Using a laptop with Kali Linux installed (because it has lots of great tools for network reconnaissance), Running

ethtool eth0

told me that my NIC supported:
10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full

Wireshark?

Tried connecting directly to the NET MGT port with wireshark running on my laptop. Didn’t get any packets from the device. At the time I thought it may have been because my laptop’s NIC was using 100baseT, but later on I found out that the NET MGT port was disabled.

Tried pinging my broadcast address ping -b 255.255.255.255 then checked my arp table arp -a. No results that looked like what I was looking for. Of course this strategy would have taken quite some time to complete… and in my case it would have yielded no results anyway.

NMap?

I started with the private IPv4 address spaces. Using Wi-fi on my Kali box, tried the 16 bit block:

nmap -sn 192.168.*.*

Got a false positive of a cable modem. How did I work out that it was a false positive?

nmap -A <falsePositiveIPOfCableModem> # Gave me the model and everything I needed to know about the device to rule it out.

Next up the 20 bit block

nmap -sn 172.16.0.0/12
Nmap done: 1048576 IP addresses (0 hosts up) scanned in 108670.97 seconds

In earlier releases of nmap the -sn switch was known as -sP

I decided I needed to try and speed up the scan, so I connected directly to the V240 NET MGT port with a Cat5 patch cable (ethtool told me my laptop’s NIC had MDI-X on (force crossover mode)) and made sure my network card supported 10baseT which the “Sun Advanced Lights Outs Manager Software User’s Guide” told me it needed for the NET MGT port. Turns out the NET MGT port didn’t support 10baseT. Details a bit further down.

Added a static IP address to the /etc/network/interfaces. Currently it looked like:

auto lo
iface lo inet loopback

auto wlan0
iface wlan0 inet dhcp

So I commented out the auto wlan0 and iface wlan0 inet dhcp and added the following:

auto eth0
iface eth0 inet static
address 10.1.1.6
netmask 255.255.255.0
broadcast 10.1.1.255
#gateway 10.1.1.1 # Make sure you don't add a gateway, as we're connecting directly to the V240

followed by:

service networking restart

then changed my /etc/NetworkManager/NetworkManager.conf
managed=true to be managed=false
So Network manager didn’t keep interfering with my interfaces.

I followed this with a

service network-manager restart

followed with ifconfig to make sure my network interface was using the correct IP address, netmask and broadcast. It wasn’t, so…

ifdown eth0
ifup eth0
ifconfig

Success, it now was.

Now to make sure my network card was communicating in a manner that the V240’s NET MGT port would understand.

Using ethtool

ethtool eth0

told me 10baseT was supported, but it also told me my current speed was 100Bb/s. So I tried changing the speed with

ethtool -s eth0 speed 10

and received Cannot advertise speed 10. So made the following temporary changes as they’ll be lost on reboot. Changed the duplex… Ran the following:

ethtool -s eth0 speed 10 duplex half

Now with a:

ethtool eth0

I got:

Speed: unknown!
Duplex: Unknown! (255)

So turned the auto negotiation off:

ethtool -s eth0 speed 10 duplex half autoneg off

Now with a:

ethtool eth0

I got:

Speed: 10Mb/s
Duplex: Half
Auto-negotiation: off
#and some other settings.

Some useful ethtool resources:

With these settings the NET MGT port didn’t have it’s green link led on. So I kept playing with the settings. Turns out it would only work with speed 100 duplex full contrary to page 10 of the “Sun Advanced Lights Out Manager Software User’s Guide”
These were the settings that gave me link:

Supported pause frame use: No #Don't think I fiddled with this.
Supports auto-negotiation: Yes
Advertised link modes: Not reported #Don't think I fiddled with this.
Advertised pause frame use: Symmetric #Don't think I fiddled with this.
Advertised auto-negotiation: No
Speed: 100Mb/s
Duplex: full
Port: Twisted Pair #Don't think I fiddled with this.
PHYAD: 1 #Don't think I fiddled with this.
Transceiver: internal #Don't think I fiddled with this.
Auto-negotiation: off
MDI-X: on
Supports Wake-on: g #Don't think I fiddled with this.
Wake-on: d #Don't think I fiddled with this.
Current message level: 0x000000ff (255)
drv prove link timer ifdown ifup rx_err tx_err
Link detected: yes

I was now confident that if the Sun Fire V240 NET MGT port was enabled, we’d find it’s IP address if it was using one from the private space. It was time to try the last and largest private address space. Oh, I also used wireshark to make sure nmap was doing what I expected on my laptop when I ran:

nmap -v -sn 10.0.0.0/8

I was a little confused to start with as nmap told me Scanning 4096 hosts I soon realised after checking the CIDR (Classless InterDomain Routing) and by the output nmap produced, that nmap was doing the scanning in chunks. As there was going to be a lot of results, I setup the output to files:

nmap -v -sn -oA 'scan-%Y-%m-%d_%H-%M 10.0.0.0/8

This produces the output in all three formats as discussed here.

SERIAL MGT Port?

This private address range was going to take a few days to scan, so I decided to have a poke at the SERIAL MGT port on the Sun Fire V240.

To use the SERIAL MGT port, a RJ-45 patch cable connected to a DB-9 adapter ($4.50 from globalpc) is required Unless you get the official Sun adaptor “530-3100-01”, or still have the one that came in the new box. So I splashed out and went with the $4.50 option. It cost me more in gas to get to the shop than buy the part. I Wired it up according to page 25 of the “Sun Fire V210 and V240 Servers Installation Guide“.

RJ-45 to DB-9 Adapter Crossovers
SERIAL MGT Port Adapter (DB-9) Pin
1 (RTS) 8 (CTS)
2 (DTR) 6 (DSR)
3 (TXD) 2 (RXD)
4 (Signal Ground) 5 (Signal Ground)
5 (Signal Ground) 5 (Signal Ground)
6 (RXD) 3 (TXD)
7 (DSR) 4 (DTR)
8 (CTS) 7 (RTS)

Red wire in with green.

RJ45-DB9 RJ45

Installed minicom and setserial and did pretty much the same as I did here. Plugged the console cable in and tried to establish a connection.

Then found that by default ALOM only communicates through the SERIAL MGT port at startup (of ALOM I thought), but it seems that at power on of the server also.

At the {1} ok prompt, I typed #. (that’s hash followed with dot) to escape from the system console sc>

I then entered the showsc command and found that the MGT NET port was disabled.
I then ran a

usershow

to see which user accounts existed and was prompted to set a password for the admin user.
When you connect to ALOM for the first time, you are automatically connected as the admin account.“.
So obviously the seller of the system reset ALOM.

SettingAdminPassword

Also audited the user accounts, and the details on the permission levels are here.

Ran the following script. A nice little dialog from Ramesh here (see step 4) too.

setupsc
  • Turned NET MGT port on
  • Changed the default if_connection from none to ssh
  • Answered no to email alerts (only for logged in users)
  • Yes to configure the network interfaces
  • No to DHCP
  • Entered the IP address for the NET MGT port
  • Entered the netmask for the NET MGT port
  • Entered the gateway for the Net Mgt port
  • Should powerstate memory be enabled [y]? y
  • Enabled power on sequencing

Then we need to restart the ALOM to apply the new settings.

resetsc -y

If you still have minicom running, it’ll show you what happens during the boot sequence and then present you with a login prompt.

Extra Resources

SSH

At this point I plugged the Ethernet cable from my test switch (10 Mbit/s capable) back into the NET MGT port of the Sun Fire V240 and tested that ALOM was responding on the IP address that I set the NET MGT port to.

ping <myNetMgtIP>

It was answering. So I attempted to SSH in on a different machine.

ssh admin@<myNetMgtIP>

I was presented with the hosts key fingerprint

The authenticity of host <myNetMgtIP> (<myNetMgtIP>)' can't be established.
RSA key fingerprint is <myExistingHostKeyInHex>.
Are you sure you want to continue connecting (yes/no)?

I wanted to know I was connecting to what I thought I was connecting to, so answered no.
Then in minicom I queried the hosts key fingerprint

ssh-keygen -l -t rsa

I was provided with the key fingerprint that matched what I was presented with when I attempted to SSH, so I new I was actually communicating with the server I thought I was.

I then regenerated the hosts key fingerprint

ssh-keygen -r -t rsa

and was provided with the new key. A restart of the SSH daemon is required to load the new host key.

sc> restartssh

Then SSH in. Confirm when prompted that the host key matches the newly provided key.

ssh admin@<myNetMgtIP>
The authenticity of host <myNetMgtIP> (<myNetMgtIP>)' can't be established.
RSA key fingerprint is <myNewHostKeyInHex>.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '<myNetMgtIP>' (RSA) to the list of known hosts.

Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.

Sun(tm) Advanced Lights Out Manager <versionHere> ()

Please login: admin
Please Enter password: *********

sc>

We’re in!

At any time for a list of commands, you can type help.

logout
Connection to <myNetMgtIP> closed.

We’re out!

Node.js Asynchronicity and Callback Nesting

July 26, 2014

Just a heads up before we get started on this… Chrome DevTools (v35) now has the ability to show the full call stack of asynchronous JavaScript callbacks. As of writing this, if you develop on Linux you’ll want the dev channel. Currently my Linux Mint 13 is 3 versions behind. So I had to update to the dev channel until I upgraded to the LTS 17 (Qiana).

All code samples can be found at GitHub.

Deep Callback Nesting

AKA callback hell, temple of doom, often the functions that are nested are anonymous and often they are implicit closures. When it comes to asynchronicity in JavaScript, callbacks are our bread and butter. In saying that, often the best way to use them is by abstracting them behind more elegant APIs.

Being aware of when new functions are created and when you need to make sure the memory being held by closure is released (dropped out of scope) can be important for code that’s hot otherwise you’re in danger of introducing subtle memory leaks.

What is it?

Passing functions as arguments to functions which return immediately. The function (callback) that’s passed as an argument will be run at some time in the future when the potentially time expensive operation is done. This callback by convention has it’s first parameter as the error on error, or as null on success of the expensive operation. In JavaScript we should never block on potentially time expensive operations such as I/O, network operations. We only have one thread in JavaScript, so we allow the JavaScript implementations to place our discrete operations on the event queue.

One other point I think that’s worth mentioning is that we should never call asynchronous callbacks synchronously unless of course we’re unit testing them, in which case we should be rarely calling them asynchronously. Always allow the JavaScript engine to put the callback into the event queue rather than calling it immediately, even if you already have the result to pass to the callback. By ensuring the callback executes on a subsequent turn of the event loop you are providing strict separation of the callback being allowed to change data that’s shared between itself (usually via closure) and the currently executing function. There are many ways to ensure the callback is run on a subsequent turn of the event loop. Using asynchronous API’s like setTimeout and setImmediate allow you to schedule your callback to run on a subsequent turn. The Promises/A+ specification (discussed below) for example specifies this.

The Test

var assert = require('assert');
var should = require('should');
var requireFrom = require('requirefrom');
var sUTDirectory = requireFrom('post/nodejsAsynchronicityAndCallbackNesting');
var nestedCoffee = sUTDirectory('nestedCoffee');

describe('nodejsAsynchronicityAndCallbackNesting post integration test suite', function (done) {
   // if you don't want to wait for the machine to heat up assign minutes: 2.
   var minutes = 32;
   this.timeout(60000 * minutes);
   it('Test the ugly nested callback coffee machine', function (done) {

      var result = function (error, state) {
         var stateOutCome;
         var expectedErrorOutCome = null;
         if(!error) {
            stateOutCome = 'The state of the ordered coffee is: ' + state.description;
            stateOutCome.should.equal('The state of the ordered coffee is: beautiful shot!');
         } else {
            assert.fail(error, expectedErrorOutCome, 'brew encountered an error. The following are the error details: ' + error.message);
         }
         done();
      };

      nestedCoffee().brew(result);
   });
});
lets test

The System Under Test

'use strict';

module.exports = function nestedCoffee() {

   // We don't do instant coffee ####################################

   var boilJug = function () {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
   };
   var addInstantCoffeePowder = function () {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Crappy instant coffee powder is being added.');
   };
   var addSugar = function () {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Sugar is being added.');
   };
   var addBoilingWater = function () {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Boiling water is being added.');
   };
   var stir = function () {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Coffee is being stirred. Hmm...');
   };

   // We only do real coffee ########################################

   var heatEspressoMachine = function (state, callback) {
      var error = undefined;
      var wrappedCallback = function () {
         console.log('Espresso machine heating cycle is done.');
         if(!error) {
            callback(error, state);
         } else
            console.log('wrappedCallback encountered an error. The following are the error details: ' + error);
      };
      // Flick switch, check water.
      console.log('Espresso machine has been turned on and is now heating.');
      // Mutate state.
      // If there is an error, wrap callback with our own error function

      // Even if you call setTimeout with a time of 0 ms, the callback you pass is placed on the event queue to be called on a subsequent turn of the event loop.
      // Also be aware that setTimeout has a minimum granularity of 4ms for timers nested more than 5 deep. For several reasons we prefer to use setImmediate if we don't want a 4ms minimum wait.
      // setImmediate will schedule your callbacks on the next turn of the event loop, but it goes about it in a smarter way. Read more about it here: https://developer.mozilla.org/en-US/docs/Web/API/Window.setImmediate
      // If you are using setImmediate and it's not available in the browser, use the polyfill: https://github.com/YuzuJS/setImmediate
      // For this, we need to wait for our huge hunk of copper to heat up, which takes a lot longer than a few milliseconds.
      setTimeout(
         // Once espresso machine is hot callback will be invoked on the next turn of the event loop...
         wrappedCallback, espressoMachineHeatTime.milliseconds
      );
   };
   var grindDoseTampBeans = function (state, callback) {
      // Perform long running action.
      console.log('We are now grinding, dosing, then tamping our dose.');
      // To save on writing large amounts of code, the callback would get passed to something that would run it at some point in the future.
      // We would then return immediately with the expectation that callback will be run in the future.
      callback(null, state);
   };
   var mountPortaFilter = function (state, callback) {
      // Perform long running action.
      console.log('Porta filter is now being mounted.');
      // To save on writing large amounts of code, the callback would get passed to something that would run it at some point in the future.
      // We would then return immediately with the expectation that callback will be run in the future.
      callback(null, state);
   };
   var positionCup = function (state, callback) {
      // Perform long running action.
      console.log('Placing cup under portafilter.');
      // To save on writing large amounts of code, the callback would get passed to something that would run it at some point in the future.
      // We would then return immediately with the expectation that callback will be run in the future.
      callback(null, state);
   };
   var preInfuse = function (state, callback) {
      // Perform long running action.
      console.log('10 second preinfuse now taking place.');
      // To save on writing large amounts of code, the callback would get passed to something that would run it at some point in the future.
      // We would then return immediately with the expectation that callback will be run in the future.
      callback(null, state);
   };
   var extract = function (state, callback) {
      // Perform long running action.
      console.log('Cranking leaver down and extracting pure goodness.');
      state.description = 'beautiful shot!';
      // To save on writing large amounts of code, the callback would get passed to something that would run it at some point in the future.
      // We would then return immediately with the expectation that callback will be run in the future.

      // Uncomment the below to test the error.
      //callback({message: 'Oh no, something has gone wrong!'})
      callback(null, state);
   };
   var espressoMachineHeatTime = {
      // if you don't want to wait for the machine to heat up assign minutes: 0.2.
      minutes: 30,
      get milliseconds() {
         return this.minutes * 60000;
      }
   };
   var state = {
      description: ''
      // Other properties
   };
   var brew = function (onCompletion) {
      // Some prep work here possibly.
      heatEspressoMachine(state, function (err, resultFromHeatEspressoMachine) {
         if(!err) {
            grindDoseTampBeans(state, function (err, resultFromGrindDoseTampBeans) {
               if(!err) {
                  mountPortaFilter(state, function (err, resultFromMountPortaFilter) {
                     if(!err) {
                        positionCup(state, function (err, resultFromPositionCup) {
                           if(!err) {
                              preInfuse(state, function (err, resultFromPreInfuse) {
                                 if(!err) {
                                    extract(state, function (err, resultFromExtract) {
                                       if(!err)
                                          onCompletion(null, state);
                                       else
                                          onCompletion(err, null);
                                    });
                                 } else
                                    onCompletion(err, null);
                              });
                           } else
                              onCompletion(err, null);
                        });
                     } else
                        onCompletion(err, null);
                  });
               } else
                  onCompletion(err, null);
            });
         } else
            onCompletion(err, null);
      });
   };
   return {
      // Publicise brew.
      brew: brew
   };
};

What’s wrong with it?

  1. It’s hard to read, reason about and maintain
  2. The debugging experience isn’t very informative
  3. It creates more garbage than adding your functions to a prototype
  4. Dangers of leaking memory due to retaining closure references
  5. Many more…

What’s right with it?

  • It’s asynchronous

Closures are one of the language features in JavaScript that they got right. There are often issues in how we use them though.   Be very careful of what you’re doing with closures. If you’ve got hot code, don’t create a new function every time you want to execute it.

Resources

  • Chapter 7 Concurrency of the Effective JavaScript book by David Herman

 

Alternative Approaches

Ranging from marginally good approaches to better approaches. Keeping in mind that all these techniques add value and some make more sense in some situations than others. They are all approaches for making the callback hell more manageable and often encapsulating it completely, so much so that the underlying workings are no longer just a bunch of callbacks but rather well thought out implementations offering up a consistent well recognised API. Try them all, get used to them all, then pick the one that suites your particular situation. The first two examples from here are blocking though, so I wouldn’t use them as they are, they are just an example of how to make some improvements.

Name your anonymous functions

  1. They’ll be easier to read and understand
  2. You’ll get a much better debugging experience, as stack traces will reference named functions rather than “anonymous function”
  3. If you want to know where the source of an exception was
  4. Reveals your intent without adding comments
  5. In itself will allow you to keep your nesting shallow
  6. A first step to creating more extensible code

We’ve made some improvements in the next two examples, but introduced blocking in the arrays prototypes forEach loop which we really don’t want to do.

Example of Anonymous Functions

var boilJug = function () {
   // Perform long running action
};
var addInstantCoffeePowder = function () {
   // Perform long running action
   console.log('Crappy instant coffee powder is being added.');
};
var addSugar = function () {
   // Perform long running action
   console.log('Sugar is being added.');
};
var addBoilingWater = function () {
   // Perform long running action
   console.log('Boiling water is being added.');
};
var stir = function () {
   // Perform long running action
   console.log('Coffee is being stirred. Hmm...');
};
var heatEspressoMachine = function () {
   // Flick switch, check water.
   console.log('Espresso machine is being turned on and is now heating.');
};
var grindDoseTampBeans = function () {
   // Perform long running action
   console.log('We are now grinding, dosing, then tamping our dose.');
};
var mountPortaFilter = function () {
   // Perform long running action
   console.log('Portafilter is now being mounted.');
};
var positionCup = function () {
   // Perform long running action
   console.log('Placing cup under portafilter.');
};
var preInfuse = function () {
   // Perform long running action
   console.log('10 second preinfuse now taking place.');
};
var extract = function () {
   // Perform long running action
   console.log('Cranking leaver down and extracting pure goodness.');
};

(function () {
   // Array.prototype.forEach executes your callback synchronously (that's right, it's blocking) for each element of the array.
   return [
      'heatEspressoMachine',
      'grindDoseTampBeans',
      'mountPortaFilter',
      'positionCup',
      'preInfuse',
      'extract',
   ].forEach(
      function (brewStep) {
         this[brewStep]();
      }
   );
}());

anonymous functions

Example of Named Functions

Now satisfies all the points above, providing the same output. Hopefully you’ll be able to see a few other issues I’ve addressed with this example. We’re also no longer clobbering the global scope. We can now also make any of the other types of coffee simply with an additional single line function call, so we’re removing duplication.

var BINARYMIST = (function (binMist) {
   binMist.coffee = {
      action: function (step) {

         return {
            boilJug: function () {
               // Perform long running action
            },
            addInstantCoffeePowder: function () {
               // Perform long running action
               console.log('Crappy instant coffee powder is being added.');
            },
            addSugar: function () {
               // Perform long running action
               console.log('Sugar is being added.');
            },
            addBoilingWater: function () {
               // Perform long running action
               console.log('Boiling water is being added.');
            },
            stir: function () {
               // Perform long running action
               console.log('Coffee is being stirred. Hmm...');
            },
            heatEspressoMachine: function () {
               // Flick switch, check water.
               console.log('Espresso machine is being turned on and is now heating.');
            },
            grindDoseTampBeans: function () {
               // Perform long running action
               console.log('We are now grinding, dosing, then tamping our dose.');
            },
            mountPortaFilter: function () {
               // Perform long running action
               console.log('Portafilter is now being mounted.');
            },
            positionCup: function () {
               // Perform long running action
               console.log('Placing cup under portafilter.');
            },
            preInfuse: function () {
               // Perform long running action
               console.log('10 second preinfuse now taking place.');
            },
            extract: function () {
               // Perform long running action
               console.log('Cranking leaver down and extracting pure goodness.');
            }
         }[step]();
      },
      coffeeType: function (type) {
         return {
            'cappuccino': {
               brewSteps: function () {
                  return [
                     // Lots of actions
                  ];
               }
            },
            'instant': {
               brewSteps: function () {
                  return [
                     'addInstantCoffeePowder',
                     'addSugar',
                     'addBoilingWater',
                     'stir'
                  ];
               }
            },
            'macchiato': {
               brewSteps: function () {
                  return [
                     // Lots of actions
                  ];
               }
            },
            'mocha': {
               brewSteps: function () {
                  return [
                     // Lots of actions
                  ];
               }
            },
            'short black': {
               brewSteps: function () {
                  return [
                     'heatEspressoMachine',
                     'grindDoseTampBeans',
                     'mountPortaFilter',
                     'positionCup',
                     'preInfuse',
                     'extract',
                  ];
               }
            }
         }[type];
      },
      'brew': function (requestedCoffeeType) {
         var that = this;
         var brewSteps = this.coffeeType(requestedCoffeeType).brewSteps();
         // Array.prototype.forEach executes your callback synchronously (that's right, it's blocking) for each element of the array.
         brewSteps.forEach(function runCoffeeMakingStep(brewStep) {
            that.action(brewStep);
         });
      }
   };
   return binMist;

} (BINARYMIST || {/*if BINARYMIST is falsy, create a new object and pass it*/}));

BINARYMIST.coffee.brew('short black');

named functions


Web Workers

I’ll address these in another post.

Create Modules

Everywhere.

Legacy Modules (Server or Client side)

AMD Modules using RequireJS

CommonJS type Modules in Node.js

In most of the examples I’ve created in this post I’ve exported the system under test (SUT) modules and then required them into the test. Node modules are very easy to create and consume. requireFrom is a great way to require your local modules without explicit directory traversal, thus removing the need to change your require statements when you move your files that are requiring your modules.

NPM Packages

Browserify

Here we get to consume npm packages in the browser.

Universal Module Definition (UMD)

ES6 Modules

That’s right, we’re getting modules as part of the specification (15.2). Check out this post by Axel Rauschmayer to get you started.


Recursion

I’m not going to go into this here, but recursion can be used as a light weight solution to provide some logic to determine when to run the next asynchronous piece of work. Item 64 “Use Recursion for Asynchronous Loops” of the Effective JavaScript book provides some great examples. Do your self a favour and get a copy of David Herman’s book. Oh, we’re also getting tail-call optimisation in ES6.

 


EventEmitter

Still creates more garbage unless your functions are on the prototype, but does provide asynchronicity. Now we can put our functions on the prototype, but then they’ll all be public and if they’re part of a process then we don’t want our coffee process spilling all it’s secretes about how it makes perfect coffee. In saying that, if our code is hot and we’ve profiled it and it’s a stand-out for using to much memory, we could refactor EventEmittedCoffee to have its function declarations added to EventEmittedCoffee.prototype and perhaps hidden another way, but I wouldn’t worry about it until it’s been proven to be using to much memory.

Events are used in the well known Ganf Of Four Observer (behavioural) pattern (which I discussed the C# implementation of here) and at a higher level the Enterprise Integration Publish/Subscribe pattern. The Observer pattern is used in quite a few other patterns also. The ones that spring to mind are Model View Presenter, Model View Controller. The pub/sub pattern is slightly different to the Observer in that it has a topic/event channel that sits between the publisher and the subscriber and it uses contractual messages to encapsulate and transmit it’s events.

Here’s an example of the EventEmitter …

The Test

var assert = require('assert');
var should = require('should');
var requireFrom = require('requirefrom');
var sUTDirectory = requireFrom('post/nodejsAsynchronicityAndCallbackNesting');
var eventEmittedCoffee = sUTDirectory('eventEmittedCoffee');

describe('nodejsAsynchronicityAndCallbackNesting post integration test suite', function () {
   // if you don't want to wait for the machine to heat up assign minutes: 2.
   var minutes = 32;
   this.timeout(60000 * minutes);
   it('Test the event emitted coffee machine', function (done) {

      function handleSuccess(state) {
         var stateOutCome = 'The state of the ordered coffee is: ' + state.description;
         stateOutCome.should.equal('The state of the ordered coffee is: beautiful shot!');
         done();
      }

      function handleFailure(error) {
         assert.fail(error, 'brew encountered an error. The following are the error details: ' + error.message);
         done();
      }

      // We could even assign multiple event handlers to the same event. We're not here, but we could.
      eventEmittedCoffee.on('successfulOrder', handleSuccess).on('failedOrder', handleFailure);

      eventEmittedCoffee.brew();
   });
});

The System Under Test

'use strict';

var events = require('events'); // Core node module.
var util = require('util'); // Core node module.

var eventEmittedCoffee;
var espressoMachineHeatTime = {
   // if you don't want to wait for the machine to heat up assign minutes: 0.2.
   minutes: 30,
   get milliseconds() {
      return this.minutes * 60000;
   }
};
var state = {
   description: '',
   // Other properties
   error: ''
};

function EventEmittedCoffee() {

   var eventEmittedCoffee = this;

   function heatEspressoMachine(state) {
      // No need for callbacks. We can emit a failedOrder event at any stage and any subscribers will be notified.

      function emitEspressoMachineHeated() {
         console.log('Espresso machine heating cycle is done.');
         eventEmittedCoffee.emit('espressoMachineHeated', state);
      }
      // Flick switch, check water.
      console.log('Espresso machine has been turned on and is now heating.');
      // Mutate state.
      setTimeout(
         // Once espresso machine is hot event will be emitted on the next turn of the event loop...
         emitEspressoMachineHeated, espressoMachineHeatTime.milliseconds
      );
   }

   function grindDoseTampBeans(state) {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('We are now grinding, dosing, then tamping our dose.');
      eventEmittedCoffee.emit('groundDosedTampedBeans', state);
   }

   function mountPortaFilter(state) {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Porta filter is now being mounted.');
      eventEmittedCoffee.emit('portaFilterMounted', state);
   }

   function positionCup(state) {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Placing cup under portafilter.');
      eventEmittedCoffee.emit('cupPositioned', state);
   }

   function preInfuse(state) {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('10 second preinfuse now taking place.');
      eventEmittedCoffee.emit('preInfused', state);
   }

   function extract(state) {
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      console.log('Cranking leaver down and extracting pure goodness.');
      state.description = 'beautiful shot!';
      eventEmittedCoffee.emit('successfulOrder', state);
      // If you want to fail the order, replace the above two lines with the below two lines.
      // state.error = 'Oh no! That extraction came out far to fast.'
      // this.emit('failedOrder', state);
   }

   eventEmittedCoffee.on('timeToHeatEspressoMachine', heatEspressoMachine).
   on('espressoMachineHeated', grindDoseTampBeans).
   on('groundDosedTampedBeans', mountPortaFilter).
   on('portaFilterMounted', positionCup).
   on('cupPositioned', preInfuse).
   on('preInfused', extract);
}

// Make sure util.inherits is before any prototype augmentations, as it seems it clobbers the prototype if it's the other way around.
util.inherits(EventEmittedCoffee, events.EventEmitter);

// Only public method.
EventEmittedCoffee.prototype.brew = function () {
   this.emit('timeToHeatEspressoMachine', state);
};

eventEmittedCoffee = new EventEmittedCoffee();

module.exports = eventEmittedCoffee;

With using raw callbacks, we have to pass them (functions) around. With events, we can have many interested parties request (subscribe) to be notified when something that our interested parties are interested in happens (the event). The Observer pattern promotes loose coupling, as the thing (publisher) wanting to inform interested parties of specific events has no knowledge of it’s subscribers, this is essentially what a service is.

Resources


Async.js

Provides a collection of methods on the async object that:

  1. take an array and perform certain actions on each element asynchronously
  2. take a collection of functions to execute in specific orders asynchronously, some based on different criteria. The likes of async.waterfall allow you to pass results of a previous function to the next. Don’t underestimate these. There are a bunch of very useful routines.
  3. are asynchronous utilities

Here’s an example…

The Test

var assert = require('assert');
var should = require('should');
var requireFrom = require('requirefrom');
var sUTDirectory = requireFrom('post/nodejsAsynchronicityAndCallbackNesting');
var asyncCoffee = sUTDirectory('asyncCoffee');

describe('nodejsAsynchronicityAndCallbackNesting post integration test suite', function () {
   // if you don't want to wait for the machine to heat up assign minutes: 2.
   var minutes = 32;
   this.timeout(60000 * minutes);
   it('Test the async coffee machine', function (done) {

      var result = function (error, resultsFromAllAsyncSeriesFunctions) {
         var stateOutCome;
         var expectedErrorOutCome = null;
         if(!error) {
            stateOutCome = 'The state of the ordered coffee is: '
               + resultsFromAllAsyncSeriesFunctions[resultsFromAllAsyncSeriesFunctions.length - 1].description;
            stateOutCome.should.equal('The state of the ordered coffee is: beautiful shot!');
         } else {
            assert.fail(
               error,
               expectedErrorOutCome,
               'brew encountered an error. The following are the error details. message: '
                  + error.message
                  + '. The finished state of the ordered coffee is: '
                  + resultsFromAllAsyncSeriesFunctions[resultsFromAllAsyncSeriesFunctions.length - 1].description
            );
         }
         done();
      };

      asyncCoffee().brew(result)
   });
});

The System Under Test

'use strict';

var async = require('async');
var espressoMachineHeatTime = {
   // if you don't want to wait for the machine to heat up assign minutes: 0.2.
   minutes: 30,
   get milliseconds() {
      return this.minutes * 60000;
   }
};
var state = {
   description: '',
   // Other properties
   error: null
};

module.exports = function asyncCoffee() {

   var brew = function (onCompletion) {
      async.series([
         function heatEspressoMachine(heatEspressoMachineDone) {
            // No need for callbacks. We can just pass an error to the async supplied callback at any stage and the onCompletion callback will be invoked with the error and the results immediately.

            function espressoMachineHeated() {
               console.log('Espresso machine heating cycle is done.');
               heatEspressoMachineDone(state.error);
            }
            // Flick switch, check water.
            console.log('Espresso machine has been turned on and is now heating.');
            // Mutate state.
            setTimeout(
               // Once espresso machine is hot, heatEspressoMachineDone will be invoked on the next turn of the event loop...
               espressoMachineHeated, espressoMachineHeatTime.milliseconds
            );
         },
         function grindDoseTampBeans(grindDoseTampBeansDone) {
            // Perform long running action, delegating async tasks passing callback and returning immediately.
            console.log('We are now grinding, dosing, then tamping our dose.');
            grindDoseTampBeansDone(state.error);
         },
         function mountPortaFilter(mountPortaFilterDone) {
            // Perform long running action, delegating async tasks passing callback and returning immediately.
            console.log('Porta filter is now being mounted.');
            mountPortaFilterDone(state.error);
         },
         function positionCup(positionCupDone) {
            // Perform long running action, delegating async tasks passing callback and returning immediately.
            console.log('Placing cup under portafilter.');
            positionCupDone(state.error);
         },
         function preInfuse(preInfuseDone) {
            // Perform long running action, delegating async tasks passing callback and returning immediately.
            console.log('10 second preinfuse now taking place.');
            preInfuseDone(state.error);
         },
         function extract(extractDone) {
            // Perform long running action, delegating async tasks passing callback and returning immediately.
            console.log('Cranking leaver down and extracting pure goodness.');
            // If you want to fail the order, uncomment the below line. May as well change the description too.
            // state.error = {message: 'Oh no! That extraction came out far to fast.'};
            state.description = 'beautiful shot!';
            extractDone(state.error, state);

         }
      ],
      onCompletion);
   };

   return {
      // Publicise brew.
      brew: brew
   };
};

Other Similar Useful libraries


Adding to Prototype

Check out my post on prototypes. If profiling reveals you’re spending to much memory or processing time creating the objects that contain the functions that are going to be used asynchronously you could add the functions to the objects prototype like we did with the public brew method of the EventEmitter example above.


Promises

The concepts of promises and futures which are quite similar, have been around a long time. their roots go back to 1976 and 1977 respectively. Often the terms are used interchangeably, but they are not the same thing. You can think of the language agnostic promise as a proxy for a value provided by an asynchronous actions eventual success or failure. a promise is something tangible, something you can pass around and interact with… all before or after it’s resolved or failed. The abstract concept of the future (discussed below) has a value that can be mutated once from pending to either fulfilled or rejected on fulfilment or rejection of the promise.

Promises provide a pattern that abstracts asynchronous operations in code thus making them easier to reason about. Promises which abstract callbacks can be passed around and the methods on them chained (AKA Promise pipelining). Removing temporary variables makes it more concise and clearer to readers that the extra assignments are an unnecessary step.

JavaScript Promises

A promise (Promises/A+ thenable) is an object or sometimes more specifically a function with a then (JavaScript specific) method.
A promise must only change it’s state once and can only change from either pending to fulfilled or pending to rejected.

Semantically a future is a read-only property.
A future can only have its value set (sometimes called resolved, fulfilled or bound) once by one or more (via promise pipelining) associated promises.
Futures are not discussed explicitly in the Promises/A+, although are discussed implicitly in the promise resolution procedure which takes a promise as the first argument and a value as the second argument.
The idea is that the promise (first argument) adopts the state of the second argument if the second argument is a thenable (a promise object with a then method). This procedure facilitates the concept of the “future”

We’re getting promises in ES6. That means JavaScript implementers are starting to include them as part of the language. Until we get there, we can use the likes of these libraries.

One of the first JavaScript promise drafts was the Promises/A (1) proposal. The next stage in defining a standardised form of promises for JavaScript was the Promises/A+ (2) specification which also has some good resources for those looking to use and implement promises based on the new spec. Just keep in mind though, that this has nothing to do with the EcmaScript specification, although it is/was a precursor.

Then we have Dominic Denicola’s promises-unwrapping repository (3) for those that want to stay ahead of the solidified ES6 draft spec (4). Dominic’s repo is slightly less specky and may be a little more convenient to read, but if you want gospel, just go for the ES6 draft spec sections 25.4 Promise Objects and 7.5 Operations on Promise Objects which is looking fairly solid now.

The 1, 2, 3 and 4 are the evolutionary path of promises in JavaScript.

Node Support

Although it was decided to drop promises from Node core, we’re getting them in ES6 anyway. V8 already supports the spec and we also have plenty of libraries to choose from.

Node on the other hand is lagging. Node stable 0.10.29 still looks to be using version 3.14.5.9 of V8 which still looks to be about 17 months from the beginnings of the first sign of native ES6 promises according to how I’m reading the V8 change log and the Node release notes.

So to get started using promises in your projects whether your programming server or client side, you can:

  1. Use one of the excellent Promises/A+ conformant libraries which will give you the flexibility of lots of features if that’s what you need, or
  2. Use the native browser promise API of which all ES6 methods on Promise work in Chrome (V8 -> Soon in Node), Firefox and Opera. Then polyfill using the likes of yepnope, or just check the existence of the methods you require and load them on an as needed basis. The cujojs or jakearchibald shims would be good starting points.

For my examples I’ve decided to use when.js for several reasons.

  • Currently in Node we have no native support. As stated above, this will be changing soon, so we’d be polyfilling everything.
  • It’s performance is the least worst of the Promises/A+ compliant libraries at this stage. Although don’t get to hung up on perf stats. In most cases they won’t matter in context of your module. If you’re concerned, profile your running code.
  • It wraps non Promises/A+ compliant promise look-a-likes like jQuery’s Deferred which will forever remain broken.
  • Is compliant with spec version 1.1

The following example continues with the coffee making procedure concept. Now we’ve taken this from raw callbacks to using the EventEmitter to using the Async library and finally to what I think is the best option for most of our asynchronous work, not only in Node but JavaScript anywhere. Promises. Now this is just one way to implement the example. There are many and probably many of which are even more elegant. Go forth explore and experiment.

The Test

var should = require('should');
var requireFrom = require('requirefrom');
var sUTDirectory = requireFrom('post/nodejsAsynchronicityAndCallbackNesting');
var promisedCoffee = sUTDirectory('promisedCoffee');

describe('nodejsAsynchronicityAndCallbackNesting post integration test suite', function () {
   // if you don't want to wait for the machine to heat up assign minutes: 2.
   var minutes = 32;
   this.timeout(60000 * minutes);
   it('Test the coffee machine of promises', function (done) {

      var numberOfSteps = 7;
      // We could use a then just as we've used the promises done method, but done is semantically the better choice. It makes a bigger noise about handling errors. Read the docs for more info.
      promisedCoffee().brew().done(
         function handleValue(valueOrErrorFromPromiseChain) {
            console.log(valueOrErrorFromPromiseChain);
            valueOrErrorFromPromiseChain.errors.should.have.length(0);
            valueOrErrorFromPromiseChain.stepResults.should.have.length(numberOfSteps);
            done();
         }
      );
   });

});

The System Under Test

'use strict';

var when = require('when');
var espressoMachineHeatTime = {
   // if you don't want to wait for the machine to heat up assign minutes: 0.2.
   minutes: 30,
   get milliseconds() {
      return this.minutes * 60000;
   }
};
var state = {
   description: '',
   // Other properties
   errors: [],
   stepResults: []
};
function CustomError(message) {
   this.message = message;
   // return false
   return false;
}

function heatEspressoMachine(resolve, reject) {
   state.stepResults.push('Espresso machine has been turned on and is now heating.');
   function espressoMachineHeated() {
      var result;
      // result will be wrapped in a new promise and provided as the parameter in the promises then methods first argument.
      result = 'Espresso machine heating cycle is done.';
      // result could also be assigned another promise
      resolve(result);
      // Or call the reject
      //reject(new Error('Something screwed up here')); // You'll know where it originated from. You'll get full stack trace.
   }
   // Flick switch, check water.
   console.log('Espresso machine has been turned on and is now heating.');
   // Mutate state.
   setTimeout(
      // Once espresso machine is hot, heatEspressoMachineDone will be invoked on the next turn of the event loop...
      espressoMachineHeated, espressoMachineHeatTime.milliseconds
   );
}

// The promise takes care of all the asynchronous stuff without a lot of thought required.
var promisedCoffee = when.promise(heatEspressoMachine).then(
   function fulfillGrindDoseTampBeans(result) {
      state.stepResults.push(result);
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      return 'We are now grinding, dosing, then tamping our dose.';
      // Or if something goes wrong:
      // throw new Error('Something screwed up here'); // You'll know where it originated from. You'll get full stack trace.
   },
   function rejectGrindDoseTampBeans(error) {
      // Deal with the error. Possibly augment some additional insight and re-throw.
      if(state.errors[state.errors.length -1] !== error.message)
         state.errors.push(error.message);
      throw new CustomError(error.message);
   }
).then(
   function fulfillMountPortaFilter(result) {
      state.stepResults.push(result);
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      return 'Porta filter is now being mounted.';
   },
   function rejectMountPortaFilter(error) {
      // Deal with the error. Possibly augment some additional insight and re-throw.
      if(state.errors[state.errors.length -1] !== error.message)
         state.errors.push(error.message);
      throw new Error(error.message);
   }
).then(
   function fulfillPositionCup(result) {
      state.stepResults.push(result);
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      return 'Placing cup under portafilter.';
   },
   function rejectPositionCup(error) {
      // Deal with the error. Possibly augment some additional insight and re-throw.
      if(state.errors[state.errors.length -1] !== error.message)
         state.errors.push(error.message);
      throw new CustomError(error.message);
   }
).then(
   function fulfillPreInfuse(result) {
      state.stepResults.push(result);
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      return '10 second preinfuse now taking place.';
   },
   function rejectPreInfuse(error) {
      // Deal with the error. Possibly augment some additional insight and re-throw.
      if(state.errors[state.errors.length -1] !== error.message)
         state.errors.push(error.message);
      throw new CustomError(error.message);
   }
).then(
   function fulfillExtract(result) {
      state.stepResults.push(result);
      state.description = 'beautiful shot!';
      state.stepResults.push('Cranking leaver down and extracting pure goodness.');
      // Perform long running action, delegating async tasks passing callback and returning immediately.
      return state;
   },
   function rejectExtract(error) {
      // Deal with the error. Possibly augment some additional insight and re-throw.
      if(state.errors[state.errors.length -1] !== error.message)
         state.errors.push(error.message);
      throw new CustomError(error.message);

   }
).catch(CustomError, function (e) {
      // Only deal with the error type that we know about.
      // All other errors will propagate to the next catch. whenjs also has a finally if you need it.
      // Todo: KimC. Do the dealing with e.
      e.newCustomErrorInformation = 'Ok, so we have now dealt with the error in our custom error handler.';
      return e;
   }
).catch(function (e) {
      // Handle other errors
      e.newUnknownErrorInformation = 'Hmm, we have an unknown error.';
      return e;
   }
);

function brew() {
   return promisedCoffee;
}

// when's promise.catch is only supposed to catch errors derived from the native Error (etc) functions.
// Although in my tests, my catch(CustomError func) wouldn't catch it. I'm assuming there's a bug as it kept giving me a TypeError instead.
// Looks like it came from within the library. So this was a little disappointing.
CustomError.prototype = Error;

module.exports = function promisedCoffee() {
   return {
      // Publicise brew.
      brew: brew
   };
};

Resources


Testing Asynchronous Code

All of the tests I demonstrated above have been integration tests. Usually I’d unit test the functions individually not worrying about the intrinsically asynchronous code, as most of it isn’t mine anyway, it’s C/O the EventEmitter, Async and other libraries and there is often no point in testing what the library maintainer already tests.

When you’re driving your development with tests, there should be little code testing the asynchronicity. Most of your code should be able to be tested synchronously. This is a big part of the reason why we drive our development with tests, to make sure your code is easy to test. Testing asynchronous code is a pain, so don’t do it much. Test your asynchronous code yes, but most of your business logic should be just functions that you join together asynchronously. When you’re unit testing, you should be testing units, not asynchronous code. When you’re concerned about testing your asynchronicity, that’s called integration testing. Which you should have a lot less of. I discuss the ratios here.

As of 1.18.0 Mocha now has baked in support for promises. For fluent style of testing promises we have Chai as Promised.

Resources


There are plenty of other resources around working with promises in JavaScript. For myself I found that I needed to actually work with them to solidify my understanding of them. With Chrome DevTool async option, we’ll soon have support for promise chaining.

Other Excellent Resources

And again all of the code samples can be found at GitHub.