Posts Tagged ‘Cloud’

Keeping Your NodeJS Web App Running on Production Linux

June 27, 2015

All the following offerings that I’ve evaluated target different scenarios. I’ve listed the pros and cons for each of them and where I think they fit into a potential solution to monitor your web applications (I’m leaning toward NodeJS) and make sure they keep running. I’ve listed the goals I was looking to satisfy.

For me I have to have a good knowledge of the landscape before I commit to a decision and stand behind it. I like to know I’ve made the best decision based on all the facts that are publicly available. Therefore, as always, it’s my responsibility to make sure I’ve done my research in order to make an informed and ideally… best decision possible. I’m pretty sure my evaluation was un-biased, as I hadn’t used any of the offerings other than forever before.

I looked at quite a few more than what I’ve detailed below, but the following candidates I felt were worth spending some time on.

Keep in mind, that everyone’s requirements will be different, so rather than tell you which to use because I don’t know your situation, I’ve listed the attributes (positive, negative and neutral) that I think are worth considering when making this choice. After my evaluation I make some decisions and start the configuration.

Evaluation criterion

  1. Who is the creator. I favour teams rather than individuals, as individuals move on, then where does that leave the product?
  2. Does it do what we need it to do? Goals address this.
  3. Do I foresee any integration problems with other required components?
  4. Cost in money. Is it free? I usually gravitate toward free software. It’s usually an easier sell to clients and management. Are there catches once you get further down the road? Usually open source projects are marketed as is.
  5. Cost in time. Is the set-up painful?
  6. How well does it appear to be supported? What do the users say?
  7. Documentation. Is there any / much? What is it’s quality?
  8. Community. Does it have an active one? Are the users getting their questions answered satisfactorily? Why are the unhappy users unhappy (do they have a valid reason).
  9. Release schedule. How often are releases being made? When was the last release?
  10. Gut feeling, Intuition. How does it feel. If you have experience in making these sorts of choices, lean on it. Believe it or not, this should probably be No. 1.

The following tools have been my choice based on the above criterion.

Goals

  1. Application should start automatically on system boot
  2. Application should be re-started if it dies or becomes un-responsive
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy (Nginx, node-http-proxy, Tinyproxy, Squid, Varnish, etc)
    2. Clustering and providing load balancing for your single threaded application
    3. Visibility of application statistics.
  4. Enough documentation to feel comfortable consuming the offering
  5. The offering should be production ready. This means: mature with a security conscious architecture.

Sysvinit, Upstart, systemd & Runit

You’ll have one of these running on your Linux box.

These are system and service managers for Linux. Upstart and the later systemd were developed as replacements for the traditional init daemon (Sysvinit), which all depend on init. Init is an essential package that pulls in the default init system. In Debian, starting with Jessie, systemd is your default system and service manager.

There’s some quite helpful info on the differences between Sysvinit and systemd here.

systemd

As I have systemd installed out of the box on my test machine (Debian Jessie), I’ll be using this for my set-up.

Documentation

  1. Well written comparison with Upstart, systemd, Runit and even Supervisor.

Running the likes of the below commands will provide some good details on how these packages interact with each other:

aptitude show sysvinit
aptitude show systemd
# and any others you think of

These system and service managers all run as PID 1 and start the rest of the system. Your Linux system will more than likely be using one of these to start tasks and services during boot, stop them during shutdown and supervise them while the system is running. Ideally you’re going to want to use something higher level to look after your NodeJS app. See the following candidates…

forever

and it’s web UI. Can run any kind of script continuously (whether it is written in node.js or not). This wasn’t always the case though. It was originally targeted toward keeping NodeJS applications running.

Requires NPM to install globally. We already have a package manager on Debian and all other main-stream Linux distros. Installing NPM just adds more attack surface area. Unless it’s essential, I’d rather do without NPM on a production server where we’re actively working to reduce the installed package count and disable everything else we can. I could install forever on a development box and then copy to the production server, but it starts to turn the simplicity of a node module into something not as simple, which then makes offerings like Supervisor, Monit and Passenger look even more attractive.

NPM Details

Does it Meet our Goals?

  1. Not without an extra script. Crontab or similar
  2. Application will be re-started if it dies, but if it’s response times go up, there’s not much forever is going to do about it. It has no way of knowing.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: I don’t see a problem
    2. Integrate NodeJS’s core module cluster into your NodeJS application for load balancing
    3. Visibility of application statistics could be added later with the likes of Monit or something else, but if you used Monit, then there wouldn’t really be a need for forever as Monit does the little that forever does and is capable of so much more, but is not pushy on what to do and how to do it. All the behaviour is defined with quite a nice syntax in a config file or as many as you like.
  4. I think there is enough documentation to feel comfortable consuming it, as forever doesn’t do a lot, which doesn’t have to be a bad thing.
  5. The code it self is probably production ready, but I’ve heard quite a bit about stability issues. You’re also expected to have NPM installed (more attack surface) when we already have native package managers on the server(s).

Overall Thoughts

For me, I’m looking for a tool set that does a bit more. Forever doesn’t satisfy my requirements. There’s often a balancing act between not doing enough and doing too much.

PM2

PM2

Younger than forever, but seems to have quite a few more features and does actually look quite good. I’m not sure about production ready though?

As mentioned on the github page: “PM2 is a production process manager for Node.js applications with a built-in load balancer“. This “Sounds” and at the initial glance looks shiny. Very quickly you should realise there are a few security issues you need to be aware of though.

The word “production” is used but it requries NPM to install globally. We already have a package manager on Debian and all other main-stream Linux distros. Installing NPM just adds more attack surface area. Unless it’s essential and it shouldn’t be, I’d rather do without it on a production system. I could install PM2 on a development box and then copy to the production server, but it starts to turn the simplicity of a node module into something not as simple, which then makes offerings like Supervisor, Monit and Passenger look even more attractive.

At the time of writing this, it’s less than a year old and in nodejs land, that means it’s very much in the immature realm. Do you really want to use something that young on a production server? I’d personally advise against it.

Yes, it’s very popular currently. That doesn’t tell me it’s ready for production though. It tells me the marketing is working.

Is your production server ready for PM2? That phrase alone tells me the mind-set behind the project. I’d much sooner see it the other way around. Is PM2 ready for my production server? You’re going to need a staging server for this, unless you honestly want development tools installed on your production server (git, build-essential, NVM and an unstable version of node 0.11.14 (at time of writing)) and run test scripts on your production server? Not for me or my clients thanks.

If you’ve considered the above concerns and can justify adding the additional attack surface area, check out the features if you haven’t already.

Features that Stood Out

They’re also listed on the github repository. Just beware of some of the caveats. Like for the load balancing: “we recommend the use of node#0.11.15+ or io.js#1.0.2+. We do not support node#0.10.* cluster module anymore!” 0.11.15 is unstable, but hang-on, I thought PM2 was a “production” process manager? OK, so were happy to mix unstable in with something we label as production?

On top of NodeJS, PM2 will run the following scripts: bash, python, ruby, coffee, php, perl.

Start-up Script Generation

Although I’ve heard a few stories that this is fairly un-reliable at the time of writing this. Which doesn’t surprise me, as the project is very young.

Documentation

  1. Advanced Readme

Does it Meet our Goals?

  1. The feature exists, unsure of how reliable it is currently though?
  2. Application should be re-started if it dies shouldn’t be a problem. PM2 can also restart your application if it reaches a certain memory threshold. I haven’t seen anything around restarting based on response times or other application health issues.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: I don’t see a problem
    2. Clustering and load-balancing is integrated but immature.
    3. PM2 provides a small collection of viewable statistics. Personally I’d want more, but I don’t see any reason why you’d have to swap PM2 because of this.
  4. There is reasonable official documentation for the age of the project. The community supplied documentation will need to catch up a bit, although there is a bit of that too. After working through all of the offerings and edge-cases, I feel as I usually do with NodeJS projects. The documentation doesn’t cover all the edge-cases and the development itself misses edge cases. Hopefully with time it’ll get better though as the project does look promising.
  5. I haven’t seen much that would make me think PM2 is production ready. It’s not yet mature. I don’t agree with it’s architecture.

Overall Thoughts

For me, the architecture doesn’t seem to be heading in the right direction to be used on a production web server where less is better. I’d like to see this change. If it did, I think it could be a serious contender for this space.

 


The following are better suited to monitoring and managing your applications. Other than Passenger, they should all be in your repository, which means trivial installs and configurations.

Supervisor

Supervisord

Supervisor is a process manager with a lot of features and a higher level of abstraction than the likes of the above Sysvinit, upstart, systemd, Runit, etc so it still needs to be run by an init daemon in itself.

From the docs: “It shares some of the same goals of programs like launchd, daemontools, and runit. Unlike some of these programs, it is not meant to be run as a substitute for init as “process id 1”. Instead it is meant to be used to control processes related to a project or a customer, and is meant to start like any other program at boot time.” Supervisor monitors the state of processes. Where as a tool like Monit can perform so many more types of tests and take what ever actions you define.

It’s in the Debian repositories  (trivial install on Debian and derivatives).

Documentation

  1. Main web site
  2. There’s a good short comparison here.

Source

Does it Meet our Goals?

  1. Application should start automatically on system boot: Yip. That’s what Supervisor does well.
  2. Application will be re-started if it dies, or becomes un-responsive. It’s often difficult to get accurate up/down status on processes on UNIX. Pidfiles often lie. Supervisord starts processes as subprocesses, so it always knows the true up/down status of its children.If your application becomes unresponsive or can’t connect to it’s database or any other service/resource it needs to work as expected. To be able to monitor these events and respond accordingly your application can expose a health-check interface, like GET /healthcheck. If everything goes well it should return HTTP 200, if not then HTTP 5**In some cases the restart of the process will solve this issue. httpok is a Supervisor event listener which makes GET requests to the configured URL. If the check fails or times out, httpok will restart the process.To enable httpok the following lines have to be placed in supervisord.conf:
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: I don’t see a problem
    2. Integrate NodeJS’s core module cluster into your NodeJS application for load balancing. This would be completely separate to supervisor.
    3. Visibility of application statistics could be added later with the likes of Monit or something else. For me, Supervisor doesn’t do enough. Monit does. Plus if you need what Monit offers, then you have to have three packages to think about, or Something like Supervisor, which is not an init system, so it kind of sits in the middle of the ultimate stack. So my way of thinking is, use the init system you already have to do the low level lifting and then something small to take care of everything else on your server that the init system is not really designed for and Monit has done this job really well. Just keep in mind also. This is not based on any bias. I hadn’t used Monit before this exercise.
  4. Supervisor is a mature product. It’s been around since 2004 and is still actively developed. The official and community provided docs are good.
  5. Yes it’s production ready. It’s proven itself.

 

Overall Thoughts

The documentation is quite good, easy to read and understand. I felt that the config was quite intuitive also. I already had systemd installed out of the box and didn’t see much point in installing Supervisor as systemd appeared to do everything Supervisor could do, plus systemd is an init system (it sits at the bottom of the stack). In most scenarios you’re going to have a Sysvinit or replacement of (that runs with a PID of 1), so in many cases Supervisor although it’s quite nice is kind of redundant, and of course Ubuntu has Upstart.

Supervisor is better suited to running multiple scripts with the same runtime, for example a bunch of different client applications running on Node. This can be done with systemd and the others, but Supervisor is a better fit for this sort of thing.

 

Monit

monit

Is a utility for monitoring and managing daemons or similar programs. It’s mature, actively maintained, free, open source and licensed with GNU AGPL.

It’s in the debian repositories (trivial install on Debian and derivatives). The home page told me the binary was just under 500kB. The install however produced a different number:

After this operation, 765 kB of additional disk space will be used.

Monit provides an impressive feature set for such a small package.

Monit provides far more visibility into the state of your application and control than any of the offerings mentioned above. It’s also generic. It’ll manage and/or monitor anything you throw at it. It has the right level of abstraction. Often when you start working with a product you find it’s limitations and they stop you moving forward and you end up settling for imperfection or you swap the offering for something else providing you haven’t already invested to much effort into it. For me Monit hit the sweet spot and never seems to stop you in your tracks. There always seems to be an easy to relatively easy way to get any monitoring->take action sort of task done. What I also really like is that moving away from Monit should be relatively painless also. The time investment is small and some of it will be transferable in many cases. It’s just config from the control file.

Features that Stood Out

  • Ability to monitor files, directories, disks, processes, the system and other hosts.
  • Can perform emergency logrotates if a log file suddenly grows too large too fast
  • File Checksum TestingThis is good so long as the compromised server hasn’t also had the tool your using to perform your verification (md5sum or sha1sum) modified, which would be common. That’s why in cases like this, tools such as stealth can be a good choice.
  • Testing of other attributes like ownership and access permissions. These are good, but again can easily be modified.
  • Monitoring directories using time-stamp. Good idea, but don’t rely solely on this. time-stamps are easily modified with touch -r … providing you do it between Monit’s cycles and you don’t necessarily know when they are unless you have permissions to look at Monit’s control file.
  • Monitoring space of file-systems
  • Has a built-in lightweight HTTP(S) interface you can use to browse the Monit server and check the status of all monitored services. From the web-interface you can start, stop and restart processes and disable or enable monitoring of services. Monit provides fine grained control over who/what can access the web interface or whether it’s even active or not. Again an excellent feature that you can choose to use or not even have the extra attack surface.
  • There’s also an agregator (m/monit) that allows sys-admins to monitor and manage many hosts at a time. Also works well on mobile devices and is available at a one off cost (reasonable price) to monitor all hosts.
  • Once you install Monit you have to actively enable the http daemon in the monitrc in order to run the Monit cli and/or access the Monit http web UI. At first I thought “is this broken?” I couldn’t even run monit status (it’s a Monit command). ps told me Monit was running. Then I realised… it’s secure by default. You have to actually think about it in order to expose anything. It was this that confirmed Monit for me.
  • The Control File
  • Just like SSH, to protect the security of your control file and passwords the control file must have read-write permissions no more than 0700 (u=xrw,g=,o=); Monit will complain and exit otherwise.

Documentation

The following was the documentation I used in the same order and I found that the most helpful.

  1. Main web site
  2. Official Documentation
  3. Source and links to other documentation including a QUICK START guide of about 6 lines.
  4. Adding Monit to systemd
  5. Release notes

Does it Meet our Goals?

  1. Application can start automatically on system boot
  2. Monit has a plethora of different types of tests it can perform and then follow up with actions based on the outcomes. Http is but one of them.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: Yes, I don’t see any issues here
    2. Integrate NodeJS’s core module cluster into your NodeJS application for load balancing. Monit will still monitor, restart and do what ever else you tell it to do.
    3. Monit provides application statistics to look at if that’s what you want, but it also goes further and provides directives for you to declare behaviour based on conditions that Monit checks for.
  4. Plenty of official and community supplied documentation
  5. Yes it’s production ready. It’s proven itself. Some extra education around some of the points I raised above with some of the security features would be good. If you could trust the hosts hashing programme (and other commonly trojanised programmes like find, ls, etc) that Monit uses, perhaps because you were monitoring it from a stealth controller (which had already taken a known good copy and produced it’s own bench-mark hash) or similar then yes, you could use that feature of Monit with greater assurance that the results it was producing were in fact accurate. In saying that, you don’t have to use the feature, but it’s there if you want it, which I see as very positive so long as you understand what could go wrong and where.

 

Overall Thoughts

The accepted answer here is a pretty good mix and approach to using the right tools for each job. Monit has a lot of capabilities, none of which you must use, so it doesn’t get in your way, as many opinionated tools do and like to dictate how you do things and what you must use in order to do them. Monit allows you to leverage what ever you already have in your stack. You don’t have to install package managers or increase your attack surface other than [apt-get|aptitude] install monit It’s easy to configure and has lots of good documentation.

Passenger

Passenger

I’ve looked at Passenger before and it looked quite good then. It still does, with one main caveat. It’s trying to do to much. One can easily get lost in the official documentation (example of the Monit install (handfull of commands to cover all Linux distros one page) vs Passenger install (aprx 10 pages)).  “Passenger is a web server and application server, designed to be fast, robust and lightweight. It runs your web apps with the least amount of hassle by taking care of almost all administrative heavy lifting for you.” I’d like to see the actual weight rather than just a relative term “lightweight”. To me it doesn’t look light weight. The feeling I got when evaluating Passenger was similar to the feeling produced with my Ossec evaluation.

The learning curve is quite a bit steeper than all the previous offerings. Passenger has strong opinions that once you buy into could make it hard to use the tools you may want to swap in and out. I’m not seeing the UNIX Philosophy here.

If you look at the Phusion Passenger Philosophy we see some note-worthy comments. “We believe no good software has bad documentation“. If your software is 100% intuitive, the need for documentation should be minimal. Few software products are 100% intuitive, because we only have so much time to develop it. The comment around “the Unix way” is interesting also. At this stage I’m not sure they’ve done better. I’d like to spend some time with someone or some team that has Passenger in production in a diverse environment and see how things are working out.

Passenger isn’t in the Debian repositories, so you would need to add the apt repository.

Passenger is six years old at the time of writing this, but the NodeJS support is only just over a year old.

Features that Stood Didn’t really Stand Out

Sadly there weren’t many that stood out for me.

  • Handle more traffic looked similar to Monit resource testing but without the detail. If there’s something Monit can’t do well, it’ll say “Hay, use this other tool and I’ll help you configure it to suite the way you want to work. If you don’t like it, swap it out for something else” With Passenger it seems to integrate into everything rather than providing tools to communicate loosely. Essentially locking you into a way of doing something that hopefully you like. It also talks about “Uses all available CPU cores“. If you’re using Monit you can use the NodeJS cluster module to take care of that. Again leaving the best tool for the job to do what it does best.
  • Reduce maintenance
    • Keep your app running, even when it crashesPhusion Passenger supervises your application processes, restarting them when necessary. That way, your application will keep running, ensuring that your website stays up. Because this is automatic and builtin, you do not have to setup separate supervision systems like Monit, saving you time and effort.” but this is what Monit excels at and it’s a much easier set-up than Passenger. This sort of marketing doesn’t sit right with me.
    • Host multiple apps at once. Host multiple apps on a single server with minimal effort. ” If we’re talking NodeJS web apps, then they are their own server. They host themselves. In this case it looks like Passenger is trying to solve a problem that doesn’t exist?
  • Improve security
    • Privilege separationIf you host multiple apps on the same system, then you can easily run each app as a different Unix user, thereby separating privileges.“. The Monit documentation says this: “If Monit is run as the super user, you can optionally run the program as a different user and/or group.” and goes on to provide examples how it’s done. So again I don’t see anything new here. Other than the “Slow client protections” which has side affects, that’s it for security considerations with Passenger. From what I’ve seen Monit has more in the way of security related features.
  • What I saw happening here was a lot of stuff that I actually didn’t need. Your mileage may vary.

Offerings

Phusion Passenger is a commercial product that has enterprise, custom and open source (which is free and still has loads of features).

Documentation

The following was the documentation I used in the same order and I found that the most helpful.

  1. NodeJS tutorial (This got me started with how it could work with NodeJS)
  2. Main web site
  3. Documentation and support portal
  4. Design and Architecture
  5. User Guide Index
  6. Nginx specific User Guide
  7. Standalone User Guide
  8. Twitter, blog
  9. IRC: #passenger at irc.freenode.net. I was on there for several days. There was very little activity.

Source

Does it Meet our Goals?

  1. Application should start automatically on system boot. There is no doubt that Passenger goes way beyond this aim.
  2. Application should be re-started if it dies or becomes un-responsive. There is no doubt that Passenger goes way beyond this aim.
  3. Ability to add the following later without having to swap the chosen offering:
    1. Reverse proxy: Passenger provides Integrations into Nginx, Apache and stand-alone (provide your own proxy)
    2. Passenger scales up NodeJS processes and automatically load balances between them
    3. Passenger is advertised as offering easily viewable statistics.
  4. There is loads of official documentation. Not as much community contributed though, as it’s still young.
  5. From what I’ve seen so far, I’d say Passenger is production ready. I would like to see more around how security was baked into the architecture though before I committed to using it.

Overall Thoughts

I spent quite a while reading the documentation. I just think it’s doing to much. I prefer to have stronger single focused tools that do one job, do it well and play nicely with all the other kids in the sand pit. You pick the tool up and it’s just intuitive how to use it and you end up reading docs to confirm how you think it should work. For me, this is not how passenger is.

If you’re looking for something even more comprehensive, check out Zabbix. If you like to pay for your tools, check out Nagios if you haven’t already.


At this point it was fairly clear as to which components I’d be using and configuring to keep my NodeJS application monitored, alive and healthy along with any other scripts or processes. systemd and Monit. If you’re on Ubuntu, you’d probably use Upstart instead of systemd as it should already be your default init system. So going with the default for the init system should give you a quick start and provide plenty of power. Plus it’s well supported, reliable, feature rich and you can manage anything/everything you want without installing extra packages. For the next level up, I’d choose Monit. I’ve now used it in production and it’s taken care of everything above the init system. I feel it has a good level of abstraction, plenty of features, doesn’t get in the way and integrates nicely into your production OS.

Getting Started with Monit

So we’ve installed Monit with an apt-get install monit and we’re ready to start configuring it.

ps aux | grep -i monit

Will reveal that Monit is running:

/usr/bin/monit -c /etc/monit/monitrc

Now if you issue a sudo service monit restart, it won’t work as you can’t access the Monit CLI due to the httpd not running.

The first thing we need to do is make some changes to the control file (/etc/monit/monitrc in Debian). The control file has sensible defaults already. At this stage I don’t need a web UI accessible via localhost or any other hosts, but it still needs to be turned on and accessible by at least localhost. Here’s why:

Note that HTTP support is required for almost all Monit CLI interface operation, as CLI commands (such as “monit status”) are handled by communicating with the Monit background process via the the HTTP interface. So basically you should have this enable, though you can bind the HTTP interface to localhost only so Monit is not accessible from the outside.

In order to turn on the httpd, all you need in your control file for that is:

set httpd port 2812 and use address localhost # only accept connection from localhost
allow localhost # allow localhost to connect to the server and

If you want to receive alerts via email, then you’ll need to configure that. Then on reload you should get start and stop events (when you quit).

sudo monit reload

Now if you issue a curl localhost:2812 you should get the web UI’s response of a html page. Now you can start to play with the Monit CLI

Now to stop the Monit background process use:

monit quit

Oh, you can find all the arguments you can throw at Monit here, or just issue a:

monit -h # will list all options.

To check the control file for syntax errors:

sudo monit -t

Also keep an eye on your log file which is specified in the control file: set logfile /var/log/monit.log

Right. So what happens when Monit dies…

Keep Monit Alive

Now you’re going to want to make sure your monitoring tool that can be configured to take all sorts of actions never just stops running, leaving you flying blind. No noise from your servers means all good right? Not necessarily. Your monitoring tool just has to keep running. So lets make sure of that now.

When Monit is apt-get install‘ed on Debian it gets installed and configured to run as a daemon. This is defined in Monit’s init script.
Monit’s init script is copied to /etc/init.d/ and the run levels set-up for it. This means when ever a run level is entered the init script will be run taking either the single argument of stop (example: /etc/rc0.d/K01monit), or start (example: /etc/rc2.d/S17monit). Further details on run levels here.

systemd to the rescue

Monit is pretty stable, but if for some reason it dies, then it won’t be automatically restarted again.
This is where systemd comes in. systemd is installed out of the box on Debian Jessie on-wards. Ubuntu uses Upstart which is similar. Both SysV init and systemd can act as drop-in replacements for each other or even work along side of each other, which is the case in Debian Jessie. If you add a unit file which describes the properties of the process that you want to run, then issue some magic commands, the systemd unit file will take precedence over the init script (/etc/init.d/monit)

Before we get started, lets get some terminology established. The two concepts in systemd we need to know about are unit and target.

  1. A unit is a configuration file that describes the properties of the process that you’d like to run. There are many examples of these I can show you and I’ll point you in the direction soon. They should have a [Unit] directive at a minimum. The syntax of the unit files and the target files were derived from Microsoft Windows .ini files. Now I think the idea is that if you want to have a [Service] directive within your unit file, then you would append .service to the end of your unit file name.
  2. A target is a grouping mechanism that allows systemd to start up groups of processes at the same time. This happens at every boot as processes are started at different run levels.

Now in Debian there are two places that systemd looks for unit files… In order from lowest to highest precedence, they are as follows:

  1. /lib/systemd/system/ (prefix with /usr dir for archlinux) unit files provided by installed packages. Have a look in here for many existing examples of unit files.
  2. /etc/systemd/system/ unit files created by the system administrator

As mentioned above, systemd should be the first process started on your Linux server. systemd reads the different targets and runs the scripts within the specific target’s “target.wants” directory (which just contains a collection of symbolic links to the unit files). For example the target file we’ll be working with is the multi-user.target file (actually we don’t touch it, systemctl does that for us (as per the magic commands mentioned above)). Just as systemd has two locations in which it looks for unit files. I think this is probably the same for the target files, although there wasn’t any target files in the system administrator defined unit location but there were some target.wants files there.

systemd Monit Unit file

I found a template that Monit had already provided for a unit file in /usr/share/doc/monit/examples/monit.service. There’s also one for Upstart. Copy that to where the system administrator unit files should go and make the change so that systemd restarts Monit if it dies for what ever reason. Check the Restart= options on the systemd.service man page. The following is what my initial unit file looked like:

[Unit]
Description=Pro-active monitoring utility for unix systems
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/monit -I -c /etc/monit/monitrc
ExecStop=/usr/bin/monit -c /etc/monit/monitrc quit
ExecReload=/usr/bin/monit -c /etc/monit/monitrc reload
Restart=always

[Install]
WantedBy=multi-user.target

Now, some explanation. Most of this is pretty obvious. The After= directive just tells systemd to make sure the network.target file has been acted on first and of course network.target has After=network-pre.target which doesn’t have a lot in it. I’m not going to go into this now, as I don’t really care too much about it. It works. It means the network interfaces have to be up first. If you want to know how, why, check this documentation. Type=simple. Again check the systemd.service man page.
Now to have systemd control Monit, Monit must not run as a background process (the default). To do this, we can either add the set init statement to Monit’s control file or add the -I option when running systemd, which is exactly what we’ve done above. The WantedBy= is the target that this specific unit is part of.

Now we need to tell systemd to create the symlinks in multi-user.target.wants directory and other things. See the man page for more details about what enable actually does if you want them. You’ll also need to start the unit.

Now what I like to do here is:

systemctl status /etc/systemd/system/monit.service

Then compare this output once we enable the service:

● monit.service - Pro-active monitoring utility for unix systems
   Loaded: loaded (/etc/systemd/system/monit.service; disabled)
   Active: inactive (dead)
sudo systemctl enable /etc/systemd/system/monit.service
# systemd now knows about monit.service
systemctl status /etc/systemd/system/monit.service

Outputs:

● monit.service - Pro-active monitoring utility for unix systems
   Loaded: loaded (/etc/systemd/system/monit.service; enabled)
   Active: inactive (dead)

Now start the service:

sudo systemctl start monit.service # there's a stop and restart also.

Now you can check the status of your Monit service again. This shows terse runtime information about the units or PID you specify (monit.service in our case).

sudo systemctl status monit.service

By default this function will show you 10 lines of output. The number of lines can be controlled with the --lines= option

sudo systemctl --lines=20 status monit.service

Now try killing the Monit process. At the same time, you can watch the output of Monit in another terminal. tmux or screen is helpful for this:

sudo tail -f /var/log/monit.log
sudo kill -SIGTERM $(pidof monit)
# SIGTERM is a safe kill and is the default, so you don't actually need to specify it. Be patient, this may take a minute or two for the Monit process to terminate.

Or you can emulate a nastier termination with SIGKILL or even SEGV (which may kill monit faster).

Now when you run another status command you should see the PID has changed. This is because systemd has restarted Monit.

When you need to make modifications to the unit file, you’ll need to run the following command after save:

sudo systemctl daemon-reload

When you need to make modifications to the running services configuration file
/etc/monit/monitrc for example, you’ll need to run the following command after save:

sudo systemctl reload monit.service
# because systemd is now in control of Monit, rather than the before mentioned: sudo monit reload

 

Keep NodeJS Application Alive

Right, we know systemd is always going to be running. So lets use it to take care of the coarse grained service control. That is keeping your NodeJS application service alive.

Using systemd

systemd my-web-app.service Unit file

You’ll need to know where your NodeJS binary is. The following will provide the path:

which NodeJS

Now create a systemd unit file my-nodejs-app.service

[Unit]
Description=My amazing NodeJS application
After=network.target

[Service]
# systemctl start my-nodejs-app # to start the NodeJS script
ExecStart=[where nodejs binary lives] [where your app.js/index.js lives]
# systemctl stop my-nodejs-app # to stop the NodeJS script
# SIGTERM (15) - Termination signal. This is the default and safest way to kill process.
# SIGKILL (9) - Kill signal. Use SIGKILL as a last resort to kill process. This will not save data or cleaning kill the process.
ExecStop=/bin/kill -SIGTERM $MAINPID
# systemctl reload my-nodejs-app # to perform a zero-downtime restart.
# SIGHUP (1) - Hangup detected on controlling terminal or death of controlling process. Use SIGHUP to reload configuration files and open/close log files.
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=my-nodejs-app
User=my-nodejs-app
Group=my-nodejs-app # Not really needed unless it's different, as the default group of the user is chosen without this option. Self documenting though, so I like to have it present.
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

Add the system user and group so systemd can actually run your service as the user you’ve specified.

sudo groupadd --system my-nodejs-app # this is not needed if you adduser like below...
getent group # to verify which groups exist.
sudo adduser --system --no-create-home --group my-nodejs-app # This will create a system group with the same name and ID of the user.
groups my-nodejs-app # to verify which groups the new user is in.

Now as we did above, go through the same procedure enable‘ing, start‘ing and verifying your new service.

Make sure you have your directory permissions set-up correctly and you should have a running NodeJS application that when it dies will be restarted automatically by systemd.

Don’t forget to backup all your new files and changes in case something happens to your server.

We’re done with systemd for now. Following are some useful resources I’ve used:

 

Using Monit

Now just configure your Monit control file. You can spend a lot of time here tweaking a lot more than just your NodeJS application. There are loads of examples around and the control file itself has lots of commented out examples also. You’ll find the following the most helpful:

There are a few things that had me stuck for a bit. By default Monit only sends alerts on change, not on every cycle if the condition stays the same, unless when you set-up your

set alert your-ame@your.domain

Append receive all alerts, so that it looks like this:

set alert your-ame@your.domain receive all alerts

There’s quite a few things you just work out as you go. The main part I used to health-check my NodeJS app was:

check host myhost with address 1.2.3.4
   start program = "/bin/systemctl start my-nodejs-app.service"
   stop program = "/bin/systemctl stop my-nodejs-app.service"
   if failed ping then alert
   if failed
      port 80 and
      protocol http and
      status = 200 # The default without status is failure if status code >= 400
      request /testdir with content = "some text on my web page" and
         then restart
   if 5 restarts within 5 cycles then alert

I carry on and check things like:

  1. cpu and memory usage
  2. load averages
  3. File system space on all the mount points
  4. Check SSH that it hasn’t been restarted by anything other than Monit (potentially swapping the binary or it’s config). Of course if an attacker kills Monit, systemd immediately restarts it and we get Monit alert(s). We also get real-time logging hopefully to an off-site syslog server. Ideally your off-site syslog server also has alerts set-up on particular log events. On top of that you should also have inactivity alerts set-up so that if your log files are not generating events that you expect, then you also receive alerts. Services like Dead Man’s Snitch or packages like Simple Event Correlator with Cron are good for this. On top of all that, if you have a file integrity checker that resides on another system that your host reveals no details of and you’ve got it configured to check all the right file check-sums, dates, permissions, etc, you’re removing a lot of low hanging fruit for someone wanting to compromise your system.
  5. Directory permissions, uid, gid and checksums. Of course you’re also going to have to make sure the tools that Monit uses to do these checks haven’t been modified.

 

If you find anything I haven’t explained clearly, or you need a hand with any of this just leave a comment. Cheers.

Installation and Hardening of Debian Web Server

December 27, 2014

These are the steps I took to set-up and harden a Debian web server before being placed into a DMZ and undergoing additional hardening before opening the port from the WWW to it. Most of the steps below are fairly simple to do, and in doing so, remove a good portion of the low hanging fruit for nasty entities wanting to gain a foot-hold on your server->network.

Install and Set-up

Debian wheezy, currently stable (supported by the Debian security team for a year or so).

Creating ESXi 5.1 guest

First thing to do is to setup a virtual switch for the host under the Configuration tab. Now I had several quad port Gbit Ethernet adapters in this server. So I created a virtual switch and assigned a physical adapter to it. Now when you create your VM, you choose the VM Network assigned to the virtual switch you created. Provision your disks. Check the “Edit the virtual machine settings before completion” and Continue. You will now be able to modify your settings before you boot the VM. I chose 512MB of RAM at this stage which is far more than it actually needs. While I’m provisioning and hardening the Debian guest, I have the new virtual switch connected to the clients LAN.

ESX Network Configuration

Once we’re done, we can connect the virtual switch up to the new DMZ physical switch or strait into the router. Upload the debian .iso that you downloaded to the ESXi datastore. Then edit the VM settings and select the CD/DVD drive. Select the “Datastore ISO File” option and browse to the .iso file and select the “Connect at power on” option.

6_NewVMSelectIso

Kick the VM in the guts and flick to the VM’s Console tab.

OS Installation

Partitioning

Deleted all the current partitions and added the following. / was added to the start and the rest to the end, in the following order.
/, /var, /tmp, /opt, /usr, /home, swap.

Partitioning Disks

Now the sizes should be setup according to your needs. If you have plenty of RAM, make your swap small, if you have minimal RAM (barely (if) sufficient), you could double the RAM size for your swap. It’s usually a good idea to think about what mount options you want to use for your specific directories. This may shape how you setup your partitions. For example, you may want to have options nosuid,noexec on /var but you can’t because there are shell scripts in /var/lib/dpkg/info so you could setup four partitions. /var without nosuid,noexec and /var/tmp, /var/log, /var/account with nosuid,noexec. Look ahead to the Mounting of Partitions section for more info on this.
In saying this, you don’t need to partition as finely grained as you want options for. You can still mount directories on directories and alter the options at that point. This can be done in the /etc/fstab file and also ad-hoc (using the mount command) if you want to test options out.

You can think about changing /opt (static data) to mount read-only in the future as another security measure.

Continuing with the Install

When you’re asked for a mirror to pull packages from, if you have an apt-cacher[-ng] proxy somewhere on your network, this is the chance to make it work for you thus speeding up your updates and saving internet bandwidth. Enter the IP address and port and leave the rest as default. From the Software selection screen, select “Standard system utilities” and “SSH server”.

10_SoftwareSelection

When prompted to boot into your new system, we need to remove our installation media from the VMs settings. Under the Device Status settings for your VM (if you’re using ESXi), Uncheck “Connected” and “Connect at power on”. Make sure no other boot media are connected at power on. Now first thing we do is SSH into our new VM because it’s a right pain working through the VM hosts console. When you first try to SSH to it you’ll be shown the ECDSA key fingerprint to confirm that the machine you think you are SSHing to is in fact the machine you want to SSH to. Follow the directions here but change that command line slightly to the following:

ssh-keygen -lf ssh_host_ecdsa_key.pub

This will print the keys fingerprint from the actual machine. Compare that with what you were given from your remote machine. Make sure they match and accept and you should be in. Now I use terminator so I have a lovely CLI experience. Of course you can take things much further with Screen or Tmux if/when you have the need.

Next I tell apt about the apt-proxy-ng I want it to use to pull it’s packages from. This will have to be changed once the server is plugged into the DMZ. Create the file /etc/apt/apt.conf if it doesn’t already exist and add the following line:

Acquire::http::Proxy "http://[IP address of the machine hosting your apt cache]:[port that the cacher is listening on]";

Replace the apt proxy references in /etc/apt/sources.list with the internet mirror you want to use, so we contain all the proxy related config in one line in one file. This will allow the requests to be proxied and packages cached via the apt cache on your network when requests are made to the mirror of your choosing.

Update the list of packages then upgrade them with the following command line. If your using sudo, you’ll need to add that to each command:

apt-get update && apt-get upgrade # only run apt-get upgrade if apt-get update is successful (exits with a status of 0)


The steps you take to harden a server that will have many user accounts will be considerably different to this. Many of the steps I’ve gone through here will be insufficient for a server with many users.
The hardening process is not a one time procedure. It ends when you decommission the server. Be prepared to stay on top of your defenses. It’s much harder to defend against attacks than it is to exploit a vulnerability.

Passwords

After a quick look at this, I can in fact verify that we are shadowing our passwords out of the box. It may be worth looking at and modifying /etc/shadow . Consider changing the “maximum password age” and “password warning period”. Consult the man page for shadow for full details. Check that you’re happy with which encryption algorithms are currently being used. The files you’ll need to look at are: /etc/shadow and /etc/pam.d/common-password . The man pages you’ll probably need to read in conjunction with each other are the following:

  • shadow
  • pam.d
  • crypt 3
  • pam_unix

Out of the box crypt supports MD5, SHA-256, SHA-512 with a bit more work for blowfish via bcrypt. The default of SHA-512 enables salted passwords. How can you tell which algorithm you’re using, salt size etc? the crypt 3 man page explains it all.
So by default we’re using SHA-512 which is better than MD5 and the smaller SHA-256.

Now by default I didn’t have a “rounds” option in my /etc/pan.d/common-password module-arguments. Having a large iteration count (number of times the encryption algorithm is run (key stretching)) and an attacker not knowing what that number is, will slow down an attack. I’d suggest adding this and re creating your passwords. As your normal user run:

passwd

providing your existing password then your new one twice. You should now be able to see your password in the /etc/shadow file with the added rounds parameter

$6$rounds=[chosen number of rounds specified in /etc/pam.d/common-password]$[8 character salt]$0LxBZfnuDue7.n5<rest of string>

Check /var/log/auth.log
Reboot and check you can still log in as your normal user. If all good. Do the same with the root account.

Using bcrypt with slowpoke blowfish is a much slower algorithm, so it’s even better for password encryption, but more work to setup at this stage.

Some References

Consider setting a password for GRUB, especially if your server is directly on physical hardware. If it’s on a hypervisor, an attacker has another layer to go through before they can access the guests boot screen. If an attacker can access your VM through the hypervisors management app, you’re pretty well screwed anyway.

Disable Remote Root Logins

Review /etc/pam.d/login so we’re only permitting local root logins. By default this was setup that way.
Review /etc/security/access.conf . Make sure root logins are limited as much as possible. Un-comment rules that you want. I didn’t need to touch this.
Confirm which virtual consoles and text terminal devices you have by reviewing /etc/inittab then modify /etc/securetty by commenting out all the consoles you don’t need (all of them preferably). Or better just issue the following command to fill the file with nothing:

cat /dev/null > /etc/securetty

I back up this file before I do this.
Now test that you can’t log into any of the text terminals listed in /etc/inittab . Just try logging into the likes of your ESX/i vSphere guests console as root. You shouldn’t be able to now.

Make sure if your server is not physical hardware but a VM, then the hosts password is long and made up of a random mix of upper case, lower case, numbers and special characters.

Additional Resources

http://www.debian.org/doc/manuals/securing-debian-howto/ch4.en.html#s-restrict-console-login

SSH

My feeling after a lot of reading is that currently RSA with large keys (The default RSA size is 2048 bits) is a good option for key pair authentication. Personally I like to go for 4096, but with the current growth of processing power (following Moore’s law), 2048 should be good until about 2030. Update: I’m not so sure about the 2030 date for this now.

Create your key pair if you haven’t already and setup key pair authentication. Key-pair auth is more secure and allows you to log in without a password. Your pass-phrase should be stored in your keyring. You’ll just need to provide your local password once (each time you log into your local machine) when the keyring prompts for it. Of course your pass-phrase needs to be kept secret. If it’s compromised, it won’t matter how much you’ve invested into your hardening effort. To tighten security up considerably Make the necessary changes to your servers /etc/ssh/sshd_config file. Start with the changes I’ve listed here.
When you change things like setting up AllowUsers or any other potential changes that could lock you out of the server. It’s a good idea to be logged in via one shell when you exit another and test it. This way if you have locked yourself out, you’ll still be logged in on one shell to adjust the changes you’ve made. Unless you have a need for multiple users, lock it down to a single user. You can even lock it down to a single user from a specific host.
After a set of changes, issue the following restart command as root or sudo:

service ssh restart

You can check the status of the daemon with the following command:

service ssh status

Consider changing the port that SSH listens on. May slow down an attacker slightly. Consider whether it’s worth adding the extra characters to your SSH command. Consider keeping the port that sshd binds to below 1025 where only root can bind a process to.

We’ll need to tunnel SSH once the server is placed into the DMZ. I’ve discussed that in this post.

Additional Resources

Check SSH login attempts. As root or via sudo, type the following to see all failed login attempts:

cat /var/log/auth.log | grep 'sshd.*Invalid'

If you want to see successful logins, type the following:

cat /var/log/auth.log | grep 'sshd.*opened'

Consider installing and configuring denyhosts

Disable Boot Options

All the major hypervisors should provide a way to disable all boot options other than the device you will be booting from. VMware allows you to do this in vSphere Client.

Set BIOS passwords.

Lock Down the Mounting of Partitions

Getting started with your fstab.

Make a backup of your /etc/fstab before you make changes. I ended up needing this later. Read the man page for fstab and also the options section in the mount man page. The Linux File System Hierarchy (FSH) documentation is worth consulting also for directory usages.
Add the noexec mount option to /tmp but not /var because executable shell scripts such as pre, post and removal reside within /var/lib/dpkg/info .
You can also add the nodev nosuid options.
You can add the nodev option to /var, /usr, /opt, /home also.
You can also add the nosuid option to /home .
You can add ro to /usr

To add mount options nosuid,noexec to /var/tmp, /var/log, /var/account, we need to bind the target mount onto an existing directory. The following procedure details how to do this for /var/tmp. As usual, you can do all of this without a reboot. This way you can modify until your hearts content, then be confident that a reboot will not destroy anything or lock you out of your system.
Your /etc/fstab unmounted mounts can be tested like this

sudo mount -a

Then check the difference with

mount

mount options can be set up on a directory by directory basis for finer grained control. For example my /var mount in my /etc/fstab may look like this:

UUID=<block device ID goes here> /var ext4 defaults,nodev 0 2

Then add another line below that in your /etc/fstab that looks like this:

/var /var/tmp none nosuid,noexec,bind 0 2

The file system type above should be specified as none (as stated in the “The bind mounts” section of the mount man page http://man.he.net/man8/mount). The bind option binds the mount. There was a bug with the suidperl package in debian where setting nosuid created an insecurity. suidperl is no longer available in debian.

If you want this to take affect before a reboot, execute the following command:

sudo mount --bind /var/tmp /var/tmp

Then to pickup the new options from /etc/fstab:

sudo mount -o remount /var/tmp

For further details consult the remount option of the mount man page.

At any point you can check the options that you have your directories mounted as, by issuing the following command:

mount

You can test this by putting a script in /var and copying it to /var/tmp. Then try running each of them. Of course the executable bits should be on. You should only be able to run the one that is in the directory mounted without the noexec option. My file “kimsTest” looks like this:

#!/bin/sh
echo "Testing testing testing kim"

Then I…

myuser@myserver:/var$ ./kimsTest
Testing testing testing kim
myuser@myserver:/var$ ./tmp/kimsTest
-bash: ./tmp/kimsTest: Permission denied

You can set the same options on the other /var sub-directories (not /var/lib/dpkg/info).

Enable read-only / mount

There are some contradictions on /run/shm size allocation. Increase the size vs Don’t increase the size

Additional Resources

Work Around for Apt Executing Packages from /tmp

Disable Services we Don’t Need

RPC portmapper

dpkg-query -l '*portmap*'

portmap is not installed by default, so we don’t need to remove it.

Exim

dpkg-query -l '*exim*'

Exim4 is installed.
You can see from the netstat output below (in the “Remove Services” area) that exim4 is listening on localhost and it’s not publicly accessible. Nmap confirms this, but we don’t need it, so lets disable it. We should probably be using ss too.

When a run level is entered, init executes the target files that start with k with a single argument of stop, followed with the files that start with s with a single argument of start. So by renaming /etc/rc2.d/s15exim4 to /etc/rc2.d/k15exim4 you’re causing init to run the service with the stop argument when it moves to run level 2. Just out of interest sake, the scripts at the end of the links with the lower numbers are executed before scripts at the end of links with the higher two digit numbers. Now go ahead and check the directories for run levels 3-5 as well and do the same. You’ll notice that all the links in /etc/rc0.d (which are the links executed on system halt) start with ‘K’. Making sense?

Follow up with

sudo netstat -tlpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0: 0.0.0.0:* LISTEN 1910/sshd
tcp6 0 0 ::: :::* LISTEN 1910/sshd

And that’s all we should see.

Additional resources for the above

Disable Network Information Service (NIS). NIS lets several machines in a network share the same account information, such as the password file (Allows password sharing between machines). Originally known as Yellow Pages (YP). If you needed centralised authentication for multiple machines, you could set-up an LDAP server and configure PAM on your machines in order to contact the LDAP server for user authentication. We have no need for distributed authentication on our web server at this stage.

dpkg-query -l '*nis*'

Nis is not installed by default, so we don’t need to remove it.

Additional resources for the above

Remove Services

First thing I did here was run nmap from my laptop

nmap -p 0-65535 <serverImConfiguring>
PORT STATE SERVICE
23/tcp filtered telnet
111/tcp open rpcbind
/tcp open

Now because I’m using a non default port for SSH, nmap thinks some other service is listening. Although I’m sure if I was a bad guy and really wanted to find out what was listening on that port it’d be fairly straight forward.

To obtain a list of currently running servers (determined by LISTEN) on our web server. Not forgetting that man is your friend.

sudo netstat -tap | grep LISTEN

or

sudo netstat -tlp

I also like to add the ‘n’ option to see the ports. This output was created before I had disabled exim4 as detailed above.

tcp 0 0 *:sunrpc *:* LISTEN 1498/rpcbind
tcp 0 0 localhost:smtp *:* LISTEN 2311/exim4
tcp 0 0 *:57243 *.* LISTEN 1529/rpc.statd
tcp 0 0 *: *:* LISTEN 2247/sshd
tcp6 0 0 [::]:sunrpc [::]:* LISTEN 1498/rpcbind
tcp6 0 0 localhost:smtp [::]:* LISTEN 2311/exim4
tcp6 0 0 [::]:53309 [::]:* LISTEN 1529/rpc.statd
tcp6 0 0 [::]: [::]:* LISTEN 2247/sshd

Rpcbind

Here we see that sunrpc is listening on a port and was started by rpcbind with the PID of 1498.
Now Sun Remote Procedure Call is running on port 111 (also the portmapper port) netstat can tell you the port, confirmed with the nmap scan above. This is used by NFS and as we don’t need NFS as our server isn’t a file server, we can get rid of the rpcbind package.

dpkg-query -l '*rpc*'

Shows us that rpcbind is installed and gives us other details. Now if you’ve been following along with me and have made the /usr mount read only, some stuff will be left behind when we try to purge:

sudo apt-get purge rpcbind

Following are the outputs of interest:

The following packages will be REMOVED:
nfs-common* rpcbind*
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
Do you want to continue [Y/n]? y
Removing nfs-common ...
[ ok ] Stopping NFS common utilities: idmapd statd.
dpkg: error processing nfs-common (--purge):
cannot remove `/usr/share/man/man8/rpc.idmapd.8.gz': Read-only file system
Removing rpcbind ...
[ ok ] Stopping rpcbind daemon....
dpkg: error processing rpcbind (--purge):
cannot remove `/usr/share/doc/rpcbind/changelog.gz': Read-only file system
Errors were encountered while processing:
nfs-common
rpcbind
E: Sub-process /usr/bin/dpkg returned an error code (1)

Another

dpkg-query -l '*rpc*'

Will result in pH. That’s a desired action of (p)urge and a package status of (H)alf-installed.
Now the easiest thing to do here is rename your /etc/fstab to something else and rename the /etc/fstab you backed up before making changes to it back to /etc/fstab then because you know the fstab is good,

reboot

Then try the purge, dpkg-query and netstat commands again to make sure rpcbind is gone and of course no longer listening. I had to actually do the purge twice here as config files were left behind from the fist purge.

Also you can remove unused dependencies now after you get the following message:

The following packages were automatically installed and are no longer required:
libevent-2.0-5 libgssglue1 libnfsidmap2 libtirpc1
Use 'apt-get autoremove' to remove them.
The following packages will be REMOVED:
rpcbind*

sudo apt-get -s autoremove

Because I want to simulate what’s going to be removed because I”m paranoid and have made stupid mistakes with autoremove years ago and that pain has stuck with me. I autoremoved a meta-package which depended on many other packages. A subsequent autoremove for packages that had a sole dependency on the meta-package meant they would be removed. Yes it was a painful experience. /var/log/apt/history.log has your recent apt history. I used this to piece back together my system.

Then follow up with the real thing… Just remove the -s and run it again. Just remember, the less packages your system has the less code there is for an attacker to exploit.

Telnet

telnet installed:

dpkg-query -l '*telnet*'
sudo apt-get remove telnet

telnet gone:

dpkg-query -l '*telnet*'

Ftp

We’ve got scp, why would we want ftp?
ftp installed:

dpkg-query -l '*ftp*'
sudo apt-get remove ftp

ftp gone:

dpkg-query -l '*ftp*'

Don’t forget to swap your new fstab back and test that the mounts are mounted as you expect.

Secure Services

The following provide good guidance on securing what ever is left.

Scheduled Backups

Make sure all data and VM images are backed up routinely. Make sure you test that restoring your backups work. Backup system files and what ever else is important to you. There is a good selection of tools here to help. Also make sure you are backing up the entire VM if your machine is a virtual guest by export / import OVF files. I also like to backup all the VM files. Disk space is cheap. Is there such a thing as being too prepared for disaster? It’s just a matter of time before you’ll be calling on your backups.

Keep up to date

Consider whether it would make sense for you or your admin/s to set-up automatic updates and possibly upgrades. Start out the way you intend to go. Work out your strategy for keeping your system up to date and patched. There are many options here.

Logging, Alerting and Monitoring

From here on, I’ve made it less detailed and more about just getting you to think about things and ways in which you can improve your stance on security. Also if any of the offerings cost money to buy, I make note of it because this is the exception to my rule. Why? Because I prefer free software and especially when it’s Open Source FOSS.

Some of the following cross the “logging” boundaries, so in many cases it’s difficult to put them into categorical boxes.

Attackers like to try and cover their tracks by modifying information that’s distributed to the various log files. Make sure you know who has write access to these files and keep the list small. As a Sysadmin you need to read your log files often and familiarise yourself with them so you get used to what they should look like.

SWatch

Monitors “a” log file for each instance you run (or schedule), matches your defined patterns and acts. You can define different message types with different font styles. If you want to monitor a lot of log files, it’s going to be a bit messy.

Logcheck

Monitors system log files, emails anomalies to an administrator. Once installed it needs to be set-up to run periodically with cron. Not a bad we run down here. How to use and customise it. Man page and more docs here.

NewRelic

Is more of a performance monitoring tool than a security tool. It has free plans which are OK, It comes into it’s own in larger deployments. I’ve used this and it’s been useful for working out what was causing performance issues on the servers.

Advanced Web Statistics (AWStats)

Unlike NewRelic which is a Software as a Service (SaaS), AWStats is FOSS. It kind of fits a similar market space as NewRelic though, but also has Host Intrusion Prevention System (HIPS) features. Docs here.

Pingdom

Similar to NewRelic but not as feature rich. Update: Recently stumbled into Monit which is a better alternative. Free and open source. I’ve been writing about it here.

Multitail

Does what its name sounds like. Tails multiple log files at once. Provides realtime multi log file monitoring. Example here. Great for seeing strange happenings before an intruder has time to modify logs, if your watching them that is. Good for a single system if you’ve got a spare screen to throw on the wall.

PaperTrail

Targets a similar problem to MultiTail except that it collects logs from as many servers as you want and copies them off-site to PaperTrails service and aggregates them into a single easily searchable web interface. Allows you to set-up alerts on anything. Has a free plan, but you only get 100MB per month. The plans are reasonably cheap for the features it provides and can scale as you grow. I’ve used this and have found it to be excellent.

Logwatch

Monitors system logs. Not continuously, so they could be open to modification without you knowing, like SWatch and Logcheck from above. You can configure it to reduce the number of services that it analyses the logs of. It creates a report of what it finds based on your level of paranoia. It’s easy to set-up and get started though. Source and docs here.

Logrotate

Use logrotate to make sure your logs will be around long enough to examine them. Some usage examples here. Ships with Debian. It’s just a matter of applying any extra config.

Logstash

Targets a similar problem to logrotate, but goes a lot further in that it routes and has the ability to translate between protocols. Requires Java to be installed.

Fail2ban

Ban hosts that cause multiple authentication errors. or just email events. Of course you need to think about false positives here too. An attacker can spoof many IP addresses potentially causing them all to be banned, thus creating a DoS.

Rsyslog

Configure syslog to send copy of the most important data to a secure system. Mitigation for an attacker modifying the logs. See @ option in syslog.conf man page. Check the /etc/(r)syslog.conf file to determine where syslogd is logging various messages. Some important notes around syslog here, like locking down the users that can read and write to /var/log.

syslog-ng

Provides a lot more flexibility than just syslogd. Checkout the comprehensive feature-set.

Some Useful Commands

  • Checking who is currently logged in to your server and what they are doing with the who and w commands
  • Checking who has recently logged into your server with the last command
  • Checking which user has failed login attempts with the faillog command
  • Checking the most recent login of all users, or of a given user with the lastlog command. lastlog comes from the binary file /var/log/lastlog.

This, is a list of log files and their names/locations and purpose in life.

Host-based Intrusion Detection System (HIDS)

Tripwire

Is a HIDS that stores a good know state of vital system files of your choosing and can be set-up to notify an administrator upon change in the files. Tripwire stores cryptographic hashes (delta’s) in a database and compares them with the files it’s been configured to monitor changes on. Not a bad tutorial here. Most of what you’ll find with tripwire now are the commercial offerings.

RkHunter

A similar offering to Tripwire. It scans for rootkits, backdoors, checks on the network interfaces and local exploits by running tests such as:

  • MD5 hash changes
  • Files commonly created by root-kits
  • Wrong file permissions for binaries
  • Suspicious strings in kernel modules
  • Hidden files in system directories
  • Optionally scan within plain-text and binary files

Version 1.4.2 (24/02/2014) now checks ssh, sshd and telent (although you shouldn’t have telnet installed). This could be useful for mitigating non-root users running a modified sshd on a 1025-65535 port. You can run ad-hoc scans, then set them up to be run with cron. Debian Jessie has this release in it’s repository. Any Debian distro before Jessie is on 1.4.0-1 or earlier.

The latest version you can install for Linux Mint Qiana (17) and Rebecca (17.1) within the repositories is 1.4.0-3 (01/05/2012)

Change-log here.

Chkrootkit

It’s a good idea to run a couple of these types of scanners. Hopefully what one misses the other will not. Chkrootkit scans for many system programs, some of which are cron, crontab, date, echo, find, grep, su, ifconfig, init, login, ls, netstat, sshd, top and many more. All the usual targets for attackers to modify. You can specify if you don’t want them all scanned. Runs tests such as:

  • System binaries for rootkit modification
  • If the network interface is in promiscuous mode
  • lastlog deletions
  • wtmp and utmp deletions (logins, logouts)
  • Signs of LKM trojans
  • Quick and dirty strings replacement

Stealth

The idea of Stealth is to do a similar job as the above file integrity scanners, but to leave almost no sediments on the tested computer (called the client). A potential attacker therefore has no clue that Stealth is in fact scanning the integrity of its client files. Stealth is installed on a different machine (called the controller) and scans over SSH.

Ossec

Is a HIDS that also has some preventative features. This is a pretty comprehensive offering with a lot of great features.

Unhide

While not strictly a HIDS, this is quite a useful forensics tool for working with your system if you suspect it may have been compromised.

Unhide is a forensic tool to find hidden processes and TCP/UDP ports by rootkits / LKMs or by another hidden technique. Unhide runs in Unix/Linux and Windows Systems. It implements six main techniques.

  1. Compare /proc vs /bin/ps output
  2. Compare info gathered from /bin/ps with info gathered by walking thru the procfs. ONLY for unhide-linux version
  3. Compare info gathered from /bin/ps with info gathered from syscalls (syscall scanning)
  4. Full PIDs space ocupation (PIDs bruteforcing). ONLY for unhide-linux version
  5. Compare /bin/ps output vs /proc, procfs walking and syscall. ONLY for unhide-linux version. Reverse search, verify that all thread seen by ps are also seen in the kernel.
  6. Quick compare /proc, procfs walking and syscall vs /bin/ps output. ONLY for unhide-linux version. It’s about 20 times faster than tests 1+2+3 but maybe give more false positives.

It includes two utilities: unhide and unhide-tcp.

unhide-tcp identifies TCP/UDP ports that are listening but are not listed in /bin/netstat through brute forcing of all TCP/UDP ports available.

Can also be used by rkhunter in it’s daily scans. Unhide was number one in the top 10 toolswatch.org security tools pole

Web Application Firewalls (WAF’s)

which are just another part in the defense in depth model for web applications, get more specific in what they are trying to protect. They operate at the application layer, so they don’t have to deal with all the network traffic. They apply a set of rules to HTTP conversations. They can also be either Network or Host based and able to block attacks such as Cross Site Scripting (XSS), SQL injection.

ModSecurity

Is a mature and feature full WAF that is designed to work with such web servers as IIS, Apache2 and NGINX. Loads of documentation. They also look to be open to committers and challengers a-like. You can find the OWASP Core Rule Set (CRS) here to get you started which has the following:

  • HTTP Protocol Protection
  • Real-time Blacklist Lookups
  • HTTP Denial of Service Protections
  • Generic Web Attack Protection
  • Error Detection and Hiding

Or for about $500US a year you get the following rules:

  • Virtual Patching
  • IP Reputation
  • Web-based Malware Detection
  • Webshell/Backdoor Detection
  • Botnet Attack Detection
  • HTTP Denial of Service (DoS) Attack Detection
  • Anti-Virus Scanning of File Attachments

Fusker

for Node.js. Although doesn’t look like a lot is happening with this project currently. You could always fork it if you wanted to extend.

The state of the Node.js echosystem in terms of security is pretty poor, which is something I’d like to invest time into.

Fire-walling

This is one of the last things you should look at when hardening an internet facing or parameterless system. Why? Because each machine should be hard enough that it doesn’t need a firewall to cover it like a blanket with services underneath being soft and vulnerable. Rather all the services should be either un-exposed or patched and securely configured.

Most of the servers and workstations I’ve been responsible for over the last few years I’ve administered as though there was no firewall and they were open to the internet. Most networks are reasonably easy to penetrate, so we really need to think of the machines behind them as being open to the internet. This is what De-perimeterisation (the concept initialised by the Jericho Forum) is all about.

Some thoughts on firewall logging.

Keep your eye on nftables too, it’s looking good!

Additional Resources

Just keep in mind the above links are quite old. A lot of it’s still relevant though.

Machine Now Ready for DMZ

Confirm DMZ has

  • Network Intrusion Detection System (NIDS), Network Intrusion Prevention System (NIPS) installed and configured. Snort is a pretty good option for the IDS part, although with some work Snort can help with the Prevention also.
  • incoming access from your LAN or where ever you plan on administering it from
  • rules for outgoing and incoming access to/from LAN, WAN tightly filtered.

Additional Web Server Preparation

  • setup and configure soft web server
  • setup and configure caching proxy. Ex:
    • node-http-proxy
    • TinyProxy
    • Varnish
    • nginx
  • deploy application files
  • Hopefully you’ve been baking security into your web app right from the start. This is an essential part of defense in depth. Rather than having your application completely rely on other entities to protect it, it should also be standing up for itself and understanding when it’s under attack and actually fighting back.
  • set static IP address
  • double check that the only open ports on the web server are 80 and what ever you’ve chosen for SSH.
  • setup SSH tunnel
  • decide on and document VM backup strategy and set it up.

Machine Now In DMZ

Setup your CNAME or what ever type of DNS record you’re using.

Now remember, keeping any machine on (not just the internet, but any) a network requires constant consideration and effort in keeping the system as secure as possible.

Work through using the likes of harden and Lynis for your server and harden-surveillance for monitoring your network.

Consider combining “Port Scan Attack Detector” (psad) with fwsnort and Snort.

Hack your own server and find the holes before someone else does. If you’re not already familiar with the tricks of how systems on the internet get attacked read up on the “Attacks and Threats” Run OpenVAS, Run Web Vulnerability Scanners

From here on is in scope for other blog posts.

Journey To Self Hosting

November 29, 2014

I was recently tasked with working out the best options for hosting web applications and their data for a client. This was their foray into whether to throw all their stuff into the cloud or to build their own infrastructure to host everything on.

Hosting Options

There are a lot of options available now. Most of which are derivatives of either external cloud or internal (possibly cloud). All of which come with features and some price tags that need to be weighed up. I’ve been collecting resources of providers and their offerings (both cloud and in-house) for quite a while. So I didn’t have to go far to pull them together for comparison.

All sites and apps require a different amount of each resource type to be allocated to them. For example many web sites are still predominantly static, which require more network band-width than any other resource, some memory, a little processing power and provided they’re being cached on the server, not a lot else. These resources are very cheap.

If you’re running an e-commerce site, then you can potentially add more Disk I/O which is usually the first bottleneck, processing power and space for your data store. Add in redundancy, backups and administration of.
Fast disks (or lets just call it storage) are cheap. In fact most hardware is cheap.

Administration of redundancy, backups and staying on top of security starts to cost more. Although the “staying on top of security” will need to be done whether you’re on someone else’s hardware or on your own. It’s just that it’s a lot easier on your own because you’re in control and dictate the amount of visibility you have.

The Cloud

The Cloud

Pros

It’s out of your hands.
Indeed it is, in more ways than one. Your trust is going to have to be honoured here (or not). Yes you have SLA’s, but what guarantee do the SLA’s give you that the people working on your system and data are not having a bad day. Maybe they’ve broken up with their girlfriend, or what ever. It takes very little to miss something that could drastically compromise your system and or data.

VPS’s can be spun up quickly, but remember, good things take time. Everything has a cost. Things are quick and easy for a reason. There is a cost to this, think about what those (often hidden) costs are.

In some cases it can be cheaper, but you get what you pay for.

Cons

Your are trusting others with your data. Even others that you are not aware of. In many cases, hosting providers can be (and in many cases are) forced by governments and other agencies to give up your secrets. This is very common place now and you may not even know it’s happened.

Your provider may go out of business.

There is an inherent lack of security in all the cloud providers I’ve looked at and worked with. They will tell you they take security seriously, but when someone that understands security inspects how they do things, the situation often looks like Swiss cheese.

In-House Cloud

In-House Cloud

Pros

You are in control of your data and your application, providing you or “your” staff:

  • and/or external consultants are competent and haven’t made mistakes in setting up your infrastructure
  • Are patching all software/firmware involved
  • Are Fastidiously hardening your server/s (this is continuous. It doesn’t stop at the initial set-up)
  • Have set-up the routes and firewall rules correctly
  • Have the correct alerts set-up
  • Have implemented Intrusion Detection and Prevention Systems (IDS’s/IPS’s)
  • Have penetration tested the set-up and not just from a technical perspective. It’s often best to get pairs to do the reviews.

The list goes on. If you are at all in doubt, that’s where you consider the alternatives. In saying that, most hosting and cloud providers perform abysmally, despite their claims that your applications and data is safe with them.

It “can” cost less than entrusting your system and data to someone (or many someone’s) on the other side of the planet. Weigh up the costs. They will not always be what they appear at face value.

Hardware is very cheap.

Cons

Potential lack of in-house skills.

People with the right skills and attitudes are not cheap.

It may not be core business. You may not have the necessary capitols in-house to scope, architect, cost, set-up, administer. Potentially you could hire someone to do the initial work and the on going administration. The amount of on going administration will be partly determined by what your hosting. Generally speaking hosting company web sites, blogs etc, will require less work than systems with distributed components and redundancy.

Spinning up an instance to develop or prototype on, doesn’t have to be hard. In fact if you have some hardware, provisioning of VM images is usually quick and easy. There is actually a pro in this too… you decide how much security you want baked into these images and the processes taken to configure.

Consider download latencies from people you want to reach possibly in other countries.

In some cases it can be more expensive, but you get what you pay for.

Outcome

The decision for this client was made to self host. There will be a follow up post detailing some of the hardening process I took for one of their Debian web servers.