We have established a monthly maintenance window for rebooting our servers. Recently, we did a whole host of updates and needed to restart 11 Linux machines, running a mixture of CentOS and Ubuntu. Here are some quick notes on how to make that process go a little bit more quickly. For purposes of this note, I’m assuming that you are running Ubuntu on your workstation.
First, of course, try to do the updates and testing of same outside of the maintenance window; most updates can be applied while the system is running and will cause no disruption to end users of your services. Usually, the only thing that requires a restart or significant interruption in service will be a kernel update.
Next, make sure that you have certificate-based ssh logins working on your Linux workstation. This has been covered extensively all over the web; here’s a good short tutorial from the great folks at howto forge. By the way, if you are using the Putty ssh client on Windows, see this tutorial instead or this one.
At this point, you should be able to log in to every Linux server without entering a password. Of course, a password is still required to escalate privileges so that you can do something like reboot each server. There are solutions to this issue, but for now I still “sudo” and supply a password on each server.
To further reduce the amount of typing that you have to do, create a file called “config” located within your home directory’s .ssh folder. The config file allows you to insert some shortcuts for commonly typed portions of the ssh command. Here is a paragraph from a sample config file:
Host firefly snapshot
In this case, assume that you are logged in as “jayne” on your local machine. Normally, you would need to type something like “ssh –p 9922 email@example.com” to connect to the remote host. With the above config file, you could type “ssh firefly” or “ssh snapshot” and accomplish the same goal. There are many more customization possibilities. More information is available with “man ssh_config” or in this note.
As part of the maintenance process, it is convenient to be able to execute the same command on multiple hosts; as above, you could use a sudossh script or, for commands that don’t require escalated privileges, you can use pssh or parallel-ssh. To install pssh on your Ubuntu workstation, issue this command:
sudo apt-get install pssh
This will give access to “parallel-ssh”, among other things. This enables to check the uptime on a whole set of hosts like this:
parallel-ssh -h hosts.txt -l buffy -o /tmp/uptime uptime
This says to use the file called hosts.txt, which contains a host name, one per line, and login with the username “buffy”. Once logged, run the command “uptime” on each remote host and save the output in a set of files under the /tmp/uptime/ directory. Running this command gives you visual confirmation that your servers are up and provides information about status in the output directory. Obviously, this is limited somewhat by the inability to run commands that require escalated privileges, although there are ways around that, if you’re brave enough to fire off multiple escalated commands on a whole host of servers at one time.
These notes are just a starting point. One of the nice problems that we now have, given easy access to virtual infrastructure, is that it’s so easy to spin up 10 or 20 or more servers. This requires us to develop new ways of managing those machines. Obviously, as the number of servers grows, we will begin to look at tools like puppet.