- First things First
Probably one of the most frequent commands to be entered right after logging in is 'w'.
w not only tells you if there's somebody else logged in, but also the uptime and the load average.
To check the latest log-ins (say 10) we can do:
last -a -10
(Of course, to learn more about any of the commands mentioned here you can do man command or http://google.com/linux it).
- Processes
To get a quick idea of the services that a server is providing:
netstat -tlpnu
This will give us a list of the programs accepting connections (listening) and their ports.
To get a list of all connections (this gives us an idea of current traffic) we can add the 'a' option:
netstat -tapnu
To get a quick breakdown of the number of connections:
ss -s
If you have the network scanner nmap installed, then the list of open ports given above by netstat should coincide with the one by nmap:
nmap --open -p0-65535 localhost
It's very important from a security standpoint to expose publicly only the services/ports that we really need and stop all other services.
To get a list of all the running processes in a nice tree-view:
ps auxf |less
This gives us interesting information like the % memory each process is using and its status (under the STAT column).
Check for processes in the D status, that means that it's waiting and it gives as a clue of a possible hard disk I/O bottleneck.
An alternative view of processes is given by the top utility which contains also current CPU utilization. For 'top' the status column is 'S'.
We can gather more advanced information with the all-powerfull lsof tool.
It tells us info about files opened by a process, user etc. Since in Unix almost everything is treated like a file (sockets etc) lsof yields a lot of useful information: who's accesing what, what's holding a resource etc.
To check what files are involved with a process:
lsof -p pid (where 'pid' is the process number we get from 'ps', 'top' or 'nstat')
Files open by a user:
lsof -u username
Port info:
lsof -i :portnumber
- Memory usage
There are several tools to check for memory usage, the simplest way is probably:
free -m
Where values are in MB.
Ignore the first line, since once Linux grabs memory it won't release it until the memory is needed, even if it's not using it (that's kind of the idea); this means we really have to look at the second line '-/+ buffers/cache' to know currently how much memory is used and how much is free (available).
The 'swap' line is interesting too; if there's any value used that means that at some point Linux ran out of RAM and used the disk ('swapped') as memory and this slows down the system. Resorting to some swap from time to time to deal with spikes in activity can be OK, but using it all the time is not.
Another handy tool to check the memory and CPU utilization is vmstat.
If we want to run it every second for example we can do vmstat 1. To see the changes in place we can use the cool 'watch' tool:
watch -n1 vmstat
If a program stopped responding or there's a general sluggishness it's possible the we have run out of memory before.
To check the number of times we've run out of memory recently we can do:
grep -i kill /var/log/messages |wc -l
or:
dmesg |grep -i kill | wc -l
The whole list (without the line count '| wc -l') will give us details of what processes were killed by the out-of-memory (oom) killer.
- Disk usage
To get disk space info we can use df or du:
df -h
The biggest (for example 5) directories (this command and the next may take a while):
du -mxS / | sort -n | tail -5
Or for the detailed files (credit Rimuhosting support):
du -a --max-depth=3 / | sort -n | awk '{ if($1 > 102400) print $1/1024 "MB" " " $2 }'
Note that sometimes 'du' and 'df' can report different disk usage; this is because they count space for deleted files that are open differently. (Not that you need to but a reboot would get rid of this discrepancy).
If you ran out of space you can delete some unneeded files (like logs perhaps), purge packages for programs you are not using or compress files with gzip.
For example to archive a directory into one file and also compress it with the tar utility:
tar cvfz dir.tar.gz dir
Here's a couple of tips to safely reclaim some disk space in an emergency situation.
- For Debian-like systems (like Ubuntu) with the 'apt' package management we can clear out the local repository of retrieved package files with:
apt-get clean, the equivalent for the yum updater is yum clean.
- The ext2 and ext3 file systems have by default 5% capacity of partition reserved to root (to check if your filesystem is ext2 or ext3 see for ex: df -T or mount)
We can free up that reserved space in case of emergency and leave just 1% with:
tune2fs -m1 /dev/hda1
, where /dev/hda1 is an example of a disk partition (use 'df' for example to get the name in your case).
We could also set the reserved % value ('m' option) to zero but perhaps that's not a very good idea; in case the disk becomes full we still want to be able to get in (as root) to do maintenance, that's the whole idea of the 5% reserve in the first place.
- Logs
Logs are a sysdamin's best friend. Look at logs at the /var/log directory like /var/log/messages etc. To look at large files we can use 'tail' to see the last lines like: tail -20 /var/log/secure.
If the file keeps growing fast we see the instant changes with tail -f file
Another option to see the end of a file is use 'less' and then press the '>' key to go to the end, this way we can go back up.
Other navigation control for 'less' are 'b' to scroll up and space bar to scroll down or enter to move down a line. We can also look for keywords in 'less' by typing the forward slah / followed by the keyword.
Most logs print a timestamp; check with the date command what the server thinks the current time is.
Also very useful for systems where more than one user uses the same account is the history utility to see previous entered commands for the current account.
For other users see the file: /home/username/.bash_history or similar.
I think it's a good idea to add a timestamp to the command history, this is done by setting the HISTTIMEFORMAT like for example:
HISTTIMEFORMAT="%d/%h - %H:%M:%S "
- Scheduled jobs
For problems that appear only at certain times check users with cron jobs with: ls /var/spool/cron/ and then to see the cron jobs for a user (for example 'maiman'): crontab -u mailman -l
- Network Settings
Very basic commands are ifconfig and route.
A couple of important files are:
/etc/resolv.conf, this file lists the DNS servers. If the server cannot access a hostname or url in the Internet, check if it's accesible by IP address and it that's the case this indicates a problem with domain name resolution.
In this case yry inserting in the /etc/resolv.conf file well-known solid DNS servers (instead of the ones provided by your ISP) like the ones by OpenDNS ( 208.67.222.222, 208.67.220.220) or the easy to remember 4.4.4.1, 4.4.4.2.
/etc/hosts: in this file we can tie a hostname with an IP address, bypassing resolution through DNS servers.
A source of problems is when the entry 127.0.0.1 localhost is missing from this file (to test: doing a ping localhost would fail).
Many programs (database access, mail servers etc) rely on this mapping so they would fail because of this.
In a server networking review is important too to check if we have firewall rules (redirections etc) with:
iptables -L; iptables -t nat -L
Note that the rules for the 'nat' table are not displayed with just 'iptables -L'.
- Software installed and their versions
For many popular programs to get its version it's sometimes small v, sometimes big V and sometimes it's 'version' ; sometimes one dash and sometimes two:
python -V
perl -v
httpd -v or apache2 -v
java -version
mysql --version
For Linux kernel and distro version:
uname -a
cat /proc/version
(There are lots of information under /proc , many commands just read from this pseudo file system and it's a matter of knowing where to look and how to interpret the info).
To get a list of installed packages:
For rpm-based distros: rpm -qa
For dpkg (debian): dpkg --get-selections
- Finding stuff
This is just one of my favorite Linux command combinations:
Say you need to find all the instances of a keyword that can be in one or more files under a directory structure (say /etc ), this can be done with:
find /etc |xargs grep keyword