Note: If you’re lucky enough to run service by systemd, “service XXX status” will give you a lot of useful information.
- When the process is started and how long it runs? This helps us to detect whether an unexpected or suspicious service restart has happened. As a supplementary, decent service will always do proper logging, which can confirms our observation.
# Get start time by pid
ps -eo pid,comm,etime,user | grep $pid
# Sample output:
root@s1:~# ps -eo pid,comm,etime,user \
| grep 20513
20513 dockerd 8-00:58:30 root
# It means 8 days, 58 min and 30 sec
- Where is the log file? A very common question, especially from Dev or QA. Usually process will do continuous logging. Thus it holds fd of log files. lsof can list all fd opened by the process. So you don’t need to ask anyone to find out the answer!
# Find out log files by pid
lsof -P -n -p $pid | grep ".*log$"
# Sample output:
# root@s1/# lsof -p 40 | grep ".*log$"
# daemon .. /var/log/jenkins/jenkins.log
# daemon .. /var/log/jenkins/jenkins.log
# Check log files for error/exceptions
grep -C 3 -iE "exception|error" $logfile
- How many CPU and memory the process takes? We certainly need to be on top of any abnormal resource utilization[1]. Fortunately almost all modern monitoring systems enable us to see the history. A big plus for trouble shooting.
# Check process resource utilization
top -p $pid
- What’s the command line starting the process? People ask this question, when they’re required to manage unfamiliar or uncomfortable services. A more urgent case: the stupid service just mysteriously refuses to start. Wrong java opts? File permission issue? The process command line can give us some insight or hints.
# Find out process start command line
cat /proc/$pid/cmdline
- What TCP ports are listening by the process? Nowdays the majority of service are web-based or micro-services. It helps, if we can understand what TCP ports the process is listening.
# Check what ports are serving
lsof -P -n -p $pid | grep -i listen
# Check whether given port is listening
lsof -i tcp:$tcp_port
- How many fd the process is opening? Usually too many fd opening is a bad sign, say over 3000: a bad design makes application is inefficient for handling requests; fd resource leak; too many requests exceeding our expectation.
# Get total fd count opened by pid
lsof -p $pid | wc -l