linux环境下,对进程自动监控的实现方案。

基于linux 系统的定时服务的改进

实现原理: 使用linux 提供的crontab 机制,定时查询服务器进程是否存在,如果宕机则运行重启脚本

核心代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#! /bin/sh

host_dir=`echo ~`                                       # 当前用户根目录
proc_name="/home/wkubuntu/named/sbin/named"                             # 进程名
file_name="/mnt/bindmonitor.log"                         # 日志文件
pid=0

proc_num()                                              # 计算进程数
{
    num=`ps -ef | grep $proc_name | grep -v grep | wc -l`
    return $num
}

proc_id()                                               # 进程号
{
    pid=`ps -ef | grep $proc_name | grep -v grep | awk '{print $2}'`
}

proc_num
number=$?
if [ $number -eq 0 ]                                    # 判断进程是否存在
then 
    /home/wkubuntu/named/sbin/named  -c /home/wkubuntu/named/etc/named.conf -n 1 &
                                                        # 重启进程的命令,请相应修改
    proc_id                                         # 获取新进程号
    echo ${pid}, `date` >> $file_name      # 将新进程号和重启时间记录
fi

linux进程监控和自动重启的简单实现

开源工具

主要介绍两个 Monit 和Supervisor

Monit monitors and manages any service, and Supervisor is a nice tool for managing persistent scripts and commands without having to write init scripts for them.

  1. Monit

Monit is a good choice when you’re managing just a few machines, and don’t want to hassle with the complexity of something like Nagios or Chef. It works best as a single-host monitor, but it can also monitor remote services, which is useful when local services depend on them, such as database or file servers. The coolest feature is you can monitor any service, and you will see why in the configuration examples.

  1. Supervisor

Supervisor is a slick tool for managing scripts and commands that don’t have init scripts. It saves you from having to write your own, and it’s much easier to use than systemd.

On Debian/Ubuntu, Supervisor starts automatically after installation. Verify with ps:

1
ps ax|grep supervisord

Let’s take our Python hello world script from last week to practice with. Set it up in /etc/supervisor/conf.d/helloworld.conf:

1
2
3
4
5
6
[program:helloworld.py]
command=/bin/helloworld.py
autostart=true
autorestart=true
stderr_logfile=/var/log/hello/err.log
stdout_logfile=/var/log/hello/hello.log

Now Supervisor needs to re-read the conf.d/ directory, and then apply the changes:

1
2
$ sudo supervisorctl reread
$ sudo supervisorctl update

Check your new logfiles to verify that it’s running:

1
2
3
4
5
6
7
8
9
$ sudo supervisorctl reread
helloworld.py: available
carla@studio:~$ sudo supervisorctl update
helloworld.py: added process group
carla@studio:~$ tail /var/log/hello/hello.log
Hello World!
Hello World!
Hello World!
Hello World!

3 Cool Linux Service Monitors

supervisor是Linux下一个便利的启动和监控服务的命令。 supervisor包括两个命令:supervisord和supervisorctl,分别是后台的守护进程以及命令行管理命令。要安装这两个命令只需要执行sudo apt-get install supervisor即可。

ubuntu 18 系统下supervisor 的使用

注意事项

  • 使用 apt 安装(不要使用pip,需要root 权限)
  • 在 conf.d 文件夹下 创建自己的配置文件

查询命令

我们可以检查一下 Supervisor 状态。

1
ps aux | grep supervisor

其次检查守护的进程的状态:

1
supervisorctl

还可以进一步设置将其放到开机自动启动项

Ubuntu 18.04 安装使用 Supervisor 进程守护并设置开机自动启动