Nagios利用NRPE監(jiān)控Linux主機(jī)
一、簡(jiǎn)介
1、NRPE介紹
NRPE是Nagios的一個(gè)功能擴(kuò)展,它可在遠(yuǎn)程Linux/Unix主機(jī)上執(zhí)行插件程序。通過(guò)在遠(yuǎn)程服務(wù)器上安裝NRPE插件及Nagios插件程序來(lái)向Nagios監(jiān)控平臺(tái)提供該服務(wù)器的本地情況,如CPU負(fù)載,內(nèi)存使用,磁盤(pán)使用等。這里將Nagios監(jiān)控端稱為Nagios服務(wù)器端,而將遠(yuǎn)程被監(jiān)控的主機(jī)稱為Nagios客戶端。
Nagios監(jiān)控遠(yuǎn)程主機(jī)的方法有多種,其方式包括SNMP,NRPE,SSH,NCSA等。這里介紹其通過(guò)NRPE監(jiān)控遠(yuǎn)程Linux主機(jī)的方式。
NRPE(Nagios Remote Plugin Executor)是用于在遠(yuǎn)端服務(wù)器上運(yùn)行監(jiān)測(cè)命令的守護(hù)進(jìn)程,它用于讓Nagios監(jiān)控端基于安裝的方式觸發(fā)遠(yuǎn)端主機(jī)上的檢測(cè)命令,并將檢測(cè)結(jié)果返回給監(jiān)控端。而其執(zhí)行的開(kāi)銷遠(yuǎn)低于基于SSH的檢測(cè)方式,而且檢測(cè)過(guò)程不需要遠(yuǎn)程主機(jī)上的系統(tǒng)賬號(hào)信息,其安全性也高于SSH的檢測(cè)方式。
2、NRPE的工作原理
NRPE有兩部分組成
check_nrpe插件:位于監(jiān)控主機(jī)上
 
nrpe daemon:運(yùn)行在遠(yuǎn)程主機(jī)上,通常是被監(jiān)控端agent
注意:nrpe daemon需要Nagios-plugins插件的支持,否則daemon不能做任何監(jiān)控
詳細(xì)的介紹NRPE的工作原理
當(dāng)Nagios需要監(jiān)控某個(gè)遠(yuǎn)程Linux主機(jī)的服務(wù)或者資源情況時(shí):
首先:Nagios會(huì)運(yùn)行check_nrpe這個(gè)插件,告訴它要檢查什么;
其次:check_nrpe插件會(huì)連接到遠(yuǎn)程的NRPE daemon,所用的方式是SSL;
然后:NRPE daemon 會(huì)運(yùn)行相應(yīng)的Nagios插件來(lái)執(zhí)行檢查;
最后:NRPE daemon 將檢查的結(jié)果返回給check_nrpe 插件,插件將其遞交給nagios做處理。
 
二、被監(jiān)控端安裝Nagios-plugins插件和NRPE
1、添加nagios用戶
- [root@ClientNrpe ~]# useradd -s /sbin/nologin nagios
 
2、安裝nagios-plugins,因?yàn)镹RPE依賴此插件
- [root@ClientNrpe ~]# yum -y install gcc gcc-c++ make openssl openssl-devel
 - [root@ClientNrpe ~]# tar xf nagios-plugins-2.0.3.tar.gz
 - [root@ClientNrpe ~]# cd nagios-plugins-2.0.3
 - [root@ClientNrpe nagios-plugins-2.0.3]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
 - [root@ClientNrpe nagios-plugins-2.0.3]# make && make install
 - #注意:如何要監(jiān)控mysql 需要添加 --with-mysql
 
3、安裝NRPE
- [root@ClientNrpe ~]# tar xf nrpe-2.15.tar.gz
 - [root@ClientNrpe ~]# cd nrpe-2.15
 - [root@ClientNrpe nrpe-2.15]# ./configure --with-nrpe-user=nagios \
 - > --with-nrpe-group=nagios \
 - > --with-nagios-user=nagios \
 - > --with-nagios-group=nagios \
 - > --enable-command-args \
 - > --enable-ssl
 - [root@ClientNrpe nrpe-2.15]# make all
 - [root@ClientNrpe nrpe-2.15]# make install-plugin
 - [root@ClientNrpe nrpe-2.15]# make install-daemon
 - [root@ClientNrpe nrpe-2.15]# make install-daemon-config
 
4、配置NRPE
- [root@ClientNrpe ~]# grep -v '^#' /usr/local/nagios/etc/nrpe.cfg |sed '/^$/d'
 - log_facility=daemon
 - pid_file=/var/run/nrpe.pid
 - server_port=5666 #監(jiān)聽(tīng)的端口
 - nrpe_user=nagios
 - nrpe_group=nagios
 - allowed_hosts=192.168.0.105 #允許的地址通常是Nagios服務(wù)器端
 - dont_blame_nrpe=0
 - allow_bash_command_substitution=0
 - debug=0
 - command_timeout=60
 - connection_timeout=300
 - command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
 - command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
 - command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
 - command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
 - command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
 
5、啟動(dòng)NRPE
- #以守護(hù)進(jìn)程的方式啟動(dòng)
 - [root@ClientNrpe ~]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
 - [root@ClientNrpe ~]# netstat -tulpn | grep nrpe
 - tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 22597/nrpe
 - tcp 0 0 :::5666 :::* LISTEN 22597/nrpe
 
有兩種方式用于管理nrpe服務(wù),nrpe有兩種運(yùn)行模式:
- -i # Run as a service under inetd or xinetd
 - -d # Run as a standalone daemon
 
可以為nrpe編寫(xiě)啟動(dòng)腳本,使得nrpe以standard alone方式運(yùn)行:
- [root@ClientNrpe ~]# cat /etc/init.d/nrped
 - #!/bin/bash
 - # chkconfig: 2345 88 12
 - # description: NRPE DAEMON
 - NRPE=/usr/local/nagios/bin/nrpe
 - NRPECONF=/usr/local/nagios/etc/nrpe.cfg
 - case "$1" in
 - start)
 - echo -n "Starting NRPE daemon..."
 - $NRPE -c $NRPECONF -d
 - echo " done."
 - ;;
 - stop)
 - echo -n "Stopping NRPE daemon..."
 - pkill -u nagios nrpe
 - echo " done."
 - ;;
 - restart)
 - $0 stop
 - sleep 2
 - $0 start
 - ;;
 - *)
 - echo "Usage: $0 start|stop|restart"
 - ;;
 - esac
 - exit 0
 - [root@ClientNrpe ~]# chmod +x /etc/init.d/nrped
 - [root@ClientNrpe ~]# chkconfig --add nrped
 - [root@ClientNrpe ~]# chkconfig nrped on
 - [root@ClientNrpe ~]# service nrped start
 - Starting NRPE daemon... done.
 - [root@ClientNrpe ~]# netstat -tnlp
 - Active Internet connections (only servers)
 - Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
 - tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1031/sshd
 - tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1108/master
 - tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 22597/nrpe
 - tcp 0 0 :::22 :::* LISTEN 1031/sshd
 - tcp 0 0 ::1:25 :::* LISTEN 1108/master
 - tcp 0 0 :::5666 :::* LISTEN 22597/nrpe
 
三、監(jiān)控端安裝NRPE
1、安裝NRPE
- [root@Nagios ~]# tar xf nrpe-2.15.tar.gz
 - [root@Nagios ~]# cd nrpe-2.15
 - [root@Nagios nrpe-2.15]# ./configure
 - > --with-nrpe-user=nagios \
 - > --with-nrpe-group=nagios \
 - > --with-nagios-user=nagios \
 - > --with-nagios-group=nagios \
 - > --enable-command-args \
 - > --enable-ssl
 - [root@Nagios nrpe-2.15]# make all
 - [root@Nagios nrpe-2.15]# make install-plugin
 - #安裝完成后,會(huì)在Nagios安裝目錄的libexec下生成check_nrpe的插件
 - [root@Nagios ~]# cd /usr/local/nagios/libexec/
 - [root@Nagios libexec]# ll -d check_nrpe
 - -rwxrwxr-x. 1 nagios nagios 76769 9月 28 08:07 check_nrpe
 
2、check_nrpe的用法
 
- [root@Nagios libexec]# ./check_nrpe -h
 - NRPE Plugin for Nagios
 - Copyright (c) 1999-2008 Ethan Galstad (nagios@nagios.org)
 - Version: 2.15
 - Last Modified: 09-06-2013
 - License: GPL v2 with exemptions (-l for more info)
 - SSL/TLS Available: Anonymous DH Mode, OpenSSL 0.9.6 or higher required
 - Usage: check_nrpe -H <host> [ -b <bindaddr> ] [-4] [-6] [-n] [-u] [-p <port>] [-t <timeout>] [-c <command>] [-a <arglist...>]
 - Options:
 - -n = Do no use SSL
 - -u = Make socket timeouts return an UNKNOWN state instead of CRITICAL
 - <host> = The address of the host running the NRPE daemon
 - <bindaddr> = bind to local address
 - -4 = user ipv4 only
 - -6 = user ipv6 only
 - [port] = The port on which the daemon is running (default=5666)
 - [timeout] = Number of seconds before connection times out (default=10)
 - [command] = The name of the command that the remote daemon should run
 - [arglist] = Optional arguments that should be passed to the command. Multiple
 - arguments should be separated by a space. If provided, this must be
 - the last option supplied on the command line.
 - Note:
 - This plugin requires that you have the NRPE daemon running on the remote host.
 - You must also have configured the daemon to associate a specific plugin command
 - with the [command] option you are specifying here. Upon receipt of the
 - [command] argument, the NRPE daemon will run the appropriate plugin command and
 - send the plugin output and return code back to *this* plugin. This allows you
 - to execute plugins on remote hosts and 'fake' the results to make Nagios think
 - the plugin is being run locally.
 
- check_nrpe -H <host> [-n] [-u] [-p <port>] [-t <timeout>] [-c <command>] [-a <arglist...>]
 - [root@Nagios libexec]# ./check_nrpe -H 192.168.0.81
 - NRPE v2.15
 
3、定義命令
- [root@Nagios ~]# cd /usr/local/nagios/etc/objects/
 - [root@Nagios objects]# vim commands.cfg
 - #增加到末尾行
 - define command{
 - command_name check_nrpe
 - command_line $USER1$/check_nrpe -H "$HOSTADDRESS$" -c "$ARG1$"
 - }
 
#p#
4、定義服務(wù)
- [root@Nagios objects]# cp windows.cfg linhost.cfg
 - [root@Nagios objects]# grep -v '^#' linhost.cfg |sed '/^$/d'
 - define host{
 - use linux-server
 - host_name linhost
 - alias My Linux Server
 - address 192.168.0.81
 - }
 - define service{
 - use generic-service
 - host_name linhost
 - service_description CHECK USER
 - check_command check_nrpe!check_users
 - }
 - define service{
 - use generic-service
 - host_name linhost
 - service_description Load
 - check_command check_nrpe!check_load
 - }
 - define service{
 - use generic-service
 - host_name linhost
 - service_description SDA1
 - check_command check_nrpe!check_hda1
 - }
 - define service{
 - use generic-service
 - host_name linhost
 - service_description Zombie
 - check_command check_nrpe!check_zombie_procs
 - }
 - define service{
 - use generic-service
 - host_name linhost
 - service_description Total procs
 - check_command check_nrpe!check_total_procs
 - }
 
這里重點(diǎn)說(shuō)下,Nagios服務(wù)端定義服務(wù)的命令完全是根據(jù)被監(jiān)控端NRPE中內(nèi)置的監(jiān)控命令,如下圖所示
5、啟動(dòng)所定義的命令和服務(wù)
- [root@Nagios ~]# vim /usr/local/nagios/etc/nagios.cfg
 - #增加一行
 - cfg_file=/usr/local/nagios/etc/objects/linhost.cfg
 
6、配置文件語(yǔ)法檢查
- [root@Nagios ~]# service nagios configtest
 - Nagios Core 4.0.7
 - Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
 - Copyright (c) 1999-2009 Ethan Galstad
 - Last Modified: 06-03-2014
 - License: GPL
 - Website: http://www.nagios.org
 - Reading configuration data...
 - Read main config file okay...
 - Read object config files okay...
 - Running pre-flight check on configuration data...
 - Checking objects...
 - Checked 20 services.
 - Checked 3 hosts.
 - Checked 2 host groups.
 - Checked 0 service groups.
 - Checked 1 contacts.
 - Checked 1 contact groups.
 - Checked 26 commands.
 - Checked 5 time periods.
 - Checked 0 host escalations.
 - Checked 0 service escalations.
 - Checking for circular paths...
 - Checked 3 hosts
 - Checked 0 service dependencies
 - Checked 0 host dependencies
 - Checked 5 timeperiods
 - Checking global event handlers...
 - Checking obsessive compulsive processor commands...
 - Checking misc settings...
 - Total Warnings: 0
 - Total Errors: 0
 - Things look okay - No serious problems were detected during the pre-flight check
 - Object precache file created:
 - /usr/local/nagios/var/objects.precache
 
7、重新啟動(dòng)nagios服務(wù)
- [root@Nagios ~]# service nagios restart
 - Running configuration check...
 - Stopping nagios: done.
 - Starting nagios: done.
 
8、打開(kāi)Nagios web監(jiān)控頁(yè)面
1)首先點(diǎn)擊【Hosts】查看監(jiān)控主機(jī)狀態(tài)是否為UP
2)其次點(diǎn)擊【Services】查看各監(jiān)控服務(wù)的狀態(tài)是否為OK
注意:在監(jiān)控新添加的主機(jī)linhost;出現(xiàn)狀態(tài)為CRITICAL,提示沒(méi)有那個(gè)文件或目錄。下面是解決辦法
在監(jiān)控Linhost主機(jī)時(shí)出現(xiàn)一個(gè)CRITICAL的警告,查找解決辦法
- ###被監(jiān)控端修改NRPE配置文件并重啟NRPE服務(wù)
 - [root@ClientNrpe etc]# vim nrpe.cfg
 - command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
 - [root@ClientNrpe etc]# service nrped restart
 - ###監(jiān)控端修改linhost.cfg配置文件并重啟nagios和httpd服務(wù)
 - [root@Nagios objects]# vim linhost.cfg
 - #注釋:原來(lái)這里是hda1,現(xiàn)在修改成sda1
 - define service{
 - use generic-service
 - host_name linhost
 - service_description SDA1
 - check_command check_nrpe!check_sda1
 - }
 - [root@Nagios ~]# service nagios restart
 - Running configuration check...
 - Stopping nagios: done.
 - Starting nagios: done.
 - [root@Nagios ~]# service httpd restart
 - 停止 httpd: [確定]
 - 正在啟動(dòng) httpd: [確定]
 
再次點(diǎn)擊【services】即為刷新頁(yè)面,查看如下圖所示:






















 
 
 
 
 
 
 