host 的 check-host-alive 使用 check_interval 5,每五分鐘間隔檢查,當發生第一次不 ok 時,檢查間隔變為每10秒一次,檢查十次不ok,變為 HARD 狀態發送通知
check_interval 5
max_check_attempts 10
notification_interval 25
nagios.log
2008-12-02.10:11:45 [1228183905] HOST ALERT: ssorc.tw;DOWN;SOFT;1;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:11:55 [1228183915] HOST ALERT: ssorc.tw;DOWN;SOFT;2;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:12:05 [1228183925] HOST ALERT: ssorc.tw;DOWN;SOFT;3;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:12:15 [1228183935] HOST ALERT: ssorc.tw;DOWN;SOFT;4;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:12:25 [1228183945] HOST ALERT: ssorc.tw;DOWN;SOFT;5;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:12:35 [1228183955] HOST ALERT: ssorc.tw;DOWN;SOFT;6;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:12:45 [1228183965] HOST ALERT: ssorc.tw;DOWN;SOFT;7;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:12:55 [1228183975] HOST ALERT: ssorc.tw;DOWN;SOFT;8;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:13:05 [1228183985] HOST ALERT: ssorc.tw;DOWN;SOFT;9;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:13:15 [1228183995] HOST ALERT: ssorc.tw;DOWN;HARD;10;CRITICAL – Plugin timed out after 10 seconds
2008-12-02.10:13:15 [1228183995] HOST NOTIFICATION: nagios-admin-email-cross;ssorc.tw;DOWN;host-notify-by-email;CRITICAL – Plugin timed out after 10 seconds
當 Host DOWN 狀態,有其它監控服務時,比方說是 HTTP,此時也是 Critical,它只有呈現紅色顯示,並不會發送通知,
2008-12-02.10:13:15 [1228183995] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;HARD;1;CRITICAL – Socket timeout after 10 seconds
HOst UP 狀態,僅發送 Host UP 通知
2008-12-02.10:32:25 [1228185145] HOST ALERT: ssorc.tw;UP;HARD;1;PING OK – Packet loss = 0%, RTA = 29.41 ms
2008-12-02.10:32:25 [1228185145] HOST NOTIFICATION: nagios-admin-email-cross;ssorc.tw;UP;host-notify-by-email;PING OK – Packet loss = 0%, RTA = 29.41 ms
2008-12-02.10:32:25 [1228185145] SERVICE ALERT: ssorc.tw;HTTP;OK;HARD;1;HTTP OK HTTP/1.1 200 OK – 69255 bytes in 2.491 seconds
那 check-host-alive 與 check_ping,它們是一樣的東西,只是判斷的標準不一樣
在監控 service
max_check_attempts 6
normal_check_interval 5
retry_check_interval 1
notification_interval 25
每五分鐘檢查一次,當發生第一次不OK時,間隔改為每一分鐘,檢查六次都不ok時,發送第一次通知,此時隔間檢查改為每五分鐘檢查,一直經過25分鐘後仍不ok,發送第二次通知,
—————————-25 分————————————->
OK 不OK 不OK 不OK 不OK 不OK Alert1 Alert2
5分 1分 1分 1分 1分 1分 5分 5分 5分 5分 5分
。————。————。————。————。————。————。————。————。————。————。————。————。
soft1 soft2 soft3 soft4 soft5 soft6/Hard Hard Hard Hard Hard Hard
圖片
記錄
2008-12-02.11:33:35 [1228188815] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;1;CRITICAL – Socket timeout after 10 seconds
2008-12-02.11:34:45 [1228188885] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;2;CRITICAL – Socket timeout after 10 seconds
2008-12-02.11:35:55 [1228188955] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;3;CRITICAL – Socket timeout after 10 seconds
2008-12-02.11:36:55 [1228189015] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;4;CRITICAL – Socket timeout after 10 seconds
2008-12-02.11:38:05 [1228189085] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;5;CRITICAL – Socket timeout after 10 seconds
2008-12-02.11:38:55 [1228189135] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;HARD;6;CRITICAL – Socket timeout after 10 seconds
2008-12-02.11:38:55 [1228189135] SERVICE NOTIFICATION: nagios-admin-email-cross;ssorc.tw;HTTP;CRITICAL;notify-by-email;CRITICAL – Socket timeout after 10 seconds
2008-12-02.12:04:05 [1228190645] SERVICE NOTIFICATION: nagios-admin-email-cross;ssorc.tw;HTTP;CRITICAL;notify-by-email;CRITICAL – Socket timeout after 10 seconds
2010-09-10 補充 host 的通知週期圖: nagios.txt
nagios 3 的 host 參數有 retry_interval 可以設定 soft 態狀時間隔為多久 check 一次,而 nagios 2 看樣子只能 10 秒吧!
留言