Description
- Finding is in service not running in state "Active: active (exited)" and the actual version of script returns "exit 0". So is not detecting a real service down and is identified as active (exit 0)
root@emonpi(rw):nagios# ./libexec/check_service -o linux -s emonhub
Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
root@emonpi(rw):nagios# echo $?
0
>>> STATUS_MSG: Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
- Some varaibles asignation found are:
Check the status of a service
STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?
>>> SERVICETOOL: systemctl status emonhub | grep -i Active
>>> STATUS_MSG: Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
>>> EXIT_CODE: 0
- The mistake is this because run 'exit 0' and is not evaluating a real service down status. Comment following line:
#[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK
- And inside of case loop (case $STATUS_MSG in) include this states:
running|activerunning*)
echo "$STATUS_MSG"
exit $OK
;;
activeexited*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
- After implement changes and run again the command, now shows service down with CRITICAL status (exit 2)
root@emonpi(rw):nagios# ./libexec/check_service_test2 -o linux -s emonhub
Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
root@emonpi(rw):nagios# echo $?
2
If you are agree with my solution, please would you generate a new version with the proposed corrections and as a comment in the header something similar to the following: