I need to monitor some directories and track their Weekly growth. The process I came up with is:
du
command to a filedu
filesdu
files into tablesdeltas.csv
0 3 * * 0 /bin/du -m /data/ > /home/USER/du_files/"du_$(/bin/date +\%Y\%m\%d)"
cd ~/du_files
TODAYS_FILE="du_$(/usr/bin/date +%Y%m%d)"
YESTERDAYS_FILE="du_$(/usr/bin/date --date="7 days ago" +%Y%m%d)"
/usr/bin/echo "create table old (oldsize interger, path varchar);" > delta.sql
/usr/bin/echo "create table new (newsize interger, path varchar);" >> delta.sql
/usr/bin/echo '.separator "\t" ' >> delta.sql
/usr/bin/echo ".import $TODAYS_FILE new" >> delta.sql
/usr/bin/echo ".import $YESTERDAYS_FILE old" >> delta.sql
/usr/bin/echo ".mode csv" >> delta.sql
/usr/bin/echo ".headers on" >> delta.sql
/usr/bin/echo ".out deltas.csv" >> delta.sql
/usr/bin/echo "select *,newsize-oldsize as delta_in_megabytes from old natural join new where oldsize<newsize order by delta_in_megabytes desc;" >> delta.sql
/usr/bin/sqlite3 < delta.sql
echo $YESTERDAYS_FILE|/usr/bin/mailx -a deltas.csv -s deltas.csv me@mywork.com
Resulting SQL
create table old (oldsize interger, path varchar);
create table new (newsize interger, path varchar);
.separator "\t"
.import du_20160821 new
.import du_20160814 old
.mode csv
.headers on
.out deltas.csv
select *,newsize-oldsize as delta_in_megabytes
from old natural join new where oldsize<newsize
order by delta_in_megabytes desc;
# TODO
### Can do now
* Bullet 1
* Bullet 2
### Near term
1. Numbered 1
1. Numbered 2
### Long term
DAILYFILE="/Users/norrist/Projects/todo/daily/$(/bin/date +%F).md"
DAILYPATH="/Users/norrist/Projects/todo/daily/"
LOCKFILE="/Users/norrist/Projects/todo/daily/LOCK"
TODOFILE="/Users/norrist/Projects/todo/todo.md"
if [ -f $LOCKFILE ]
then
echo "$LOCKFILE PRESENT - ABORTING"
read -n1 -p "Remove and Continue? [y,n]" doit
case $doit in
|Y) echo "Continuing with $LOCKFILE PRESENT" ;;
y*) exit 1 ;;
esac
else
echo "NO LOKCFILE"
touch $LOCKFILE
fi
if [ -f $DAILYFILE ]
then
echo "$DAILYFILE exists"
else
echo >> $DAILYFILE
echo "-----">> $DAILYFILE
echo "# $(/bin/date +%F)" >> $DAILYFILE
echo >> $DAILYFILE
echo "### Projects" >> $DAILYFILE
echo >> $DAILYFILE
echo "### Tickets" >> $DAILYFILE
echo >> $DAILYFILE
echo "### Walkups" >> $DAILYFILE
fi
/usr/local/bin/edit -w --new-window $DAILYFILE
/opt/local/bin/aspell -c $DAILYFILE
/opt/local/bin/aspell -c $TODOFILE
rm $LOCKFILE
rm $DAILYPATH/README.md
cat $TODOFILE >> $DAILYPATH/README.md
for f in $(ls -r $DAILYPATH/2*md)
do cat $f >> $DAILYPATH/README.md
echo >>$DAILYPATH/README.md
done
cd /Users/norrist/Projects/todo; /usr/bin/git add . \
&& /usr/bin/git commit -m "$(date)" \
&& /usr/bin/git push origin master
# 2016-08-02
-----
### Projects
### Tickets
### Walkups
Tune in for another episode.
Make sure you have console access to the Linux VM
Record the Network info for the running Linux VM. If not using DHCP, you will need to know the IP, netmask, default route (gateway), and a DNS server.
Download the OpenBSD installation ram disk to /boot
cd /boot
wget http://ftp5.usa.openbsd.org/pub/OpenBSD/6.0/amd64/bsd.rd
Reboot
Enter the grub command prompt by pressing c
at the grub menu
The grub2 prompt has tab completion which can be helpful.
type ls
to see the available disks
load the OpenBSD installation ram disk and boot
grub> set root=(hd0,msdos1)
grub> kopenbsd /bsd.rd
grub> boot
FreeBSD jails allow users to run multiple, isolated instances of FreeBSD on a single server. Iocage simplifies the management of FreeBSD Jails. Following this tutorial, the jails will be configured to bind to an IP address on the jail host's internal network, and the host OS will pass traffic from the external network to the jail.
The jails will be managed with Iocage. Iocage uses ZFS properties to store configuration data for each jail, so a ZFS file system is required.
These steps will:
PF is full featured firewall, and can do more than just pass traffic to an internal network. Refer to the PF documentation for additional configuration options.
sysrc cloned_interfaces+="lo1"
sysrc ifconfig_lo1="inet 192.0.2.1/24"
sysrc pf_enable="YES"
/etc/pf.conf
# Variables
# ext_if should be set to the hosts external NIC
ext_if = "vtnet0"
jail_if = "lo1"
jail_net = $jail_if:network
# NAT allows the jails to access the external network
nat on $ext_if from $jail_net to any -> ($ext_if)
# Redirect traffic on port 80 to the web server jail
# Add similar rules for additional jails
rdr pass on $ext_if inet proto tcp to port 80 -> 192.0.2.10
Reboot to activate the network changes
The best way to use ZFS on a VPS is to attach block storage as a new disk.
If block storage is not available, you can optionally use a file as the ZFS device.
sysrc zfs_enable="YES"
service zfs start
List the available disks.
If you are using a VPS, the block store will probably be the second disk.
geom disk list
Create a ZFS pool named jailstore.
zpool create jailstore /dev/vtbd1
Create the ZFS file.
dd if=/dev/zero of=/zfsfile bs=1M count=4096
Create a ZFS pool named jailstore.
zpool create jailstore /zfsfile
pkg install py36-iocage
Skip to "Using iocage"
Smaller servers may not have enough RAM to build iocage. If needed, create a swap file and reboot.
dd if=/dev/zero of=/swapfile bs=1M count=1024
echo 'swapfile="/swapfile"' >> /etc/rc.conf
reboot
pkg install subversion python36 git-lite libgit2 py36-pip
svn checkout https://svn.freebsd.org/base/releng/11.1 /usr/src
portsnap fetch
portsnap extract
cd /usr/ports/sysutils/iocage/
make install
iocage activate jailstore
iocage fetch
iocage create -n www ip4_addr="lo1|192.0.2.10/24" -r 11.1-RELEASE
iocage start www
iocage console www
Once you have a shell inside the jail, install and start Apache.
pkg install apache24
sysrc apache24_enable="yes"
service apache24 start
Port 80 on the jail will now be accessible on the hosts IP address.
Additional jails can be installed using the example above.
iocage create
command , but use a different IP addressA basic overview of the VPN I use
There are a few options for the Linux server. Free tier cloud providers
VPS with Free credits ($20-$100) for new accounts I've gotten discount codes from podcasts
VPS requirements for running a OpenVPN server are pretty are basic
The OpenVPN installer is on GitHub. https://github.com/angristan/openvpn-install
On the server as root, run
git clone https://github.com/angristan/openvpn-install.git
/openvpn-install/openvpn-install.sh
---
- hosts: localhost
tasks:
- name: read subnet 10
read_csv:
path: 10.csv
fieldnames: mac,ip,hostname
register: subnet_10
- name: read subnet 11
read_csv:
path: 11.csv
fieldnames: mac,ip,hostname
register: subnet_11
- name: read static
read_csv:
path: static.csv
fieldnames: hostname,ip
register: static_ip
- name: write dhcp file
template:
src: dhcpd.conf.j2
dest: /etc/dhcpd.conf
validate: dhcpd -nc %s
- name: write local.lan zone file
template:
src: local.lan.zone.j2
dest: /var/nsd/zones/master/local.lan
owner: root
group: _nsd
validate: nsd-checkzone local.lan %s
- name: nsd_conf
copy:
src: nsd.conf
dest: /var/nsd/etc/nsd.conf
owner: root
group: _nsd
validate: nsd-checkconf %s
- name: restart nsd
service:
name: nsd
state: restarted
- name: restart dhcpd
service:
name: dhcpd
state: restarted
- name: restart unbound
service:
name: unbound
state: restarted
b8:27:eb:8b:7a:6d,192.168.10.100,pi3a
b8:27:eb:ef:f2:d4,192.168.10.101,pi3b
28:10:7b:25:d5:60,192.168.10.79,ipcam3
28:10:7b:0c:fa:7b,192.168.10.80,ipcam1
f0:7d:68:0b:ca:56,192.168.10.81,ipcam2
tplink,192.168.10.2
gate,192.168.10.10
www,192.168.10.10
fox,192.168.10.17
option domain-name "local.lan";
option domain-name-servers 192.168.10.10;
subnet 192.168.10.0 netmask 255.255.255.0 {
option routers 192.168.10.10;
range 192.168.10.161 192.168.10.179;
{% for host in subnet_10.list %}
host static-client { hardware ethernet {{ host.mac }};fixed-address {{ host.ip }};} #{{ host.hostname }}
{% endfor %}
}
subnet 192.168.11.0 netmask 255.255.255.0 {
option routers 192.168.11.10;
range 192.168.11.72 192.168.11.127;
{% for host in subnet_11.list %}
host static-client { hardware ethernet {{ host.mac }};fixed-address {{ host.ip }};} #{{ host.hostname }}
{% endfor %}
}
host static-client { hardware ethernet b8:27:eb:de:2f:38;fixed-address 192.168.10.45;} #pi3a
host static-client { hardware ethernet 28:10:7b:25:d5:60;fixed-address 192.168.10.79;} #ipcam3
host static-client { hardware ethernet 28:10:7b:0c:fa:7b;fixed-address 192.168.10.80;} #ipcam1
$TTL 3600
local.lan. IN SOA a.root-servers.net. root. (
2016092901 ; Serial
3H ; refresh after 3 hours
1H ; retry after 1 hour
1W ; expire after 1 week
1D) ; minimum TTL of 1 day
IN NS gate.
IN MX 50 gate.local.lan.
local.lan. IN A 192.168.10.10
{% for host in static_ip.list%}
{{ host.hostname }} IN A {{ host.ip }}
{% endfor %}
{% for host in subnet_10.list%}
{{ host.hostname }} IN A {{ host.ip }}
{% endfor %}
{% for host in subnet_11.list%}
{{ host.hostname }} IN A {{ host.ip }}
{% endfor %}
pi3b IN A 192.168.10.101
pi3a IN A 192.168.10.45
ipcam3 IN A 192.168.10.79 ipcam1 IN A 192.168.10.80
ansible-playbook hostname-setup.yml
I noticed nagios on the requested topics page. I am far from being an expert with nagios and there is a lot I do not know. I have a working knowledge of most of the basic nagios principles. So, hopefully, I can give a useful introduction and review some one the principles of nagios along the way
nagios is a network monitoring tool. You define some things for nagios to check, and nagios will alert you if those checks fail.
Nagios has a web UI that is normally used to see the status of the checks. There are some basic administration tasks you can do from the web UI
Nagios is primarily configured with text files. You have to edit the nagios config files for things like
NagiosXI is the commercial version of nagios. NagiosXI requires a paid license and includes support. NagiosXI has some extra features including wizards for adding hosts and easy cloning of hosts.
I have used NagiosXI, and personally don't find the extra features very useful. Probably the biggest reason to use NagiosXI is Enterprise that requires commercial support
The community
version of nagios is normally referred to as nagios core
This episode will focus on the nagios core
I don't like the official nagios core documentation. A lot Like man pages, It is a good reference, but can be hard to follow.
Maybe is it possible for someone to read the documentation and be able to install and configure nagios for the first time. But It took me a lot of trial and error to get a functional nagios server following the nagios documentation
Outside of the official documentation, Most of the nagios installation guides I found online recommend downloading and building nagios from the nagios site. My general policy is to use OS provided packages whenever possible. Normally, sticking to packages eases long the term maintenance.
You may not always get the latest feature release, but installation and updates are usually easier. I know not everyone will agree with me here, and will want to build the latest version. Regardless of the install method, most of the nagios principles i go over will still apply
I am making the assumption that most listeners will be most familiar with Debian/Ubuntu, so I will go over installing nagios on Ubuntu using the nagios packages from the Ubuntu repository
Before I go over the installation, Ill talk a bit about some of the pieces that make up nagios Nagios checks are for either hosts or services.
From the Nagios documentation
A host definition is used to define a physical server, workstation, device, etc. that resides on your network.
Also from the nagios documentation
A service definition is used to identify a "service" that runs on a host. The term "service" is used very loosely. It can mean an actual service that runs on the host (POP, SMTP, HTTP, etc.) or some other type of metric associated with the host
Normally, hosts are checked using ping. If the host responds to the ping with in the specified time frame , the host is considered up. Once a host is defined and determined to be UP, you can optionally check services on that host
Install the packages apt install nagios4
One of the dependencies is a the monitoring-plugins Ill talk more about the monitoring-plugins package when we dig in to the checks
The primary UI for nagios is a cgi driven web app usually served via apache. Following the nagios4 installation, the web UI isn't functional. so we need to make a few configuration changes
The nagios config file for apache contains an directive that is not enabled by default Enable 2 Apache modules
a2enmod authz_groupfile
a2enmod auth_digest
systemctl restart apache2
In /etc/nagios4/cgi.cfg
change the line
'use_authentication=0'
to
'use_authentication=1'
In /etc/apache2/conf-enabled/nagios4-cgi.conf
change
Require all granted
to
Require valid-user
And if needed, remove the IP restriction by removing the line that starts with
Require ip
And finally we need to add a nagios basic auth user. I normally use nagiosadmin, but it can be any username
htdigest -c /etc/nagios4/htdigest.users Nagios4 nagiosadmin
Restart apache and nagios and the nagios UI will be fully functional
Nagios uses a collection of small standalone executable to perform the checks. Checks are either OK, Warning, or Critical, depending on the exit code of the check.
Exit Code | Status |
---|---|
0 | OK/UP |
1 | WARNING |
2 | CRITICAL |
The check commands are standalone applications that can be ran independent from nagios, Running the checks from the shell is helpful to better understand how the nagios checks work. The location of the check commands can vary depending on how nagios was packaged. In this case, they are in /usr/lib/nagios/plugins
Looking that the names on the files can give you an idea of their purpose For example, it should be obvious what check_http
and check_icmp
are for.
cd /usr/lib/nagios/plugins
$ ./check_icmp localhost
OK - localhost: rta 0.096ms, lost 0%|rta=0.096ms;200.000;500.000;0; pl=0%;40;80;; rtmax=0.218ms;;;; rtmin=0.064ms;;;;
$ ./check_http localhost
HTTP OK: HTTP/1.1 200 OK - 10977 bytes in 0.005 second response time |time=0.004558s;;;0.000000;10.000000 size=10977B;;;0
Most checks can be ran with -h
to print usage help
The checks can be in any language as long as is it is executable by the nagios server. Many are compiled C, but Perl and shell scripts are also common
file check_icmp
check_icmp: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=46badf6e4322515a70d5553c8018a20e1e9b8206, for GNU/Linux 3.2.0, stripped
The primary nagios config file is /etc/nagios4/nagios.cfg
nagios.cfg has a directive that will load additional user generated files cfg_dir=/etc/nagios4/conf.d
I like to put all my additions to nagios in this directory and use git for both version control abd backup.
Nagios doesn't run the check executable directly The checks have to be explicitly defined in as a command Some predefined commands are in /etc/nagios4/objects/commands.cfg
Debian package monitoring-plugins-basic
contains several command definitions that are loaded by nagios.cfg cfg_dir=/etc/nagios-plugins/config
Lets look in the /etc/nagios-plugins/config at ping.cfg
for an example of how commands are defined
# 'check-host-alive' command definition
define command{
command_name check-host-alive
command_line /usr/lib/nagios/plugins/check_ping -H '$HOSTADDRESS$' -w 5000,100% -c 5000,100% -p 1
}
Commands require command_name
and command_line
The command line is that path to the executable that will perform the check and optional arguments. Most checks require -H
for the host address to check The check-host-alive command also contains arguments to set the critical and warning thresholds with -c
and -w
The check_ping command is similar the check-host-alive command except it requires 2 arguments to set the critical and warning thresholds.
define command{
command_name check_ping
command_line /usr/lib/nagios/plugins/check_ping -H '$HOSTADDRESS$' -w '$ARG1$' -c '$ARG2$'
}
Hosts and services require a lot of reused variables. Object definitions normally use templates to avoid having to repetitively set the same variables on each host. Nagios normally ships with predefined templates for hosts and services that will work for most cases.
In Ubuntu, the templates are defined in /etc/nagios4/objects/templates.cfg
. Template definitions are the same as other object definitions, except they contain register 0
which designates the object as a template. Ill show how the templates are used when I go over the host and service definitions.
By default, notifications are sent via email to nagios@localhost. The easiest way to get notifications is to configure the nagios server to forward emails to a monitored email address. Since many networks block sending email directly via SMTP, email forwarding may be challenging.
In a follow up episode I will cover setting up postfix to relay mail through a mail sending service and maybe some other methods for sending alerts
By default, nagios is set to monitor localhost. Having the nagios server can be useful but you probably want to add some additional servers.
Have a look at /etc/nagios4/objects/localhost.cfg
if you want to see how the checks for localhost are defined
We will use google.com as an example. and create a file named google.cfg
and place it in in the cfg_dir /etc/nagios4/conf.d
.
The files can be named anything that ends in .cfg
. My preference is one file per host htat contains all the checks for that host. The content of google.cfg
is included new the end of the show notes.
First, we need to define the host. host_name
is the only field required to be set. The remaining requirements are met my using the generic-host
template
We can add a service check to google.com using the same file. The easiest to add is a http check host_name
, service_description
, and check_command
have to be set. the remaining requirements are met by using the generic-service
template.
nagios has to be reloaded to pick up the configuration changes. Prior to restarting nagios, you can verify the nagios configuration is valid by running: nagios4 -v /etc/nagios4/nagios.cfg
This will print a summary of the configuration. Any warnings or errors will be printed at the end.
Warnings are not fatal, but should probably be looked at. Errors will keep nagios from restarting if there are no errors, it is safe to restart nagios
Check the nagios UI at http://SERVER_IP/nagios4 and you should see 2 hosts, localhost and google.com as well as the service checks for the hosts
Since I have already made the mistake of mentioning a follow up episode, I know I am now committed to making additional episode, Next time I will try to cover some enhancements to nagios, including
Leave a comment If there are other aspects of nagios you would like me to try to cover. No promises, but I will do my best.
Thanks for listening and I will see you next time.
---
- hosts: nagios
tasks:
- name: install nagios
apt:
name:
- nagios4
update_cache: yes
- name: Enable the Apache2 modules
command: a2enmod "{{item}}"
with_items:
- authz_groupfile
- auth_digest
- name: modify nagios cgi config to require user
replace:
path: /etc/nagios4/cgi.cfg
regexp: 'use_authentication=0'
replace: 'use_authentication=1'
- name: nagios require valid user
replace:
path: /etc/apache2/conf-enabled/nagios4-cgi.conf
regexp: "Require all granted"
replace: "Require valid-user"
- name: remove IP restriction
lineinfile:
regexp: "Require ip"
path: /etc/apache2/conf-enabled/nagios4-cgi.conf
state: absent
- name: move auth requirements out of File restrictions
lineinfile:
path: /etc/apache2/conf-enabled/nagios4-cgi.conf
regexp: '^\s*<\/?Files'
state: absent
- name: nagios user
copy:
dest: /etc/nagios4/htdigest.users
src: htdigest.users
- name: restart apache
service:
name: apache2
state: restarted
- name: copy nagios configs
copy:
src: "{{item}}"
dest: /etc/nagios4/conf.d
with_items:
- google.cfg
- name: restart nagios
service:
name: nagios4
state: restarted
define host {
host_name google.com
use generic-host
}
define service {
use generic-service
host_name google.com
service_description HTTP
check_command check_http
}
nagiosadmin:Nagios4:85043cf96c7f3eb0884f378a8df04e4c
I did not get any feed back on my first nagios episode, so I can only assume that I perfectly explained what nagios is. And my installation instructions were so good, that no one had any questions. So I will move on to some additional nagios topics.
One thing I meant to talk about but forgot in the intro is why you may want to run nagios as a hobbyist.
Most of the benefits of nagios are not specific to nagios. There are plenty of other options for monitoring, and all of them are worth exploring.
I had planned on discussing how to set up postfix to send emails. But, that is such a big topic I will have to skip it. I will instead talk about what I do to send email. And Maybe you can do something similar.
Spammers have ruined the ability to directly send email. Most residential ISPs block port 25 outbound to prevent malware from sending email. Some Virtual hosting providers may not block sending mail, but many mail servers will not accept mail from VPS IP ranges.
There are a few ways to get around this problem. I use the email delivery service Sendgrid
. They do all the work of staying off the list of spammers, and most email providers trust mail send via Sendgrid.
I wont go into the instructions for configuring postfix to relay outgoing mail via Sendgrid, but their documentation is easy to follow.
There are plenty of services like sendgrid. And most have a free tier. So unless you are blasting out alerts you probably will not have to pay. If you want to send alerts from nagios via email, i I recommend finding a email sending service that works for you.
There are a few options (besides email) for getting alerts on your phone.
The easiest way to get alerts is probably the aNag
Android app. aNag connects to the nagios UI to get status updates. It can be configured to check in periodically and there generate notifications for failed checks.
One downside to aNag is the phone has to be able to connect to the nagios server. So, if nagios is on a private network, you will need a VPN when you are not on the same network.
If you decide to put nagios on a public network, be sure to configure apache to only use HTTPS.
certbot
makes this really easy.
Another option is to us a Push Notification service that can send notifications that are triggered by API calls.
I like to use the pushover.net You pay $5 when you download the pushover app from the app store, and then notifications are sent for free.
They offer a 30 day trial if you want to evaluate the service.
To use pushover, we will add a new contact to nagios. The command for the pushover contact is a script that calls the pushover API via curl.
Remember from the previous episode, nagios has a conf.d
directory and will load any files in that directory. So we will create a new file /etc/nagios4/conf.d/pushover.cfg
and restart nagios. The contents of the pushover file will be in the show notes.
To use pushover for specific checks, and the contact to that check. See the example in the show notes. Or if you want to use pushover for everything Modify the definitions for the host and service templates to use pushover as a contact
The script that calls the Pushover API is at https://github.com/jedda/OSX-Monitoring-Tools/blob/master/notify_by_pushover.sh
Save a copy of the script in the nagios plugins directory.
pushover.cfg
# 'notify-host-pushover' command definition
define command{
command_name notify-host-pushover
command_line $USER1$/notify_by_pushover.sh -u $CONTACTADDRESS1$ -a $CONTACTADDRESS2$ -c 'persistent' -w 'siren' -t "Nagios" -m "$NOTIFICATIONTYPE$ Host $HOSTNAME$ $HOSTSTATE$"
}
# 'notify-service-pushover' command definition
define command{
command_name notify-service-pushover
command_line $USER1$/notify_by_pushover.sh -u $CONTACTADDRESS1$ -a $CONTACTADDRESS2$ -c 'persistent' -w 'siren' -t "Nagios" -m "$HOSTNAME$ $SERVICEDESC$ : $SERVICESTATE$ Additional info: $SERVICEOUTPUT$"
}
define contact{
name generic-pushover
host_notifications_enabled 1
service_notifications_enabled 1
host_notification_period 24x7
service_notification_period 24x7
service_notification_options w,c,r
host_notification_options d,r
host_notification_commands notify-host-pushover
service_notification_commands notify-service-pushover
can_submit_commands 1
retain_status_information 1
retain_nonstatus_information 1
contact_name Pushover
address1 {{ pushover_user_key }}
address2 {{ pushover_app_key }}
}
One of the big advantages of nagios is the ability to write custom checks. In the previous episode, I mentioned that the status of the nagios checks are based on exit code.
Exit Code | status |
---|---|
0 | OK/UP |
1 | WARNING |
2 | CRITICAL |
So, to write a custom check, we need a script that will perform a check, and exit with an exit code based on the results of the check.
I have a server where occasionally the syslog daemon stop running,
Instead of trying to figure out why syslog keeps crashing, I wrote a script to check the log file is being updated. The script looks for the the expected log file and tests that it has been modified in the last few minutes. The script will:
Since the server with the crashy syslog is not the same server running nagios, I need a way for nagios to execute the script on the remote server.
Nagios has a few ways to run check commands on remote servers. I prefer to use ssh, but there are some disadvantages to using ssh. Specifically the resources required to establish the ssh connection can be heavier than some of the other remote execution methods.
The check_by_ssh
plugin can be used to execute check commands on another system. Typically ssh-key authentication is set up so the user that is running the nagios daemon can log in to the remote system without a password
You can try the command to make sure it is working.
cd /usr/lib/nagios/plugins
./check_by_ssh -H RemoteHost -u RemoteUser \
-C /path/to/remote/script/check_log_age.sh
The new command can be added to a file in the nagios conf.d directory
define command {
command_name check_syslog_age
command_line $USER1$/check_by_ssh -u RemoteUser -C /remote/path/check_log_age.sh }
After adding the command definition, check_syslog_age
can be added as a service check The Log Check script:
#!/usr/bin/bash
TODAY=$(date +%Y%m%d)
LOGPATH="/syslog"
TODAYSLOG="$TODAY.log"
if test `find "$LOGPATH/$TODAYSLOG" -mmin -1`
then
echo OK
exit 0
elif test `find "$LOGPATH/$TODAYSLOG" -mmin -10`
then
echo WARNING
exit 1
else
echo CRITICAL
exit 2
fi
SNMP can get complicated and I have mixed feelings about using it. I am not going to go into the SNMP versions or the different authentication options for SNMP. But I will show a minimal setup that allows some performance data to be checked by nagios
The SNMP authentication that I am demonstrating is only appropriate for isolated networks. If you plan to use snmp over a public network, I recommend looking into more secure versions of SNMP or tunnelling the check traffic via ssh or a VPN.
If you want to learn more about SNMP, I recommend "SNMP Mastery" by Michael W Lucas. https://www.tiltedwindmillpress.com/product/snmp-mastery/
First we need to configure the client to respond to SNMP request. On Ubuntu, apt install snmpd
By default, snmpd listens on localhost. Replace the existing snmpd.conf with this example to set a read only community string and listen on all IP addresses.
And don't forget, I do not recommend this for a Public Network. Restart snmpd and open port 161 if there is a firewall enabled.
agentAddress udp:161,udp6:[::1]:161
rocommunity NEW_SECURE_PASSWORD disk /
The nagios plugin package installs several pre-defined snmp checks in /etc/nagios-plugins/config/snmp.cfg
Look through the file to get an idea of the checks that can be performed via SNMP.
Below is an example of a client configuration that uses SNMP. If you look at how the command definitions, most of them have an option to accept arguments to modify how the check is done The argument placeholders re represented by $ARG1$
In most cases, the arguments are optional.
This particular SNMP check for disk space requires an argument to complete the disk ID being checked.
When the service check is defined, the arguments are separated by !
You can also see in the example how you can
define host {
host_name ServerIP
use linux-server
}
define service {
use generic-service
host_name ServerIP
contacts Pushover
max_check_attempts 1
check_interval 1
service_description DISK
check_command snmp_disk!NEW_SECURE_PASSWORD!1!1 # first arg is disk number
# command in /etc/nagios-plugins/config/snmp.cfg
}
define service {
use generic-service
host_name ServerIP
contacts Pushover
service_description LOAD
check_command snmp_load!NEW_SECURE_PASSWORD
# command in /etc/nagios-plugins/config/snmp.cfg
}
define service {
use generic-service
host_name ServerIP
service_description Memory
check_command snmp_mem!NEW_SECURE_PASSWORD
# command in /etc/nagios-plugins/config/snmp.cfg
}
define service {
use generic-service
host_name ServerIP
service_description Swap
check_command snmp_swap!NEW_SECURE_PASSWORD
# command in /etc/nagios-plugins/config/snmp.cfg }
Nagios has plugins that can check if there are system updates required.
The check plugin is installed on the remote server. The plugin for Debian based systems is nagios-plugins-contrib
or nagios-plugins-check-updates
for Red Hat based systems.
The command definitions are below. Since the plugins take longer to run, you will probably need to modify the nagios plugin timeout.
define command {
command_name check_yum
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -t 120 -u root -C "/usr/lib64/nagios/plugins/check_updates -t120"
}
define command {
command_name check_apt
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -t 120 -u nagios-ssh -C "/usr/lib/nagios/plugins/check_apt -t60" }
That's probably all the nagios I can handle for now. leave a comment if there are nagios topics you would like to hear about. Thanks for listening and I will see you next time.
I have been listening to HPR since the beginning. I cant say I have listened to every episode, because I do skip a few. But I have listened to most of the HPR episodes.
One of my favorite corespondents was Mr Gadgets If you want to look him up, he was Host ID 155 If you haven't been listening to HPR for long, you should go back and find some of Mr Gadget's episodes. You will consider it time well spent.
Besides interesting topics, Mr Gadgets was a good story teller. Even though His episodes were unedited, they still moved the listener along without interruption.
I cant do that. When I speak off the cuff, I end up using a lot of filler words, and I have to pause frequently to think about what I will say next. I cant just start talking and a HPR episode falls out.
The first few episodes I made, I tried just talking into a recorder. I kept stumbling over what I was trying to say and repeating myself multiple times to get out what I was trying to say.
I ended up with about an hour of audio that I edited down to about a 15 minute HPR episode. It took me a couple hours in audacity removing all the repeated sentences and filler words like UM and you know.
After I made a few episodes using the method where I would just talk for a long time and create a giant recording that has to be edited I thought it would be easier to record in small chunks so I could stop the recording and think about what I wanted to say between chunks of audio
I looked for a way to record short audio segments and stitch them together later. I know this is possible with audio editor.s I tried recording into audacity stopping the recording to review my notes and think about what I wanted to say next
Audacity worked, but I had a hard time working with it. its not you, its me
I started thinking about what would be ideal tool for me to record HPR episodes. I decided what I wanted was a tool that would
I looked for a tool that was designed to record audio in short segments while also presenting a script but I could not find anything. It didn't seem too hard to implement so I took a stab at writing some python.
I will have a copy of the script solocast.py
I will go over how to use the script, and then talk a bit about some of the script code. The dependencies are pretty minimal. The script uses sox
to record and process the audio. and the python click module to process the command line arguments
when solocast starts, it looks for and loads a file names script.txt solocast breaks the script into segments. The segments are split at empty lines.
The segments only have to contain as much information as you need. I like to have most of the words I plan to say written in the script. You may prefer only having bullet points, or even just the topic.
I like to keep the segments short. That was the whole point of the project, to record several short segments. For me, 1-2 minutes of audio is a good goal for segment length.
For the last few episodes I recorded, I wrote the solocast script in markdown, so the script could double as show notes.
To use the script, run ./solocast.py
followed by a command.
Commands:
combine Combine Segments into single audio file
record Record next unrecorded segment
review Print segments
silence Generate noise profile
The silence option records 5 seconds of audio used to generate a noise profile. When I first started working in this project, I was using a bad USB headset as a microphone and therefor there was a lot of line noise in the recording. I figured out that sox
could replicate my normal process of using audacity to remove noise.
The noise removal in sox
works by first generating a noise profile for a bit of silent audio sox can then use the profile to subtract the noise from the audio. the sox noise removal process worked surprisingly well and is something I would consider a Killer Feature of sox
I wrote solocast.py to enforce the use of the noise profile. the noise profile can be generated manually by running solocast silence If you try to record audio and a noise profile does not already exist solocast will jump to the function in the code to record the profile
running solocast review will present the script segments like when recording but without recording audio, This allows you to rehearse the script without recording any audio This step is not required but hopefully useful
solocast uses the first 40 characters of the content of the script to determine the segment names When you run solocast with the record option, the script looks for the first segment that does not have an associated recording. the text of the segment is printed, and you will be prompted to press enter to start recording.
The script launches the rec
command and records audio from the default sound input. Press CRTL-c hen you are finished recording
Near the top of the script is an option to set the recording file format. You can set it to anything that sox can recognize. I just use wav.
(p)lay
(a)ccept
(r)eccord again
(t)runcate
when you are done, run solocast combine again to record the next segment. The files are save in a directory named Recordings that is created when you first run solocast
when you have recorded all the segments, the next step is to combine the recorded segments into one file. Since solocast used the script to generate the segment recording file names it can also use the script to put the segments in order.
solocast combine does the obvious step of combining the sediments, but also applies the noise reduction using the silence profile.
the resulting file is combined.wav
The last few times I used solocast, I have opened the final combined file in audacity to check the wav form for overall loudness and I scan the wav form for spikes that may be clicks that I want to edit out. Otherwise, the combined file is ready for submission.
Gitlab
rewrite with pysox
library
automate final review steps
pypi
add optional recording tool, GUI??
HPR Feedback episode