Every now and then, a host in your Zabbix system will turn unavailable. It results in notifications like this:
Trigger: Zabbix agent on myhost is unreachable for 5 minutes
Trigger status: PROBLEM
And logs like this:
1461:20140808:074300.517 Zabbix agent item "vfs.fs.size[/home/deploy/sites/mc/shared/voip,free]" on host "myhost" failed: first network error, wait for 15 seconds
1466:20140808:074345.134 temporarily disabling Zabbix agent checks on host "myhost": host unavailable
What could be a cause? No, it’s not because a host could not be reached. That would be too easy.
Once upon a time, you need to send a value to Zabbix from a script. Zabbix_sender comes to the rescue:
zabbix_sender -z "server IP address" -p 10051 -s "host in zabbix" -k "item key" -o "item value"
But, what happens if for some reason Zabbix won’t accept the value (say, wrong item type?). It’ll just fail (non-zero), without any error message. Well, let’s try to debug (notice the added “-vv”):
zabbix_sender -vv -z "server IP address" -p 10051 -s "host in zabbix" -k "item key" -o "item value"
Info from server: "Processed 1 Failed 1 Total 1 Seconds spent 0.003079"
sent: 1; skipped: 0; total: 1
Oh wow, that’s helpful. I already know it’s failed. I want to know why.
“- Is my server up?
– I don’t know.”
Right. Exactly what I need from a monitoring system.
The most used notification type in Zabbix (and other systems, likely) is email. So, suppose you’ve installed Zabbix server and want to enable the alerts. There’s only one minor issue: your SMTP runs on a non-standard port. For whatever reason. This can’t be a problem, can it?
Zabbix database can grow quite considerably over time. Expecially if you use default templates where many items are checked at interval of 30 and aren’t even used in the triggers.
The biggest database I had to manage so far was about 80Gb. And when I tried to upgrade, it took a day or two to run all the scripts. Of course, having to backup 80Gb of SQL data does not add to my enjoyment as well.
So, there it goes – a few scripts, some
stolen copied from another person’s repo, some written by myself: github link
With that, the database shrinked from 80Gb to 15Gb, quite an improvement. And the upgrade was finished in a few hours.
NOTE: due to the way the databases work (Mysql in particular), running these scripts won’t reduce Zabbix db size if it’s already bloated. You will have to dump and reload the db after that. What the scripts do is keep its size more or less constant if you run them regularly.
Every item in Zabbix is configured to store its history and trends for some individual period. The process that cleans outdated data is Housekeeper. Although if you monitor at least a hundred of hosts, you probably know about it already. From been bitten by its performance. It housekeeps and housekeeps and housekeeps, wasting CPU cycles and increasing the entropy of the Universe.