Zabbix housekeeper woes

Every item in Zabbix is configured to store its history and trends for some individual period. The process that cleans outdated data is Housekeeper. Although if you monitor at least a hundred of hosts, you probably know about it already. From been bitten by its performance. It housekeeps and housekeeps and housekeeps, wasting CPU cycles and increasing the entropy of the Universe.

Really, it’s not that much of a task – look up item’s storage intervals, delete everything too old, repeat. Ideally, a user doesn’t even need to know about its existence. Why is it that google search yields 15000 results for “zabbix housekeeper”, then? It’s beyond my understanding. Housekeeper is a certified resource hog. An utter and complete failure of the developers.

Fortunately, it’s possible to disable it. But then we need to clean up old data somehow. Mostly people suggest database partitioning at this point. See Zabbix wiki for an example. If you run a really large environment (say, a thousand hosts), this might be your only choice. Looks a bit scary? Well, if you have just a few dozen or a few hundred hosts, probably you’ll be able to get away with a few SQL DROP queries. Of course, choosing that way, you lose the ability to control individual item history retain intervals. It’s worth it.

So let’s say we want to keep items history, alerts, acknowledges, events for a week and keep the trends for 3 months. Here’s how to do it in Postgresql:

delete FROM alerts where age(to_timestamp(alerts.clock)) > interval '7 days';
delete FROM acknowledges where age(to_timestamp(acknowledges.clock)) > interval '7 days';
delete FROM events where age(to_timestamp(events.clock)) > interval '7 days';
delete FROM history where age(to_timestamp(history.clock)) > interval '7 days';
delete FROM history_uint where age(to_timestamp(history_uint.clock)) > interval '7 days';
delete FROM history_str  where age(to_timestamp(history_str.clock)) > interval '7 days';
delete FROM history_text where age(to_timestamp(history_text.clock)) > interval '7 days';
delete FROM history_log where age(to_timestamp(history_log.clock)) > interval '7 days';
delete FROM trends where age(to_timestamp(trends.clock)) > interval '90 days';
delete FROM trends_uint where age(to_timestamp(trends_uint.clock)) > interval '90 days';

That’s it. Be happy and keep your Housekeeper disabled.

Update 2014.05.20: proceed to the github repo for Mysql version and other goodies.

Comments