One concept I had to adjust to was deciding on the resolution of the data I wanted to collect. By resolution I mean how detailed the data is. Let us use the example of CPU usage. If I intend to look at the last 1 hour vs the last 30 days, I would see the data at different details. My 1 hour window may show me every single data point (@5 sec). Every peak and dip is exactly what it was. Each pixel is 5 sec and I have 720 data points (60 min, 12 points a min).
Now when we look at the 30 days we will not be able to include that much detail on the screen at once. My 30 day window will show data points @60 minute ticks. If the CPU slowly goes up and down, that's not an issue. But that is not often the case. It is very spiky, up and down very quickly at times. If you only take a reading every 60 min, its any ones guess if you read a spike or a dip.
In this case I would record the min,max, and average values. When you chart that, the max will be the spikes to show you how hard it gets pushed. The min will show you if it ever gets to idle.
When you create a rrdtool database, you define all those options. You can define multiple counters to the same thing at different resolutions. You define a database and indicate how often the value will be recorded. Then you can define different resolutions of that data. You will still just record the CPU every 5 sec and the rrdtool will keep track of those data windows and resolutions.
We can define a database that gets a value every 5 sec for our CPU dataset. We can define a detailed resolution of 1 hour and a second one of 30 days. The first RRA records a value every tick for 720 ticks. each tick is 5 sec, so that's 1 hour. The RRA would be "RRA:AVERAGE:0.5:1:720". The second RRA averages the values of 720 values (1 hour) over 30 days (720 ticks). That RRA is "RRA:AVERAGE:0.5:720:720".
Let us pick some new numbers. What if we want 30 sec average for 2 days. 30 sec is 6 values of data. 2 days is 5760 intervals of 30 sec. So the RRA is "RRA:AVERAGE:0.5:6:5760".
So our final rrdtool database is this:
rrdtool create temp.rrd -s 5
DS:cpu:GAUGE:30:0:100
RRA:AVERAGE:0.5:1:720
RRA:AVERAGE:0.5:720:720
RRA:AVERAGE:0.5:6:5760
What I want you to do is read over this and then return to the examples on the rrdtool's website. Once this information clicks, come back here and read it again. I know the RRA code is over your head if this is your intro, but when you return it will be a very solid example.
Some problems you just can't search on. Here are some I wish were more searchable and this blog is my attempt to make that happen.
Saturday, June 27, 2009
Wednesday, June 24, 2009
Intro to rrdtool
I recently discovered a simple tool that has lots of power behind it. rrdtool is a round robin database that stores time dependant values and easily graphs them. It is a database where you insert values at consistent intervals and the query results are in the form of a graph. It is a round robin database because it only saves a set number of values and overwrites the oldest one every time.
Now that I pointed out what it is, let us talk about what we can do with that. The first thing that jumps out (and what it was designed for) is performance monitoring. You can set up a task to save the CPU, network, disk, and ram activity to this file every 5 sec and then generate charts to display it. Any thing you can get a counter on. Computer temperature, event log errors, ping times, terminal server connections, and anything else you can think of. Things like the daily temperature, number of visitors to your office, spam messages, or even your daily bank balance.
This is great for performance monitoring because the database automatically discards old data. To put it another way, it only keeps the data for as long as you think its important. Once you define how much you want to store and at what resolution, the database is then set in size. It never grows or shrinks.
Now that I pointed out what it is, let us talk about what we can do with that. The first thing that jumps out (and what it was designed for) is performance monitoring. You can set up a task to save the CPU, network, disk, and ram activity to this file every 5 sec and then generate charts to display it. Any thing you can get a counter on. Computer temperature, event log errors, ping times, terminal server connections, and anything else you can think of. Things like the daily temperature, number of visitors to your office, spam messages, or even your daily bank balance.
This is great for performance monitoring because the database automatically discards old data. To put it another way, it only keeps the data for as long as you think its important. Once you define how much you want to store and at what resolution, the database is then set in size. It never grows or shrinks.