Metrictank comes with a set of tools to import Whisper data into the chunk format used by Metrictank. The usage of the tools is documented on this page(reader/writer), this document specifically focuses on how the import procedure of Whisper files works.
The importer utility walks recursively through the whisper directory, looking for files which end with .wsp
. If there is a name filter defined it additionally filters all the discovered files by the given pattern. It then generates the metric name by removing the prefix defined via the -whisper-directory
parameter from the file's path, removing the string .wsp
suffix, and replacing all the /
characters with .
.
For example if a file is located at /opt/graphite/storage/my/metric/name.wsp
and the value given to -whisper-directory
is /opt/graphite/storage
, then the resulting metric name would be my.metric.name
.
Additionally there is an optional parameter called -name-prefix
which allows the caller to prefix all metric names with a given string, which needs to include the tailing .
if it is wanted. In the above example, if the value of -name-prefix
is my.prefix.
then the resulting metric name would by my.prefix.my.metric.name
.
The whisper importer reader requires the user to provide the path to the storage schemas used at the import destination (not the one on the source graphite installation), it then reads and parses those schemas. During the import it iterates over all the whisper files that need to be imported and processes each of them as following:
- Reads all the headers and points
- Generates the MetricData, interval gets set to the raw interval of the destination schema
- Iterates over each archive in the destination schema (raw + all rollups), for each destination archive it uses the schema conversion logic described in the next chapter to generate the archive points
- Generates chunks from the points which have been generated by the schema conversion logic
- Chunks are then sent to the importer-writer, which writes them into the store used by the destination cluster
Note that this conversion allows to convert from one storage-schema to another, but it does not support changes to the storage-aggregations. When the whisper importer reader reads a whisper file it looks at the aggregation method in the whisper header and then uses this aggregation method for all the schema conversions which it applies. If a user would import data into a Metrictank which has a storage-aggregations.conf that assigns a different aggregation function to a metric than was used in the whisper file, then the resulting situation would be the same as if another Metrictank with different storage-aggregations.conf would have persisted its rollups in that store, the data just wouldn't be found in the store.
Metrictank comes with schema conversion logic which can be used to convert data from one storage schema into another one. This is currently only used by the Whisper importer, but may in the future also be used to import from other types of data sources.
The conversion happens on a destination archive by destination archive basis. In this context the term "archive" refers to one section between the ,
of a storage schema, f.e. the schema 1s:1d,10min:1y
has two archives. Each destination archive gets generated separately, in some cases the same input archives may be used to generate multiple destination archives and in others different input archives may be used. Multiple input archives may also be combined into a single destination archive, if this makes sense given the input schema and the destination schema.
This is how the schema conversion logic generates a destination archive:
- In the set of source archives which it has received as input, it finds the archive with the shortest retention which still contains enough data to satisfy the retention of the destination archive
- In the same set of source archives, it finds the archive with the longest retention that still has a higher or equal resolution as the destination archive
- At this point it knows that for the generation of the destination archive's points, it will have to use the range of input archives determined at the previous two steps as input, including the ones between them. All other input archives can be ignored. That's because it is trying to satisfy the retention period of the destination archive while also trying to achieve the highest possible resolution up to the resolution of the destination archive at the same time. For example if the destination archive is
10s:180d
, the input archives are10s:30d,10min:180d
, then it uses both of the input archives to achieve10s
resolution for the first30d
and then10min
resolution for the remaining150d
in the output. If there was another input archive with the schema1h:5y
, then it could be ignore for the purpose of generating a10s:180d
archive. - It iterates over all the input archives that have been chosen in the previous steps, sorted by resolution in decreasing order (starting with the largest interval). It uses all the points within the time range of the destination archive's retention period as input to generate the points of the destination archive. If the interval of an input archive is higher than the interval of the destination archive, then it "fake-deaggregates" them based on the aggregation method which the input archive was aggregated with. If the interval of an input archive is lower than the interval of the destination archive, then it aggregates the points using the aggregation method which the input archive was aggregated with.
- Additionally, if the aggregation method used is "average" and the destination archive is not the first/raw archive of the destination schema, then it generates two output archives where one is aggregated by "count" and the other by "sum", because that is what Metrictank expects
- If the aggregation method of the input archive is "sum" then the conversion generates an archive with aggregation method "sum", plus also one with the aggregation method "cnt", so that the metric would also be queriable by average if the user wants to do so
Some examples to illustrate what the above procedure produces when it generates the destination archives:
In the above described process, it can happen that some of the used input archives have a lower interval than the destination archive. In this case the conversion process will use the same aggregation logic as Metrictank uses internally to generate rollups to decrease the resolution (increase interval) of the input archive(s) to make it match the destination archive.
In the conversion process it can also happen that some of the used input archives have a higher interval than the destination archive. In this case the conversion logic "undoes" the aggregation as good as possible, obviously it's not possible to do this perfectly because during the aggregation some information gets lost. It works differently depending on the aggregation method.
It takes each value and divides it by the ratio of input interval/output interval
, then it repeats that value ratio
times. Additionally it creates the archive for the count
aggregation method, so that the resulting data could also be viewed as average
by Metrictank if wanted because when requesting an average
rollup from Metrictank it uses the sum
and the count
rollup.
For example when converting 2
datapoints from a 5s
interval archive which has been aggregated with sum
into a 1s
interval archive, then the point values would look like this:
Input:
Sum: [40 65]
Output:
Sum: [8 8 8 8 8 13 13 13 13 13]
Cnt: [1 1 1 1 1 1 1 1 1 1]
Because the ratio between 5s
interval and 1s
interval is 5
, so 40/5=8
and 65/5=13
.
When decreasing the interval of an archive with the aggregation method average
, the behavior is different depending on whether the destination archive is the first/raw archive or a rollup, because Metrictank treats them differently. When reading the raw archive of a metric with the aggregation method average
it directly accesses the raw archive's data which has not been aggregated yet, however when it reads rolled up data then it uses two archives as input where one is aggregated by sum
and the other by count
.
When the schema conversion logic needs to decrease the interval (increase resolution) of an archive with the aggregation method average
and the destination archive is the raw/first archive, then it simply repeats the present datapoints to fill in the gaps.
For example when converting 2
datapoints from a 5s
interval archive into a 1s
interval archive, then it would look like this:
Input:
Avg: [1 2]
Output:
Avg: [1 1 1 1 1 2 2 2 2 2]
However, if the destination archive for the same input is a rollup, so not the first archive of the schema, then it needs to generate a sum
and a count
aggregation because this is what Metrictank expects. Additionally it tries to ensure that the sums
and counts
are correct, as if they would have been generated from the first/raw archive. To do that it multiplies all the input points by the ratio of input interval/output interval
and then duplicates the result ratio
times, it also sets all the counts to ratio
. This way the resulting averages stay correct, plus the counts and sums have values which are closer to what they would really be if the raw metric used as input had a value at every point without any gaps.
Input:
Avg: [1 2]
Output:
Sum: [5 5 5 5 5 10 10 10 10 10]
Cnt: [5 5 5 5 5 5 5 5 5 5]
When viewed in Metrictank it divides the sums by the counts, so the averages stay correct.
The cases of the last
, min
and max
aggregation methods are more simple. When decreasing the interval, the point values simply get repeated to fill all the gaps. So for example when converting the following input datapoints which have been aggregated with last
from a 1s
interval to a 2s
interval, it would look like this:
Input:
Lst: [1 2]
Output:
Lst: [1 1 2 2]
In some cases it is possible that multiple input archives need to be combined into one destination archive. This is because on one hand we are trying to achieve the highest resolution possible up to the resolution of the destination archive, while at the same time we are also trying to fill the whole retention period of the destination archive. This example shows what would happen if two input archives have to be combined into a single destination archive, to achieve both of the beforementioned goals.
Aggregation method: sum
Destination archive: 1s:30s
Input archives: 1s:5s,5s:30s
Input Point Values:
Archive 1:
Sum: [ 1 2 3 4 5]
Archive 2:
Sum: [15 40 65 90 115 140]
Output Point Values:
Archive 1:
Sum: [1 2 3 4 5 8 8 8 8 8 13 13 13 13 13 18 18 18 18 18 23 23 23 23 23 28 28 28 28 28]
Cnt: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Note that the ratio between input interval and destination interval is 5
so 40/5=8
, 65/5=13
, 90/5=18
, 115/5=23
, 140/5=28
. At the same time, for the first 5
points it used the higher resolution archive to achieve max accuracy.