Skip to content

Commit 485e0bb

Browse files
jsuerethreyangaabmass
authored
Metric Exemplars SDK Specification (open-telemetry#1828)
* First cut at exemplar spec. * Update exemplar specification. * Remove the word parent Co-authored-by: Reiley Yang <[email protected]> * Update specification/metrics/sdk.md Co-authored-by: Reiley Yang <[email protected]> * Update specification/metrics/sdk.md Co-authored-by: Reiley Yang <[email protected]> * Update specification/metrics/sdk.md Co-authored-by: Reiley Yang <[email protected]> * Update specification/metrics/sdk.md Co-authored-by: Reiley Yang <[email protected]> * Update specification/metrics/sdk.md Co-authored-by: Aaron Abbott <[email protected]> * Fixes from review. * Clarify a confusing paragraph. Co-authored-by: Reiley Yang <[email protected]> Co-authored-by: Aaron Abbott <[email protected]>
1 parent a304311 commit 485e0bb

File tree

1 file changed

+130
-0
lines changed

1 file changed

+130
-0
lines changed

specification/metrics/sdk.md

+130
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,9 @@ are the inputs:
134134
applies to [synchronous Instruments](./api.md#synchronous-instrument).
135135
* The `aggregation` (optional) to be used. If not provided, a default
136136
aggregation will be applied by the SDK. The default aggregation is a TODO.
137+
* The `exemplar_reservoir` (optional) to use for storing exemplars.
138+
This should be a factory or callback similar to aggregation which allows
139+
different reservoirs to be chosen by the aggregation.
137140

138141
The SDK SHOULD use the following logic to determine how to process Measurements
139142
made with an Instrument:
@@ -411,6 +414,118 @@ active span](../trace/api.md#context-interaction)).
411414
+------------------+
412415
```
413416

417+
## Exemplars
418+
419+
An [Exemplar](./datamodel.md#exemplars) is a recorded measurement that exposes
420+
the following pieces of information:
421+
422+
- The `value` that was recorded.
423+
- The `time` the measurement was seen.
424+
- The set of [Attributes](../common/common.md#attributes) associated with the measurement not already included in a metric data point.
425+
- The associated [trace id and span id](../trace/api.md#retrieving-the-traceid-and-spanid) of the active [Span within Context](../trace/api.md#determining-the-parent-span-from-a-context) of the measurement.
426+
427+
A Metric SDK MUST provide a mechanism to sample `Exemplar`s from measurements.
428+
429+
A Metric SDK MUST allow `Exemplar` sampling to be disabled. In this instance the SDK SHOULD not have overhead related to exemplar sampling.
430+
431+
A Metric SDK MUST sample `Exemplar`s only from measurements within the context of a sampled trace BY DEFAULT.
432+
433+
A Metric SDK MUST allow exemplar sampling to leverage the configuration of a metric aggregator.
434+
For example, Exemplar sampling of histograms should be able to leverage bucket boundaries.
435+
436+
A Metric SDK SHOULD provide extensible hooks for Exemplar sampling, specifically:
437+
438+
- `ExemplarFilter`: filter which measurements can become exemplars
439+
- `ExemplarReservoir`: determine how to store exemplars.
440+
441+
### Exemplar Filter
442+
443+
The `ExemplarFilter` interface MUST provide a method to determine if a
444+
measurement should be sampled.
445+
446+
This interface SHOULD have access to:
447+
448+
- The value of the measurement.
449+
- The complete set of `Attributes` of the measurment.
450+
- the `Context` of the measuremnt.
451+
- The timestamp of the measurement.
452+
453+
See [Defaults and Configuration](#defaults-and-configuration) for built-in
454+
filters.
455+
456+
### Exemplar Reservoir
457+
458+
The `ExemplarReservoir` interface MUST provide a method to offer measurements
459+
to the reservoir and another to collect accumulated Exemplars.
460+
461+
The "offer" method SHOULD accept measurements, including:
462+
463+
- value
464+
- `Attributes` (complete set)
465+
- `Context`
466+
- timestamp
467+
468+
The "offer" method SHOULD have the ability to pull associated trace and span
469+
information without needing to record full context. In other words, current
470+
span context and baggage can be inspected at this point.
471+
472+
The "offer" method does not need to store all measurements it is given and
473+
MAY further sample beyond the `ExemplarFilter`.
474+
475+
The "collect" method MUST return accumulated `Exemplar`s.
476+
477+
`Exemplar`s MUST retain the any attributes available in the measurement that
478+
are not preserved by aggregation or view configuration. Specifically, at a
479+
minimum, joining together attributes on an `Exemplar` with those available
480+
on its associated metric data point should result in the full set of attributes
481+
from the original sample measurement.
482+
483+
The `ExemplarReservoir` SHOULD avoid allocations when sampling exemplars.
484+
485+
### Exemplar Defaults
486+
487+
The SDK will come with two types of built-in exemplar reservoirs:
488+
489+
1. SimpleFixedSizeExemplarReservoir
490+
2. AlignedHistogramBucketExemplarReservoir
491+
492+
By default, fixed sized histogram aggregators will use
493+
`AlignedHistogramBucketExemplarReservoir` and all other aggregaators will use
494+
`SimpleFixedSizeExemplarReservoir`.
495+
496+
*SimpleExemplarReservoir*
497+
This Exemplar reservoir MAY take a configuration parameter for the size of
498+
the reservoir pool. The reservoir will accept measurements using an equivalent of
499+
the [naive reservoir sampling algorithm](https://en.wikipedia.org/wiki/Reservoir_sampling)
500+
501+
```
502+
bucket = random_integer(0, num_measurements_seen)
503+
if bucket < num_buckets then
504+
reservoir[bucket] = measurement
505+
end
506+
```
507+
508+
*AlignedHistogramBucketExemplarReservoir*
509+
This Exemplar reservoir MUST take a configuration parameter that is the
510+
configuration of a Histogram. This implementation MUST keep the last seen
511+
measurement that falls within a histogram bucket. The reservoir will accept
512+
measurements using the equivalent of the following naive algorithm:
513+
514+
```
515+
bucket = find_histogram_bucket(measurement)
516+
if bucket < num_buckets then
517+
reservoir[bucket] = measurement
518+
end
519+
520+
def find_histogram_bucket(measurement):
521+
for boundary, idx in bucket_boundaries do
522+
if value <= boundary then
523+
return idx
524+
end
525+
end
526+
return boundaries.length
527+
```
528+
414529
## MetricExporter
415530

416531
`MetricExporter` defines the interface that protocol-specific exporters MUST
@@ -534,3 +649,18 @@ they want to make the shutdown timeout configurable.
534649
Pull Metric Exporter reacts to the metrics scrapers and reports the data
535650
passively. This pattern has been widely adopted by
536651
[Prometheus](https://prometheus.io/).
652+
653+
## Defaults and Configuration
654+
655+
The SDK MUST provide the following configuration parameters for Exemplar
656+
sampling:
657+
658+
| Name | Description | Default | Notes |
659+
|-----------------|---------|-------------|---------|
660+
| `OTEL_METRICS_EXEMPLAR_FILTER` | Filter for which measurements can become Exemplars. | `"WITH_SAMPLED_TRACE"` | |
661+
662+
Known values for `OTEL_METRICS_EXEMPLAR_FILTER` are:
663+
664+
- `"NONE"`: No measurements are eligble for exemplar sampling.
665+
- `"ALL"`: All measurements are eligible for exemplar sampling.
666+
- `"WITH_SAMPLED_TRACE"`: Only allow measurements with a sampled parent span in context.

0 commit comments

Comments
 (0)