Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

Latest commit

 

History

History
34 lines (27 loc) · 3.3 KB

startup.md

File metadata and controls

34 lines (27 loc) · 3.3 KB

Metrictank startup

The full startup procedure has many details, but here we cover the main steps if they affect:

  • performance/resource usage characteristics
  • cluster status
  • API availability
  • diagnostics
Phase Description effect on CPU / RAM
load config load/validate config no
setup diagnostics set up logging, profiling, proftrigger no
log startup logs "Metrictank starting" message no
start sending stats starts connecting and writing to graphite endpoint no
create Store create keyspace, tables, write queues, etc minor RAM increase ~ queue size
create Input(s) open connections (kafka) or listening sockets (carbon) no
start cluster starts gossip, joins cluster no
create Index creates instance and starts write queues minor RAM increase ~ queue size
start API server opens listening socket and starts handling requests in not-ready mode no
init Index creates session, keyspace, tables, write queues, etc and loads in-memory index from persisted data reasonable RAM and CPU increase
create cluster notifier optional: connects to Kafka, starts backfilling persistence message and waits until done or timeout if backfilling: above-normal CPU, normal RAM usage
start input plugin(s) starts backfill (kafka) or listening (carbon) and maintain priority based on input lag if backfilling: above-normal CPU and RAM usage
mark ready state immediately (primary) / after warmup (secondary) details no

We recommend provisioning a cluster such that it can backfill a 7 hour backlog in half on hour or less. This means:

  • The CPU increase during the kafka backfilling is very significant: typically a 14x cpu increase compared to normal usage.
  • The RAM usage during the input data backfilling is typically about 1.5x to 2x normal, though the cluster.gc-percent-not-ready setting lets you trade cpu for memory usage during startup.

Backfilling will go as fast as it can until it reaches a bottleneck (kafka brokers, cpu constraints, etc), so your numbers may vary.

This is true for v0.11.0, but may need revising later.