ref(spans): Add spans buffer v2 #85856

untitaker · 2025-02-25T15:52:39Z

The current process-spans consumer assumes that each span has a segment
ID. In the new world we need to construct segments and correlate spans
purely based on their parent-child relationship + timeouts.

Build a new redis-based spans buffer, patch it into the existing consumer behind a
CLI flag.

See https://github.com/getsentry/streaming-planning/issues/18

see getsentry/streaming-planning#18 The current process-spans consumer assumes that each span has a segment ID. In the new world we need to construct segments and correlate spans purely based on their parent-child relationship + timeouts. Build a new redis-based spans buffer, patch it into the existing consumer behind a CLI flag.

codecov · 2025-02-25T16:19:06Z

Codecov Report

Attention: Patch coverage is 98.88579% with 4 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/sentry/spans/consumers/process/flusher.py	92.85%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master   #85856       +/-   ##
===========================================
+ Coverage   33.16%   87.74%   +54.57%     
===========================================
  Files        8314     9834     +1520     
  Lines      463026   556594    +93568     
  Branches    21939    21939               
===========================================
+ Hits       153582   488373   +334791     
+ Misses     309013    67790   -241223     
  Partials      431      431

mcannizz

Here's some high level feedback; mostly questions I thought of as I read the code.

src/sentry/spans/buffer_v2.py

src/sentry/spans/consumers/process/factory.py

src/sentry/spans/buffer_v2.py

tests/sentry/spans/test_buffer_v2.py

untitaker · 2025-02-25T19:48:34Z

hey @mcannizz, this wasn't supposed to be reviewable yet, next steps for us is to test it in the sandbox but there's a lot of stuff to clean up in code. i'll address your comments partially though.

evanh

Some other questions:

This code doesn't seem to deal with the fact that a parent might arrive after a child. In that case the child won't be consolidated into the parent set in Redis. Is that something that will be addressed later/not at all?

src/sentry/spans/buffer_v2.py

untitaker · 2025-02-27T16:57:27Z

This code doesn't seem to deal with the fact that a parent might arrive after a child. In that case the child won't be consolidated into the parent set in Redis. Is that something that will be addressed later/not at all?

you're right -- there are a few other edgecases that are not properly handled. right now we're mostly concerned with performance. I don't think those cases will change our perf profile.

.

src/sentry/spans/consumers/process/factory.py

src/sentry/spans/buffer_v2.py

src/sentry/spans/consumers/process/factory.py

jan-auer · 2025-03-14T09:22:44Z

src/sentry/spans/consumers/process/factory.py

+            parent_span_id=val.get("parent_span_id"),
+            project_id=val["project_id"],
+            payload=payload.value,
+            # TODO: validate, this logic may not be complete.


Indeed, however we can follow up with a dedicated PR for this.

We shouldn't use span.op nor access sentry_tags in here, instead check the span's kind (Relay PR).

We need to call out that after flushing the buffer (or flusher?) checks whether the parent span is in a different project; otherwise considering it the same as is_remote is True.

ahh i couldn't find any code in relay-master that showed how to use those attributes, didn't know this was still WIP.

src/sentry/spans/buffer_v2.py

jan-auer · 2025-03-14T09:29:18Z

src/sentry/spans/buffer_v2.py

+
+                    payload_nums.append(len(payloads))
+                    for payload in payloads:
+                        span_id = rapidjson.loads(payload)["span_id"]


Would it be possible to avoid this load by getting the span_id from somewhere else?

we don't store individual span IDs in redis, only payloads, so when we flush we eventually have to get it out of the payload one way or another.

src/sentry/spans/buffer_v2.py

jan-auer · 2025-03-14T09:41:33Z

src/sentry/spans/buffer_v2.py

+                    # redis-cluster-py sentrywide. this probably leaves a bit of
+                    # perf on the table as we send the full lua sourcecode with every span.
+                    p.eval(
+                        add_buffer_script.script,


We are highly likely to find spans of the same trace in a batch. This means we could pre-process them in memory and minimize the numbers of mutations in Redis. For example, we should build partial trees in memory, then insert redirect keys for the top-most parent known at the current time.

I generally don't believe that reducing the amount of work done in Redis will pay off if we pay the cost for the same CPU-bound work but executed in Python. So while Redis is a scaling bottleneck I do believe it's just more cost efficient to run things like tree construction within Redis, and scaling out Redis to the degree we need it is actually not a problem.

Of course I say this without having it rigorously tested, but I don't think we should spend time on that. If this consumer was Rust then I would've been all-in on the local buffer approach.

src/sentry/spans/buffer_v2.py

untitaker requested review from a team as code owners February 25, 2025 15:52

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 25, 2025

vercel bot deployed to Preview February 25, 2025 15:57 View deployment

wip on making the spans buffer fit for redis cluster

1f4d0dd

vercel bot deployed to Preview February 25, 2025 18:36 View deployment

untitaker added 2 commits February 25, 2025 20:08

revert

c4048bd

fix sunionstore

cb1b88a

vercel bot deployed to Preview February 25, 2025 19:15 View deployment

mcannizz reviewed Feb 25, 2025

View reviewed changes

untitaker marked this pull request as draft February 25, 2025 19:47

untitaker added 2 commits February 25, 2025 20:56

add todo

ec2e6a6

add code comment

e45fa59

vercel bot deployed to Preview February 25, 2025 20:01 View deployment

untitaker added 2 commits February 27, 2025 13:12

add done_flush_segments

a7fead9

fix mypy

18d0cf7

vercel bot deployed to Preview February 27, 2025 12:50 View deployment

add todo

2896294

vercel bot deployed to Preview February 27, 2025 13:46 View deployment

fix fix fix

4be2e59

vercel bot deployed to Preview February 27, 2025 13:55 View deployment

evanh reviewed Feb 27, 2025

View reviewed changes

src/sentry/spans/buffer_v2.py Outdated Show resolved Hide resolved

src/sentry/spans/buffer_v2.py Outdated Show resolved Hide resolved

src/sentry/spans/buffer_v2.py Outdated Show resolved Hide resolved

src/sentry/spans/buffer_v2.py Outdated Show resolved Hide resolved

untitaker removed request for a team February 27, 2025 16:46

handle post-order correctly, add some docs

03085db

vercel bot deployed to Preview March 13, 2025 13:40 View deployment

untitaker and others added 2 commits March 13, 2025 16:12

remove excessive logging

5a5c616

Merge branch 'master' into spans-buffer-consumer-v2

41f683c

vercel bot deployed to Preview March 13, 2025 15:24 View deployment

jan-auer requested a review from evanh March 14, 2025 08:25

jan-auer reviewed Mar 14, 2025

View reviewed changes

untitaker added 2 commits March 14, 2025 11:48

inflight segments metric

7210543

apply review feedback

224fc5c

vercel bot deployed to Preview March 14, 2025 10:52 View deployment

rename file, fix tests, add one more docstring

7e91b3e

vercel bot deployed to Preview March 14, 2025 11:02 View deployment

fix a few bugs, remove backpressure

1fdb6c5

vercel bot deployed to Preview March 14, 2025 12:03 View deployment

fix nested root_spans case

3603b02

vercel bot deployed to Preview March 14, 2025 12:58 View deployment

fix typing

ed8a075

vercel bot deployed to Preview March 14, 2025 13:20 View deployment

remove more dead code

dfe0e72

vercel bot deployed to Preview March 14, 2025 13:41 View deployment

apply review feedback

5a3ba1b

jan-auer approved these changes Mar 14, 2025

View reviewed changes

vercel bot deployed to Preview March 14, 2025 14:28 View deployment

fix some more bugs and add e2e test

ab9be2e

vercel bot deployed to Preview March 14, 2025 15:14 View deployment

fix mypy

5da6916

vercel bot deployed to Preview March 14, 2025 15:20 View deployment

untitaker merged commit 1a8e073 into master Mar 14, 2025
49 checks passed

untitaker deleted the spans-buffer-consumer-v2 branch March 14, 2025 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ref(spans): Add spans buffer v2 #85856

ref(spans): Add spans buffer v2 #85856

untitaker commented Feb 25, 2025 •

edited by jan-auer

Loading

codecov bot commented Feb 25, 2025 •

edited

Loading

mcannizz left a comment

untitaker commented Feb 25, 2025

evanh left a comment •

edited

Loading

untitaker commented Feb 27, 2025

jan-auer Mar 14, 2025

untitaker Mar 14, 2025

jan-auer Mar 14, 2025

untitaker Mar 14, 2025

jan-auer Mar 14, 2025

untitaker Mar 14, 2025

ref(spans): Add spans buffer v2 #85856

ref(spans): Add spans buffer v2 #85856

Conversation

untitaker commented Feb 25, 2025 • edited by jan-auer Loading

codecov bot commented Feb 25, 2025 • edited Loading

Codecov Report

mcannizz left a comment

Choose a reason for hiding this comment

untitaker commented Feb 25, 2025

evanh left a comment • edited Loading

Choose a reason for hiding this comment

untitaker commented Feb 27, 2025

jan-auer Mar 14, 2025

Choose a reason for hiding this comment

untitaker Mar 14, 2025

Choose a reason for hiding this comment

jan-auer Mar 14, 2025

Choose a reason for hiding this comment

untitaker Mar 14, 2025

Choose a reason for hiding this comment

jan-auer Mar 14, 2025

Choose a reason for hiding this comment

untitaker Mar 14, 2025

Choose a reason for hiding this comment

untitaker commented Feb 25, 2025 •

edited by jan-auer

Loading

codecov bot commented Feb 25, 2025 •

edited

Loading

evanh left a comment •

edited

Loading