-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stack trace collection support #4269
Conversation
Datadog ReportBranch report: ❌ 9 Failed (0 Known Flaky), 21946 Passed, 1476 Skipped, 5m 27.74s Total Time ❌ Failed Tests (9)
|
BenchmarksBenchmark execution time: 2025-03-03 17:52:52 Comparing candidate commit d68d25b in PR branch Found 0 performance improvements and 1 performance regressions! Performance is the same for 30 metrics, 2 unstable metrics. scenario:profiler - gvl benchmark samples
|
f79044e
to
77e9969
Compare
5d75fc3
to
255e242
Compare
@@ -35,7 +35,7 @@ def self.subscribe(engine, context) | |||
next unless result.match? | |||
|
|||
yield result | |||
throw(:block, true) unless result.actions.empty? | |||
throw(:block, true) if result.actions.include?('block') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we changing this everywhere?
Datadog ReportBranch report: ✅ 0 Failed, 22135 Passed, 1476 Skipped, 5m 2.11s Total Time |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## vpellan/meta-struct #4269 +/- ##
=======================================================
- Coverage 97.71% 97.69% -0.02%
=======================================================
Files 1370 1377 +7
Lines 83106 83583 +477
Branches 4227 4254 +27
=======================================================
+ Hits 81209 81659 +450
- Misses 1897 1924 +27 ☔ View full report in Codecov by Sentry. |
5dfecd6
to
7aa25ec
Compare
b3dc5c2
to
0836e11
Compare
a771a6d
to
65dd37b
Compare
7aa25ec
to
c49806e
Compare
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
c49806e
to
9f10909
Compare
spec/datadog/appsec/actions_handler/stack_trace/collector_spec.rb
Outdated
Show resolved
Hide resolved
65dd37b
to
630c10d
Compare
365dd2f
to
8bae22c
Compare
8bae22c
to
8aa9b65
Compare
- Make metastruct extends Forwardable and forward methods to @Metastruct -add self.empty for when creating an empty metastruct (used in SpanOperation
- Extract Thread:Backtrace:Location to hash convertion - Add Noop class for nil metastruct
8aa9b65
to
094cc5e
Compare
|
||
private | ||
|
||
def generate_stack_trace: (Integer max_depth, Float top_percent) -> (Hash[Symbol, String | Integer | Array[Hash[Symbol, String | Integer | nil]] | nil ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the return a missing type alias? And with type you can have it recursive
module AppSec | ||
module ActionsHandler | ||
class StackTraceInMetastruct | ||
type stack_trace = Hash[Symbol, String | nil | Array[Datadog::AppSec::ActionsHandler::StackTraceCollection::stack_frame]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be opposite order
type stack_trace = Hash[Symbol, String | nil | Array[Datadog::AppSec::ActionsHandler::StackTraceCollection::stack_frame]] | |
type stack_trace = Hash[nil, Symbol, String | Array[Datadog::AppSec::ActionsHandler::StackTraceCollection::stack_frame]] |
locations = [] | ||
1000.times do | ||
locations << self.locations.first | ||
end | ||
locations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will that work?
locations = [] | |
1000.times do | |
locations << self.locations.first | |
end | |
locations | |
1000.times.map { self.locations.first } |
subject(:set_stack_trace_max_depth_top_percent) do | ||
settings.appsec.stack_trace.max_depth_top_percent = stack_trace_max_depth_top_percent | ||
end | ||
|
||
context 'when given a value' do | ||
let(:stack_trace_max_depth_top_percent) { 50 } | ||
|
||
before { set_stack_trace_max_depth_top_percent } | ||
|
||
it { expect(settings.appsec.stack_trace.max_depth_top_percent).to eq(50) } | ||
end | ||
|
||
context 'when given a negative value' do | ||
let(:stack_trace_max_depth_top_percent) { -1 } | ||
|
||
before { set_stack_trace_max_depth_top_percent } | ||
|
||
it { expect(settings.appsec.stack_trace.max_depth_top_percent).to eq(0) } | ||
end | ||
|
||
context 'when given a value higher than 100' do | ||
let(:stack_trace_max_depth_top_percent) { 101 } | ||
|
||
before { set_stack_trace_max_depth_top_percent } | ||
|
||
it { expect(settings.appsec.stack_trace.max_depth_top_percent).to eq(100) } | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all the examples could be simplified to examples like
context 'when given a value' do
it 'sets max value to 0' do
settings.appsec.stack_trace.max_depth_top_percent = -1
expect(settings.appsec.stack_trace.max_depth_top_percent).to eq(0)
end
end
end | ||
|
||
context 'when stack trace is enabled and context contains only span' do | ||
let(:context) { OpenStruct.new(span: span_op) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use verified double i.e (instance_double
) here
let(:exploit_category) { Datadog::AppSec::Ext::EXPLOIT_PREVENTION_EVENT_CATEGORY } | ||
|
||
before do | ||
Datadog.configure do |c| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI if you go this way - you will need to clean the settings afterwards (see other tests where we do that)
ThreadBacktraceHelper.locations_inside_nested_blocks.map.with_index do |location, index| | ||
{ | ||
id: index, | ||
text: location.to_s.encode('UTF-8'), | ||
file: (location.absolute_path || location.path)&.encode('UTF-8'), | ||
line: location.lineno, | ||
function: location.label&.encode('UTF-8') | ||
} | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH i thought that helper suppose to make things easier. I think here it's either missing something or else ...
@@ -97,4 +101,124 @@ | |||
end | |||
end | |||
end | |||
|
|||
describe '.generate_stack' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH this test suite look overly complicated and violates certain test practices and it would be easier to go over in 1:1 instead of chatting here.
stack = context.trace.nil? ? span_stack : trace_stack | ||
stack.push({ language: 'ruby', id: utf8_stack_id, frames: stack_frames }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stack = context.trace.nil? ? span_stack : trace_stack | |
stack.push({ language: 'ruby', id: utf8_stack_id, frames: stack_frames }) | |
(context.trace || context.span).push(language: 'ruby', id: stack_id, frames: stack_frames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not work as we are using the push method defined in StackTraceInMetastruct
return if config.max_collect != 0 && span_stack.count + trace_stack.count >= config.max_collect | ||
|
||
# Generate stacktrace | ||
utf8_stack_id = action_params['stack_id'].encode('UTF-8') if action_params['stack_id'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable will be nil
if no params[stack_id]
is given, right? I think the whole method will not work without stack_id, hence we always have it or of not - we should do nothing and report a telemetry or generate warning or so.
Because of that that line must be adjusted to
utf8_stack_id = action_params['stack_id'].encode('UTF-8') if action_params['stack_id'] | |
stack_id = action_params['stack_id'].encode('UTF-8') |
P.S it doesn't make sense to mention UTF-8 in the variable name because it's not important
span_stack = ActionsHandler::StackTraceInMetastruct.create(context.span&.metastruct) | ||
trace_stack = ActionsHandler::StackTraceInMetastruct.create(context.trace&.metastruct) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span_stack = ActionsHandler::StackTraceInMetastruct.create(context.span&.metastruct) | |
trace_stack = ActionsHandler::StackTraceInMetastruct.create(context.trace&.metastruct) | |
span_stack = StackTraceInMetastruct.create(context.span&.metastruct) | |
trace_stack = StackTraceInMetastruct.create(context.trace&.metastruct) |
module_function | ||
|
||
def collect(max_depth:, top_percent:) | ||
locations = (caller_locations || []).reject { |location| location.to_s.include?('lib/datadog') } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be converted into mutating to avoid allocation
locations = (caller_locations || []).reject { |location| location.to_s.include?('lib/datadog') } | |
locations = (caller_locations || []) | |
locations.reject! { |location| location.to_s.include?('lib/datadog') } |
locations.map.with_index do |location, index| | ||
{ | ||
id: index, | ||
text: location.to_s.encode('UTF-8'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a second time we convert location into the string, I think it makes sense to do it once at the beginning of the collection
return | ||
end | ||
|
||
config = Datadog.configuration.appsec.stack_trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be closer to the guard-clause on line 39
config = Datadog.configuration.appsec.stack_trace | ||
|
||
# Check that the sum of stack_trace count in trace and entry_span does not exceed configuration | ||
span_stack = ActionsHandler::StackTraceInMetastruct.create(context.span&.metastruct) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we calling span metastruct a span_stack
? Should we rather call it span_metastruct
? Same one line below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because StackTraceInMetastruct is specifically applying changes to '_dd.stack' field of metastruct. We give a metastruct as an input but we cannot do any changes to it other than on the stack traces
# Check that the sum of stack_trace count in trace and entry_span does not exceed configuration | ||
span_stack = ActionsHandler::StackTraceInMetastruct.create(context.span&.metastruct) | ||
trace_stack = ActionsHandler::StackTraceInMetastruct.create(context.trace&.metastruct) | ||
return if config.max_collect != 0 && span_stack.count + trace_stack.count >= config.max_collect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to avoid stack trace collection if the stack is longer than max_collect
, or do we want to truncate the stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_collect is the maximum number of stack traces that we want in our trace, not the maximum number of frames in each stack trace (which is defined by max_depth). So if we already have the maximum number of stack traces in metastruct, we want to avoid collection.
top_limit = (max_depth * top_percent / 100.0).round | ||
bottom_limit = locations.size - (max_depth - top_limit) | ||
|
||
locations.slice!(top_limit...bottom_limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is top_limit
lower than bottom_limit
?
What are we trying to do here? Get the top n items from the array? Do we really need to calculate both limits, or can we just slice n items from one end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, if max_depth is 32 and top_percent is 75%, and we have more than 32 stack frames, we want to keep the 24th first stack traces, the 8th last, and drop everything in the middle. So what this method do is calculate the limits of the frames that we should drop, so we can just do a slice on the locations with that range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to RFC, we have to take 2 slices - one from the top of the stack trace, and one from the bottom:
In the original RFC, some restrictions were placed on the layout of stack traces, namely:
Stack traces must contain a maximum of 32 frames.
When a stack trace is truncated, 24 frames should be from the bottom of the stack and 8 from the top.
The number of top and bottom frames on truncated stack traces can be considered a flexible requirement and it’s up to the library implementor to decide the most useful number of frames from each side of the stack for their language. In addition, since the user can configure the maximum number of frames through the use of DD_APPSEC_MAX_STACK_TRACE_DEPTH, the number of top and bottom frames for a truncated stack trace of arbitrary size must be computed dynamically based on the configuration value, with the simplest approach being a proportional one.
Similarly, the default maximum number of frames might not be particularly useful in languages with notoriously deep stacks, for that reason, the requirement must be updated to ensure that each library can provide meaningful stack traces.
Given the above, the requirements can be amended as follows:
Stack traces must be limited to a maximum of 32 frames, or a more suitable higher value for the given language.
When a stack trace is truncated, libraries must collect some frames from both the top and bottom of the stack. As a recommendation, libraries may collect 75% of frames from the bottom of the stack and 25% from the top.
I think this is not what we are doing here?
…ck_trace_collection - Fix specs
top_limit = (max_depth * top_percent / 100.0).round | ||
bottom_limit = locations.size - (max_depth - top_limit) | ||
|
||
locations.slice!(top_limit...bottom_limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to RFC, we have to take 2 slices - one from the top of the stack trace, and one from the bottom:
In the original RFC, some restrictions were placed on the layout of stack traces, namely:
Stack traces must contain a maximum of 32 frames.
When a stack trace is truncated, 24 frames should be from the bottom of the stack and 8 from the top.
The number of top and bottom frames on truncated stack traces can be considered a flexible requirement and it’s up to the library implementor to decide the most useful number of frames from each side of the stack for their language. In addition, since the user can configure the maximum number of frames through the use of DD_APPSEC_MAX_STACK_TRACE_DEPTH, the number of top and bottom frames for a truncated stack trace of arbitrary size must be computed dynamically based on the configuration value, with the simplest approach being a proportional one.
Similarly, the default maximum number of frames might not be particularly useful in languages with notoriously deep stacks, for that reason, the requirement must be updated to ensure that each library can provide meaningful stack traces.
Given the above, the requirements can be amended as follows:
Stack traces must be limited to a maximum of 32 frames, or a more suitable higher value for the given language.
When a stack trace is truncated, libraries must collect some frames from both the top and bottom of the stack. As a recommendation, libraries may collect 75% of frames from the bottom of the stack and 25% from the top.
I think this is not what we are doing here?
text: text, | ||
file: location.absolute_path || location.path, | ||
line: location.lineno, | ||
function: location.label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC specifies separate class_name
and function
fields, and #label
is containing both class name and function name. We have to parse it
1fe22fc
to
2e8289c
Compare
The slice! method will remove the part in the middle and locations will then contains the top part and the bottom part only. |
sorry, my bad, you are right |
d580c5e
to
a70a07e
Compare
What does this PR do?
This PR adds stack trace collection support using meta_struct
Motivation:
Stack trace collection is required for exploit prevention.
Change log entry
Yes. Fix stack trace not being with RASP SQLi signals
Additional Notes:
How to test the change?