Skip to content

Add stack trace collection support #4269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Steepfile
Original file line number Diff line number Diff line change
@@ -157,6 +157,7 @@ target :datadog do
library 'zlib'
library 'time'
library 'pp'
library 'forwardable'

# Load all dependency signatures from the `vendor/rbs` directory
repo_path 'vendor/rbs'
30 changes: 29 additions & 1 deletion lib/datadog/appsec/actions_handler.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# frozen_string_literal: true

require_relative 'actions_handler/stack_trace_in_metastruct'
require_relative 'actions_handler/stack_trace_collection'

module Datadog
module AppSec
# this module encapsulates functions for handling actions that libddawf returns
@@ -19,7 +22,32 @@ def interrupt_execution(action_params)
throw(Datadog::AppSec::Ext::INTERRUPT, action_params)
end

def generate_stack(_action_params); end
def generate_stack(action_params)
return unless Datadog.configuration.appsec.stack_trace.enabled

context = AppSec.active_context
if context.nil? || context.trace.nil? && context.span.nil?
Datadog.logger.debug { 'Cannot find trace or service entry span to add stack trace' }
return
end

# Check that the sum of stack_trace count in trace and entry_span does not exceed configuration
span_stack = StackTraceInMetastruct.create(context.span&.metastruct)
trace_stack = StackTraceInMetastruct.create(context.trace&.metastruct)
config = Datadog.configuration.appsec.stack_trace
return if config.max_collect != 0 && span_stack.count + trace_stack.count >= config.max_collect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to avoid stack trace collection if the stack is longer than max_collect, or do we want to truncate the stack?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_collect is the maximum number of stack traces that we want in our trace, not the maximum number of frames in each stack trace (which is defined by max_depth). So if we already have the maximum number of stack traces in metastruct, we want to avoid collection.


# Generate stacktrace
stack_id = action_params['stack_id'].encode('UTF-8')
stack_frames = StackTraceCollection.collect(
max_depth: config.max_depth,
top_percent: config.max_depth_top_percent
)

# Add newly created stacktrace to metastruct
stack = context.trace.nil? ? span_stack : trace_stack
stack.push({ language: 'ruby', id: stack_id, frames: stack_frames })
end

def generate_schema(_action_params); end
end
49 changes: 49 additions & 0 deletions lib/datadog/appsec/actions_handler/stack_trace_collection.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# frozen_string_literal: true

module Datadog
module AppSec
module ActionsHandler
# Module that collects the stack trace into formatted hash
module StackTraceCollection
module_function

def collect(max_depth:, top_percent:)
locations = StackTraceCollection.filter_map_datadog_locations(caller_locations || [])

return [] if locations.empty?
return StackTraceCollection.convert(locations) if max_depth.zero? || locations.size <= max_depth

top_limit = (max_depth * top_percent / 100.0).round
bottom_limit = locations.size - (max_depth - top_limit)

locations.slice!(top_limit...bottom_limit)
Comment on lines +16 to +19
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is top_limit lower than bottom_limit?

What are we trying to do here? Get the top n items from the array? Do we really need to calculate both limits, or can we just slice n items from one end?

Copy link
Contributor Author

@vpellan vpellan Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, if max_depth is 32 and top_percent is 75%, and we have more than 32 stack frames, we want to keep the 24th first stack traces, the 8th last, and drop everything in the middle. So what this method do is calculate the limits of the frames that we should drop, so we can just do a slice on the locations with that range

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to RFC, we have to take 2 slices - one from the top of the stack trace, and one from the bottom:

In the original RFC, some restrictions were placed on the layout of stack traces, namely:
Stack traces must contain a maximum of 32 frames.
When a stack trace is truncated, 24 frames should be from the bottom of the stack and 8 from the top.

The number of top and bottom frames on truncated stack traces can be considered a flexible requirement and it’s up to the library implementor to decide the most useful number of frames from each side of the stack for their language. In addition, since the user can configure the maximum number of frames through the use of DD_APPSEC_MAX_STACK_TRACE_DEPTH, the number of top and bottom frames for a truncated stack trace of arbitrary size must be computed dynamically based on the configuration value, with the simplest approach being a proportional one. 

Similarly, the default maximum number of frames might not be particularly useful in languages with notoriously deep stacks, for that reason, the requirement must be updated to ensure that each library can provide meaningful stack traces.

Given the above, the requirements can be amended as follows:
Stack traces must be limited to a maximum of 32 frames, or a more suitable higher value for the given language.
When a stack trace is truncated, libraries must collect some frames from both the top and bottom of the stack. As a recommendation, libraries may collect 75% of frames from the bottom of the stack and 25% from the top.

I think this is not what we are doing here?

StackTraceCollection.convert(locations)
end

def filter_map_datadog_locations(locations)
locations.each_with_object([]) do |location, result|
text = location.to_s
next if text.include?('lib/datadog')

result << {
text: text,
file: location.absolute_path || location.path,
line: location.lineno,
function: location.label
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC specifies separate class_name and function fields, and #label is containing both class name and function name. We have to parse it

}
end
end

def convert(locations)
locations.each_with_index do |location, index|
location[:id] = index
# Strings can be frozen so we need to copy them
location[:text] = location[:text].encode('UTF-8')
location[:file] = location[:file]&.encode('UTF-8')
location[:function] = location[:function]&.encode('UTF-8')
end
end
end
end
end
end
39 changes: 39 additions & 0 deletions lib/datadog/appsec/actions_handler/stack_trace_in_metastruct.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# frozen_string_literal: true

module Datadog
module AppSec
module ActionsHandler
# Object that holds a metastruct, and modify the exploit group stack traces
class StackTraceInMetastruct
# Implementation with empty metastruct
class Noop
def count
0
end

def push(_)
nil
end
end

def self.create(metastruct)
metastruct.nil? ? Noop.new : new(metastruct)
end

def initialize(metastruct)
@metastruct = metastruct
end

def count
@metastruct.dig(AppSec::Ext::TAG_STACK_TRACE, AppSec::Ext::EXPLOIT_PREVENTION_EVENT_CATEGORY)&.size || 0
end

def push(stack_trace)
@metastruct[AppSec::Ext::TAG_STACK_TRACE] ||= {}
@metastruct[AppSec::Ext::TAG_STACK_TRACE][AppSec::Ext::EXPLOIT_PREVENTION_EVENT_CATEGORY] ||= []
@metastruct[AppSec::Ext::TAG_STACK_TRACE][AppSec::Ext::EXPLOIT_PREVENTION_EVENT_CATEGORY] << stack_trace
end
end
end
end
end
49 changes: 49 additions & 0 deletions lib/datadog/appsec/configuration/settings.rb
Original file line number Diff line number Diff line change
@@ -267,6 +267,55 @@ def self.add_settings!(base)
o.default false
end
end

settings :stack_trace do
option :enabled do |o|
o.type :bool
o.env 'DD_APPSEC_STACK_TRACE_ENABLED'
o.default true
end

# The maximum number of stack frames to collect for each stack trace.
# If the number of frames in a stack trace exceeds this value,
# max_depth / 4 frames will be collected from the top, and max_depth * 3 / 4 from the bottom.
option :max_depth do |o|
o.type :int
o.env 'DD_APPSEC_MAX_STACK_TRACE_DEPTH'
o.default 32
# 0 means no limit
o.setter do |value|
value = 0 if value.negative?
value
end
end

# The percentage that decides the number of top stack frame to collect
# for each stack trace if there is more stack frames than max_depth.
# number_of_top_frames = max_depth * max_depth_top_percent / 100
# Default is 75
option :max_depth_top_percent do |o|
o.type :float
o.env 'DD_APPSEC_MAX_STACK_TRACE_DEPTH_TOP_PERCENT'
o.default 75
o.setter do |value|
value = 100 if value > 100
value = 0 if value < 0
value
end
end

# The maximum number of stack traces to collect for each exploit prevention event.
option :max_collect do |o|
o.type :int
o.env 'DD_APPSEC_MAX_STACK_TRACES'
o.default 2
# 0 means no limit
o.setter do |value|
value = 0 if value < 0
value
end
end
end
end
end
end
8 changes: 6 additions & 2 deletions lib/datadog/appsec/context.rb
Original file line number Diff line number Diff line change
@@ -62,8 +62,12 @@ def extract_schema
def export_metrics
return if @span.nil?

Metrics::Exporter.export_waf_metrics(@metrics.waf, @span)
Metrics::Exporter.export_rasp_metrics(@metrics.rasp, @span)
# This does not caused a steep error previously because
# @span was wrongly defined as a SpanOperation that cannot be nil in context.rbs.
# Even though we check that @span is not nil, steep consideres that the thread can pause after that check,
# and another thread change it to nil. This does not happen in our case, which is why steep:ignore has been added.
Metrics::Exporter.export_waf_metrics(@metrics.waf, @span) # steep:ignore ArgumentTypeMismatch
Metrics::Exporter.export_rasp_metrics(@metrics.rasp, @span) # steep:ignore ArgumentTypeMismatch
end

def finalize
2 changes: 2 additions & 0 deletions lib/datadog/appsec/ext.rb
Original file line number Diff line number Diff line change
@@ -10,12 +10,14 @@ module Ext
INTERRUPT = :datadog_appsec_interrupt
CONTEXT_KEY = 'datadog.appsec.context'
ACTIVE_CONTEXT_KEY = :datadog_appsec_active_context
EXPLOIT_PREVENTION_EVENT_CATEGORY = 'exploit'

TAG_APPSEC_ENABLED = '_dd.appsec.enabled'
TAG_APM_ENABLED = '_dd.apm.enabled'
TAG_DISTRIBUTED_APPSEC_EVENT = '_dd.p.appsec'

TELEMETRY_METRICS_NAMESPACE = 'appsec'
TAG_STACK_TRACE = '_dd.stack'
end
end
end
1 change: 1 addition & 0 deletions lib/datadog/tracing/metadata.rb
Original file line number Diff line number Diff line change
@@ -2,6 +2,7 @@

require_relative 'metadata/analytics'
require_relative 'metadata/tagging'
require_relative 'metadata/metastruct'
require_relative 'metadata/errors'

module Datadog
64 changes: 64 additions & 0 deletions lib/datadog/tracing/metadata/metastruct.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# frozen_string_literal: true

require 'forwardable'

module Datadog
module Tracing
module Metadata
# Adds complex structures tagging behavior through metastruct
class Metastruct
extend Forwardable

MERGER = proc do |_, v1, v2|
if v1.is_a?(Hash) && v2.is_a?(Hash)
v1.merge(v2, &MERGER)
elsif v1.is_a?(Array) && v2.is_a?(Array)
v1.concat(v2)
elsif v2.nil?
v1
else
v2
end
end

def self.empty
new({})
end

def initialize(metastruct)
@metastruct = metastruct
end

# Deep merge two metastructs
# If the types are not both Arrays or Hashes, the second one will overwrite the first one
#
# Example with same types:
# metastruct = { a: { b: [1, 2] } }
# second = { a: { b: [3, 4], c: 5 } }
# result = { a: { b: [1, 2, 3, 4], c: 5 } }
#
# Example with different types:
# metastruct = { a: { b: 1 } }
# second = { a: { b: [2, 3] } }
# result = { a: { b: [2, 3] } }
def deep_merge!(second)
@metastruct.merge!(second.to_h, &MERGER)
end

def_delegators :@metastruct, :[], :[]=, :dig, :to_h

def pretty_print(q)
q.seplist @metastruct.each do |key, value|
q.text "#{key} => #{value}\n"
end
end

def to_msgpack(packer = nil)
packer ||= MessagePack::Packer.new

packer.write(@metastruct.transform_values(&:to_msgpack))
end
end
end
end
end
10 changes: 9 additions & 1 deletion lib/datadog/tracing/span.rb
Original file line number Diff line number Diff line change
@@ -21,6 +21,7 @@ class Span
:end_time,
:id,
:meta,
:metastruct,
:metrics,
:name,
:parent_id,
@@ -53,6 +54,7 @@ def initialize(
end_time: nil,
id: nil,
meta: nil,
metastruct: {},
metrics: nil,
parent_id: 0,
resource: name,
@@ -75,6 +77,7 @@ def initialize(
@trace_id = trace_id || Tracing::Utils.next_id

@meta = meta || {}
@metastruct = Tracing::Metadata::Metastruct.new(metastruct)
@metrics = metrics || {}
@status = status || 0

@@ -144,6 +147,7 @@ def to_hash
error: @status,
meta: @meta,
metrics: @metrics,
meta_struct: @metastruct.to_h,
name: @name,
parent_id: @parent_id,
resource: @resource,
@@ -185,12 +189,16 @@ def pretty_print(q)
q.text "#{key} => #{value}"
end
end
q.group(2, 'Metrics: [', ']') do
q.group(2, 'Metrics: [', "]\n") do
q.breakable
q.seplist @metrics.each do |key, value|
q.text "#{key} => #{value}"
end
end
q.group(2, 'Metastruct: ') do
q.breakable
q.pp metastruct
end
end
end

Loading