Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[profiling] Reduce copying and allocation in exporter #926

Merged
merged 22 commits into from
Mar 21, 2025

Conversation

danielsn
Copy link
Contributor

What does this PR do?

Passes the EncodedProfile as a Rust object, rather than forcing it through the C-FFI straw.

Motivation

We ended up having to make a new Vec and copy the bytes from the pprof in, even though they were already there. This saves the allocation and copy (which can be several MB).

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Existing tests

@github-actions github-actions bot added the profiling Relates to the profiling* modules. label Mar 13, 2025
@pr-commenter
Copy link

pr-commenter bot commented Mar 13, 2025

Benchmarks

Comparison

Benchmark execution time: 2025-03-20 21:39:17

Comparing candidate commit 26375b3 in PR branch dsn/exporter_avoid_copy with baseline commit 9f66ffa in branch main.

Found 2 performance improvements and 0 performance regressions! Performance is the same for 50 metrics, 2 unstable metrics.

scenario:credit_card/is_card_number/ 378282246310005

  • 🟩 execution_time [-8.503µs; -8.386µs] or [-9.967%; -9.830%]
  • 🟩 throughput [+1278766.712op/s; +1296775.414op/s] or [+10.909%; +11.063%]

Candidate

Candidate benchmark details

Group 1

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
concentrator/add_spans_to_concentrator execution_time 5.985ms 5.996ms ± 0.007ms 5.995ms ± 0.003ms 5.998ms 6.005ms 6.011ms 6.052ms 0.95% 4.392 30.506 0.12% 0.000ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
concentrator/add_spans_to_concentrator execution_time [5.995ms; 5.997ms] or [-0.016%; +0.016%] None None None

Group 2

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
ip_address/quantize_peer_ip_address_benchmark execution_time 4.928µs 5.002µs ± 0.032µs 4.997µs ± 0.023µs 5.030µs 5.055µs 5.060µs 5.062µs 1.30% 0.051 -0.684 0.64% 0.002µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark execution_time [4.998µs; 5.007µs] or [-0.089%; +0.089%] None None None

Group 3

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
write only interface execution_time 1.176µs 3.196µs ± 1.424µs 2.999µs ± 0.022µs 3.019µs 3.635µs 13.849µs 14.873µs 395.87% 7.403 55.726 44.45% 0.101µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
write only interface execution_time [2.999µs; 3.394µs] or [-6.176%; +6.176%] None None None

Group 4

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching string interning on wordpress profile execution_time 147.962µs 148.882µs ± 0.338µs 148.844µs ± 0.142µs 148.986µs 149.405µs 149.632µs 152.012µs 2.13% 4.086 35.685 0.23% 0.024µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching string interning on wordpress profile execution_time [148.836µs; 148.929µs] or [-0.031%; +0.031%] None None None

Group 5

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
tags/replace_trace_tags execution_time 2.316µs 2.373µs ± 0.018µs 2.371µs ± 0.009µs 2.385µs 2.406µs 2.415µs 2.440µs 2.90% -0.073 1.808 0.77% 0.001µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
tags/replace_trace_tags execution_time [2.371µs; 2.376µs] or [-0.107%; +0.107%] None None None

Group 6

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
redis/obfuscate_redis_string execution_time 31.572µs 32.541µs ± 1.319µs 31.734µs ± 0.107µs 34.083µs 34.975µs 35.021µs 35.461µs 11.75% 0.964 -0.940 4.04% 0.093µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
redis/obfuscate_redis_string execution_time [32.358µs; 32.724µs] or [-0.562%; +0.562%] None None None

Group 7

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_trace/test_trace execution_time 245.757ns 254.542ns ± 13.544ns 248.487ns ± 2.283ns 255.869ns 286.087ns 301.373ns 304.701ns 22.62% 2.216 4.059 5.31% 0.958ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_trace/test_trace execution_time [252.665ns; 256.419ns] or [-0.737%; +0.737%] None None None

Group 8

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sql/obfuscate_sql_string execution_time 66.527µs 66.769µs ± 0.208µs 66.752µs ± 0.071µs 66.822µs 66.910µs 67.024µs 69.093µs 3.51% 7.820 80.022 0.31% 0.015µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sql/obfuscate_sql_string execution_time [66.740µs; 66.798µs] or [-0.043%; +0.043%] None None None

Group 9

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time 208.647µs 209.073µs ± 0.166µs 209.066µs ± 0.103µs 209.172µs 209.347µs 209.501µs 209.615µs 0.26% 0.200 0.291 0.08% 0.012µs 1 200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput 4770657.910op/s 4783010.372op/s ± 3805.878op/s 4783186.462op/s ± 2345.128op/s 4785424.236op/s 4789309.144op/s 4791158.252op/s 4792791.003op/s 0.20% -0.194 0.286 0.08% 269.116op/s 1 200
normalization/normalize_name/normalize_name/bad-name execution_time 18.238µs 18.316µs ± 0.044µs 18.311µs ± 0.034µs 18.348µs 18.391µs 18.425µs 18.437µs 0.69% 0.289 -0.428 0.24% 0.003µs 1 200
normalization/normalize_name/normalize_name/bad-name throughput 54239797.335op/s 54598660.134op/s ± 131326.850op/s 54612783.909op/s ± 100305.927op/s 54705808.604op/s 54822066.004op/s 54830884.780op/s 54831836.747op/s 0.40% -0.278 -0.439 0.24% 9286.211op/s 1 200
normalization/normalize_name/normalize_name/good execution_time 10.664µs 10.724µs ± 0.037µs 10.717µs ± 0.025µs 10.745µs 10.792µs 10.817µs 10.837µs 1.11% 0.606 -0.273 0.34% 0.003µs 1 200
normalization/normalize_name/normalize_name/good throughput 92277639.282op/s 93249169.746op/s ± 318646.608op/s 93305743.555op/s ± 222329.620op/s 93499148.267op/s 93692262.151op/s 93719108.965op/s 93775337.375op/s 0.50% -0.592 -0.298 0.34% 22531.718op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time [209.050µs; 209.097µs] or [-0.011%; +0.011%] None None None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput [4782482.914op/s; 4783537.831op/s] or [-0.011%; +0.011%] None None None
normalization/normalize_name/normalize_name/bad-name execution_time [18.309µs; 18.322µs] or [-0.033%; +0.033%] None None None
normalization/normalize_name/normalize_name/bad-name throughput [54580459.496op/s; 54616860.773op/s] or [-0.033%; +0.033%] None None None
normalization/normalize_name/normalize_name/good execution_time [10.719µs; 10.729µs] or [-0.047%; +0.047%] None None None
normalization/normalize_name/normalize_name/good throughput [93205008.391op/s; 93293331.102op/s] or [-0.047%; +0.047%] None None None

Group 10

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
two way interface execution_time 17.672µs 26.043µs ± 11.053µs 17.926µs ± 0.152µs 35.392µs 44.553µs 46.006µs 89.580µs 399.72% 1.654 5.205 42.34% 0.782µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
two way interface execution_time [24.511µs; 27.575µs] or [-5.882%; +5.882%] None None None

Group 11

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching deserializing traces from msgpack to their internal representation execution_time 55.426ms 55.682ms ± 0.205ms 55.619ms ± 0.041ms 55.670ms 56.111ms 56.403ms 56.946ms 2.39% 2.972 10.344 0.37% 0.014ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching deserializing traces from msgpack to their internal representation execution_time [55.653ms; 55.710ms] or [-0.051%; +0.051%] None None None

Group 12

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
credit_card/is_card_number/ execution_time 3.896µs 3.915µs ± 0.003µs 3.915µs ± 0.001µs 3.916µs 3.920µs 3.922µs 3.924µs 0.23% -1.098 8.803 0.07% 0.000µs 1 200
credit_card/is_card_number/ throughput 254842074.217op/s 255430328.959op/s ± 185419.287op/s 255432080.616op/s ± 93068.055op/s 255517055.775op/s 255732683.506op/s 255783816.716op/s 256652704.291op/s 0.48% 1.119 8.934 0.07% 13111.124op/s 1 200
credit_card/is_card_number/ 3782-8224-6310-005 execution_time 81.677µs 82.084µs ± 0.243µs 82.023µs ± 0.112µs 82.185µs 82.474µs 82.867µs 83.458µs 1.75% 2.204 8.263 0.29% 0.017µs 1 200
credit_card/is_card_number/ 3782-8224-6310-005 throughput 11982004.208op/s 12182734.907op/s ± 35776.132op/s 12191687.098op/s ± 16561.487op/s 12206405.314op/s 12219563.521op/s 12227742.205op/s 12243385.661op/s 0.42% -2.157 7.928 0.29% 2529.755op/s 1 200
credit_card/is_card_number/ 378282246310005 execution_time 76.401µs 76.868µs ± 0.304µs 76.851µs ± 0.244µs 77.045µs 77.470µs 77.601µs 77.817µs 1.26% 0.648 -0.128 0.39% 0.021µs 1 200
credit_card/is_card_number/ 378282246310005 throughput 12850709.094op/s 13009473.587op/s ± 51272.062op/s 13012199.546op/s ± 41474.271op/s 13055418.358op/s 13075232.177op/s 13087228.227op/s 13088811.684op/s 0.59% -0.631 -0.165 0.39% 3625.482op/s 1 200
credit_card/is_card_number/37828224631 execution_time 3.898µs 3.915µs ± 0.002µs 3.915µs ± 0.001µs 3.916µs 3.918µs 3.919µs 3.922µs 0.19% -1.614 9.660 0.06% 0.000µs 1 200
credit_card/is_card_number/37828224631 throughput 254957552.312op/s 255445511.973op/s ± 161141.118op/s 255440814.364op/s ± 86910.395op/s 255526101.992op/s 255725616.497op/s 255800045.091op/s 256537901.040op/s 0.43% 1.631 9.785 0.06% 11394.398op/s 1 200
credit_card/is_card_number/378282246310005 execution_time 71.875µs 72.316µs ± 0.324µs 72.308µs ± 0.268µs 72.483µs 72.888µs 73.199µs 73.280µs 1.34% 0.729 -0.189 0.45% 0.023µs 1 200
credit_card/is_card_number/378282246310005 throughput 13646218.544op/s 13828408.682op/s ± 61752.208op/s 13829735.265op/s ± 51262.218op/s 13885453.252op/s 13900940.816op/s 13909286.392op/s 13913068.545op/s 0.60% -0.711 -0.230 0.45% 4366.540op/s 1 200
credit_card/is_card_number/37828224631000521389798 execution_time 51.895µs 52.152µs ± 0.106µs 52.153µs ± 0.064µs 52.216µs 52.304µs 52.429µs 52.693µs 1.04% 0.592 2.834 0.20% 0.008µs 1 200
credit_card/is_card_number/37828224631000521389798 throughput 18977764.682op/s 19174970.782op/s ± 39019.066op/s 19174322.706op/s ± 23364.478op/s 19197551.365op/s 19242247.124op/s 19256357.563op/s 19269700.433op/s 0.50% -0.565 2.724 0.20% 2759.065op/s 1 200
credit_card/is_card_number/x371413321323331 execution_time 6.435µs 6.454µs ± 0.030µs 6.444µs ± 0.003µs 6.447µs 6.519µs 6.581µs 6.667µs 3.47% 3.584 15.788 0.47% 0.002µs 1 200
credit_card/is_card_number/x371413321323331 throughput 149990211.521op/s 154949445.423op/s ± 717342.522op/s 155191086.895op/s ± 70664.372op/s 155257909.583op/s 155332001.499op/s 155379422.050op/s 155401570.466op/s 0.14% -3.517 15.076 0.46% 50723.776op/s 1 200
credit_card/is_card_number_no_luhn/ execution_time 3.891µs 3.914µs ± 0.003µs 3.915µs ± 0.001µs 3.916µs 3.918µs 3.919µs 3.920µs 0.14% -3.016 21.600 0.07% 0.000µs 1 200
credit_card/is_card_number_no_luhn/ throughput 255083958.064op/s 255469179.431op/s ± 188225.157op/s 255447480.006op/s ± 93363.803op/s 255550068.647op/s 255788515.409op/s 255888701.595op/s 257022989.726op/s 0.62% 3.048 21.939 0.07% 13309.528op/s 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time 63.977µs 64.234µs ± 0.080µs 64.235µs ± 0.048µs 64.281µs 64.354µs 64.438µs 64.595µs 0.56% 0.220 2.158 0.12% 0.006µs 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput 15481178.856op/s 15568127.913op/s ± 19475.599op/s 15567752.957op/s ± 11673.214op/s 15579848.259op/s 15599967.865op/s 15613659.198op/s 15630545.211op/s 0.40% -0.205 2.127 0.12% 1377.133op/s 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time 58.164µs 58.297µs ± 0.080µs 58.287µs ± 0.027µs 58.309µs 58.456µs 58.647µs 58.703µs 0.71% 2.581 8.688 0.14% 0.006µs 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 throughput 17034811.727op/s 17153441.763op/s ± 23593.578op/s 17156453.490op/s ± 7970.451op/s 17165239.532op/s 17178895.126op/s 17187117.104op/s 17192797.937op/s 0.21% -2.564 8.598 0.14% 1668.318op/s 1 200
credit_card/is_card_number_no_luhn/37828224631 execution_time 3.893µs 3.914µs ± 0.003µs 3.914µs ± 0.002µs 3.916µs 3.919µs 3.920µs 3.937µs 0.57% 0.197 16.037 0.08% 0.000µs 1 200
credit_card/is_card_number_no_luhn/37828224631 throughput 254020869.893op/s 255469706.883op/s ± 217591.842op/s 255467213.424op/s ± 112689.404op/s 255579480.702op/s 255756382.463op/s 255847611.798op/s 256850537.727op/s 0.54% -0.151 15.996 0.08% 15386.067op/s 1 200
credit_card/is_card_number_no_luhn/378282246310005 execution_time 54.564µs 54.775µs ± 0.236µs 54.666µs ± 0.033µs 54.799µs 55.281µs 55.620µs 55.724µs 1.93% 1.939 3.104 0.43% 0.017µs 1 200
credit_card/is_card_number_no_luhn/378282246310005 throughput 17945682.244op/s 18256984.257op/s ± 77903.946op/s 18292843.840op/s ± 11013.385op/s 18300177.512op/s 18309152.930op/s 18324603.054op/s 18327206.738op/s 0.19% -1.922 3.011 0.43% 5508.641op/s 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time 51.887µs 52.138µs ± 0.095µs 52.135µs ± 0.075µs 52.214µs 52.279µs 52.338µs 52.374µs 0.46% -0.065 -0.442 0.18% 0.007µs 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput 19093338.881op/s 19180000.114op/s ± 34858.407op/s 19180821.098op/s ± 27717.924op/s 19206731.001op/s 19235091.448op/s 19257099.438op/s 19272744.976op/s 0.48% 0.074 -0.439 0.18% 2464.862op/s 1 200
credit_card/is_card_number_no_luhn/x371413321323331 execution_time 6.430µs 6.446µs ± 0.018µs 6.443µs ± 0.003µs 6.446µs 6.466µs 6.545µs 6.583µs 2.17% 5.277 32.541 0.28% 0.001µs 1 200
credit_card/is_card_number_no_luhn/x371413321323331 throughput 151895891.554op/s 155124672.373op/s ± 432439.255op/s 155198918.945op/s ± 74166.085op/s 155272194.648op/s 155403049.618op/s 155467812.127op/s 155530169.097op/s 0.21% -5.221 31.922 0.28% 30578.073op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
credit_card/is_card_number/ execution_time [3.915µs; 3.915µs] or [-0.010%; +0.010%] None None None
credit_card/is_card_number/ throughput [255404631.629op/s; 255456026.289op/s] or [-0.010%; +0.010%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 execution_time [82.050µs; 82.118µs] or [-0.041%; +0.041%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 throughput [12177776.679op/s; 12187693.135op/s] or [-0.041%; +0.041%] None None None
credit_card/is_card_number/ 378282246310005 execution_time [76.826µs; 76.910µs] or [-0.055%; +0.055%] None None None
credit_card/is_card_number/ 378282246310005 throughput [13002367.772op/s; 13016579.402op/s] or [-0.055%; +0.055%] None None None
credit_card/is_card_number/37828224631 execution_time [3.914µs; 3.915µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number/37828224631 throughput [255423179.363op/s; 255467844.582op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number/378282246310005 execution_time [72.271µs; 72.361µs] or [-0.062%; +0.062%] None None None
credit_card/is_card_number/378282246310005 throughput [13819850.420op/s; 13836966.944op/s] or [-0.062%; +0.062%] None None None
credit_card/is_card_number/37828224631000521389798 execution_time [52.137µs; 52.166µs] or [-0.028%; +0.028%] None None None
credit_card/is_card_number/37828224631000521389798 throughput [19169563.115op/s; 19180378.449op/s] or [-0.028%; +0.028%] None None None
credit_card/is_card_number/x371413321323331 execution_time [6.450µs; 6.458µs] or [-0.065%; +0.065%] None None None
credit_card/is_card_number/x371413321323331 throughput [154850028.648op/s; 155048862.197op/s] or [-0.064%; +0.064%] None None None
credit_card/is_card_number_no_luhn/ execution_time [3.914µs; 3.915µs] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/ throughput [255443093.234op/s; 255495265.627op/s] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time [64.223µs; 64.245µs] or [-0.017%; +0.017%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput [15565428.782op/s; 15570827.043op/s] or [-0.017%; +0.017%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time [58.286µs; 58.309µs] or [-0.019%; +0.019%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 throughput [17150171.920op/s; 17156711.606op/s] or [-0.019%; +0.019%] None None None
credit_card/is_card_number_no_luhn/37828224631 execution_time [3.914µs; 3.915µs] or [-0.012%; +0.012%] None None None
credit_card/is_card_number_no_luhn/37828224631 throughput [255439550.747op/s; 255499863.020op/s] or [-0.012%; +0.012%] None None None
credit_card/is_card_number_no_luhn/378282246310005 execution_time [54.742µs; 54.807µs] or [-0.060%; +0.060%] None None None
credit_card/is_card_number_no_luhn/378282246310005 throughput [18246187.519op/s; 18267780.994op/s] or [-0.059%; +0.059%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time [52.125µs; 52.151µs] or [-0.025%; +0.025%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput [19175169.074op/s; 19184831.154op/s] or [-0.025%; +0.025%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 execution_time [6.444µs; 6.449µs] or [-0.039%; +0.039%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 throughput [155064740.451op/s; 155184604.294op/s] or [-0.039%; +0.039%] None None None

Group 13

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 26375b3 1742506058 dsn/exporter_avoid_copy
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time 504.356µs 505.659µs ± 0.412µs 505.656µs ± 0.246µs 505.933µs 506.289µs 506.568µs 506.821µs 0.23% -0.217 0.328 0.08% 0.029µs 1 200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput 1973081.685op/s 1977617.783op/s ± 1610.399op/s 1977627.865op/s ± 962.648op/s 1978502.091op/s 1980342.396op/s 1981503.522op/s 1982728.234op/s 0.26% 0.223 0.332 0.08% 113.872op/s 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time 452.535µs 453.367µs ± 0.363µs 453.373µs ± 0.231µs 453.574µs 454.000µs 454.288µs 454.533µs 0.26% 0.324 0.101 0.08% 0.026µs 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput 2200060.282op/s 2205721.895op/s ± 1766.552op/s 2205689.477op/s ± 1124.331op/s 2206935.759op/s 2208527.743op/s 2209099.299op/s 2209774.318op/s 0.19% -0.319 0.095 0.08% 124.914op/s 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time 174.434µs 176.291µs ± 0.414µs 176.369µs ± 0.229µs 176.571µs 176.802µs 176.933µs 177.028µs 0.37% -1.252 2.499 0.23% 0.029µs 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput 5648827.806op/s 5672454.769op/s ± 13360.958op/s 5669918.896op/s ± 7350.703op/s 5678672.326op/s 5696446.048op/s 5712132.545op/s 5732811.284op/s 1.11% 1.273 2.588 0.23% 944.762op/s 1 200
normalization/normalize_service/normalize_service/[empty string] execution_time 37.603µs 37.715µs ± 0.051µs 37.712µs ± 0.036µs 37.750µs 37.800µs 37.850µs 37.878µs 0.44% 0.434 0.162 0.13% 0.004µs 1 200
normalization/normalize_service/normalize_service/[empty string] throughput 26400496.452op/s 26514506.701op/s ± 35643.692op/s 26516987.523op/s ± 25005.847op/s 26540363.349op/s 26564455.294op/s 26585105.761op/s 26593766.761op/s 0.29% -0.426 0.151 0.13% 2520.390op/s 1 200
normalization/normalize_service/normalize_service/test_ASCII execution_time 48.225µs 48.335µs ± 0.053µs 48.324µs ± 0.024µs 48.358µs 48.414µs 48.507µs 48.720µs 0.82% 2.623 14.822 0.11% 0.004µs 1 200
normalization/normalize_service/normalize_service/test_ASCII throughput 20525336.662op/s 20688911.928op/s ± 22492.995op/s 20693519.801op/s ± 10318.403op/s 20702239.977op/s 20716496.024op/s 20727560.946op/s 20736313.503op/s 0.21% -2.590 14.539 0.11% 1590.495op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time [505.602µs; 505.716µs] or [-0.011%; +0.011%] None None None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput [1977394.597op/s; 1977840.969op/s] or [-0.011%; +0.011%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time [453.316µs; 453.417µs] or [-0.011%; +0.011%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput [2205477.068op/s; 2205966.722op/s] or [-0.011%; +0.011%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time [176.234µs; 176.349µs] or [-0.033%; +0.033%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput [5670603.069op/s; 5674306.469op/s] or [-0.033%; +0.033%] None None None
normalization/normalize_service/normalize_service/[empty string] execution_time [37.708µs; 37.722µs] or [-0.019%; +0.019%] None None None
normalization/normalize_service/normalize_service/[empty string] throughput [26509566.828op/s; 26519446.574op/s] or [-0.019%; +0.019%] None None None
normalization/normalize_service/normalize_service/test_ASCII execution_time [48.328µs; 48.342µs] or [-0.015%; +0.015%] None None None
normalization/normalize_service/normalize_service/test_ASCII throughput [20685794.615op/s; 20692029.241op/s] or [-0.015%; +0.015%] None None None

Baseline

Omitted due to size.

@danielsn danielsn force-pushed the dsn/exporter_avoid_copy branch from 8945950 to 0a70e3e Compare March 13, 2025 21:52
@danielsn danielsn changed the title DRAFT [profiling] Reduce copying and allocation in exporter [profiling] Reduce copying and allocation in exporter Mar 13, 2025
@danielsn danielsn force-pushed the dsn/exporter_avoid_copy branch from 0a70e3e to dfa6eb3 Compare March 13, 2025 21:53
@danielsn danielsn force-pushed the dsn/exporter_avoid_copy branch from dfa6eb3 to 606d179 Compare March 13, 2025 21:54
@danielsn danielsn marked this pull request as ready for review March 13, 2025 21:54
@danielsn danielsn requested review from a team as code owners March 13, 2025 21:54
@codecov-commenter
Copy link

codecov-commenter commented Mar 13, 2025

Codecov Report

Attention: Patch coverage is 76.38889% with 68 lines in your changes missing coverage. Please review.

Project coverage is 72.98%. Comparing base (fa39a68) to head (26375b3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #926      +/-   ##
==========================================
+ Coverage   72.91%   72.98%   +0.07%     
==========================================
  Files         334      334              
  Lines       51002    50915      -87     
==========================================
- Hits        37186    37160      -26     
+ Misses      13816    13755      -61     
Components Coverage Δ
crashtracker 42.85% <ø> (+0.02%) ⬆️
crashtracker-ffi 6.25% <ø> (ø)
datadog-alloc 98.73% <ø> (ø)
data-pipeline 91.96% <ø> (ø)
data-pipeline-ffi 90.29% <ø> (ø)
ddcommon 82.95% <84.09%> (+1.57%) ⬆️
ddcommon-ffi 70.13% <61.11%> (+4.02%) ⬆️
ddtelemetry 61.87% <ø> (ø)
ddtelemetry-ffi 22.46% <ø> (ø)
dogstatsd 89.60% <ø> (ø)
dogstatsd-client 82.57% <ø> (ø)
ipc 82.51% <ø> (+0.10%) ⬆️
profiling 81.86% <77.40%> (-0.02%) ⬇️
profiling-ffi 69.90% <72.31%> (-0.83%) ⬇️
serverless 0.00% <ø> (ø)
sidecar 42.07% <ø> (ø)
sidecar-ffi 9.85% <ø> (ø)
spawn-worker 54.37% <ø> (ø)
tinybytes 91.59% <ø> (ø)
trace-mini-agent 74.66% <ø> (ø)
trace-normalization 98.24% <ø> (ø)
trace-obfuscation 96.00% <ø> (ø)
trace-protobuf 78.13% <ø> (ø)
trace-utils 92.91% <ø> (ø)
🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a pass!

Overall I like this PR less about the copying/allocation reduction and more as I think it's very useful to have a direct link between the profile and the exporter. In the past for instance we've had to expose the ProfiledEndpointsStats because it was not encoded in the pprof. With this change, we can trivially report more things (metrics? other info?) that also don't get encoded in the pprof.

@danielsn danielsn requested a review from a team as a code owner March 14, 2025 19:08
@r1viollet
Copy link
Contributor

r1viollet commented Mar 14, 2025

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 78.40 MB 78.42 MB +.02% (+22.07 KB) 🔍
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 7.80 MB 7.80 MB +0% (+336 B) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so.debug 24.55 MB 24.56 MB +.05% (+12.57 KB) 🔍
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 7.73 MB 7.73 MB +0% (+384 B) 👌
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 72.73 MB 72.75 MB +.02% (+20.04 KB) 🔍
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so.debug 23.13 MB 23.14 MB +.05% (+12.16 KB) 🔍
i686-alpine-linux-musl
Artifact Baseline Commit Change
/i686-alpine-linux-musl/lib/libdatadog_profiling.a 67.68 MB 67.70 MB +.03% (+22.33 KB) 🔍
/i686-alpine-linux-musl/lib/libdatadog_profiling.so 8.25 MB 8.25 MB +0% (+56 B) 👌
/i686-alpine-linux-musl/lib/libdatadog_profiling.so.debug 23.70 MB 23.71 MB +.05% (+13.25 KB) 🔍
i686-unknown-linux-gnu
Artifact Baseline Commit Change
/i686-unknown-linux-gnu/lib/libdatadog_profiling.a 68.56 MB 68.58 MB +.03% (+21.83 KB) 🔍
/i686-unknown-linux-gnu/lib/libdatadog_profiling.so 8.13 MB 8.13 MB +.05% (+4.42 KB) 🔍
/i686-unknown-linux-gnu/lib/libdatadog_profiling.so.debug 21.32 MB 21.33 MB +.06% (+13.31 KB) 🔍
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 17.17 MB 17.19 MB +.10% (+18.50 KB) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 54.81 KB 55.10 KB +.51% (+290 B) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 118.47 MB 118.69 MB +.19% (+232.00 KB) 🔍
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 707.83 MB 713.13 MB +.74% (+5.30 MB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 5.05 MB 5.05 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 54.81 KB 55.10 KB +.51% (+290 B) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 16.29 MB 16.30 MB +.04% (+8.00 KB) 🔍
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 26.96 MB 26.96 MB +.01% (+5.04 KB) 🔍
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 14.57 MB 14.58 MB +.08% (+12.50 KB) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 55.66 KB 55.94 KB +.51% (+294 B) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 120.51 MB 120.73 MB +.18% (+224.00 KB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 699.50 MB 704.68 MB +.74% (+5.17 MB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 3.84 MB 3.84 MB +.03% (+1.50 KB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 55.66 KB 55.94 KB +.51% (+294 B) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 16.96 MB 16.96 MB 0% (0 B) 👌
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 24.95 MB 24.95 MB +.02% (+5.79 KB) 🔍
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 67.68 MB 67.70 MB +.03% (+22.33 KB) 🔍
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 8.25 MB 8.25 MB +0% (+56 B) 👌
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so.debug 23.70 MB 23.71 MB +.05% (+13.25 KB) 🔍
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 68.56 MB 68.58 MB +.03% (+21.83 KB) 🔍
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 8.13 MB 8.13 MB +.05% (+4.42 KB) 🔍
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so.debug 21.32 MB 21.33 MB +.06% (+13.31 KB) 🔍

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes!

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I've marked it as Request changes because of the invalid-report-without-start/end thingy, but after clearing that this is basically good to go.

@danielsn danielsn requested a review from ivoanjo March 20, 2025 20:36
@danielsn danielsn dismissed ivoanjo’s stale review March 20, 2025 20:37

Addressed the issue

@danielsn danielsn merged commit 51d4c2b into main Mar 21, 2025
35 checks passed
@danielsn danielsn deleted the dsn/exporter_avoid_copy branch March 21, 2025 00:52
ivoanjo added a commit that referenced this pull request Mar 21, 2025
…ation

**What does this PR do?**

In #926 we removed the
`start_time` argument from
* `ddog_prof_Profile_new`
* `ddog_prof_Profile_with_string_storage`
* `ddog_prof_Profile_reset`

The intention was that having it as an argument in
`ddog_prof_Profile_serialize` was enough, and anyway almost
everyone was passing in `null`s in the APIs above.

I missed when suggesting in that PR that the `start_time` for
serialize was the `start_time` for the next profile, not the one
being serialized.

In this PR I'm changing that behavior: the `start_time` argument
to serialize now controls the time for the profile being
serialized, allowing the profiling library to have exact control
over this value.

I've also removed the duration.

[As can be seen in this github
search](https://github.com/search?q=org%3ADataDog+ddog_prof_Profile_serialize&type=code)
this is not expected to impact anyone: everyone's passing
`NULL` for `start_time` and `duration` when calling serialize already.

**Motivation:**

Allow Ruby profiler to set the start_time of profiles.

**Additional Notes:**

N/A

**How to test the change?**

I've tested this with my experimental libdatadog 17 branch for Ruby.
ivoanjo added a commit that referenced this pull request Mar 27, 2025
…ation (#963)

**What does this PR do?**

In #926 we removed the
`start_time` argument from
* `ddog_prof_Profile_new`
* `ddog_prof_Profile_with_string_storage`
* `ddog_prof_Profile_reset`

The intention was that having it as an argument in
`ddog_prof_Profile_serialize` was enough, and anyway almost
everyone was passing in `null`s in the APIs above.

I missed when suggesting in that PR that the `start_time` for
serialize was the `start_time` for the next profile, not the one
being serialized.

In this PR I'm changing that behavior: the `start_time` argument
to serialize now controls the time for the profile being
serialized, allowing the profiling library to have exact control
over this value.

I've also removed the duration.

[As can be seen in this github
search](https://github.com/search?q=org%3ADataDog+ddog_prof_Profile_serialize&type=code)
this is not expected to impact anyone: everyone's passing
`NULL` for `start_time` and `duration` when calling serialize already.

**Motivation:**

Allow Ruby profiler to set the start_time of profiles.

**Additional Notes:**

N/A

**How to test the change?**

I've tested this with my experimental libdatadog 17 branch for Ruby.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common profiling Relates to the profiling* modules.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants