Skip to content

Commit 5156796

Browse files
committed
Add max_delimiters_per_line config option
1 parent 5ce491f commit 5156796

10 files changed

+101
-6
lines changed

.phpstorm.meta.php

+1
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
'html_input',
3232
'allow_unsafe_links',
3333
'max_nesting_level',
34+
'max_delimiters_per_line',
3435
'renderer',
3536
'renderer/block_separator',
3637
'renderer/inner_separator',

CHANGELOG.md

+6-1
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,14 @@ Updates should follow the [Keep a CHANGELOG](https://keepachangelog.com/) princi
88

99
### Added
1010

11+
- Added `max_delimiters_per_line` config option to prevent denial of service attacks when parsing malicious input
12+
- Added `table/max_autocompleted_cells` config option to prevent denial of service attacks when parsing large tables
13+
- The `AttributesExtension` now supports attributes without values (#985, #986)
14+
- The `AutolinkExtension` exposes two new configuration options to override the default behavior (#969, #987):
15+
- `autolink/allowed_protocols` - an array of protocols to allow autolinking for
16+
- `autolink/default_protocol` - the default protocol to use when none is specified
1117
- Added `RegexHelper::isWhitespace()` method to check if a given character is an ASCII whitespace character
1218
- Added `CacheableDelimiterProcessorInterface` to ensure linear complexity for dynamic delimiter processing
13-
- Added `table/max_autocompleted_cells` config option to prevent denial of service attacks when parsing large tables
1419
- Added `Bracket` delimiter type to optimize bracket parsing
1520

1621
### Changed

docs/2.5/configuration.md

+2
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ $config = [
2727
'html_input' => 'escape',
2828
'allow_unsafe_links' => false,
2929
'max_nesting_level' => PHP_INT_MAX,
30+
'max_delimiters_per_line' => PHP_INT_MAX,
3031
'slug_normalizer' => [
3132
'max_length' => 255,
3233
],
@@ -73,6 +74,7 @@ Here's a list of the core configuration options available:
7374
- `escape` - Escape all HTML
7475
- `allow_unsafe_links` - Remove risky link and image URLs by setting this to `false` (default: `true`)
7576
- `max_nesting_level` - The maximum nesting level for blocks (default: `PHP_INT_MAX`). Setting this to a positive integer can help protect against long parse times and/or segfaults if blocks are too deeply-nested.
77+
- `max_delimiters_per_line` - The maximum number of delimiters (e.g. `*` or `_`) allowed in a single line (default: `PHP_INT_MAX`). Setting this to a positive integer can help protect against long parse times and/or segfaults if lines are too long.
7678
- `slug_normalizer` - Array of options for configuring how URL-safe slugs are created; see [the slug normalizer docs](/2.5/customization/slug-normalizer/#configuration) for more details
7779
- `instance` - An alternative normalizer to use (defaults to the included `SlugNormalizer`)
7880
- `max_length` - Limits the size of generated slugs (defaults to 255 characters)

docs/2.5/security.md

+21-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ In order to be fully compliant with the CommonMark spec, certain security settin
1111

1212
- `html_input`: How to handle raw HTML
1313
- `allow_unsafe_links`: Whether unsafe links are permitted
14-
- `max_nesting_level`: Protected against long render times or segfaults
14+
- `max_nesting_level`: Protect against long render times or segfaults
15+
- `max_delimiters_per_line`: Protect against long parse times or rendering segfaults
1516

1617
Further information about each option can be found below.
1718

@@ -88,6 +89,25 @@ echo $converter->convert($markdown);
8889

8990
See the [configuration](/2.5/configuration/) section for more information.
9091

92+
## Max Delimiters Per Line
93+
94+
Similarly to the maximum nesting level, **no maximum number of delimiters per line is enforced by default.** Delimiters can be nested (like `*a **b** c*`) or un-nested (like `*a* *b* *c*`) - in either case, having too many in a single line can result in long parse times. We therefore have a separate option to limit the number of delimiters per line.
95+
96+
If you need to parse untrusted input, consider setting a reasonable `max_delimiters_per_line` (perhaps 100-1000) depending on your needs. Once this level is hit, any subsequent delimiters on that line will be rendered as plain text.
97+
98+
### Example - Prevent too many delimiters
99+
100+
```php
101+
use League\CommonMark\CommonMarkConverter;
102+
103+
$markdown = '*a* **b *c **d** c* b**'; // 8 delimiters (* and **)
104+
105+
$converter = new CommonMarkConverter(['max_delimiters_per_line' => 6]);
106+
echo $converter->convert($markdown);
107+
108+
// <p><em>a</em> **b *c <strong>d</strong> c* b**</p>
109+
```
110+
91111
## Additional Filtering
92112

93113
Although this library does offer these security features out-of-the-box, some users may opt to also run the HTML output through additional filtering layers (like HTMLPurifier). If you do this, make sure you **thoroughly** test your additional post-processing steps and configure them to work properly with the types of HTML elements and attributes that converted Markdown might produce, otherwise, you may end up with weird behavior like missing images, broken links, mismatched HTML tags, etc.

docs/2.6/upgrading.md

+7
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,13 @@ redirect_from: /upgrading/
77

88
# Upgrading from 2.5 to 2.6
99

10+
## `max_delimiters_per_line` Configuration Option
11+
12+
The `max_delimiters_per_line` configuration option was added in 2.6 to help protect against malicious input that could
13+
cause excessive memory usage or denial of service attacks. It defaults to `PHP_INT_MAX` (no limit) for backwards
14+
compatibility, which is safe when parsing trusted input. However, if you're parsing untrusted input from users, you
15+
should probably set this to a reasonable value (somewhere between `100` and `1000`) to protect against malicious inputs.
16+
1017
## Custom Delimiter Processors
1118

1219
If you're implementing a custom delimiter processor, and `getDelimiterUse()` has more logic than just a

src/Delimiter/DelimiterStack.php

+10-1
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,13 @@ final class DelimiterStack
3939
*/
4040
private $missingIndexCache;
4141

42-
public function __construct()
42+
43+
private int $remainingDelimiters = 0;
44+
45+
public function __construct(int $maximumStackSize = PHP_INT_MAX)
4346
{
47+
$this->remainingDelimiters = $maximumStackSize;
48+
4449
if (\PHP_VERSION_ID >= 80000) {
4550
/** @psalm-suppress PropertyTypeCoercion */
4651
$this->missingIndexCache = new \WeakMap(); // @phpstan-ignore-line
@@ -51,6 +56,10 @@ public function __construct()
5156

5257
public function push(DelimiterInterface $newDelimiter): void
5358
{
59+
if ($this->remainingDelimiters-- <= 0) {
60+
return;
61+
}
62+
5463
$newDelimiter->setPrevious($this->top);
5564

5665
if ($this->top !== null) {

src/Environment/Environment.php

+1
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,7 @@ public static function createDefaultConfiguration(): Configuration
432432
'html_input' => Expect::anyOf(HtmlFilter::STRIP, HtmlFilter::ALLOW, HtmlFilter::ESCAPE)->default(HtmlFilter::ALLOW),
433433
'allow_unsafe_links' => Expect::bool(true),
434434
'max_nesting_level' => Expect::type('int')->default(PHP_INT_MAX),
435+
'max_delimiters_per_line' => Expect::type('int')->default(PHP_INT_MAX),
435436
'renderer' => Expect::structure([
436437
'block_separator' => Expect::string("\n"),
437438
'inner_separator' => Expect::string("\n"),

src/Parser/InlineParserContext.php

+2-2
Original file line numberDiff line numberDiff line change
@@ -42,12 +42,12 @@ final class InlineParserContext
4242
*/
4343
private array $matches;
4444

45-
public function __construct(Cursor $contents, AbstractBlock $container, ReferenceMapInterface $referenceMap)
45+
public function __construct(Cursor $contents, AbstractBlock $container, ReferenceMapInterface $referenceMap, int $maxDelimitersPerLine = PHP_INT_MAX)
4646
{
4747
$this->referenceMap = $referenceMap;
4848
$this->container = $container;
4949
$this->cursor = $contents;
50-
$this->delimiterStack = new DelimiterStack();
50+
$this->delimiterStack = new DelimiterStack($maxDelimitersPerLine);
5151
}
5252

5353
public function getContainer(): AbstractBlock

src/Parser/InlineParserEngine.php

+1-1
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ public function parse(string $contents, AbstractBlock $block): void
5959
$contents = \trim($contents);
6060
$cursor = new Cursor($contents);
6161

62-
$inlineParserContext = new InlineParserContext($cursor, $block, $this->referenceMap);
62+
$inlineParserContext = new InlineParserContext($cursor, $block, $this->referenceMap, $this->environment->getConfiguration()->get('max_delimiters_per_line'));
6363

6464
// Have all parsers look at the line to determine what they might want to parse and what positions they exist at
6565
foreach ($this->matchParsers($contents) as $matchPosition => $parsers) {
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
/*
6+
* This file is part of the league/commonmark package.
7+
*
8+
* (c) Colin O'Dell <[email protected]>
9+
*
10+
* For the full copyright and license information, please view the LICENSE
11+
* file that was distributed with this source code.
12+
*/
13+
14+
namespace League\CommonMark\Tests\Functional;
15+
16+
use League\CommonMark\CommonMarkConverter;
17+
use PHPUnit\Framework\TestCase;
18+
19+
final class MaxDelimitersPerLineTest extends TestCase
20+
{
21+
/**
22+
* @dataProvider provideTestCases
23+
*/
24+
public function testIt(string $input, int $maxDelimsPerLine, string $expectedOutput): void
25+
{
26+
$converter = new CommonMarkConverter(['max_delimiters_per_line' => $maxDelimsPerLine]);
27+
28+
$this->assertEquals($expectedOutput, \trim($converter->convert($input)->getContent()));
29+
}
30+
31+
/**
32+
* @return iterable<array<mixed>>
33+
*/
34+
public function provideTestCases(): iterable
35+
{
36+
yield ['*a* **b *c* b**', 6, '<p><em>a</em> <strong>b <em>c</em> b</strong></p>'];
37+
38+
yield ['*a* **b *c **d** c* b**', 0, '<p>*a* **b *c **d** c* b**</p>'];
39+
yield ['*a* **b *c **d** c* b**', 1, '<p>*a* **b *c **d** c* b**</p>'];
40+
yield ['*a* **b *c **d** c* b**', 2, '<p><em>a</em> **b *c **d** c* b**</p>'];
41+
yield ['*a* **b *c **d** c* b**', 3, '<p><em>a</em> **b *c **d** c* b**</p>'];
42+
yield ['*a* **b *c **d** c* b**', 4, '<p><em>a</em> **b *c **d** c* b**</p>'];
43+
yield ['*a* **b *c **d** c* b**', 5, '<p><em>a</em> **b *c **d** c* b**</p>'];
44+
yield ['*a* **b *c **d** c* b**', 6, '<p><em>a</em> **b *c <strong>d</strong> c* b**</p>'];
45+
yield ['*a* **b *c **d** c* b**', 7, '<p><em>a</em> **b <em>c <strong>d</strong> c</em> b**</p>'];
46+
yield ['*a* **b *c **d** c* b**', 8, '<p><em>a</em> <strong>b <em>c <strong>d</strong> c</em> b</strong></p>'];
47+
yield ['*a* **b *c **d** c* b**', 9, '<p><em>a</em> <strong>b <em>c <strong>d</strong> c</em> b</strong></p>'];
48+
yield ['*a* **b *c **d** c* b**', 100, '<p><em>a</em> <strong>b <em>c <strong>d</strong> c</em> b</strong></p>'];
49+
}
50+
}

0 commit comments

Comments
 (0)