Skip to content

Commit c7b75ef

Browse files
committed
version bump to v1.17.0
1 parent e8e8ffe commit c7b75ef

File tree

2 files changed

+67
-12
lines changed

2 files changed

+67
-12
lines changed

CHANGELOG.md

+66-11
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@ Nokogiri follows [Semantic Versioning](https://semver.org/), please see the [REA
44

55
---
66

7-
## v1.next / unreleased
7+
## v1.17.0 / 2024-12-08
88

99
### Dependencies
1010

1111
* [CRuby] Vendored libxml2 is updated to [v2.13.5](https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.13.5). @flavorjones
1212
* [CRuby] Vendored libxslt is updated to [v1.1.42](https://gitlab.gnome.org/GNOME/libxslt/-/releases/v1.1.42). @flavorjones
1313
* [CRuby] Minimum supported version of libxml2 raised to v2.9.2 (released 2014-10-16) from v2.6.21. [#3232, #3287] @flavorjones
14-
* [JRuby] Minimum supported versino of Java raised to 8 (released 2014-03-18) from 7. [#3134] @flavorjones
14+
* [JRuby] Minimum supported version of Java raised to 8 (released 2014-03-18) from 7. [#3134] @flavorjones
1515
* [CRuby] Update to rake-compiler-dock v1.5.1 for building precompiled native gems. [#3216] @flavorjones
1616

1717

@@ -30,25 +30,36 @@ If your application relies on the SAX parsers, and in particular if you're SAX-p
3030

3131
Document fragment parsing has been improved, particularly with respect to handling malformed fragments or fragments with implicit namespace prefixes. Namespace reconciliation still isn't where we want it to be, but it's an improvement.
3232

33-
HTML5 fragment parsing now allows the context node to be specified as a keyword argument to the `HTML5::DocumentFragment.parse` and `.new` methods, which in particular should allow for more flexible sanitization and support for the [draft HTML Sanitizer API](https://wicg.github.io/sanitizer-api/) in downstream libraries.
33+
HTML5 fragment parsing now allows the context node to be specified as a `context:` keyword argument to the `HTML5::DocumentFragment.parse` and `.new` methods, which should allow for more flexible sanitization and future support for the [draft HTML Sanitizer API](https://wicg.github.io/sanitizer-api/) in downstream libraries.
3434

3535

3636
#### Error handling
3737

38-
In scenarios where multiple errors could be reported by the underlying parser, the errors will be aggregated into a single `Nokogiri::XML::SyntaxError` that is raised. Previously only the final error reported by libxml2 was raised which was often misleading if it was only a warning and not the fatal error.
38+
In scenarios where multiple errors could be reported by the underlying parser, the errors will be aggregated into a single `Nokogiri::XML::SyntaxError` that is raised. Previously only the final error reported by libxml2 was raised (which was often misleading if it was only a warning and not the fatal error).
3939

4040

4141
#### Schema validation
4242

4343
We've resolved many long-standing bugs in the various schema classes, validation methods, and their error reporting. Behavior is now consistent across schema types and input types, as well as parser backends (Xerces and libxml2).
4444

4545

46+
#### Keyword arguments
47+
48+
The following methods now accept keyword arguments in addition to positional arguments, and use `...` parameter forwarding when possible:
49+
`HTML4()`, `HTML4.fragment`, `HTML4.parse`, `HTML4::Document.parse`, `HTML4::DocumentFragment#initialize`, `HTML4::DocumentFragment.parse`, `HTML5()`, `HTML5.fragment`, `HTML5.parse`, `HTML5::Document.parse`, `HTML5::Document.read_io`, `HTML5::Document.read_memory`, `HTML5::DocumentFragment#initialize`, `HTML5::DocumentFragment.parse`, `XML()`, `XML.fragment`, `XML.parse`, `XML::Document.parse`, `XML::DocumentFragment#initialize`, `XML::DocumentFragment.parse`, `XML::Node#canonicalize`, `XML::Node.parse`, `XML::Reader()`, `XML::RelaxNG()`, `XML::RelaxNG.new`, `XML::RelaxNG.read_memory`, `XML::SAX::PushParser#initialize`, `XML::Schema()`, `XML::Schema.new`, `XML::Schema.read_memory`, and `XSLT()`.
50+
51+
Special thanks to those contributors who participated in the RubyConf 2024 Hack Day to work on #3323 to help modernize Nokogiri by adding keyword arguments and using parameter forwarding in many methods, and expanding some of the documentation! We intend to continue adding keyword argument support to more methods. #3323 #3324 #3326 #3327 #3329 #3330 #3332 #3333 #3334 #3335 #3336 #3342 #3355 #3356 @infews @matiasow @MattJones @mononoken @openbl @flavorjones
52+
53+
4654
### Added
4755

4856
* Introduce support for a new SAX callback `XML::SAX::Document#reference`, which is called to report some parsed XML entities when `XML::SAX::ParserContext#replace_entities` is set to the default value `false`. This is necessary functionality for some applications that were previously relying on incorrect entity error reporting which has been fixed (see below). For more information, read the docs for `Nokogiri::XML::SAX::Document`. [#1926] @flavorjones
4957
* `XML::SAX::Parser#parse_memory` and `#parse_file` now accept an optional `encoding` argument. When not provided, the parser will fall back to the encoding passed to the initializer, and then fall back to autodetection. [#3288] @flavorjones
5058
* `XML::SAX::ParserContext.memory` now accepts an optional `encoding` argument. When not provided, the encoding will be autodetected. [#3288] @flavorjones
51-
* New attributes `XML::DocumentFragment#parse_options` and `HTML4::DocumentFragment#parse_options` contain the options used to parse the document fragment. @flavorjones
59+
* New readonly attributes `XML::DocumentFragment#parse_options` and `HTML4::DocumentFragment#parse_options` return the options used to parse the document fragment. @flavorjones
60+
* New method `XML::Reader.new` is the primary constructor to which `XML::Reader()` forwards. Both methods now take `url:`, `encoding:`, and `options:` kwargs in addition to the previous calling convention of passing positional parameters. #3326 @infews @flavorjones
61+
* [CRuby] The HTML5 parse methods accept a `:parse_noscript_content_as_text` keyword argument which will emulate the parsing behavior of a browser which has scripting enabled. [#3178, #3231] @stevecheckoway
62+
* [CRuby] `HTML5::DocumentFragment.parse` and `.new` accept a `:context` keyword argument that is the parse context node or element name. Previously this could only be passed in as a positional argument to `.new` and not at all to `.parse`. @flavorjones
5263
* [CRuby] `Nokogiri::HTML5::Builder` is similar to `HTML4::Builder` but returns an `HTML5::Document`. [#3119] @flavorjones
5364
* [CRuby] Attributes in an HTML5 document can be serialized individually, something that has always been supported by the HTML4 serializer. [#3125, #3127] @flavorjones
5465
* [CRuby] Introduce a compile-time option, `--disable-xml2-legacy`, to remove from libxml2 its dependencies on `zlib` and `liblzma` and disable implicit `HTTP` network requests. These all remain enabled by default, and are present in the precompiled native gems. This option is a precursor for removing these libraries in a future major release, but may be interesting for the security-minded who do not need features like automatic decompression and would like to remove these dependencies. You can read more and give feedback on these plans in #3168. [#3247] @flavorjones
@@ -57,17 +68,16 @@ We've resolved many long-standing bugs in the various schema classes, validation
5768

5869
### Improved
5970

71+
* Documentation has been improved for `XML::RelaxNG`, `XML::Schema`, `XML::Reader`, `HTML5`, `HTML5::Document`, `HTML5::DocumentFragment`, `HTML4::Document`, `HTML4::DocumentFragment`, `XML`, `XML::Document`, `XML::DocumentFragment`. #3355 @flavorjones
6072
* Documentation has been improved for `CSS.xpath_for`. [#3224] @flavorjones
6173
* Documentation for the SAX parsing classes has been greatly improved, including encoding overrides and the complex entity-handling behavior. [#3265] @flavorjones
6274
* `XML::Schema#read_memory` and `XML::RelaxNG#read_memory` are now Ruby methods that call `#from_document`. Previously these were native functions, but they were buggy on both CRuby and JRuby (but worse on JRuby) and so this is now useful, comparable in performance, and simpler code that is easier to maintain. [#2113, #2115] @flavorjones
6375
* `XML::SAX::ParserContext.io`'s `encoding` argument is now optional, and can now be an `Encoding` or an encoding name. When not provided will default to autodetecting the encoding. [#3288] @flavorjones
64-
* [CRuby] When compiling packaged libraries from source, allow users' `AR` and `LD` environment variables to set the archiver and linker commands, respectively. This augments the existing `CC` environment variable to set the compiler command. [#3165] @ziggythehamster
65-
* [CRuby] The HTML5 parse methods accept a `:parse_noscript_content_as_text` keyword argument which will emulate the parsing behavior of a browser which has scripting enabled. [#3178, #3231] @stevecheckoway
66-
* [CRuby] `HTML5::DocumentFragment.parse` and `.new` accept a `:context` keyword argument that is the parse context node or element name. Previously this could only be passed in as a positional argument to `.new` and not at all to `.parse`. @flavorjones
6776
* [CRuby] The update to libxml v2.13 improves "in context" fragment parsing recovery. We removed our hacky workaround for recovery that led to silently-degraded functionality when parsing fragments with parse errors. Specifically, malformed XML fragments that used implicit namespace prefixes will now "link up" to the namespaces in the parent document or node, where previously they did not. [#2092] @flavorjones
6877
* [CRuby] When multiple errors could be detected by the parser and there's no obvious document to save them in (for example, when parsing a document with the recovery parse option turned off), the libxml2 errors are aggregated into a single `Nokogiri::XML::SyntaxError`. Previously, only the last error recorded by libxml2 was raised, which might be misleading if it's merely a warning and not the fatal error preventing the operation. [#2562] @flavorjones
6978
* [CRuby] The SAX parser context and handler implementation has been simplified and now takes advantage of some of libxml2's default SAX handlers for entities and DTD management. [#3265] @flavorjones
70-
* [CRuby] When building from source on MacOS, environment variables AR and RANLIB are now respected when set instead of being overridden to /usr/bin/{ar,ranlib} (which is still the default). [#3338] @joshheinrichs-shopify
79+
* [CRuby] When compiling packaged libraries from source, allow users' `AR` and `LD` environment variables to set the archiver and linker commands, respectively. This augments the existing `CC` environment variable to set the compiler command. [#3165] @ziggythehamster
80+
* [CRuby] When building from source on MacOS, environment variables `AR` and `RANLIB` are now respected when set instead of being overridden to /usr/bin/{ar,ranlib} (which is still the default). [#3338] @joshheinrichs-shopify
7181

7282

7383
### Fixed
@@ -80,7 +90,7 @@ We've resolved many long-standing bugs in the various schema classes, validation
8090
* [CRuby] libgumbo (the HTML5 parser) treats reaching max-depth as EOF. This addresses a class of issues when the parser is interrupted in this way. [#3121] @stevecheckoway
8191
* [CRuby] Update node GC lifecycle to avoid a potential memory leak with fragments in libxml 2.13.0 caused by changes in `xmlAddChild`. [#3156] @flavorjones
8292
* [CRuby] libgumbo correctly prints nonstandard element names in error messages. [#3219] @stevecheckoway
83-
* [CRuby] SAX parsing no longer registers errors when encountering external entity references. [#1926] @flavorjones
93+
* [CRuby] External entity references no long cause the SAX parser to register errors. [#1926] @flavorjones
8494
* [JRuby] Fixed entity reference serialization, which rendered both the reference and the replacement text. Incredibly nobody noticed this bug for over a decade. [#3272] @flavorjones
8595
* [JRuby] Fixed some bugs in how `Node#attributes` handles attributes with namespaces. [#2677, #2679] @flavorjones
8696
* [JRuby] Fix `Schema#validate` to only return the most recent Document's errors. Previously, if multiple documents were validated, this method returned the accumulated errors of all previous documents. [#1282] @flavorjones
@@ -93,7 +103,7 @@ We've resolved many long-standing bugs in the various schema classes, validation
93103

94104
### Changed
95105

96-
* [CRuby] `Nokogiri::XML::CData.new` no longer accepts `nil` as the content argument, making `CData` behave like other character data classes (like `Comment` and `Text`). This change was necessitated by behavioral changes in the upcoming libxml 2.13.0 release. If you wish to create an empty CDATA node, pass an empty string. [#3156] @flavorjones
106+
* [CRuby] `Nokogiri::XML::CData.new` no longer accepts `nil` as the content argument, making `CData` behave like other character data classes (like `Comment` and `Text`). This change was necessitated by behavioral changes in libxml2 v2.13.0. If you wish to create an empty CDATA node, pass an empty string. [#3156] @flavorjones
97107
* Internals:
98108
* The internal `CSS::XPathVisitor` class now accepts the xpath prefix and the context namespaces as constructor arguments. The `prefix:` and `ns:` keyword arguments to `CSS.xpath_for` cannot be specified if the `visitor:` keyword argument is also used. `CSS::XPathVisitor` now exposes `#builtins`, `#doctype`, `#prefix`, and `#namespaces` attributes. [#3225] @flavorjones
99109
* The internal CSS selector cache has been extracted into a distinct class, `CSS::SelectorCache`. Previously it was part of the `CSS::Parser` class. [#3226] @flavorjones
@@ -107,6 +117,51 @@ We've resolved many long-standing bugs in the various schema classes, validation
107117
* Passing libxml2 encoding IDs to `SAX::ParserContext` methods is now deprecated and will generate a warning. The use of `SAX::Parser::ENCODINGS` is also deprecated. Use `Encoding` objects or encoding names instead.
108118

109119

120+
### Thank you!
121+
122+
The following people and organizations were kind enough to sponsor @flavorjones or the Nokogiri project during the development of v1.17.0:
123+
124+
* via Github sponsors
125+
* renuo @renuo
126+
* Ajaya Agrawalla @ajaya
127+
* Rob Stringer @Mycobee
128+
* Better Stack Community @betterstack-community
129+
* Prowly @prowlycom
130+
* Maxime Gauthier @biximilien
131+
* Harry Lascelles @hlascelles
132+
* Evil Martians @evilmartians
133+
* Typesense @typesense
134+
* YOSHIDA Katsuhiko @kyoshidajp
135+
* Quan Nguyen @qu8n
136+
* Sentry @getsentry
137+
* Codecov @codecov
138+
* Frank Groeneveld @frenkel
139+
* Hiroshi SHIBATA @hsbt
140+
* Nando Vieira @fnando
141+
* Orien Madgwick @orien
142+
* Avo @avo-hq
143+
* Zoran Pesic @zokioki
144+
* @zzak
145+
* Graham Watts @GingerGraham
146+
* Nandang Permana Kusuma @nandangpk
147+
* Mr. Henry @mrhenry
148+
* Götz Görisch @GoetzGoerisch
149+
* Andrew Nesbitt @andrew
150+
* via Thanks.dev
151+
* Sentry @getsentry
152+
* Codecov @codecov
153+
* Keygen @keygen-sh
154+
* Keith Bauson @kwbauson
155+
* Nicco Kunzmann @niccokunzmann
156+
* timhaynes @timhaynes
157+
* via Open Collective
158+
* Airbnb @airbnb
159+
* Nemo @captn3m0
160+
* Velocity Labs @velocity-labs
161+
162+
We'd also like to thank @github who donate a ton of compute time for our CI pipelines!
163+
164+
110165
## v1.16.8 / 2024-12-02
111166

112167
### Fixed

lib/nokogiri/version/constant.rb

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22

33
module Nokogiri
44
# The version of Nokogiri you are using
5-
VERSION = "1.17.0.dev"
5+
VERSION = "1.17.0"
66
end

0 commit comments

Comments
 (0)