Bugfixes
(JRuby) Fix out of memory bug when certain invalid documents are parsed.
(JRuby) Fix regression of billion-laughs vulnerability. #586
This release was based on v1.5.10 and 1.6.0.rc1, and contains changes mentioned in both.
Deprecations
Remove pre 1.9 monitoring from Travis.
This release was based on v1.5.9, and so does not contain any fixes mentioned in the notes for v1.5.10.
Notes
mini_portile is now a runtime dependency
Ruby 1.9.2 and higher now required
Features
(MRI) Source code for libxml 2.8.0 and libxslt 1.2.26 is packaged with the gem. These libraries are compiled at gem install time unless the environment variable NOKOGIRI_USE_SYSTEM_LIBRARIES is set. VERSION_INFO (also ‘nokogiri -v`) exposes whether libxml was compiled from packaged source, or the system library was used.
(Windows) libxml upgraded to 2.8.0
Deprecations
Support for Ruby 1.8.7 and prior has been dropped
Bugfixes
(JRuby) Fix out of memory bug when certain invalid documents are parsed.
(JRuby) Fix regression of billion-laughs vulnerability. #568
Bugfixes
(JRuby) Fix “null document” error when parsing an empty IO in jruby 1.7.3. #883
(JRuby) Fix schema validation when XSD has DOCTYPE set to DTD. #861 (Thanks, Patrick Cheng!)
(MRI) Fix segfault when there is no default subelement for an HTML node. #917
Notes
Use rb_ary_entry instead of RARRAY_PTR (you know, for Rubinius). #877 (Thanks, Dirkjan Bussink!)
Fix TypeError when running tests. #900 (Thanks, Cédric Boutillier!)
Bugfixes
Ensure that prefixed attributes are properly namespaced when reparented. #869
Fix for inconsistent namespaced attribute access for SVG nested in HTML. #861
(MRI) Fixed a memory leak in fragment parsing if nodes are not all subsequently reparented. #856
Bugfixes
(JRuby) Fix EmptyStackException thrown by elements with xhref attributes and no base_uri #534, #805. (Thanks, Patrick Quinn and Brian Hoffman!)
Fixes duplicate attributes issue introduced in 1.5.7. #865
Allow use of a prefixed namespace on a root node using Nokogiri::XML::Builder #868
Features
Windows support for Ruby 2.0.
Bugfixes
SAX::Parser.parse_io throw an error when used with lower case encoding. #828
(JRuby) Java Nokogiri is finally green (passes all tests) under 1.8 and 1.9 mode. High five everyone. #798, #705
(JRuby) Nokogiri::XML::Reader broken (as a pull parser) on jruby - reads the whole XML document. #831
(JRuby) JRuby hangs parsing “&”. #837
(JRuby) JRuby NPE parsing an invalid XML instruction. #838
(JRuby) Node#content= incompatibility. #839
(JRuby) to_xhtml doesn’t print the last slash for self-closing tags in JRuby. #834
(JRuby) Adding an EntityReference after a Text node mangles the entity in JRuby. #835
(JRuby) JRuby version inconsistency: nil for empty attributes. #818
CSS queries for classes (e.g., “.foo”) now treat all whitespace identically. #854
Namespace behavior cleaned up and made consistent between JRuby and MRI. #846, #801 (Thanks, Michael Klein!)
(MRI) SAX parser handles empty processing instructions. #845
Features
Improved performance of XML::Document#collect_namespaces. #761 (Thanks, Juergen Mangler!)
New callback SAX::Document#processing_instruction (Thanks, Kitaiti Makoto!)
Node#native_content= allows setting unescaped node contant. #768
XPath lookup with namespaces supports symbol keys. #729 (Thanks, Ben Langfeld.)
XML::Node#[]= stringifies values. #729 (Thanks, Ben Langfeld.)
bin/nokogiri will process a document from $stdin
bin/nokogiri -e will execute a program from the command line
(JRuby) bin/nokogiri –version will print the Xerces and NekoHTML versions.
Bugfixes
Nokogiri now detects XSLT transform errors. #731 (Thanks, Justin Fitzsimmons!)
Don’t throw an Error when trying to replace top-level text node in DocumentFragment. #775
Raise an ArgumentError if an invalid encoding is passed to the SAX parser. #756 (Thanks, Bradley Schaefer!)
Prefixed element inconsistency between CRuby and JRuby. #712
(JRuby) space prior to xml preamble causes nokogiri to fail parsing. (fixed along with #748) #790
(JRuby) Fixed the bug Nokogiri::XML::Node#content inconsistency between Java and C. #794, #797
(JRuby) raises INVALID_CHARACTER_ERR exception when EntityReference name starts with ‘#’. #719
(JRuby) doesn’t coerce namespaces out of strings on a direct subclass of Node. #715
(JRuby) Node#content now renders newlines properly. #737 (Thanks, Piotr Szmielew!)
(JRuby) Unknown namespace are ignore when the recover option is used. #748
(JRuby) XPath queries for namespaces should not throw exceptions when called twice in a row. #764
(JRuby) More consistent (with libxml2) whitespace formatting when emitting XML. #771
(JRuby) namespaced attributes broken when appending raw xml to builder. #770
(JRuby) Nokogiri::XML::Document#wrap raises undefined method ‘length’ for nil:NilClass when trying to << to a node. #781
(JRuby) Fixed “bad file descriptor” bug when closing open file descriptors. #495
(JRuby) JRuby/CRuby incompatibility for attribute decorators. #785
(JRuby) Issues parsing valid XML with no internal subset in the DTD. #547, #811
(JRuby) Issues parsing valid node content when it contains colons. #728
(JRuby) Correctly parse the doc type of html documents. #733
(JRuby) Include dtd in the xml output when a builder is used with create_internal_subset. #751
(JRuby) builder requires textwrappers for valid utf8 in jruby, not in mri. #784
Features
Much-improved support for JRuby in 1.9 mode! Yay!
Bugfixes
Regression in JRuby Nokogiri add_previous_sibling (1.5.0 -> 1.5.1) #691 (Thanks, John Shahid!)
JRuby unable to create HTML doc if URL arg provided #674 (Thanks, John Shahid!)
JRuby raises NullPointerException when given HTML document is nil or empty string. #699
JRuby 1.9 error, uncaught throw ‘encoding_found’, has been fixed. #673
Invalid encoding returned in JRuby with US-ASCII. #583
XmlSaxPushParser raises IndexOutOfBoundsException when over 512 characters are given. #567, #615
When xpath evaluation returns empty NodeSet, decorating NodeSet’s base document raises exception. #514
JRuby raises exception when xpath with namespace is specified. pull request #681 (Thanks, Piotr Szmielew)
JRuby renders nodes without their namespace when subclassing Node. #695
JRuby raises NAMESPACE_ERR (org.w3c.dom.DOMException) while instantiating RDF::RDFXML::Writer. #683
JRuby is not able to use namespaces in xpath. #493
JRuby’s Entity resolving should be consistent with C-Nokogiri #704, #647, #703
Features
The “nokogiri” script now has more verbose output when passed the ‘–rng` option. #675 (Thanks, Dan Radez!)
Build support on hardened Debian systems that use ‘-Werror=format-security`. #680.
Better build support for systems with pkg-config. #584
Better build support for systems with multiple iconv installations.
Bugfixes
Segmentation fault when creating a comment node for a DocumentFragment. #677, #678.
Treat ‘.’ as xpath in at() and search(). #690
(MRI, Security) Default parse options for XML documents were changed to not make network connections during document parsing, to avoid XXE vulnerability. #693
To re-enable this behavior, the configuration method ‘nononet` may be called, like this:
Nokogiri::XML::Document.parse(xml) { |config| config.nononet }
Insert your own joke about double-negatives here.
Features
Support for “prefixless” CSS selectors ~, > and + like jQuery supports. #621, #623. (Thanks, David Lee!)
Attempting to improve installation on homebrew 0.9 (with regards to iconv). Isn’t package management convenient?
Bugfixes
Custom xpath functions with empty nodeset arguments cause a segfault. #634.
Nokogiri::XML::Node#css now works for XML documents with default namespaces when the rule contains attribute selector without namespace.
Fixed marshalling bugs around how arguments are passed to (and returned from) XSLT custom xpath functions. #640.
Nokogiri::XML::Reader#outer_xml is broken in JRuby #617
Nokogiri::XML::Attribute on JRuby returns a nil namespace #647
Nokogiri::XML::Node#namespace= cannot set a namespace without a prefix on JRuby #648
(JRuby) 1.9 mode causes dead lock while running rake #571
HTML::Document#meta_encoding does not raise exception on docs with malformed content-type. #655
Fixing segfault related to unsupported encodings in in-context parsing on 1.8.7. #643
(JRuby) Concurrency issue in XPath parsing. #682
Repackaging of 1.5.1 with a gemspec that is compatible with older Rubies. #631, #632.
Features
XML::Builder#comment allows creation of comment nodes.
CSS searches now support namespaced attributes. #593
Java integration feature is added. Now, XML::Document.wrap and XML::Document#to_java methods are available.
RelaxNG validator support in the ‘nokogiri` cli utility. #591 (thanks, Dan Radez!)
Bugfixes
Fix many memory leaks and segfault opportunities. Thanks, Tim Elliott!
extconf searches homebrew paths if homebrew is installed.
Inconsistent behavior of Nokogiri 1.5.0 Java #620
Inheriting from Nokogiri::XML::Node on JRuby (1.6.4/5) fails #560
XML::Attr nodes are not allowed to be added as node children, so an exception is raised. #558
No longer defensively “pickle” adjacent text nodes on Node#add_next_sibling and Node#add_previous_sibling calls. #595.
Java version inconsistency: it returns nil for empty attributes #589
to_xhtml incorrectly generates <p /></p> when tag is empty #557
Document#add_child now accepts a Node, NodeSet, DocumentFragment, or String. #546.
Document#create_element now recognizes namespaces containing non-word characters (like “SOAP-ENV”). This is mostly relevant to users of Builder, which calls Document#create_element for nearly everything. #531.
File encoding broken in 1.5.0 / jruby / windows #529
Java version does not return namespace defs as attrs for ::HTML #542
Bad file descriptor with Nokogiri 1.5.0 #495
remove_namespace! doesn’t work in pure java version #492
The Nokogiri Java native build throws a null pointer exception when ActiveSupport’s .blank? method is called directly on a parsed object. #489
1.5.0 Not using correct character encoding #488
Raw XML string in XML Builder broken on JRuby #486
Nokogiri 1.5.0 XML generation broken on JRuby #484
Do not allow multiple root nodes. #550
Fixes for custom XPath functions. #605, #606 (thanks, Juan Wajnerman!)
Node#to_xml does not override :save_with if it is provided. #505
Node#set is a private method (JRuby). #564 (thanks, Nick Sieger!)
C14n cleanup and Node#canonicalize (thanks, Ivan Pirlik!) #563
Notes
See changelog from 1.4.7
Features
extracted sets of Node::SaveOptions into Node::SaveOptions::DEFAULT_{X,H,XH}TML (refactor)
Bugfixes
default output of XML on JRuby is no longer formatted due to inconsistent whitespace handling. #415
(JRuby) making empty NodeSets with null ‘nodes` member safe to operate on. #443
Fix a bug in advanced encoding detection that leads to partially duplicated document when parsing an HTML file with unknown encoding.
Add support for <meta charset=“…”>.
Notes
JRuby performance tuning
See changelog from 1.4.4
Bugfixes
Node#inner_text no longer returns nil. (JRuby) #264
Notes
See changelog from 1.4.3
Notes
JRuby support is provided by a new pure-java backend.
Deprecations
Ruby 1.8.6 is deprecated. Nokogiri will install, but official support is ended.
LibXML 2.6.16 and earlier are deprecated. Nokogiri will refuse to install.
FFI support is removed.
Bugfixes
Fix a bug in advanced encoding detection that leads to partially duplicated document when parsing an HTML file with unknown encoding. Thanks, Timothy Elliott (@ender672)! #478
Notes
This version is functionally identical to 1.4.5.
Ruby 1.8.6 support has been restored.