Nokogiri::XML::Document is the main entry point for dealing with XML documents. The Document is created by parsing an XML document. See ::parse for more information on parsing.
For searching a Document, see Nokogiri::XML::Node#css and Nokogiri::XML::Node#xpath
I'm ignoring unicode characters here. See www.w3.org/TR/REC-xml-names/#ns-decl for more details.
A list of Nokogiri::XML::SyntaxError found when parsing a document
Parse an XML file.
string_or_io
may be a String, or any object that responds to
read and close such as an IO, or StringIO.
url
(optional) is the URI where this document is located.
encoding
(optional) is the encoding that should be used when
processing the document.
options
(optional) is a configuration object that sets options
during parsing, such as Nokogiri::XML::ParseOptions::RECOVER. See the Nokogiri::XML::ParseOptions for more
information.
block
(optional) is passed a configuration object on which
parse options may be set.
When parsing untrusted documents, it's recommended that the
nonet
option be used, as shown in this example code:
Nokogiri::XML::Document.parse(xml_string) { |config| config.nonet }
Nokogiri.XML() is a convenience method which will call this method.
# File lib/nokogiri/xml/document.rb, line 43 def self.parse string_or_io, url = nil, encoding = nil, options = ParseOptions::DEFAULT_XML, &block options = Nokogiri::XML::ParseOptions.new(options) if Fixnum === options # Give the options to the user yield options if block_given? return new if !options.strict? && empty_doc?(string_or_io) doc = if string_or_io.respond_to?(:read) url ||= string_or_io.respond_to?(:path) ? string_or_io.path : nil read_io(string_or_io, url, encoding, options.to_i) else # read_memory pukes on empty docs read_memory(string_or_io, url, encoding, options.to_i) end # do xinclude processing doc.do_xinclude(options) if options.xinclude? return doc end
JRuby
Wraps Java's org.w3c.dom.document and returns Nokogiri::XML::Document
# File lib/nokogiri/xml/document.rb, line 250 def self.wrap document raise "JRuby only method" unless Nokogiri.jruby? return wrapJavaDocument(document) end
# File lib/nokogiri/xml/document.rb, line 235 def add_child node_or_tags raise "Document already has a root node" if root && root.name != 'nokogiri_text_wrapper' node_or_tags = coerce(node_or_tags) if node_or_tags.is_a?(XML::NodeSet) raise "Document cannot have multiple root nodes" if node_or_tags.size > 1 super(node_or_tags.first) else super end end
Recursively get all namespaces from this node and its subtree and return them as a hash.
For example, given this document:
<root xmlns:foo="bar"> <bar xmlns:hello="world" /> </root>
This method will return:
{ 'xmlns:foo' => 'bar', 'xmlns:hello' => 'world' }
WARNING: this method will clobber duplicate names in the keys. For example, given this document:
<root xmlns:foo="bar"> <bar xmlns:foo="baz" /> </root>
The hash returned will look like this: { 'xmlns:foo' => 'bar' }
Non-prefixed default namespaces (as in “xmlns=”) are not included in the hash.
Note that this method does an xpath lookup for nodes with namespaces, and as a result the order may be dependent on the implementation of the underlying XML library.
# File lib/nokogiri/xml/document.rb, line 160 def collect_namespaces xpath("//namespace::*").inject({}) do |hash, ns| hash[["xmlns",ns.prefix].compact.join(":")] = ns.href if ns.prefix != "xml" hash end end
Create a Comment Node containing
string
# File lib/nokogiri/xml/document.rb, line 116 def create_comment string, &block Nokogiri::XML::Comment.new self, string.to_s, &block end
Create an element with name
, and optionally setting the
content and attributes.
doc.create_element "div" # <div></div> doc.create_element "div", :class => "container" # <div class='container'></div> doc.create_element "div", "contents" # <div>contents</div> doc.create_element "div", "contents", :class => "container" # <div class='container'>contents</div> doc.create_element "div" { |node| node['class'] = "container" } # <div class='container'></div>
# File lib/nokogiri/xml/document.rb, line 81 def create_element name, *args, &block elm = Nokogiri::XML::Element.new(name, self, &block) args.each do |arg| case arg when Hash arg.each { |k,v| key = k.to_s if key =~ NCNAME_RE ns_name = key.split(":", 2)[1] elm.add_namespace_definition ns_name, v else elm[k.to_s] = v.to_s end } else elm.content = arg end end if ns = elm.namespace_definitions.find { |n| n.prefix.nil? or n.prefix == '' } elm.namespace = ns end elm end
Apply any decorators to node
# File lib/nokogiri/xml/document.rb, line 208 def decorate node return unless @decorators @decorators.each { |klass,list| next unless node.is_a?(klass) list.each { |moodule| node.extend(moodule) } } end
Get the list of decorators given key
# File lib/nokogiri/xml/document.rb, line 168 def decorators key @decorators ||= Hash.new @decorators[key] ||= [] end
A reference to self
# File lib/nokogiri/xml/document.rb, line 126 def document self end
Create a Nokogiri::XML::DocumentFragment from
tags
Returns an empty fragment if tags
is nil.
# File lib/nokogiri/xml/document.rb, line 227 def fragment tags = nil DocumentFragment.new(self, tags, self.root) end
The name of this document. Always returns “document”
# File lib/nokogiri/xml/document.rb, line 121 def name 'document' end
Get the hash of namespaces on the root Nokogiri::XML::Node
# File lib/nokogiri/xml/document.rb, line 220 def namespaces root ? root.namespaces : {} end
Explore a document with shortcut methods. See Nokogiri::Slop for details.
Note that any nodes that have been instantiated before slop! is called will not be decorated with sloppy behavior. So, if you're in irb, the preferred idiom is:
irb> doc = Nokogiri::Slop my_markup
and not
irb> doc = Nokogiri::HTML my_markup ... followed by irb's implicit inspect (and therefore instantiation of every node) ... irb> doc.slop! ... which does absolutely nothing.
# File lib/nokogiri/xml/document.rb, line 197 def slop! unless decorators(XML::Node).include? Nokogiri::Decorators::Slop decorators(XML::Node) << Nokogiri::Decorators::Slop decorate! end self end
JRuby
Returns Java's org.w3c.dom.document of this Document.
# File lib/nokogiri/xml/document.rb, line 258 def to_java raise "JRuby only method" unless Nokogiri.jruby? return toJavaDocument() end