URI support for Ruby
Author |
Akira Yamada <akira@ruby-lang.org> |
Documentation |
Akira Yamada <akira@ruby-lang.org>, Dmitry V. Sabanin <sdmitry@lrn.ru> |
License |
Copyright © 2001 akira yamada <akira@ruby-lang.org> You can redistribute it and/or modify it under the same term as Ruby. |
Revision |
$Id: uri.rb 25189 2009-10-02 12:04:37Z akr $ |
See URI for documentation
Author |
Akira Yamada <akira@ruby-lang.org> |
Revision |
$Id: common.rb 31799 2011-05-29 22:49:36Z yugui $ |
License |
You can redistribute it and/or modify it under the same term as Ruby. |
Decode URL-encoded form data from given str.
This decodes application/x-www-form-urlencoded data and returns array of key-value array. This internally uses URI.decode_www_form_component.
charset hack is not supported now because the mapping from given charset to Ruby's encoding is not clear yet. see also www.w3.org/TR/html5/syntax.html#character-encodings-0
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
ary = URI.decode_www_form(“a=1&a=2&b=3”) p ary #=> [[‘a’, ‘1’], [‘a’, ‘2’], [‘b’, ‘3’]] p ary.assoc(‘a’).last #=> ‘1’ p ary.assoc(‘b’).last #=> ‘3’ p ary.rassoc(‘a’).last #=> ‘2’ p Hash # => {“a”=>“2”, “b”=>“3”}
See URI.decode_www_form_component, URI.encode_www_form
# File uri/common.rb, line 836
def self.decode_www_form(str, enc=Encoding::UTF_8)
return [] if str.empty?
unless /\A#{WFKV_}*=#{WFKV_}*(?:[;&]#{WFKV_}*=#{WFKV_}*)*\z/ =~ str
raise ArgumentError, "invalid data of application/x-www-form-urlencoded (#{str})"
end
ary = []
$&.scan(/([^=;&]+)=([^;&]*)/) do
ary << [decode_www_form_component($1, enc), decode_www_form_component($2, enc)]
end
ary
end
Decode given str of URL-encoded form data.
This decods + to SP.
See URI.encode_www_form_component, URI.decode_www_form
# File uri/common.rb, line 761
def self.decode_www_form_component(str, enc=Encoding::UTF_8)
if TBLDECWWWCOMP_.empty?
tbl = {}
256.times do |i|
h, l = i>>4, i&15
tbl['%%%X%X' % [h, l]] = i.chr
tbl['%%%x%X' % [h, l]] = i.chr
tbl['%%%X%x' % [h, l]] = i.chr
tbl['%%%x%x' % [h, l]] = i.chr
end
tbl['+'] = ' '
begin
TBLDECWWWCOMP_.replace(tbl)
TBLDECWWWCOMP_.freeze
rescue
end
end
raise ArgumentError, "invalid %-encoding (#{str})" unless /\A(?:%\h\h|[^%]+)*\z/ =~ str
str.gsub(/\+|%\h\h/, TBLDECWWWCOMP_).force_encoding(enc)
end
Generate URL-encoded form data from given enum.
This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.
This internally uses URI.encode_www_form_component(str).
This doesn’t convert encodings of give items, so convert them before call this method if you want to send data as other than original encoding or mixed encoding data. (strings which is encoded in HTML5 ASCII incompatible encoding is converted to UTF-8)
This doesn’t treat files. When you send a file, use multipart/form-data.
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.encode_www_form_component, URI.decode_www_form
# File uri/common.rb, line 799
def self.encode_www_form(enum)
str = nil
enum.each do |k,v|
if str
str << '&'
else
str = nil.to_s
end
str << encode_www_form_component(k)
str << '='
str << encode_www_form_component(v)
end
str
end
Encode given str to URL-encoded form data.
This doesn’t convert *, -, ., 0-9, A-Z, _, a-z, does convert SP to +, and convert others to %XX.
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.decode_www_form_component, URI.encode_www_form
# File uri/common.rb, line 732
def self.encode_www_form_component(str)
if TBLENCWWWCOMP_.empty?
tbl = {}
256.times do |i|
tbl[i.chr] = '%%%02X' % i
end
tbl[' '] = '+'
begin
TBLENCWWWCOMP_.replace(tbl)
TBLENCWWWCOMP_.freeze
rescue
end
end
str = str.to_s
if HTML5ASCIIINCOMPAT.include?(str.encoding)
str = str.encode(Encoding::UTF_8)
else
str = str.dup
end
str.force_encoding(Encoding::ASCII_8BIT)
str.gsub!(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_)
str.force_encoding(Encoding::US_ASCII)
end
URI::extract(str[, schemes][,&blk])
str |
String to extract URIs from. |
schemes |
Limit URI matching to a specific schemes. |
Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.
require "uri" URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.") # => ["http://foo.example.com/bla", "mailto:test@example.com"]
# File uri/common.rb, line 680
def self.extract(str, schemes = nil, &block)
DEFAULT_PARSER.extract(str, schemes, &block)
end
URI::join(str[, str, ...])
str |
String(s) to work with |
Joins URIs.
require 'uri' p URI.join("http://localhost/","main.rbx") # => #<URI::HTTP:0x2022ac02 URL:http://localhost/main.rbx>
# File uri/common.rb, line 652
def self.join(*str)
DEFAULT_PARSER.join(*str)
end
URI::parse(uri_str)
uri_str |
String with URI. |
Creates one of the URI’s subclasses instance from the string.
Raised if URI given is not a correct one.
require 'uri' uri = URI.parse("http://www.ruby-lang.org/") p uri # => #<URI::HTTP:0x202281be URL:http://www.ruby-lang.org/> p uri.scheme # => "http" p uri.host # => "www.ruby-lang.org"
# File uri/common.rb, line 627
def self.parse(uri)
DEFAULT_PARSER.parse(uri)
end
URI::regexp([match_schemes])
match_schemes |
Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes. |
Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on it’s number.
require 'uri' # extract first URI from html_string html_string.slice(URI.regexp) # remove ftp URIs html_string.sub(URI.regexp(['ftp']) # You should not rely on the number of parentheses html_string.scan(URI.regexp) do |*matches| p $& end
# File uri/common.rb, line 715
def self.regexp(schemes = nil)
DEFAULT_PARSER.make_regexp(schemes)
end
# File uri/common.rb, line 540
def self.scheme_list
@@schemes
end
URI::split(uri)
uri |
String with URI. |
Splits the string on following parts and returns array with result:
* Scheme * Userinfo * Host * Port * Registry * Path * Opaque * Query * Fragment
require 'uri' p URI.split("http://www.ruby-lang.org/") # => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
# File uri/common.rb, line 592
def self.split(uri)
DEFAULT_PARSER.split(uri)
end