Characters in a URL other than the English alphanumeric characters and
- _ . ~ should be encoded as %
plus a two-digit hexadecimal representation, and any single-byte
character can be so encoded. (Multi-byte characters are encoded
byte-by-byte.) The standard refers to this as percent-encoding. In addition, ! $ & ' ( ) * + , ; = : / ? @ # [ ] are reserved
characters, and should be encoded unless used in their reserved sense,
which is scheme specific. The default in URLencode
is to leave
them alone, which is appropriate for file:// URLs, but probably
not for http:// ones.
An apparently already-encoded URL is one containing
%xx
for two hexadecimal digits.