[VFS] URI normalization

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[VFS] URI normalization

Philippe Poulard
Should VFS normalize URIs before parsing a file name ?

--- URI normalization ---

URI references require encoding and escaping of certain characters. The
disallowed characters include all non-ASCII characters, plus the
excluded characters listed in Section 2.4 of [RFC 2396], except for the
number sign (#) and percent sign (%) characters and the square bracket
characters re-allowed in [RFC 2732].
The set of excluded US-ASCII characters is :
  [00-20]    [22] [3C] [3E] [5C] [5E] [60] [7B-7D] [7F]
   C0  SPACE   "    <    >    \    ^    `   { | }   DEL

Escaping disallowed characters is performed as follows:
1. Each disallowed character is converted to UTF-8 [RFC 2279] as one or
more bytes.
2. Any octets corresponding to a disallowed character are escaped with
the URI escaping mechanism (that is, converted to %HH, where HH is the
hexadecimal notation of the octet value). If escaping must be performed,
uppercase hexadecimal characters should be used.
3. The original character is replaced by the resulting character sequence.
Note that this normalization process is idempotent: repeated
normalization does not change a normalized URI reference.

--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] URI normalization

Mario Ivankovits
Hi!
> Should VFS normalize URIs before parsing a file name ?
Do you have a problem with the current behaviour?
VFS do not encode special characters other than "%" and sometimes "?"
(url based fs) and sometimes "!" (layers/zip fs).

Before any parsing the filename is DEcoded. For VFS it is needed to have
a consistent view of the filename even if it is encoded or decoded -
that does not matter.

If I encode/normalize it, all visual representation of the filename
looks a little bit strange.

Even if the VFS filename looks like a URI I think we could still treat
it simply as "VFS filename" - human readable with minimum encoding.
The filesystem implemenation is responsible to to encode it at needed
(e.g. take session charset into account).

For sure, you can reverse all said above ... but what's the advantage of it?

The BIG disadvantage is to have to deal e.g. with charsets in VFS core.
If the filename is encoded we have to know which charset was used.


Ciao,
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] URI normalization

Philippe Poulard
Mario Ivankovits wrote:
> Hi!
>
>> Should VFS normalize URIs before parsing a file name ?
>
> Do you have a problem with the current behaviour?

not yet ;)

i'm dealing with XML resources and some XML standards oblige to use
normalized URIs

as a french person, I'll try VFS with a file named "? la p?che.xml" and
tell you if I encounter any problem

> VFS do not encode special characters other than "%" and sometimes "?"
> (url based fs) and sometimes "!" (layers/zip fs).
>
> Before any parsing the filename is DEcoded. For VFS it is needed to have
> a consistent view of the filename even if it is encoded or decoded -
> that does not matter.
>
> If I encode/normalize it, all visual representation of the filename
> looks a little bit strange.
>
> Even if the VFS filename looks like a URI I think we could still treat
> it simply as "VFS filename" - human readable with minimum encoding.
> The filesystem implemenation is responsible to to encode it at needed
> (e.g. take session charset into account).
>
> For sure, you can reverse all said above ... but what's the advantage of
> it?
>
> The BIG disadvantage is to have to deal e.g. with charsets in VFS core.
> If the filename is encoded we have to know which charset was used.
>
>
> Ciao,
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] URI normalization

Mario Ivankovits
Hi Philippe!
>> Do you have a problem with the current behaviour?
> not yet ;)
>
> as a french person, I'll try VFS with a file named "? la p?che.xml"
> and tell you if I encounter any problem
I am looking forward to hear from you. :-)
In advance - charset/encoding/printers/filetransfers are a pain in our
business ;-) ...

Ciao,
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]