googling if commas are allowed in the spec isn't agreed apon, some say reserved some say allowed. My vote would be not to use them.
2.2. Reserved Characters
Many URI include components consisting of or delimited by, certain special characters. These characters are called "reserved", since their usage within the URI component is limited to their reserved purpose. If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI.
reserved =";"|"/"|"?"|":"|"@"|"&"|"="|"+"|"$"|","
The "reserved" syntax class above refers to those characters that are allowed within a URI, but which may not be allowed within a particular component of the generic URI syntax
Short answer, you can use commas, but not as part of the "address", only as part of a delimiter (for parameters, for example).
Long answer - read on!
When searching, go straight to the source...
From the RFC 1738 (from 1994) that defines exactly what is a URL, (the link goes right to the page), commas were allowed...
No corresponding graphic US-ASCII:
URLs are written only with the graphic printable characters of the
US-ASCII coded character set. The octets 80-FF hexadecimal are not
used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
control characters; these must be encoded.
Unsafe:
Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.
Therefore, commas were marked as "safe extra-national punctuation".
BUT
An update to the standard RFC 3986 (Jan 2005) states:
URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component. If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.
Therefore commas are allowed as *delimiters* within URLs (separating parameters after a # for instance), but must not be allowed as part of the URL body itself. Interesting reading, really, and I would like to point out that the standard Umbraco conversion of page name into URL FAILS this test miserably :)
And that is why you can (and can not) use commas in your URLs.
In the end you'll need to add quite a few characters to the umbracosettings.config section, if you don't want to have naste html encoded links (all french accents like é à ç è , ...)
umbracoUseDirectoryUrls allows commas in the URL...
for instance I have a URL
http:://www.domain.com/jp/briefings/briefings/obligation,-opportunity-or-self-preservation
googling if commas are allowed in the spec isn't agreed apon, some say reserved some say allowed. My vote would be not to use them.
Short answer, you can use commas, but not as part of the "address", only as part of a delimiter (for parameters, for example).
Long answer - read on!
When searching, go straight to the source...
From the RFC 1738 (from 1994) that defines exactly what is a URL, (the link goes right to the page), commas were allowed...
Therefore, commas were marked as "safe extra-national punctuation".
BUT
An update to the standard RFC 3986 (Jan 2005) states:
Dekker
@Dekker
Don't ask for thumbs-up, if your post was good, then it will get thumbs-up :-)
Ontopic: You can add the comma to the urlReplacing config section in the config\umbracoSettings.config file...
In the end you'll need to add quite a few characters to the umbracosettings.config section, if you don't want to have naste html encoded links (all french accents like é à ç è , ...)
Kind regards,
Rik
Yes, that's too bad isn't it?
Umbraco should give you the option to define your own 'GenerateNiceUrl' method, something like this:
is working on a reply...