Sunday, April 03, 2011

Sharing a #Tamil #Wikipedia URL

When someone pastes to you the URL for the Tamil Wikipedia Village pump, it will likely look like one of these:
The first option is copy pasting the URL directly. The second option is achieved by appending the name of the article after . The third option is courtesy of some clever JavaScript and a domain specially acquired for the purpose.

In a modern browser, the URL shows like the second option, it is sad that it reverts to gobbledygook when it is copy pasted. There must be a reason that once was valid, the question is if this reason is still valid.

1 comment:

brion said...

I believe it's for compatibility. The browser doesn't know whether the eventual target of the cut-n-paste can deal with Unicode IRIs (the much more attractive version with literal characters) or will need traditional encoded ASCII URLs, so following the principle of least surprise the fully-compatible ASCII URL is used for cut-n-paste.

Smart software (including wikis, blog posting systems etc) should be able to detect the encoded UTF-8 and, like the browsers themselves do, transform it back into the pretty Unicode IRI form when showing it again, but it's not done as widely as it could be.