Words and what not: April 2012

Monday, April 30, 2012

The "pitfalls" of the #Arabic language

A wonderful project aims to create a #dictionary of technology and social media related terminology. It is a necessary project because it will help people use the Internet in the Arabic language.

The one thing that is quite telling is that it aims to add this terminology to classical Arabic and that it refers to "the many different Arabic dialects".

When Muslims learn the Quran, they learn a language as it was spoken in the days of the prophet Muhammed. The Arabic language itself has evolved in the mean time in many distinctly different ways. As such it can be compared to the Latin language, that brought us Italian, French, Spanish..

The question is to what extend a dictionary of modern words will become part of what is after all a classical language and be adopted. It is however very much a wonderful enterprise, it may even standardise the targeted terminology in all the Arabic languages.
Thanks,
GerardM

Saturday, April 28, 2012

Is #agriculture must have #Wikipedia Zero content for #Africa?

FarmAfripedia is a project about the best farming practices in Africa. Great information that needs to find its way to African farmers. The project is using the same collaboration model as Wikipedia.

The project did get the attention of the FAO in one of its newsletters. One of the issues for content for Africa is getting it to the people who have use for it. They do not all speak English and increasingly they have a mobile.

When there is a demand for agricultural information, when people spend time creating projects like FarmAfripedia, it would be great to have much of this information within the Wikimedia Foundation, particularly in Wikipedia. An important consideration will be that many Africans may receive this information free of charge thanks to Wikipedia Zero.

Providing agricultural knowledge in an encyclopaedic setting in one of the native languages of Africa is an ambition that can easily be realised. People with such an ambition will find that it is not that hard to get a Wikipedia in new languages approved. As there is not that much information in native languages available, such an initiative has great potential.
Thanks,
GerardM

#Wikidata may support what #Wikipedia does not know

The first actual Wikidata project starts to become functional. It is interwiki links on steroids. Just one database for all the interwiki links. An API or application programming interface to acquire the relevant data. At this time it works alongside the old interwiki system and it is getting most of its updates that way.

Work is under way to make an editing interface. when this is fully functional and when Wikidata can cope with the amount of data Wikipedia will ask it to serve it will replace the existing interwikis and good riddance.

Wikipedia is not the only project that makes use of interwiki links. Wiktionary is another. There are fewer Wiktionaries than Wikipedias but languages are treated differently; the English Wiktionary alone supports entries in some 450 languages.

Now consider what happens when the links from Wiktionary and Wikipedia are joined and the translations for concepts in Wiktionary are available in Wikidata ...

Over 150 languages are added. That is exciting enough. More useful will be all the translations to languages that we already support. When a search item is entered, it can be found as it is known to Wikidata and it can be shown in red for easy editing. As the concept found is associated with existing Wikipedia articles, an article can be presented in another language.

How cool is that?
Thanks,
GerardM

Friday, April 27, 2012

Duh; a three letter challenge to #Google #Android

Many a #Wikipedia is written in a language that is known by its three letter ISO-639-3 code. All these languages do not have a two letter equivalent like English (en) or Dutch (nl) have. These languages have their own localisation. This includes localisation for the Wikipedia mobile application.

Many people of us want to use this localisation on their Android phone. This is not that straight forward; you have to install an app that allows you to use a user interface in stead of the Android user interface itself.

There is a problem. Android is not ready to support languages that are only known by their three language code. It is probably an oversight and possibly because all the "big" languages have two character equivalents. However there is no language bigger than the language you grew up with.

NB duh is the code for Dungra Bhil, a language from India.
Thanks,
GerardM

Thursday, April 26, 2012

#Wikisource is not a data repository

The cost of access to proprietary data sources has only gone up. The budgets of libraries has not kept pace and relevant sources that used to be available to students for study and research are often no longer available. It is no longer an issue only for libraries in "other" countries.

Many universities and even countries like the Netherlands mandate that scientific publications need to be made available as Open Access. Typically publications and data become available under a free license for the whole world to use. One key side effect is that true science is helped exactly because the data and publications are freely available.

Recently, the library of Harvard University made millions of its library records available under a CC-0 license. These records contain bibliographic information about books, videos, audio recordings, images, manuscripts, maps, and more and are in the "Marc21" format.

Data like this is useful when you can query it and it remains useful as long as it is maintained. This data is maintained by Harvard University and it is maintained because it is key to the functioning of its library. It is unlikely that Harvard University is the only library in need of such data and once many organisations work together maintaining a universal database about books, videos, audio recordings, images, manuscripts, maps, and more, the data becomes more complete, authoritative and useful.

Having such data in Wikisource as has been suggested is not a good idea for several reasons. Wikisource is used for storing text, not data. A rich resource like this needs continuous and reliable maintenance to be useful. All this is available from Harvard. The catalog records are available for bulk download from Harvard, and are available for programmatic access by software applications via API's at the Digital Public Library of America (DPLA).

When data like this is useful to the Wikimedia Foundation's projects, the first order of business would be to study those API's maybe implement them and only consider storing such data at the Wikimedia Foundation for editing when there is a benefit. One benefit could be to integrate data from other sources and the subsequent need for de-duplication. Then again, it is unlikely that librarians are waiting for the Wikimedia movement to get into this act and, realistically the WMF will need an operational Wikidata before it can unleash its communities on what is just one of the many many important data resources that are available under a free license.

Thanks,

GerardM

Wednesday, April 25, 2012

Supporting #Asturian in the #CLDR II

It is that time when people CAN support their language and enter data about their language in the CLDR. Last year, collecting the necessary data for Asturian did not happen within the set amount of time.

This year another try will be made to find and enter the data for Asturian. It is good news. I hope to learn about more languages and locales that will be entered.
Thanks,
GerardM

Tuesday, April 24, 2012

#MediaWiki global preferences

At the #Wikimedia Foundation, there are several teams of developers. They all do their best for the Wikimedia projects and sometimes, they hit a blocker and that functionality is out of scope for them. Out of scope because it is something that is in scope for another team.

For the WMF Localisation Team, having global preferences is a big deal. However, for this team it is out of scope. Consider for instance this scenario:

The user sets the language of his user interface and when he goes to another Wiki, he will know what the software expects of him because the language of the user interface stays the same.

Another nice scenario:

The user indicates what languages he knows in his Babel information and he will by default see interwiki links in the languages he knows.

Scenarios like these make sense on any multi-lingual project like Wikipedia or Wiktionary. When we start to make use of what we know of a user, we can be more helpful. The interfaces we build can be less cluttered. More important is that we treat people like people and ask things only once.
Thanks,
GerardM

Monday, April 23, 2012

Fallback in #MediaWiki #localisation

At translatewiki.net, people localise in many languages. The objective of all this work is that people understand what the software wants from them.

Often we do not have a localisation for a particular message. For such a situation, we provide a message from a "fall back" language. The selection of a fall back language is very simple; it should be a language that people are likely to know. When a fall back language or several fall back languages do not provide a service, ultimately it will be English what we serve to our readers and editors.

Sometimes using a fall back language has its issues:

The difference between Russian and Ukrainian messages is sometimes mainly in the spelling. Language purists hate this. In such instances, our suggestion is to fix it by completing the localisation. In this way there is no need for a fall back language.
For Spanish there is a more formal Spanish. There is only a need for a localisation when there is a difference. The need to continuously identify the messages that are the same is what makes these localisations difficult.

Having "fall back" languages is a good thing, we cannot do without them as long as we do not have full localisations for all the languages we support.
Thanks,
Gerard

Interesting perk for working with #Wikipedia

By Thelmadatter

When people ask me why I devote so much of my free time to Wikipedia, I tell them that while I dont get paid for what I do, I get some amazing perks! I experienced one of these perks during my recent trip to the Costa Chica region on Mexico’s Pacific coast.

Mexico’s first Wikipedia GLAM cooperation is with the Museo de Arte Popular (MAP) in Mexico City, which started in May 2011. MAP has invited me and other Wikipedians to cover various museum events for Wikipedia. One of these was the sixth anniversary celebration of the institution in March of 2012, where I had the chance to meet a number of interesting artisans, including Juana Santa Ana. Juana does public relations for a group of Amuzgo indigenous weavers called Liaa’ Ljaa,’ which is in Xochistlahuaca, Guerrero state, near the Pacific coast. Xochistlahuaca is the largest Amuzgo community in Mexico where most people still wear traditional dress and speak the Amuzgo language. The most notable garment is the women’s huipil, a long tunic worn over a sleeveless dress. You can see both hand woven huipils with complicated designs as well as those made from commercial fabric.

Juana invi

ted my husband (AlejandroLinaresGarcia) and me to visit her town and see the work that they do. Juana could not accompany us, but she gave us the name of her brother Ireneo, who runs the Xochistlahuaca Community Museum. We spent almost two days in the town, getting not only a personal tour of the town and museum and a chance to take photos of some of the town’s weavers but we were also treated to meals prepared by Juana’s and Ireneo’s mother (who speaks only Amuzgo). While tops was the opportunity to take the photos and show everyone what we can do in Wikipedia, I have to admit the food was a really, really close second!

Finished articles for Wikipedia from that trip so far include the Costa Chica of Guerrero (the region where most Amuzgo live) and Amuzgo textiles. Other articles that will follow include one about the Amuzgo people and a decent expansion of Xochistlahuaca. The Amuzgo people I met were amazing. Neither Juana or Ireneo were familiar with Wikipedia as most of the older generations from Xochistlahuaca are not familiar with the Web. However, they are open to new ideas and how to promote their amazing textiles to the world.

#MediaWiki #Localisation update process runs now with #GIT

Tthe #Wikimedia Foundation embarked on the migration from SVN to GIT and it proved quite an adventure. It was unclear how changes in the MediaWiki software and its extensions would end up in translatewiki.net . It was unclear how changes in the localisation would end up in GIT and from there on the production wikis.

We are now happy to report that the pipeline from new and changed messages in the MediaWiki software and its extensions resulting in new and changed localisations in all the WMF projects is working again. The messages that have been localised are now automatically moved into GIT and from there the LocalisationUpdate process makes them available in all the WMF wikis.

When you had the LocalisationUpdate process running on your wiki, it is likely that you need to update the software to reflect the change from SVN to GIT. When the latest version of LocalisationUpdate does not work for you, please create a bug in Bugzilla and ask for support.
Thanks,
GerardM

Translating:Waymarked Trails

When hiking, cycling, mountain biking or skating is your passion, you are always in for another great route. The Waymarked Trails application may provide you with the inspiration to go where you never went before.

The maps shows sign-posted cycling routes around the world. It is based on data from the OpenStreetMap (OSM) project. OSM is a freely editable world map where anybody can participate. That means that the maps are by no means complete, but it also means that you can contribute by adding new routes and correcting mistakes in existing ones. To find out more about OpenStreetMap, see the Beginner's guide.

The Waymarked Trails application has recently found localisation support at translatewiki.net. As these things go, it is with 48 messages a small project and consequently the software has already been localised in several languages.
Thanks,
GerardM

Friday, April 20, 2012

Do you speak #copyright? A challenge for the GLAM community

Peter Weis is one of my friends who does GLAM related things for quite some time now. He is really good at restoring images and he loves to learn about copyright and licenses. The vagaries of copyright and licenses has been one of our favourite subjects. As Peter will attend the Open Glam workshop in Berlin, I asked him to blog about it.

This is his first contribution.
Thanks,
GerardM

Wikimedia Commons only accepts material with a free licence that is valid in the United States and the country of origin and the location of the uploader. For most people, including many Wikimedians, a statement like this could be Greek to them. Learning Greek takes a long term commitment - a process that is helped when you can talk to native speakers who help you understanding the intricacies of that language. Just like learning Greek, learning about copyright is difficult. If you want to learn a new language you can do that in school, at university, an adult education center or language courses via Internet. But where do you go when you want to learn about copyright? Learning about copyright is not about understanding every exception; it is about getting a working knowledge about what is relevant for the content you are interested in. Usually, there is no easy chart or table that tells you what you need to know. That's where the native speakers of copyright can help you. The native speakers of copyright are usually copyright lawyers, for example within the legal departments of GLAM institutions. Partnering with GLAM institutions usually involves media files, articles and such. Integrating a legal workshop into a GLAM cooperation can help to tear down common misconceptions for both sides: while the legal department of a GLAM is firm with national copyright law, Wikimedians are firm with the guidelines and policies of projects like Wikipedia and Wikimedia Commons. Conducting a legal workshop can help tailoring a content donation to the framework of Wikimedia’s projects. On April, 20th the Open GLAM workshop will be hosted in Berlin. The workshop aims to explore the legal questions surrounding cultural cooperation within the GLAM sector. It will feature successful open data initiatives and there will be an open data licensing clinic with lawyers and legal experts. They will address issues and questions about common licensing frameworks. It will be a workshop for law aficionados and people who care about copyright. The outcome will probably be relevant for the GLAM Wiki movement. I’m going to report back to you with my personal highlights of the event. Regards,

Peter Weis

Thursday, April 19, 2012

The #Batak #script gets its #font

Thank you #Wikimedia Foundation

History is written by the victor. The Dutch were victorious over the Batak people and as a result they became part of what is now Indonesia. At that time the Batak languages were written in the Batak script and many documents ended up in Dutch archives and museums like the Tropenmuseum.

Once the existence of a script is recognised in the ISO-15924, the characters of that script can be defined in Unicode. Once they are defined in Unicode, a font can be created for a script. When this font is freely licensed, everybody may use this font. Existing documents and literature can be transcribed and texts that were transcribed in something other then Unicode can be converted.

With documents and literature transcribed or converted, they can become available for research and when they can be easily found, it becomes possible to understand more of the history of the Batak people from their point of view.

A free font makes it possible to display the Batak characters. As you can understand, with the development of a first Unicode font for the Batak script, there is not much to display yet. Documents and literature have to be transcribed and typically a keyboard is used for this. A standard keyboard does not map to Batak characters and a keyboard mapping is what makes this possible.

The Wikimedia Foundation makes it possible to have this font. A grant has been given to do a GLAM project about Batak documents in Dutch museums and archives. Scans of Batak documents will be uploaded on Commons and the transcriptions will have to go to a Wikisource that supports both the WebFonts and the Narayam extension.

Now that we can start building a font, we are getting ready to reach out to people who have documents in one of the Batak languages. We will be reaching out to people who know these languages and as people like myself transcribe documents, they are the ones who have to identify the precise Batak language.

As you may understand, a project like this is many faceted. It will be an adventure to learn how many facets there are. Watch this space for future updates.
Thanks,
GerardM

Monday, April 16, 2012

#NPOV and information warfare

#Wikipedia has the neutral point of view (NPOV) as one of its core principles. Wikipedia finds its public all over the world and consequently sources that are to establish facts are from all over the world.

Information warfare is practised by many opposing parties, companies and countries. One of its aspects is to "spreading of propaganda or disinformation to demoralize or manipulate^[1] the enemy and the public, undermining the quality of opposing force information and denial of information-collection opportunities to opposing forces."

When information provided by sources is tainted by premeditated lies, when this is an established fact, the question of a notion of a neutral point of view is no longer about different view points but about different fabrications. Wikipedia frowns about when some sources are used. Maybe even more sources need to be considered to be for what they are, the conduit of fabrications and not as sources of facts.
Thanks,
GerardM

PS inspired as often by Bruce Schneier ..

Sunday, April 15, 2012

#Scans, #transcriptions and #copyright

Wonderful news. This week the University of Oxford and the Vatican announced a plan to collaborate in digitizing 1.5 million pages of rare and ancient texts, most dating from the 16th century or earlier. All this will become available on-line and most likely a copyright will be claimed on these scans.

Sadly some countries claim that "sweat of the brow" is enough reason to prevent pictures or scans of works that are obviously in the public domain to become available as public domain. The usability of scans is however limited. It is only once the texts in such scans have been transcribed that you can easily read it and research it.

Both scanning and transcription is a lot of work. Both involve a lot of sweat of the brow. Where scanning precious books and documents requires specialists and equipment, transcription takes people willing to type the texts that they see in the scan.

When rights are reserved on scans using the "sweat of the brow" argument, the texts in these scans cannot be claimed in this way. These books are all in the public domain and transcribing them exactly from these scans serves one purpose really well. They establish the provenance between the original sources that are represented in these scans and the texts people refer to.

Having the original text exactly transcribed is one way of dealing with problematic translations and the resulting problematic interpretations. Having such transcriptions in Wikisource is well worth it.
Thanks,
GerardM

Supporting #Asturian in the #CLDR

Finding the data to support a language in the CLDR can be a struggle. The core requirements are only a few so how hard can it be...

(04) Exemplar sets: main, auxiliary, index, punctuation. [main/xxx.xml]
(02) Orientation (bidi writing systems only) [main/xxx.xml]
(01) Plural rules [supplemental/plurals.xml]
(01) Default content script and region (normally: normally country with largest population using that language, and normal script for that). [supplemental/supplementalMetadata.xml]
(N) Verify the country data ( i.e. which territories in which the language is spoken enough to create a locale ) [supplemental/supplementalData.xml]
*(N) Romanization table (non-Latin writing systems only) [spreadsheet, we'll translate into transforms/xxx-en.xml]

When you read this, the text indicating what the initial requirements are, it becomes quite obvious why this process has such a bad reputation.

It is not clear nor relevant where the data provided ends up in an xml format
Orientation is very much an aspect of a script, not of a language nor of a locale.
In the survey tool it is Esperanto that proves that a language may not fit into a locale anyway.
Romanisation is stated as a requirement. It is however not obvious at all that every script or language has ever been romanised in a standardised way and why this might keep a language out of the standard

In the past there has been an attempt to provide information for the Asturian language to the CLDR. The good news is that there is documentation on why it failed. The problem was that when you establish data about a language, you need to be certain. Four Unicode characters ('Ḥ ḥ Ḷ ḷ') are used for writing the Asturian language properly and the literature on the subject was not consistent. This issue was resolved but it took more time then was available in the CLDR time box.

The Asturian example proves that getting data ready for a standard takes time. The practice of closing a request because the data was not provided within a set amount of time is what stopped people dead in their tracks. We can only hope that Asturians will find what it takes to get support for their language in this time box.
Thanks,
GerardM

Thursday, April 12, 2012

What should get into #Wikibooks or #Wikisource

When the number of copies in existence of an important book like "Nazism and euthanasia of the mentally ill in Germany" or "Die Tötung Geisteskranker" written by Alice Ricciardi-von Platen is less then twenty, it is important that such a book is transcribed and becomes widely available.

This book is the first book written on the murder of the mentally ill in the "third reich". It is based on her personal experiences and more importantly from her observations at the Nuremberg Doktors' trial.

Mrs Ricciardi died and one of her family members promised to seek approval for the book to be available under a free license. Sadly we are still waiting.

It is however books, sources like this that are of an importance that go beyond the abstract niceties of copy right. Mrs Ricciardi did not get rich because of this book, there is no monetary value of the copyright.

It is however important that our generation, later generations do not have the excuse that "they did not know". Wikibooks and Wikisource are in a great position to play a role. In the mean time, I will ask my friend if he has an update about the copyright of the book by Mrs Ricciardi.
Thanks,
GerardM

#CLDR is not used solely for locales

#Unicode is best known for its architecture for the digital representation of the characters off scripts. The Unicode consortium also hosts the "common locale data repository" project also known as the CLDR. In this repository you can find how languages are used in different areas. You will find for instance what currency symbol is used. There is also information on what direction is the language written in, the way numbers and dates are written.

Many applications rely on this data. Several Open Source word processors use this data to enable a language for editing. While it is great that this data is used it is problematic because not all languages are spoken let alone in locales.

One great example is Ancient Greek; this language is not a living language but it is taught in schools all over the world. Students are doing their homework and what they write, it is certainly not modern Greek. From a technical perspective, it is only correct when the meta-data of such documents indicates that it is Ancient Greek.

When Ancient Greek and other extinct languages are supported in word processors, surviving texts can be written with modern tools. These transcribed texts will by default have correct meta data and it will be easier to find them when they are placed on the Internet.

For this to happen, the CLDR either embraces that it is used to enable languages in word processors or the word processors who currently use the CLDR allow for alternate sources of primary data.
Thanks.
GerardM

Wednesday, April 11, 2012

#Wikidata and what it can do for #Wikipedia

In an opinion piece Mark Graham fears for the many different opinions that exist in the many different Wikipedia articles on a same subject. However, many Wikipedias will do really well when they have high quality info-boxes available to them provided by Wikidata.

The article on Mali on the Zulu Wikipedia contains the info box I used as an illustration and the text: "iMali izwe eAfrika". It is a stub and I am REALLY pleased with the activity in the Zulu Wikipedia. They are working hard copying info-boxes and creating stubs but I would much prefer it when all that is needed is translate the labels and maybe the values as well and have tons of stubs in next to no time. Once well maintained data becomes available, the value of even minimal articles like this will go up a lot.

When people start to write the articles that go with the data, it will have to fit the data. There will be fights over the data but I am sure that in the final analysis the facts will provide their own Neutral Point Of View. This will not deny the differences in approach in the articles themselves but it will provide a common baseline based on facts.
Thanks,
GerardM

Tuesday, April 10, 2012

Sisters are doing it for themselves

#Wikipedia has sister projects. They are mighty fine sisters too. They have their own purpose, their own communities and their own public. It is easy to argue that Wikipedia on its own will never provide people with the sum of all knowledge as Wikipedia restricts itself to encyclopaedic content.

As the siblings of Wikipedia are not getting the attention they deserve, there is now a "Sister Projects Committee" that will seek a remedy.

On the talk page many issues, opportunities are mentioned. One of them is close to my heart; it is to coordinate Wiktionary and OmegaWiki. OmegaWiki was always ready to be adopted by the Wikimedia Foundation, this never happened. Now that Wikidata is being prepared for development, it will be interesting to learn if the new project will be capable to surpass OmegaWiki in its support for multiple languages. When it does, it will be interesting to see if Wikidata is able and willing to go where OmegaWiki wants to go and share the journey.

As there are more sister projects, all with their own opportunities, it will help when they get more of the attention Wikipedians take for granted.
Thanks,
GerardM

#logo for #Wikidata

Wikidata has the potential to become a very influential project. Logos for the project are being considered and when you are interested in making the selection even more challenging, you can still add yours to this list of proposed logos.
Thanks,
GerardM

#Wikitravel for #Wikimedia?

A core group of Wikitravel editors is interested in forking the Wikitravel project and run it as a Wikimedia Foundation project.

Given that the content of Wikitravel is available under a free license this is certainly possible. It is discussed on Meta and, the subject is discussed on a mailing list. Wikitravel is a website that is owned by a company and it fills a niche by providing information related to travelling.

When you read up on the deliberations, you will find that some people question if travel information is educational, you will find that some people consider if and how a "neutral point of view" apply. You will find that people do not stop being a Wikipedian when they consider other projects.

There are great arguments to be made for the WMF to have its own travel wiki and several other projects. There are great arguments why the WMF should not have more projects. It is all about the ability to consider the needs of other projects and the will to act upon such considerations. The WMF is supremely placed to make a success of a project like Wikitravel, the question is if it wants to.
Thanks,
GerardM

Monday, April 09, 2012

Crazy #Font

A font for a crazy occasion is readily available for the Latin script. It is great when you can be this expressive and when you have the creative ability to make such characters, you still need to have the technical ability to make it a character in a font.

Ttfautohint is a program that has found the funding needed to go the extra mile. It provides a 99% automated hinting process for web fonts, acting as a platform for finely hand-tuned hinting. This is important to make sure that the characters of a font scale and are usable on many platforms.

What makes it so cool is that when the new functionality is available, it will become even easier to create fonts. Now consider what crazy fonts will look like in the Malayalam or the Tamil script..

Thanks,

Gerard

Friday, April 06, 2012

The #world described in #Wikipedia

Sometimes there is this tool that is just great. Mark Graham, the Oxford Internet Institute and Gavin Baily at Trace Media have created this Mapping Wikipedia project. It maps the world with the articles that cover it. It shows really well what areas are of interest in the diverse projects.

These are the screen shots for the languages that are analysed. it is possible to drill down and find the articles themselves.
Have fun !
Thanks,
GerardM

Map for Arabic Wikipedia articles

Map for English Wikipedia articles

Map for Egyptian Arabic Wikipedia articles

Map for French Wikipedia articles

Map for Farsi Wikipedia articles

Map for Hebrew Wikipedia articles

Map for Swahili Wikipedia articles

Wednesday, April 04, 2012

Reading the wiki of "other" #Wikimedia chapters

With the announcement of the Chapter Council Steering Committee who will prepare the way for 26 chapters to join a Wikimedia Chapters Association, the need to communicate will become even more obvious for the chapters that are involved. Obviously chapters primarily exist for the people who live in the area or the country that is the remit of that chapter. The "language of the land" will therefore take centre stage on their wiki.

When visitors from abroad can navigate the essential pages of a Wiki, it will help a lot when these pages are translated. They will want to know the basic information, the monthly reports.. Once published, they are all fairly static and consequently they are prime candidates for translation.

When the websites used by the chapters are wikis, it will be easy to implement the Translate extension. Once a visitor has arrived and selects a language in the preferences, translated pages will show up in this language will show when the "mylanguage" special page is used in the Wiki link. The syntax is like this:

[[Special:Mylanguage/Article name|Article name]]

If you refer to pages that are translated in this way reading the content becomes more convenient.

Thanks,
Gerard

Tuesday, April 03, 2012

#Chennai hackathon IV

Several of the Chennai hackathon projects were interesting because of the different approach they took to known problems. A hackathon is a perfect place to experiment; there are many people around and they often have their own ideas about how things should be done.

Wikiquotes via SMS

You send the name of a person and in reply you get quotes from that person. The idea makes sense but the problem is that the quotes in Wikiquote are not available in a structured way. When quotes are individually tagged, an application to send SMS messages becomes possible from the source. The approach of this project was to copy quotes into a database and use this.

When the Wikiquote communities start using templates for each quote, it will become possible to use Wikiquote itself.

Translation of Gadgets/UserScripts to tawiki

There is no doubt, the translation or localisation of software makes a hell of a difference to the usability of software. The standard approach is to localise at translatewiki.net but gadgets and user scripts are not yet supported in this way.

It is wonderful to learn that the ProvIt and the TwoColumn gadget are now available in Tamil. The challenge left is how to make them available in other languages and how to maintain them when the code needs changing.

Lightweight offline Wiki reader

Even though there are several off line readers like OkaWix and Kiwix, it was the considered opinion that there is room for another one, a reader that can be used when there is only a little room available. The Qvido project was revived at this hackathon and is now available with "build" instructions.

Open source projects exist as long as there is an interest in developing a project and as long as there are people who benefit from it. Being able to host MediaWiki content off line as widely as possible is certainly reason enough for this project to get attention.

Program to help record pronunciations for words in tawikt

When you learn a foreign language it is really hard to get the pronunciation of words right. A program that allows the recording of some 500 words in half an hour is really useful. However, getting it uploaded to Commons is equally important because this can be an enormous time-sink and this is one part of the puzzle that still needs solving.

WikiPronouncer

Another approach to recording pronunciations was to do this on an Android phone. Developing an app in a day is too much of a challenge.. What these two projects make clear is how much pronunciations is a feature that is very much in demand.

As soundfiles are used extensively on other wiktionaries, it does make sense to learn the best practices elsewhere. The least they need is a Tamil makeover eh, a localisation.

Thanks,

Gerard