Monday, December 31, 2007
Happy New Year
One of my hopes is for SignWriting to do well in the new year. I hope that many people will learn to read and to write their sign language. I wish us, that the best is in front of us... Happy New Year :)
Thanks,
GerardM
Linguistic tolerance
There are problems with this policy. As I wrote elsewhere, a language may be dead. A dead language is a language that is no longer actively used; there has been no new terminology, nobody is using it actively, good examples are Hittite (hit) or Akkadian (akk). In my opinion they can have a Wikisource but a Wikipedia is problematic because you cannot write in these languages for a modern public without changing the language into something completely different. It does not even make sense to have a MediaWiki localisation for such languages.
There are more problems, what to do with languages where from within the culture it is prohibited to write the language down. What to do with languages where there are few people speaking a language. What to do with languages where few people are truly literate for their language. What to do with constructed languages?
The biggest issue with all these issues is one of competency. Who is competent to judge that a language is truly dead. At what level are there sufficient people in a community to support a language for a WMF project. How do we judge the quality of our projects and as importantly who is to judge? Also does the WMF have a responsibility for less resourced languages.
Brianna blogged about the Volapük wikipedia. For her and for many others, Volapük became an issue because they had the audacity to create enough articles to be noticed. People like Brianna feel offended because it upsets the notion of what Wikipedia is. Brianna introduces the notion of a "language ego" but I am sure she will agree that every non dead language deserves its place under the sun and only the people that communicate in a language are the ones that can realise a WMF project. The good news is there is plenty of sun and it is not expensive to have another language.
When people equate artificial languages with languages without merit, they have a problem. Many languages have started out in this way. One of the more interesting examples is Italian, it was standardised by Dante, used as a lingua franca until the unification of Italy when it became an official language. Another example is Sardinian where a constructed merged linguistic entity that has not been recognised by the ISO-639-3 registrar, is recognised in Italian law.
When you compare Volapük with Klingon, the biggest difference is that Volapük allows you to express all modern subjects.
For me the issue of the Volapük Wikipedia is a non-issue. I know three people that speak Volapük, not all of them are involved in this project. Given the competency of the people that are, there are no issues in getting the information in Wikipedia right. Wikipedia has always allowed for a project to evolve and insisted on the independence of communities. I am sure that in the end both Wikipedia and the Volapük Wikipedia will emerge stronger from all this.
Thanks,
GerardM
Sranang Tongo
What is becoming a comfortable routine is requesting the great people at BetaWiki to support another language. People that know the language can now start with the most used messages. I wish the proposers for this new project well, and I hope to hear from them when they consider to be ready for the big time :)
Thanks,
GerardM
Saturday, December 29, 2007
Farmer; tactical technology
MediaWiki can be found in an NGO in a box. This is an initiative of Tactical Tech, an organisation that aims to "demystify technology for non profits". The big selling points for MediaWiki are the many people that know MediaWiki through Wikipedia and the many languages that are supported by it.
Recently farmer was welcomed as an extension in BetaWiki, and the developers active at BetaWiki have been working hard at improving the messaging of Farmer. Farmer will make the maintenance of MediaWiki less challenging. With its messages translated in more and more languages, MediaWiki becomes more and more tactical technology.
Thanks,
GerardM
Friday, December 28, 2007
Firefox 3 beta
What I like:
- When I click on an Arab text it will cleanly select the whole word.
- URL's are shown much more cleanly
- Firefox is still a great program, it seems more stable and responsive
- I do not like the presentation of the browsing history
- There is a bug that has a tab go to the beginning of the page
Gerard
Monday, December 24, 2007
Localisation fast and furious
You will see a lot of red, not good. This is a typical situation of the "cup being half full" as the list is includes more languages. With a new visualisation you can not compare. So you do not notice the many recently added extensions. You do not notice the many recently added languages. You do not notice that there would have been more red for many languages. :)
If you want to help MediaWiki, help us improve the MediaWiki localisation for your language at BetaWiki..
Thanks,
GerardM
Friday, December 21, 2007
BetaWiki exports .po files
The most important thing about this format is that there are many tools that support it for off line translation. Many translators will not work on-line. For many languages we do not need to accommodate off line translation when there are sufficient people willing to maintain the localisation. However, when you look at the statistics, you will find that there are many languages not supported or poorly supported in MediaWiki. Hindi is a good example. The Wikipedia is well localised but for Hindi only 3.99% of the system messages is translated. Hindi is spoken by 180.000.000 people...
For Hindi .po files will not be the solution. Collaboration between Indian Wikimedians and the BetaWiki administrators will be a better solution. There are also languages like Neapolitan where it helps to localise and then have the localisation proof read. When the number of collaborators for a language is small, it is typically easy and safe to work off line. You do not have to wait for loading and saving, you can combine it with a translation memory and make it efficient.
Nikerabbit has now created a .po importer. What he is looking for are translations to test this new functionality...
Thanks,
GerardM
Thursday, December 20, 2007
Wednesday, December 19, 2007
WOSI, a really cool Open Source/Standards project
The objective of the school is to provide an environment for their students that will give them a real feel for what it is like to work in the ICT business and teach them Open Source and Open Standards. Students that are part of this project will experience that a project does not start from scratch, there is always something to build upon, there are always conflicting requirements, there is always the need to ensure interoperability because the use of Open Standards is a precondition.
The WOSI project is in its second year and, it is growing in size. Students from other disciplines are getting involved as there is overlap with other specialties like communications and marketing. More woningbouwcorporaties are interested in the project as well as other schools. The great thing is that as Open Source and Open Standards are key to this curriculum, professionals will be released to the job market that know how to apply these notions in real world scenarios. This is likely to prove the biggest boon to this really cool project.
Thanks,
GerardM
Tuesday, December 18, 2007
Inter operability is important
I am subscribed to the DBpedia mailing list and today I read about errors in Wikipedia that had to do with Wigan and Manchester City. Errors were found and the gentleman wrote that he can and will make the necessary updates. His question is when will the DBpedia reflect the changes.
When the data of Wikipedia is analysed with tools, and when the results are found to be of value, it adds relevance to what enables this collaboration. It typically relies on the availability of dumps. When the data is analysed, a new work emerges. When it has a completely different format, it is possible to mesh it with other data sources. This in turn will help establish the validity of the Wikipedia data and will allow for the extension of the data.
When multiple data sources are meshed, the issue of copyright and license raise their ugly head. You can create static and dynamic meshes. In a dynamic mesh you can build the mesh depending on what the person has access to. In a static mesh you can only include the data that is still available to the least privileged person who will get access to the data.
The consequence is that many people, organisations will mesh sources, manipulate data, publish and not indicate what all the sources are. They will not do that because they do not want to be bound by all kinds of licenses and because they do not want to be hassled.
This DBpedia example shows that the presentation of facts is important. It demonstrates that interoperability will result in a better Wikipedia. It is important for Wikipedia to be as open and engaging as it can be. Frankly, when people analyse our data in a similar way to DBpedia, it is a new work it should not be considered derivative. Best practice is to publish sources and this, more then the viral nature of a license like the GFDL or CC-by-sa, will drive collaboration and give Wikipedia more relevance.
Thanks,
GerardM
Sunday, December 16, 2007
Localisation of MediaWiki
The Localisation statics show the languages that have a central localisation and the percentage of the messages that have been done. It clearly shows that the MediaWiki localisation leaves a lot to be desired; for some 144 languages less then half of the messages have been localised. At this moment there are 235 languages known to MediaWiki. When you compare this to the 253 language that have a Wikipedia and add the languages that are starting in the Incubator, you get a clear picture of how much effort is needed to better support the readers of MediaWiki projects.
When you look at the statistics, the glass is half full, and it is filling. On average five languages are introduced every month and more then 500 messages are translated every day. The languages that are in the Incubator are doing well Seeltersk for instance has done an astonishing 99,3%.
One of the latest innovations in the BetaWiki are the core top 500 messages, they contain the most important messages and with these messages translated, MediaWiki is usable for a language. BetaWiki has a dedicated team of people that make MediaWiki and as a consequence MediaWiki projects usable for many people. With your help, we can improve the localisation even further. One message at a time will slowly but surely provide proper support for all the languages MediaWiki supports.
Thanks,
GerardM
Friday, December 14, 2007
It is perfect after all
This makes me perfectly happy. Now I know how the water flows from the Oostvaardersplassen into the "Wilgenbos" and I know that there will be a lot of work that needs doing. But when Staatsbosbeheer, as it does, states that fish will be able to swim in and out .. really great news.
Thanks,
GerardM
Wednesday, December 12, 2007
Vindication of a kind
I visited the Oostvaardersplassen this weekend, and I learned that a small dyke will be removed leading to a more natural distribution of water and a more dynamic water level. This will have a huge impact on the fish stock; the current population of mainly mature carps will make room for many more smaller fish. This will allow many small herons and other fish eaters finding their niche.
The one remaining question for me is if fish will be able to freely migrate in and out of the nature reserve. It would be grand if this is the case.. As only one dyke is mentioned, I do expect it to be great but not "perfect".
Thanks,
GerardM
Monday, December 10, 2007
Burglary
After all the excitement, I find it hard to go back to sleep.. Anyway, this is real life drama. Not dramatic, but it keeps me from sleeping.
NB the word of the day is íshokkí.
Thanks,
GerardM
Wednesday, December 05, 2007
Shameless plug ...
As I absolutely approve of great projects doing great things, I am happy to shamelessly plug wikiHow and I wish it and all its language versions great editors and a great audience.
Thanks,
Gerard
Monday, November 26, 2007
Wiktionary upset
Not taking things for granted is a healthy attitude. There is an inherent bias against the French language Wiktionary in the Alexa numbers. However, I am impressed by the numbers quoted.
All the bigger Wiktionary projects have used bots to build up their content. I can imagine that a healthy rivalry will make the numbers go even higher, this would benefit the users of Wiktionary because I trust the Wiktionary communities to watch the quality of the content :)
Thanks,
GerardM
Wednesday, November 21, 2007
Pride in a language
Many people feel strongly about their culture, their language. What I find special are the people that go the extra mile to promote their language. It is therefore that I am grateful when I notice languages like Spanish, Georgian, Breton and now Eastern Yiddish having a champion that make a difference.
It is especially interesting to see how with an ever increasing amount of terminology, the information becomes rich. Rich both for the people who are interesting in learning the language and also for the people that want to learn other languages starting from these languages.
I am grateful when people find in OmegaWiki a tool that helps to document and service their language. It is an imperfect tool but its redeeming quality is that it is getting better as we go along.
Thanks,
GerardM
Tuesday, November 20, 2007
Thank you Mycom
I got home and it was still not working. So I played with my configuration and I could not get it to work. So I went again to my computer supplier Mycom and we fiddled with all kinds of values. For whatever reason we got it to work but we were not able to pinpoint what the issue was. I think that it may have to do with an upgrade of the Skype software that I did the other day but I am not sure. In the end, the friendly service I got was the highlight of the day :)
Thanks,
GerardM
Friday, November 16, 2007
69 page document in SignWriting
The reason for me to mention it is that it indicates that SignWriting is stepping over a threshold. It is enabling American Sign Language to be a literary language. This is another step closer to the realisation of a Wikipedia for ASL.
Thanks,
GerardM
Tuvin
On the Tyawiki there is a MediaWiki installation that aims to create a repository about Tyva. Withthe localisation of Tuvin in MediaWiki, they will be able to do a much better job.
I welcome this first localisation effort I am aware of that is driven from outside the Wikimedia Foundation. It demonstrates how MediaWiki is getting recognition for the outstanding software it is.
Thanks,
GerardM
Thursday, November 15, 2007
Flags for languages
Languages are spoken on both sides of a border. Languages are spoken by people who do not recognise countries or flags. Languages are separate from nationhood. Consequently it is in my honest opinion wrong to associate languages with flags. There are no obvious symbols for languages and for many languages they would share the same flag. OmegaWiki is not likely to ever associate a language with a flag.
Thanks,
GerardM
Statistics
When you compare the OmegaWiki statistics with the Wiktionary stats they are doing really great. With a daily rank of 1.601 it is time for the Wikimedia Foundation to demonstrate that they have a valuable resource in Wiktionary :)
Thanks,
GerardM
Tuesday, November 13, 2007
Political science
Effectively the right honourable gentlemen is looking for a reduction in the cost of science by increasing the involvement of even more people to select and manage scientific projects. Effectively this will lead to less money for doing science. Effectively it will not be the scientists who select the projects that are considered to be of scientific value.
The dismay I feel is because it is likely to lead to more yet "politically correct" science. Science that has more to do with what the expedient results should be and not with scientifically relevance. When you consider the huge amounts of administrative and other overhead it is a wonder that scientific research is still practised.
Thanks,
GerardM
Sunday, November 11, 2007
When is a project alive
Today, an anonymous person replied to this blog entry. The suggestion is made to cooperate with the SWAD Europe group. They have a website, a blog but it all stopped in 2004. So I am wondering about all these projects, all this effort that just stops. Projects that may be valuable and given that people promote it in 2007, may still be alive. For me there is no way of knowing.
I have an idea how I would use semantic data in OmegaWiki. What I am not so sure about is how semantic web applications would use OmegaWiki data. In essence OmegaWiki is multi-lingual and exporting it in anything but a machine readable version only, would strip what I think is valuable in OmegaWiki.
Collaborating with for instance a SWAD Europe group makes sense. People can suggest cooperation, it should however be a two way street. Just pointing that there are others does not help me much.
Thanks,
GerardM
Wednesday, November 07, 2007
The World Language Documentation blog
It is with pleasure that I inform you that the World language Documentation Centre has a blog. As the WLDC is ambitious in what it wants to achieve, and as many of these objectives will take time, it is great that there is a blog where the board members of the WLDC can publish about they find of relevance.
I hope that the many members of the WLDC board will find the time to blog because this will help you appreciate the amazing qualities that you find in these people. As I have the privilege to be on this board as well, some of the subjects that I have written about in the past will now be covered on the WLDC blog ..
I hope you will find the WLDC blog of interest to follow it in your RSS reader.. :)
Thanks,
GerardM
Sunday, November 04, 2007
Stuttering
With great surprise I learned that stuttering also occurs in sign languages. I learned this from a mailing list that deals with sign languages and SignWriting. The implications are quite profound. It means that stuttering is not necessarily a speech disorder and consequently when it is not, speech therapy does not work.
This similarity in the problems between signed and spoken languages indicate that the format of communication is incidental. The same mechanisms are at play and therefore one is as good as the other. To me this seems obvious many people rate their own method of communication as superior. The spoken language is superior for when communication with me as I am dumb when it comes to signed languages...
Thanks
Gerard
Saturday, November 03, 2007
Who is Frank Thompson, and why include them in Wikipedia
For me it is interesting to see how Wikipedia deals with what is considered relevant. To me these people are irrelevant, both misters Thompson are no longer in office. But they are deemed to be noteworthy enough to link to where might be an article.
In a similar way there are articles about pop stars who had a single hit in 1962, there are articles about wide receivers that only played one season.. There is a lot of information that is of no importance and that is fine.
What astounds me is that when an article is written in the German and the English Wikipedia about Kotava, a constructed language, it is speedily deleted. When it is then indicated that this language is on route to be recognised in the ISO-639-3 code, the comment is speedy deleted. The article that was deleted was more then a stub, it cited sources and I did not write it.
I would love to understand why a mayor of Yarra, a 1962 pop star or a 1956 wide receiver are "relevant" and a language like Kotava is not.
Thanks,
GerardM
Tuesday, October 30, 2007
Kotava - another constructed language
At this moment Kotava is not eligible for a Wikipedia, it will not be enabled for editing in OmegaWiki. It does not have an ISO-639-3 code yet. What is special is that there are clear indications that the process for a code is under way. The code is likely to be "avk".
At OmegaWiki there is a Kotava enthusiast who has started a lot of the preparations for another language. It will be given once the code is official. For a Wikipedia, they may ask for a Kotava Wikipedia. With the ISO process under way, the language committee does not have to do anything until the code is granted.
It is great that the language committee has reserved the right to do nothing..
Thanks,
GerardM
WCN, the network - an unsung hero
GREAT :)
Thanks,
GerardM
Sunday, October 28, 2007
Google docs - published presentation
Saturday, October 27, 2007
WCN
I had the privilege to give a presentation, and as I had to do some serious travelling to be there, I considered to what extend I could reduce what I took with me. I decided that with the latest Google application I did not need to bring anything. I could rely on there being a network, Kim brought his Merakis who are still on the old functional software and assuming one functional lap top should not be a problem either.
To prove that it was indeed the Google presentation tool. I selected one of the available backgrounds. It is different. What I need in a presentation is basic. I need to show some texts and some screen dumps. I think there is some need to polish the handling of the screen dumps.
One of the nice things is that the presentation can be made available. So have a look and let me know what you think.
Thanks,
GerardM
Monday, October 22, 2007
Breton
There is an organisation that is actively promoting the Breton language. I am really happy to inform you that they have taken the trouble to localise the system messages of OmegaWiki and have started to add translations in Breton. I have send a file with many of the languages that are in the ISO 639-1. This combination will localise most of OmegaWiki in Breton.
The argument that proved convincing is that you get all the information we have. By steadily increasing the translations available in Breton, the experience will improve.
Thanks,
GerardM
Saturday, October 20, 2007
Import, export
The export creates a txt file with the columns separated by tabs. The columns are the number identifying the DefinedMeaning, and combinations of the Expressions and Definitions. The reason why we start with this export is because it is still much quicker to translate in a spreadsheet then it is to translate on the web. These experiments are done in a test environment and, we hope to bring it life soon. I can already send you a file when you are interested :)
When we start to import, it is likely that languages can make quite a jump in the statistics. I hope it will encourage people to help us by providing translations particularly for the less resourced languages.
Thanks,
GerardM
Friday, October 19, 2007
Dbpedia
It does a great job and it is different from that other project that deals with structured information, Semantic MediaWiki, in that it does operate by data mining information from Wikipedia while Semantic MediaWiki is an integral part of a MediaWiki project.
The great thing of dbpedia is that it explicitly encourages interlinking. With interlinking data, data that can be found in another resource, becomes available limited by the quality of the interface.
You might ask how OmegaWiki fits into all this. Dbpedia's information is in English while OmegaWiki allows for the representation of information in any language . Both OmegaWiki and dbpedia link to Wikipedia articles and consequently where the two share a link, the information can be mashed together.
In Wikiprotein there is a large amount of medical information available. This information does link to external databases. Given that the medical articles relate to external databases, there is an opportunity to link the data to Wikipedia articles. With Wikipedia articles linked in this way, more specialists will find their way to the Wikipedia medical articles and this in turn will make more enriched information available.
The thing to consider now is how OmegaWiki can benefit from dbpedia.. one of the issues is the difference in license. Then again, dbpedia provides its algorithms and consequently the result is not necessarily the license that dbpedia posts.
Thanks,
GerardM
Friday, October 12, 2007
Domain names in other scripts
From a localisation point of view, this is one of the ultimate challenges. Consider; the .org has two components, the dot (.) and the org. The org needs an equivalent in all scripts. This top level domain or TLD is just one of many. The ISO 15924 defines many scripts and, a particular combination that is auspicious in one language can be the equivalent of wtf in another. Choosing these codes is not trivial. There are not only TLDs but also ccTLDs or country code top level domains.
We have agreed that ويكيبيدي is Arabic for Wikipedia .. Wikipedia is very much an international movement. We have agreed that Wikipedia is to be used for the Latin script in our domain name. The question is, if ويكيبيدي will be accepted to represent Wikipedia in the Arab script and, when there are multiple ways of writing Wikipedia, do we need to register for all these domains?
To make it even more confusing, what will the rules be when it comes to domain squatting. I can imagine that a brand is only registered for one script and not necessarily for another. I wonder what the position of the WMF will be; I am sure that it has not been considered yet.
ICANN is courageous, they are now experimenting with the technical issues and this will show that it can be done. I expect that the next part will be a proposal on how all the top level domain names are to be "transscripted". Then, it will become interesting because from that moment onwards the Internet will be truly global in its reach and no longer centered on one language or script.
Thanks,
GerardM
Saturday, October 06, 2007
A picture paints a thousand words
I think the Georgian script is pretty :)
Thanks,
GerardM
Now with grammatical gender
Thanks,
GerardM
OmegaWiki vs Semantic MediaWiki
Both OW and SMW are extensions to MediaWiki, so at first face it seems like a reasonable suggestion. The two extensions however do completely different things.
Semantic MediaWiki will shine when it becomes part of a project like the English Wikipedia; when key data that is in the article is marked, it will provide a great improvement in making these facts available. SMW even provides a really rich environment to query the information. It is absolutely great and it is absolutely mono-lingual.
OmegaWiki is at this moment very much a stand alone application. It does not derive data from anything, it is great at presenting the same data in many languages. This means that when we know that the information exists, we will show it in "your" language.
SMW can export and when OW can import, we have the best of both worlds; that is to say we have the best of both worlds when we can link the resources to each other. As OmegaWiki is not encyclopaedic and does not want to be, it is our stated intention to link to Wikipedia articles. As the SMW is tightly linked to the Wikipedia articles, this may be just the trick.
The suggestion that OmegaWiki would only have the semantic information that is in Wikipedia is incorrect. In Wikiprotein we already have a real rich set of annotations of proteins. This is the kind of information that is not encyclopaedic. It enables scientists to maintain information on "their" proteins. The language of the science of proteins is English, however many are known in other languages. It is rich that all this can integrate as it brings diverse information together.
Both OmegaWiki and Semantic MediaWiki have their own, and different strengths. Within Open Progress we use Semantic MediaWiki for our internal wiki. It works absolutely fabulous. I love both MediaWiki extensions!
Thanks,
GerardM
Thursday, October 04, 2007
Changes in the user interface
Well some things seem like miracles and we can have them in five minutes.. The problem with the collations is that at this moment it has to be by hand. This makes that it does not scale. With the "borders on" example however, we have a great showcase WHY we need to be able to sort these texts and also why they should be Expressions in OmegaWiki like all the other information that we hold.
Really, I could not be more happy with the progress that started to happen.. :)
Thanks,
GerardM
Wednesday, October 03, 2007
Is it a Wiki ?
The best argument why OmegaWiki is a wiki is because people can add/ change little items one at a time, they are not compelled to do everything at one go. As this is acceptable, our data has to be correct but does not necessarily need to be complete.
With the new terminological support; we can now indicate that a language is a language, all kinds of additional information can be added once you have stated that it is a language. Have a look at French for instance, the "incoming relations" and the annotations provide a lot of information. As the moment of writing it is not yet clear that French is an official language of France for instance.
With the expansion of existing classes, with the expansion of the class attributes information will become more available and integrated. The best bit is that as the OmegaWiki specific user interface can be translated in many language people will be challenged to ensure that the right terminology is used in translation.
With the new functionality OmegaWiki became much more wiki. It is there for every one to see, and everyone is cordially invited to have a look, create a user, add some Babel templates and have a go at it.
Thanks,
Gerard
Monday, September 24, 2007
Self promotions on BBC-News
Recently there has been an increase in the number of video fragments. At the same time I find that I am less likely to watch them. The reason; every fragments is now preceded with a promotion and it is annoying and distracting to the point where I do not bother any more.
Thanks,
GerardM
Sunday, September 16, 2007
Zuckertüte
The word Zuckertüte is a good example of a word where I do not expect translations in any other language as it is a real German tradition. This does not mean that there cannot be translations of the definition.
I hope that the grandchildren of this little lady have a great day at school tomorrow :)
Thanks,
GerardM
Friday, September 14, 2007
Shtooka
Now they have surpassed themselves. There is now the Shtooka explorer. It allows you to listen to the many, many pronunciations in several languages they have recorded. It is an absolutely gorgeous application that really shows off an already great project :)
Thanks,
Gerard
Thursday, September 13, 2007
Social networks
LinkedIn, Plaxo and ecademy are at a considerable disadvantage because in order to get the full functionality, you have to spend money. It then becomes relevant to understand the potential benefits for these networks. All three provide basic functionality that is for free and as there is no real overlap yet, I maintain a presence there.
Facebook is what I have been looking at lately. What it has right is that it tries to connect people, the groups they belong to, the causes they champion and the organisations they are associated with. When you combine it with the potential to program extra functionality for facebook, you get a more compelling package then its competition.
It is however lacking in other ways. For one the way it has its security is minimal. I do not mind to tell the world that I am on facebook, but I prefer that only friends and friends of friends can see who my friends are. I would happily leave Plaxo behind when facebook had the same security for sharing personal and work information.
The one thing all the social networks have in common is that they are proprietary. To make their functionality useful to me, I have to trust it with my information. I do however not know if the implementation of their security can be trusted. Were they to use something like A-Select I would feel more comfortable because it would allow enough eye balls to vouch for the authentication process. Even though it would be a great step forward, it would be better if all the software were Open / Free software. Authentication is one aspect, the authorisation within the application itself can still make my data insecure.
Who to trust, why to trust ...
Thanks,
GerardM
Tuesday, September 11, 2007
Hehe
Even though there are many references at the back of the article, it is marked as not citing references or sources. I do agree however that this article would benefit a lot from wikifying and the creation of supporting articles. It is a good example of the amount of work that needs doing to make the English Wikipedia relevant as a resource for Africa.
Thanks,
GerardM
Monday, September 10, 2007
Sassarese and Sardinian
The Italian government has officially recognised the Sardinian language or the "Limba Sarda Comune". This is in essence a constructed language as it tries to make one language out of the four "dialects". One of the effects has been that some people prevent others from writing in one of the four languages on the sc.wikpedia.
The language committee of the Wikimedia Foundation has a request to approve a new language; one of the Sardinian languages, Sassarese with ISO code sdc.
There are two problems to deal with:
- The "Limba Sarda Comune" is not recognised as a language
- The proponents of the "Limba Sarda Comune" reserve the sc.wikipedia for their language
Given that the language committee has as one of its rules that political arguments are not accepted, there are a few conclusions that we should make.
- Sassarese can have a conditional approval
- We urge the proponents of the Limba Sarda Comune to ask for the recognition of this newly constructed language from ISO.
Thanks,
GerardM
Saturday, September 08, 2007
Wikizine but more relevant WalterBE
People who know Walter know him as a soft spoken can-do person. He has done much of the organisational work on the Dutch Wikipedia, organising elections, being involved in OTRS from the start. He is the press contact for Belgium ... As a steward he did much good, he is a member of the communication committee ...
Walter has indicated that he has grown away from the community and as such his motivation for Wikipedia and Wikimedia stuff has gone downhill. He does not feel that he can properly represent the WMF and the community, he is disappointed in the lack of cooperation around Wikizine ... What has prevented him so far to stop is his sense of responsibility to the Wikizine readers. Wikizine has been a labour of love for Walter, there has been little input from the community to inform Walter about the latest, no people except for proof readers who shared the burden of this well received periodical.
Well, I can only be sad that Walter finds his commitments a burden. I do hope that he will know and remember how much he is appreciated for the work that he does and has done. Really, to me Walter is one of the most important Wikimedians.
Thanks,
GerardM
Friday, September 07, 2007
My friend Bèrto 'd Sèra
Thanks,
GerardM
Kamusi, The Internet Living Swahili Dictionary has been taken offline.
It is sad, that such a sterling effort is endangered on what seems to me a minor issue. It is sad because the many Kamusi's users are now without their Swahili dictionary.
I contacted Martin Benjamin, the editor of Kamusi and, I learned that the World Language Documentation Centre is willing to help out with the hosting of Kamusi. Martin and the WLDC are looking for the best way forward; maybe another university can take over where Yale has dropped the ball, maybe Yale will reconsider ...
When I know how to get to the new Kamusi website, I will let you know.
Thanks,
GerardM
Monday, September 03, 2007
Cool application
Thanks,
GerardM
Saturday, September 01, 2007
A computer that works for Luna and Marco
Their mother, Luna and Marca are five years old, also has a computer. Her computer only works reliably in the hot Italian summer with an external fan pointing at the computer. The computer is raised a bit from the desk to improve ventilation even further.
In all the talk about computers something simple like the environment is hardly mentioned. PC's and laptops are thought to work in an office environment. In an office environment it is expected that the temperature is regulated.
The OLPC is made for kids and it will work in a hot environment. Luna and Marco would love to have one. It is sad that their mother has to use a computer that cannot take the heat and, a computer that is not as cool.
Thanks,
GerardM
Thursday, August 30, 2007
Babel
Today I came across a really nice website called VirtualSecrets, it provides you with a tool that allows you to translate English into Assyrian/Babylonian, Sumerian and Egyptian. It is a really nice tool and I get the idea that it should be good because museums use this tool as well.
Soldiers of many nations are currently in Mesopotamia, they are bound to bring souvenirs home. Some of them will be original clay tablets. Much of these tablets will be useless without the context where they came from. But given a machine translator, it would be possible to have a text translated from photos.
Photos from the clay tablets in an archive, would make a digital collection. The texts can be translated and with an ever increasing amount of material in such a repository, individual tablets that are currently out of context may fall into place after all.
Thanks,
GerardM
Wednesday, August 29, 2007
Music and languages, the sound of it
Not knowing the language, you listen to the sound. The language in the music that I heard from Berto, Piedmontese, is so different from Neapolitan. The Neapolitan sounds more like Spanish and Piedmontese is more like what I associate with Tirol. Knowing the geography and some history it makes some sense.
At this moment I really like I Musicatoria. Have a listen, I think you will like it :)
Thanks,
GerardM
Sunday, August 26, 2007
Language support in applications
One of the problems that people have that want to use one of these "other" languages is the ability to just state that they are writing in their own language. The Unicode enabled applications should have a list of all the languages that are supported in Unicode and thereby provide the most basic level of support. This way people are enabled to enrich their document with the right meta data for their language.
Many people would expect that all the official languages of the world are supported in Unicode. This is sadly not the fact. Brianna informed me that several official Indian languages are not fully supported in Unicode.
In order for applications to know what languages they can support on this most basic level, there is a need for a public database that keeps this information up to date. Yes, it would be great if it also includes a link to a font that is needed as well.
Thanks,
GerardM
Tuesday, August 21, 2007
French sign language in Togo
Given the information on Togo, it is the only sign language known. There are no other sign languages that have been recognised and, if this is the case, French Sign Language is certainly as good as any other. However, I can not help but wonder if there are no other sign languages. It seems odd to me that deaf people do not have their own sign language.
When there is a native sign language, it may be that because of the French Sign Language being taught at school this language is marginalised. When however the local language is strong, it may be that this French language is isolated ...
Really, languages are a really interesting subject. :)
Thanks,
GerardM
Sunday, August 19, 2007
Inspiration for working on content in OmegaWiki
At Wikimania I showed that we are supporting with real time semantic support for Wikipedia. The information that it uses is content that exist in all the databases of OmegaWiki. This means that when you look in one of the articles in this dump of the English language Wikipedia, you will find many concepts that do not yet exist. When you add these concepts, the Semantic Support will only become better once we have the process of updating the new terminology implemented.
Another way it is stimulating is that the existing concepts exist mostly in the UMLS database. In order to start providing Semantic Support for other languages, we need translations. This is done by creating a DefinedMeaning in the Community Database and linking it to the UMLS database.
It turns out that we already have a number of languages that are really doing well considering that we have mapped almost 4% of the records of the Community Database.
Thanks,
GerardM
Wednesday, August 15, 2007
micropledge.com
Erik proposed and probably pledged some money for a project called "RSS extension for namespaces with smart quality filtering". Twenty dollars have been pledged (US / Canadian / Taiwan it does not say..) and two people bothered to comment on the project .. they like it :) . I however would not pledge money to it as it is completely unclear how much money is needed for this job.
Micropledge is a young project and it deserves a chance. So I asked the CTO of Open Progress to come up with a budget that would allow an expensive developer to do the job. Some money would also be set aside for the necessary overhead. The point is not that Open Progress wants to do this job, the point is that there will be at least one Micropledge project that has a target amount associated with it.
In many projects like Rentacoder, you can post a project and developers can bid for the project. I sincerely hope that a good developer will want to do the job and will do it for less then the amount we will post. Bidding for projects is however not something that Micropledge caters for at this moment in time.
When in the end it turns out that enough money has been pledged, Open Progress will do the job. For us it is an experiment as well. Will posting a realistic amount of money get more pledges, will we in the end be asked to do the job ??
Thanks,
GerardM
Saturday, August 11, 2007
Court Rules: Novell owns the UNIX and UnixWare copyrights! Novell has right to waive!
Thanks,
GerardM
Friday, August 03, 2007
Equipment for an Internet traveller on the cheap
Your PC does not pick up a network, so you get out your Meraki router and with the extra long distance antenna you have you find a Fon network. A Ethernet cable is welcome, it allows you to reconfigure the Meraki if need be. When you are in luck, the hotel provides you with WIFI but with you own router, you can connect to your preconfigured network and connect a bit more securely to your own network and move from there.
Before you leave the hotel, you check again the position of the hotel on your GPS system, you take your directions and move to your destination. With some luck you have the GPS location of your destination and off you go. The good news of the GPS is that even when you cannot read the street signs, you know have a tool that tells you if you are getting closer or getting further away from your destination, it beats getting a cab to get to your destination.
Typically you do not bring a T-shirt to sleep in, when you are really lucky you have a new clean one for every night of your stay.. T-shirt, proof of: “been there, done that”.
Thanks,
GerardM
Wednesday, July 25, 2007
WiktionaryDev
Superb is the possibility to indicate that two labels indicate content in the same language, this brings the content of Greek and Greek (modern) for instance under the same heading. Having indexes for each language is really powerful; it allows people that are interested to work on one specific language.
The WiktionaryDev functionality builds upon the standards that were adopted in the en.Wiktionary. It is absolutely fabulous that the hard work of standardising Wiktionary results in all this new functionality.. :)
Thanks,
GerardM
Tuesday, July 24, 2007
Licenses are often not even a nice pain
You would create a spell checker under the GPL, but it is incompatible with the GFDL, the CC-by-sa ... You would create a machine translation engine under the GPL ...
The best reason for selecting a Free or Open license for software is because you want to ensure that the freedoms will remain available to the people who receive the programs downstream. In essence it is a defensive measure using copyright law.
Facts are different from programs. You cannot copyright facts. You can copyright collections of facts. Large collections of facts are available under a Free / Open license and these are incompatible with other Free / Open licenses. This means that you either take these collections and just use them, or you get into long discussions about licenses. Either option has its issues.
When you get into discussions about licenses, you have to indicate that the license does not liberate the facts for use in another setting. People get really grumpy even upset when they are told that their favourite Open or Free license is the issue.
The worst thing happens when you are asked to cooperate on a project. A project that has obvious merits. You are asked to help out because you know the subject matter. You are asked to help and comply with their license. You are asked to collaborate for free but you are not permitted to use your own work. When you then tell these people that you do not want to collaborate because their Free / Open license is an issue, you first get stunned silence and disbelief and then you get the same old religious arguments why their license is best. To me licenses are only great if you belief in the copyright system. I believe the copyright system is evil.
In OmegaWiki, we make our community data available under a combined license, CC-by and GFDL. In this way we reach out to both the Free and the Open communities. The people that use our data downstream can pick either license or when their license is more restrictive, they can even re licence our data. Our data will remain Free / Open. People can come back to us and improve and append our data and from OmegaWiki it will be available to the people who use the data downstream.
We are happy to cooperate with anyone. We are happy to collaborate on any database of facts but we have to insist that we work in our community environment. Facts need to be liberated and be available to all. This notion I learned from the people I met in the Open Access world.
Thanks,
GerardM
Monday, July 23, 2007
Happy news
Congratulation to all the people who are involved and make it happen :)
Thanks,
GerardM
Sunday, July 22, 2007
Harry Potter and the Deathly Hallows
Thanks,
GerardM
Friday, July 20, 2007
Ishi could have spoken Coos
So let us consider this situation. Ishi spoke Coos, his language is now extinct and he used the technology of the day and left a recording of his message. We do not speak the language any more and we have problems with technology less than hundred years later. This wax cylinder is owned by the Phoebe Hearst Museum of Anthropology and we might be lucky; there could be a complete set of annotations including translation of Ishi's words. When we are lucky, we can listen to the the Ishi recording 96 years later and have some understanding through the possible annotation; a window is opened in our past.
When Ishi had spoken his message in English, we would consider it to be easier for us to understand it. The message would still be from a different culture, it would still require the same annotations for us to understand it properly. Now Bill who was also living in Oregon, had an as profound message. Suppose Ishi and Bill knew each other, Ishi's background would be Coos, and Bill's background would be Welsh. Our ability to understand their true message would depend on our understanding of that time. Without sufficient understanding of the culture, the profound message of either Bill or Ishi will not reach us. It would be an artefact of a museum, an artefact to be studied.
For Bill and Ishi it might have been of great significance that their profound message was recorded. For Ishi it would be natural to communicate in Coos, it was his language and it is not unlikely that is was only recorded and annotated because it was Coos. Bill's message was not recorded, his English and his message was not considered of similar significance.
Much of what is said and done, is done only for the present moment. My message is written in English because it is the best way for me to convey my message. Many of the messages of Sabine are in Neapolitan when she reaches out to that particular audience. My message is written on Blogger, I do not spend much thought considering its format. If at all, it may be saved for posterity thanks to the effort of the Internet Archive. When people cannot read it, understand it in 100 years time, I do not really mind as they are not my intended audience.
Much of what we do on our Wikis is for our current audience. Our content is transient, its shelf life is limited. We aim to bring information to our public and we to do this now. We provide Free content and when a wealth of content is available in a format like Flash, we should imho provide it because we aim to provide the best possible service now. With the continued development of Gnash, I feel reasonably safe that a future generation will still be able to experience some of what our day and age is about. The stuff that I really enjoyed.. well that is another story.. some people try to preserve it..
Thanks,
GerardM
Thursday, July 19, 2007
Ishi meets IRENE
IRENE is a method that in stead of a needle uses a camera to find all the little grooves in the track. With this information it is able to emulate what a needle would find and, play the music. There are many priceless recordings and it will be great when they are digitised. This will ensure that they are less likely to get lost.
Obviously, given the age of this material, is it all public domain.
Thanks,
GerardM
Wednesday, July 18, 2007
Today's word of the day: hypoxia
The sad thing about hypoxia is that it is preventable. Hypoxia was a phenomena that happened in the Wadden Sea as a result of the pollution that came in from the Rhine. As a result of cleaning up this river, the hypoxic areas or dead zones have diminished, the areas where seagrass is growing are on the rise again. The same could happen with the Mississippi and the Gulf of Mexico. The main things required is the prevention of nitrate and phosphate getting into the waterways. It is well known how this can be done, it just takes the political will to make this happen.
My interest in this subject can be understood from the fact that I wrote most of the articles about the fresh water fish of the Benelux on the Dutch Wikipedia.
Thanks,
GerardM
Sunday, July 15, 2007
Copyfraud
Organisations that deal in copyfraud are legally fraudsters. While reading this paper one question that comes up to me is, how can industries that do not implement the law themselves on such a massive scale expect their customers to respect the law ?
The paper informs of the many ways it prevents people to use material that is public domain. There have been many threads on the WMF mailinglists about this subject and it is quite clear that our projects would benefit enormously from a strengthened public domain.
This paper does address the issue of how the public domain can be strengthened, it mentions among other things that courts ruled that those with dirty hands because of the assertion of copyright on public domain material were denied copyright enforcement.
Industry protects its copyright through organisations that represent them. With an industry massively breaking the law by claiming copyright where it is not theirs to claim, the moral footing of their representatives is undermined. There is legislation where the court denied copyright enforcement to copyright owners with unclean hands. Many industries have engaged in the implementation of digital rights management. These implementation do not take into consideration the fact that copyrights expire. Consequently I would argue that these implementations are broken by design and consequently they are not a legal correct implementation of copyright restrictions. I think it could also be argued that combined with the massive copyfraud perpetrated by the industry copyright enforcement should not be allowed because of the fraudulent behaviour of these industries by organisations representing the whole of an industry.
Thanks,
GerardM
Saturday, July 14, 2007
Deletions of pictures
I have had two instances now of people insisting on deleting this logo because it is not Free. It will be probably be deleted from the Dutch Wikipedia where the information about the permission is documented and it will after this be deleted from the English Wikipedia because the reference for the permission will no longer be readable.
There are several issues that I want to raise. Commons is as far as I am concerned not as good as it could be because it does not have a way to deal with a restrictive use of logos of organisations that allow for the use of their organisation within the limitations that they have to insist on because of it being part of a trademark.
Permissions given on one project, are often referred to on another. This is not considered when the licenses of pictures are evaluated. This is understandable because there is so much. Much of the older material does not have all the templates and doodahs because these did not exist at the time. The consequence is that much is lost because of this insistence on the compliance with later policies.
For the photos I made, I do not care too much if they are kept on projects or not. Everybody can make a photo of a building, an animal ... The problem is with material that is of benefit to our projects that we do not consider because it requires licenses that are by necessity restrictive. Restrictive in a way that even the organisation that puts the restriction on cannot help.
One additional benefit for accepting a license for logos and stuff would be that as a result our own logos would no longer have a "status aparte" on Commons.
Thanks,
GerardM
Friday, July 13, 2007
Proposal: localisation sure but enable languages first !!
At this stage, the support of a language is very much an all or nothing affair. There is a localisation or there is nothing. This is not how it needs to be. When a language is known to exist, the lowest level of support for that language is the acknowledgement that this language exists. This is currently not done, and I think it is a missed opportunity.
The first thing to consider is, what languages and linguistic entities exist and, how do you support this. This is a surprisingly complex question. Languages are recognised in the ISO 639 standard. There are several versions of the standard and not all languages have a script that is supported in Unicode. Even when a script is supported in Unicode, it does not mean the an associated font is available for a language. The consequence of these two points is that a subset is needed on computer. On the other hand the currently recognised versions of the ISO 639 do not recognise orthographies or dialects or other entities that make a difference to how documents are to be supported.
This is not an issue the organisations that develop and localise software want to tackle. For them this a distraction. Deciding what linguistic entities can be supported is something that is best addressed by one organisation that exists to deal with issues like these. The World Language Documentation Centre (WLDC) is that organisation. Through its association with Geolang and because its board of experts in many of the relevant fields, it is already in a prime position to the research that goes into the development of the ISO 639-6.
With the WLDC and Geolang able to provide researched and verified information about linguistic entities that can be safely supported, it is then up to the applications to at least acknowledge the existence and allow a user to create content in that language. As more information becomes available, spell checkers can be added specific to that linguistic entity. In this way slowly but surely the functionality grows without the need to first localise the application.
In a way this is a solution for a "chicken and egg" problem. This problem is solved when you think of it in an evolutionary way. First there was the egg, the support of the language, and then the chicken evolved, the localisation of the application.
Thanks,
GerardM
Thursday, July 12, 2007
Paypal
Credit card payments cost money, paypal payments cost money. I would expect that credit card payments are more expensive and it seems customary that you pay your paypal bill with a credit card as well. I have the option to pay a paypal bill directly with my checking account. Intuitively this seems to be the option that is less expensive. So I have opted for this.
The question I am left with is how paypal compares with transferring money using traditional banking methods. As paypal exists for some time now, I am sure it will have had an effect on the banks. Paypal benefits from ubiquitous Internet and highly automated procedures, these benefits are available to banks as well...
One of these days I will know the answer to these questions.
Thanks,
GerardM
PS Yes, I am going to Taiwan :)
Wednesday, July 11, 2007
Finaly, Ariana may stay
When Ariana got her nursing diploma she was not allowed to work, she was a refugee. After many years she finally got her permit to stay in the Netherlands. Many of the people that know Ariana have been appalled by the, in our eyes un-just, policies of the Dutch government. Not only my mother was willing to provide her with a sanctuary and sabotage these loathed policies that deal with refugees.
I am happy that Ariana can stay and now has more of a future. It is a shame that she has had to suffer all these years of uncertainty and doubt that have added to an already troubled past.
Thanks,
GerardM
Sunday, July 08, 2007
RTFM, read the fine manual
There is a mailing list and the last week there were two things that really got my attention. The first one said that not all the symbols that make up the total character set are to be used for one language. There are for instance characters specific to the Italian and the Ethiopian sign language. This lead to the observation that there is a need to identify the characters that are used for a particular language. This is similar to for instance the Latin script where only so many character are used for a language.
In the same way I appreciate this latest amazing story unfolds; there is a lot of documentation on how to use the SignWriting characters... People do not really read it. This is of course to be expected but it brings these wonderful moments where people find that there is more to it. That like for any other language, the basic tenets have to be really understood. That you can go find a character, a movement and it will be readable but it might prove not be the best one. What makes it so nice to read is the wonder and delight these people express that such great documentation exists.
I am impressed with SignWriting and I believe that it would be good when serious funding would find its way towards the further development of SignWriting. To sum up some points why; kids who learn to write their first language first have an easier time to learn the dominant written language that surrounds them, it gives the deaf people of the whole world the opportunity to express themselves in their own language. It allows for a better preservation of so many cultures that do not have anything but video.
Thanks,
GerardM
Saturday, July 07, 2007
Another BBC journalist comes home
I am moved, I know several Iranians; they are wonderful people. I have been to concerts of Persian music, it is enchanting. I am saddened because with this continued breakdown of the free exchange of ideas with the deterioration of the freedom of the press, this wonderful country, these wonderful people may become painted even more as an enemy of our culture.
Frances writes: "The Islamic system of government has deliberately erased much of what was Persian culture and it is only by looking hard that you can catch glimpses of the past." It is self righteous politicians that try to make the world in their own image. It is self righteous politicians that do not allow their own and their own actions to be judged in the same way as they judge others.
Thanks,
GerardM