I thought I’d start my blogging year by sharing just three things that I found deeply interesting last year, and that resonate with me still, several months later. This is the first of three posts (the other two will be post before the end of January*).
The first was a presentation given by Asanka Wasala, a PhD student and localization researcher, at the Multilingual Web Workshop in Limerick in September. Entitled
‘A Micro Crowdsourcing Architecture to Localize Web Content for Less-Resourced Languages‘, the presentation grabbed my attention with the first two slides.
Take a look at them. The first is a tag cloud of the most common languages you’ll find on the web. Note the extraordinary place taken up by English. Now look at the second slide: that’s a tag cloud of the smaller languages on the web. Notice how many of them there are. In fact, many of those languages are not only under-represented on the web, they’re barely there at all.
In his presentation, Wasala focuses on Sinhala, a language spoken by more than 15 million people in Sri Lanka. But the Sinhalese are not only hard-pressed to find content on the web in their native language, they’re also unable to use tools like Google Translate, which does not yet translate Sinhala (at least, not yet, though making content accessible is clearly Google’s mission). He goes on to talk about a crowd-sourced localization initiative that would allow Sinhalese and other users to translate sites on the fly, and have that translation sit in a translation memory for others to benefit from. (I encourage you to listen to his 13 minute video if you want to know more.)
Several things struck me about his presentation. The first has nothing to do with language and everything to do with data visualization. The overwhelming dominance of English on the web was not new to me – I even referred to it in my own presentation on localization issues in content strategy. But somehow, looking at those same numbers in a table, and then viewing that same information as a tag cloud – well, the impact is just so much greater. These slides – a very simple application of data visualization on a familiar subject – brought that home for me.
More fundamentally, though, this presentation drove home another point: content is indeed the ‘last mile’ of the digital divide. We may be catching up on connectivity (indeed, IBM claims it’ll be resolved in five years), but if substantial groups of people cannot access information because they cannot understand it, can we truly say we’ve closed the gap?
So as we spend 2012 and beyond thinking of how to make content engaging, sticky, authoritative, killer, relevant, useful, compelling and what have you, let’s remember that for many people in the world, content simply needs to be one thing: accessible. Hats off to the linguists, computer scientists, researchers, and organizations working to make this happen.
*I just posted Highlight N° 2, and it’s February 6th. Sigh…sorry folks, but client work comes first, and client work is sometimes unpredictable.