Falsehoods programmers believe about languages
-
I can’t believe nobody has done this list yet. I mean, there is one about names, one about time and many others on other topics, but not one about languages yet (except one honorable mention that comes close). So, here’s my attempt to list all the misconceptions and prejudices I’ve come across in the course of my long and illustrious career in software localisation and language technology. Enjoy – and send me your own ones!
Icons that are based on English puns and wordplay are easily understood by speakers of other languages.
This reminded me of one of those Top Gear "drive across a foreign country in weird vehicles" specials where Jeremy Clarkson needed to borrow a cable to jump-start his car, and laboriously mimed out jumping for "jump", and walking a dog for "lead", to a perplexed local. Richard Hammond was cracking up but finally managed to point out what a fool Clarkson was being.
Geolocation is an accurate way to predict the user’s language.
And as an addendum to this, in 2025 nobody should be using Windows' "Non-latin/-unicode character set" setting to guess the user's preferred language. That's a pre-WinXP kludge. I'm specifically looking at you, Intel integrated graphics software writers, but you have plenty of company, don't worry.
-
@TrickDacy @rimu but use them much more restrictively. As an example in Thai, "yes" is "chai", but is used only in a few situations, like if a question is ended with "chai mai" (yes followed by word forming polar question).
In interfaces you can't usually put this as yes/no buttons, but rather usually one is a verb like "khao" ("come/go in") and the other is the same word prefixed with mai ("not", different tone from the other "mai" i mentioned).
Chinese is similar but I don't know it as well.
@TrickDacy @rimu another example is Irish, which I've heard claim as an explanation for Irish English also contains more of "it is/isn't" and such constructs in favour of yes no.
Another European example is Finnish which has yes but not no. You want me to go on?
-
It's infuriating when an app or a site throws a shitty translation of itself in my native language at me. Most of the time they must not even check it's quality, it's notorious. Just let me use English by default.
Now there you have one more pet peeve of mine, language preference settings that pretend that all language versions are equal, that shitty translations aren't abundant. If the original text is in a language that I can read then I want the original, not some shitty translation. It shouldn't be that hard!
-
Segmenting a text into sentences is as easy as splitting on end-of-sentence punctuation.
Is there a language this actually isn't true for? It seems oddly specific like a lot of the others and I don't think I know of one that does this. Except maybe some wack ass conlangs of course.
Even in english this isn't true, for example dots can appear inside a sentence for multiple reasons (a decimal number, an abbreviation, a quotation, three dots, etc, etc), which would make you split it into more than one piece.
-
Geolocation is an accurate way to predict the user’s language.
Now that's a pet peeve of mine, a bizarre belief surprisingly often held by people, who must be oblivious to the existence of tourism.
It would be a useful way to predict it possibly, but presumably the author meant if you have support for localization, you also provide an obvious and easy means of changing the language.
More importantly, you should be using the language an existing user has already used in the past.
-
This. When I was in Mexico on my honeymoon, Google kept redirecting me to their .mx version of Google; despite my inability to read Spanish.
And I always want the english version instead of the german version, despite me being german. Literally only google fucks that up. Every other site, even the small local german Uni website or the canteens meal site, respects my browsers setting. Google does not, and serves me german.
-
Icons that are based on English puns and wordplay are easily understood by speakers of other languages.
This reminded me of one of those Top Gear "drive across a foreign country in weird vehicles" specials where Jeremy Clarkson needed to borrow a cable to jump-start his car, and laboriously mimed out jumping for "jump", and walking a dog for "lead", to a perplexed local. Richard Hammond was cracking up but finally managed to point out what a fool Clarkson was being.
Geolocation is an accurate way to predict the user’s language.
And as an addendum to this, in 2025 nobody should be using Windows' "Non-latin/-unicode character set" setting to guess the user's preferred language. That's a pre-WinXP kludge. I'm specifically looking at you, Intel integrated graphics software writers, but you have plenty of company, don't worry.
Geolocation is an accurate way to predict the user’s language.
This makes me so angry. It really really really really really does.
Despite setting everything to English I still get my receipts in French. And all because my IP is CG-NAT to the capital which is marked as french speaking.
What is so hard about letting me decide. The absolute fucking arrogance thinking you as a company know better than me in which language I would like to be served.
Eat a dick Microsoft.
-
Most of these just seem like basic educational issues except this one imo:
Every language has words for yes and no.
I want to see more than like 1 or 2 counterexamples. I'm pretty interested in linguistics on an amateur level. Don't believe I've heard of that one before now.
My wife is Vietnamese, so I have a basic grasp of it, but they don't really have a word for yes.
The verb itself is used to answer the question.
Want something to drink? Drink.
Want to go to the park? Go.They have a word for no, but as you can probably ascertain, it's only for the negative.
-
Arabic doesn't have a word for "yes". I don't think most semitic languages do either [Classical Hebrew does not, but Modern Hebrew does, however, the word they use in modern Hebrew is the word for "Thusly", that is now a particle]. In fact you can see that proto-indo European didn't have a word for yes: Greek is ναι, but the romance languages are si (I am pretty sure French oui is actually derived from the same root as Spanish and Italian. Could be wrong) and if my memories is correct (and it may not be) classical Latin didn't have a word for yes. And the Germanic words yes/ja have a similar origin. I can't speak to the other IE languages unfortunately.
I know there are also language families that don't have a single word for no, but use a negation mood on the verb. I unfortunately can't give you an example of this. But it should be fun to look up!
Arabic doesn't have a word for "yes".
(Yes) in Arabic is نعم pronounced as (Na'am) or ( Na ع m) & this is the word which most people use in all Arab countries . The Arabic dialect word for (Yes) is ايوة or ايوا pronounced as (Aywa)and also used by all Arabs.
-
Most of these just seem like basic educational issues except this one imo:
Every language has words for yes and no.
I want to see more than like 1 or 2 counterexamples. I'm pretty interested in linguistics on an amateur level. Don't believe I've heard of that one before now.
Scottish Gaelic doesn't have 'yes' or 'no' - you answer with the positive or negative form of the verb used in the question.
http://www.gaidhliggachlatha.com/blog-mios-na-gaidhlig/how-to-say-yes-and-no-in-scottish-gaelic
-
Oh of course, I didn't think about punctuation occurring in the middle of a sentence. Duh, thanks.
@2xsaiko @TehPers there's other examples too. E.g. Thai has no spaces between words but spaces between phrases/sentences. However the spaces between phrases involve style choices similar to comma in English and many other Latin script writing systems. Also, Thai may have spaces around abbreviations special characters.
I'm quite familiar with Thai so that's close at hand but I guess it's the same in a lot of other writing systems based on Brahmic scripts.
-
Confidently incorrect.
Jesus. I mistyped that horrendously to make a point about !0
I read my comment just now waking up and went wtf
-
I can’t believe nobody has done this list yet. I mean, there is one about names, one about time and many others on other topics, but not one about languages yet (except one honorable mention that comes close). So, here’s my attempt to list all the misconceptions and prejudices I’ve come across in the course of my long and illustrious career in software localisation and language technology. Enjoy – and send me your own ones!
There are perfect guidelines on preparing translatable strings in the GNU gettext documentation.
-
Confidently incorrect.
Until you use exit codes, which flips the logic.
-
Another couple missing:
- every language uses gendered nouns/verbs/adjectives/pronouns/etc
- no language uses gendered nouns/verbs/adjectives/pronouns/etc
- pronouns referring to people are always gendered
- pronouns are always singular (1) or plural (2+)
A fun language to learn regarding these is Hawaiian, where the language uses a-class and o-class rather than masculine and feminine, and which you use is largely based on how much control you have over it.
Persian is a perfectly neutral language. Some westerners apparently struggle with understanding how we speak out language. We also don't have an equivalent to "The"
-
It would be a useful way to predict it possibly, but presumably the author meant if you have support for localization, you also provide an obvious and easy means of changing the language.
More importantly, you should be using the language an existing user has already used in the past.
It's not even that, there are multiple languages spoken in the same region. Webpages should just use the language the browser tells it to use.
-
My wife is Vietnamese, so I have a basic grasp of it, but they don't really have a word for yes.
The verb itself is used to answer the question.
Want something to drink? Drink.
Want to go to the park? Go.They have a word for no, but as you can probably ascertain, it's only for the negative.
Can they answer “not no”?
-
Most of these just seem like basic educational issues except this one imo:
Every language has words for yes and no.
I want to see more than like 1 or 2 counterexamples. I'm pretty interested in linguistics on an amateur level. Don't believe I've heard of that one before now.
I think Latin doesn't really have words for yes and no.
-
Geolocation is an accurate way to predict the user’s language.
Now that's a pet peeve of mine, a bizarre belief surprisingly often held by people, who must be oblivious to the existence of tourism.
yup I too remember getting YouTube ads in Hungarian when I was there as a tourist - despite not understanding Hungarian at all and watching videos only in other languages, they really ought to know that
-
Segmenting a text into sentences is as easy as splitting on end-of-sentence punctuation.
Is there a language this actually isn't true for? It seems oddly specific like a lot of the others and I don't think I know of one that does this. Except maybe some wack ass conlangs of course.
There are languages that don't have the concept of "punctuation" at all.