
The Country Similarity Index attempts to quantify how similar countries are to each other relative to other countries. The index is a statistically-based way to measure this. 20% of the index is based on culture. 25% of a country’s demographic score (5% of the overall Country Similarity Index score) is allocated for languages commonly spoken in the country. The following is an explanation on how they were calculated:
Native Language
Language Family:
Language families are languages that descended through a common language in the past. By far, the most widely spoken language family is Indo-European. It includes most European languages, as well as Persian, Hindustani, and Bengali. The Niger-Congo language family is dominate in Sub-Saharan Africa. Another major language family is Afroasiatic. It includes Arabic, Somali, and Hebrew. While it is disputed whether Mongolian and Turkic languages actually came from a common ancestral language, they are considered part of the same Altaic language family for the purposes of this study, since they share many similarities.
Language Branch:
Even though languages may have a common ancestral language, they could still be extremely different. For instance, German and Bengali are both Indo-European languages, but have very little else in common. German is much more closely related to Dutch, English, and Scandinavian languages. This is known as the Germanic language branch. For this study, the branches were determined by elinguistics.com. Languages on the same branch had a common ancestry no more than 4,000 years ago.
Language
Although the most spoken native language is Mandarin Chinese, it is not widely used outside of China and Taiwan. Singapore and Brunei are the only other countries where more than 10% of their population speaks it natively. The Spanish language is more wide spread. A majority of the people in at least 20 countries speak Spanish natively. The only other languages with at least 10 countries are English and Arabic. French is the official language of many countries but the people in a majority of those countries do not speak it natively. Half credit was given to languages that are extremely similar and only diverged in the last few centuries, even though they are not considered to be the same language (as determined by elinguistics.net). Some examples: Danish and Norwegian, Afrikaans and Dutch, Kazakh and Kyrgyz. Other languages may have two different names, but are essentially the same languages and are mutually intelligible like Romanian and Moldovan.
The CIA World Factbook and elinguistics were the sources of the information:
https://www.indexmundi.com/factbook/fields/languages
http://www.elinguistics.net/Compare_Languages.aspx
Language
In many countries, the language people natively speak is the same as the official language. However there are also many countries where the language that people speak at home is different from the official language of the country. This category focuses on lingua francas in countries where they generally do not speak the official language at home. To qualify as a country’s common language, the language needed to satisfy at least two of the following five different criteria:
Official language
Official language is a big indicator that a language may be a lingua franca of a country, however there are many exceptions. For instance, although Romansh is one of the official languages of Switzerland, it is rarely used in government or education. Furthermore, less than 1% of the population speaks it.
Working language of the government
The language that the government uses to conduct business is always one of the lingua francas of a country. For example, although few people in Sudan can speak English, it is one of the government’s working languages, along with Arabic. Not all official languages are the working languages of governments.
Language of instruction in public education
Most of the time, the language of instruction in schools is one of the official languages. There are some exceptions however. In Ethiopia, the medium of instruction is English for high school and college. However, Aramaic is the official language and very few people can speak English. This is not the same as making students study a particular foreign language.
Lingua Franca commonly used between different ethnicities
In Namibia, Afrikaans is not the official language or taught in public schools. When Namibia became independent from South Africa, the government deliberately choose to become monolingual English speaking, but Afrikaans is still widely used between people of different ethnicities in the country.
Over 50% of the people in the country can speak the language
While some languages In Djibouti, Somali is not an official language, or taught in schools, however about 60% of the people there speak it.
In addition, the lingua franca can drastically change based on different regions within a country. For instance, in Canada, although French is one of the official languages, it is only truly the lingua franca in Quebec. Since Quebec is about 20% of the Canadian population, Canada is weighted as 80% English and 20% French in the lingua franca category. Another similar example is Cyprus, where Greek dominates but Turkish is mostly spoken in Northern Cyprus.
If the common language of a country is part of the same language family branch, it received half credit. For example, Portuguese, Spanish, French, Italian, and Romanian are fairly similar languages since they are part of the same Romance language branch of the Indo-European family.
Wikipedia and eLiguistics were the main sources of the data:
https://en.wikipedia.org/wiki/List_of_lingua_francas
http://www.elinguistics.net/Language_Evolutionary_Tree.html
Calculation Method
The languages and religions of two different countries are compared using the following method:
Example 1: In Estonia, 68.5% of people speak Estonian natively, while 29.6% speak Russian natively. Therefore, Estonia gets 0.7 Estonian points and 0.3 Russian points. In Lithuania, 82% of people speak Lithuanian natively, 8% speak Russian natively, and 5.6% speak Polish natively. Therefore, Lithuania gets 0.8 Lithuanian points, 0.1 Russian points, and 0.1 Polish points. When language in Estonia and Lithuania is compared, they get credit for just 0.1 out of 1.0 points, since so few people in Lithuania speak Estonian or Russian natively.
Example 2: Indonesia is 87% Muslim and 10% Christian, therefore it gets 0.9 Muslim points and 0.1 Christian points. Ethiopia is 67% Christian and 31% Muslim, therefore it gets 0.7 Christian points and 0.3 Muslim points. When religion in Indonesia and Ethiopia is compared, they receive credit for 0.3 points for Muslims and 0.1 for Christians, adding up to a total of 0.4 points out of 1.0 points.