The Country Similarity Food Index is an attempt to quantify how similar the food in one country is relative to another, based on data from the Food and Agriculture Organization of the United Nations. By comparing the weight of raw food ingredients used in each country, a score is assigned that reflects the degree of similarity between the respective countries.
The Index will be revised to better reflect which countries actually have the most similar food to each other. The first iteration put too great of a weight on the largest ingredients of diets. Furthermore, the equally weighted food group method ignored the fact that some countries eat far less meat or fruits than other countries, leading to skewed data.
The revised Country Food Similarity Index will fix these flaws by applying a square root transformation to the individual ingredients in the unweighted data. In most cases, this gives more logical results. The articles about each country will be updated periodically to reflect the new and improved method.
For reference, the first attempt’s score was composed of three components: an unweighted comparison method that accounts for one third of the total index, and two weighted comparison methods that represents the remaining two-thirds. The weighting of method 2 is based on a distribution similar to that of the food pyramid, with 40% of the index assigned to grains and starchy roots, 30% to fruits and vegetables, 20% to meat and dairy, and 10% to sweets and oils. Method 3 weights four different food groups equally. No drinks, such as milk, beer, or orange juice, are included in the score.
One other change was made to the Index. It is evident that some of the FAOSTAT data may be slightly inconsistent as some countries did not report certain food ingredients. For example, the Netherlands did not report data on cucumbers, despite it being a relatively common ingredient in Dutch cuisine. However, two of the most similar countries to the Netherlands, Germany and Belgium, did report data. In order to fix this inconsistency, the lowest of the two numbers was assumed.