State Similarity Index

Have you ever wondered how similar or different two states are? The State Similarity Index attempts to quantify how similar American states are to each other relative to other states. For example, one might intuitively know that West Virginia is more like Alabama than Alaska but less like West Virginia than Kentucky. The index is a statistically-based way to measure this. The index weights equally five major aspects of states: their demographics, culture, politics, infrastructure, and geography. The research combines more than 1,000 different data points. Each aspect was roughly balanced evenly between quantity/percentage and quality/type. The following paragraphs explain the exact criteria for the index more in depth:

Overall Rubric


Individuals & their personal characteristics

  • Race: Whites, Latinos, Blacks, Asians, Pacific Islanders, Native Americans
  • Ancestry: National Origin
  • Religion: Religious Denominations
  • Age: Average Age, Children, Elderly
  • Marriage: Married, Divorced, Single
  • Appearance: Height, Weight, Gender
  • Income: Average Income, Poverty, Millionaires
  • Education: High School Graduates, College Graduates, Advanced Degrees


Society & its common activities

  • Religion: Belief and Religious Denomination
  • History: Native Americans, Colonization, Civil War, Segregation
  • Speech: Accents and Native Languages
  • Entertainment: Sports, Exercise, Music, Television
  • Violence: Guns, Hunting, Murder, Incarceration
  • Health: Tobacco, Alcohol, Drugs, Suicide
  • Behavior: Personality, Individualism, Charity, Tolerance
  • Organizations: Union Membership, Military Enlistment, College Conferences, University Standards


Government & its adopted policies

  • Legislative: Senators, Congressmen
  • Executive: Presidential Elections, Gubernatorial Elections, Primaries
  • Judicial: State Court, Court of Appeals, Judicial Selection, Sentencing
  • Vices: Smoking, Marijuana, Alcohol, Gambling
  • Sex: Consent, Incest, Abortion, Surrogacy
  • Budget: Taxation, Expenditures
  • Protections: Rights, Immigration, Minimum Wage, Unionization
  • Procedures: Gun Control, Voting Laws, Education Laws, Traffic Laws


Technology & essential equipment

  • Energy: Use, Electrical Grid, Sources
  • Transportation: Mass Transit, Airplanes, Railways, Roadways
  • Water: Use, Content, Distribution
  • Construction: Homes, Skyscrapers, Age
  • Industry: Mining, Manufacturing
  • Logistics: Railroad Tonnage, Port Tonnage, Pipelines, Semi-Trucks
  • Communication: Radio, Telephone, Television, Internet
  • Emergency: Doctors, Firefighters, Police, Military


Environment & its physical features

  • Climate: Temperature, Precipitation
  • Land Cover: Forest, Farms, Pastures, Barren Lands
  • Vegetation Type: Crops, Habitats, Floristic Regions
  • Location: Time Zone, Latitude
  • Adjacency: Hydrologic Region, Geologic Region
  • Slope: Mountains, Elevation
  • Water: Freshwater, Coastal water
  • Anthropization: Population, Built-Up Area

A full distance matrix of all states was completed, revealing similarities and differences between individual US states.

The data from the Index was then used to generate an accurate Regions of the United States Map.


  • If a characteristic was nearly universal, then it was not included in the calculation. For example, only Louisiana is the only state with a judicial system that combines common law with civil law. Therefore this characteristic was not used since it would not make a difference for a vast majority of comparisons.
  • Similarly, if a characteristic is almost never shared between states, then it was not included in the calculation as well.
  • Some characteristics are difficult to quantify or hard to find statistics for. For example, architecture style would have been included into the index if there were relevant statistics readily available. However, since over 100 individual statistics were used, adding in more would not significantly change the overall score for a country. In addition, many statistics highly correlate with one another.
  • The index has been slightly revised since it was first created in response to feedback. This is version 3.0. Race, religion, history, language and climate have all been given a heavier weight than in the first iteration.

More specific information about the statistics used in this index and how they are quantified will be reviewed in later posts.