Geo 485 Final Project
Mapping Foreign Born Population and Language Spoken at Home
Skip all the gibberish, let me see the maps!
Introduction
For my final project in GEO485 (Cartographic and Geographic Visualization) I chose to create maps showing the spatial patterns of spoken language and foreign born population in New York state. This page contains a discussion on the methods and data used, as well as the maps themselves.
Data
All data used in the maps is from the US Census Bureau year 2000 census. In particular, the data used comes from SF3 tables which are summaries of other tables using a 1 in 6 sample weighted to represent the entire data set. The American Fact Finder was used to select a smaller more manageable subset of all the available data.
Language data was separated into native and foreign born speakers, and further broken down by English language ability using the Census Bureau classifications. The universe for language data is all people over 5 years of age. Immigration data (represented by citizens who are foreign born, i.e. not born inside the United States) was broken down by country of origin, which was subsequently organized into supra-regions (e.g. Western Europe) and then into continents. The universe for foreign born data is all foreign born citizens of the US.
All data required processing to extract the exact values to be mapped.
- Click here to download the immigration data I used. Includes raw and processed data.
- Click here to download the language data I used. Includes raw and processed data.
Classification
All maps use classified data, with the exception of the Total Number of Speakers/Foreign Born maps. Classes used are derived directly from the Census Bureau classes for the choropleth maps, while the classifications for the Percentage of Population maps are calculated using the Jenks natural break method. The total population maps do not require any classification to be used, although it would be possible to attempt class the data using the census bureau classes to have a multi variate map.
Unfortunately, the classes used for Area of Origin and Language family do not have a direct 1:1 mapping. In particular, the Asian language family does not map to the Asian Area of Origin. The Asian area of origin includes countries such as India, Pakistan, and Turkey and others where Indo-European languages are spoken. Thus although a country may have a primarily Asian area of origin, there may be an Indo-European language predominately spoken. A similar problem exists with the Spanish language. This language can actually be spoken by people from two different areas of origin: Europe and the Americas (i.e. South and Central America). Additionally, Spanish is itself and Indo-European language, which due to the large number of speakers has been given it's own category. Thus Indo-European languages are essentially anything but Spanish that is still spoken in Europe and parts of Asia.
These classification problems could be avoided by further breaking down either the geographic regions or the language families into their component members. Unfortunately, doing so would result in a very complex and difficult to interpret map with a gigantic number of classes. Breaking up Continents into their supra-regions would result in 16 classes, an altogether too large number.
Symbology
When choosing the symbology for a map, the first choice I made was regarding the color scheme. Because spoken language and (in particular) immigration can be a politically touch subject, I decided to avoid colors associated with "danger" or that could be considered propaganda-ish such as red or yellow. I used cooler pastel colors such as blue, green and beige. Additionally, despite the issues mentioned above in classification, the color schemes for the Area of Origin and Language Family map were chosen to roughly match area of origin with the language family spoken there. Graduated symbols were used for the percentage of population maps because they showed the changes in values much more clearly than the proportional symbols. Dot density was used for displaying the actual numbers of non-english users and foreign born members to give the map user more of a "feel" for the numbers of people in a given county, as well as to effectively display the change in values for each country.
Region
I chose to map the variables at the state level because I felt that it would provide sufficient variety in the variables without being too over whelming. The same map at county level would likely not give sufficient variation to allow the user to see any spatial patterns without changing the classes significantly. I chose New York state because I happen to live there, and also because the change in population and demographics as one approaches New York City show a spatial patten that is readily recognizable.
The same maps could likely be produces at the national level, with few changes. Data is available at the national level for both foreign born and language spoken at home, and the same symbology could be used. The only substantial difference would be to change the numbers represented by each individual dot on the dot density maps, as well as possibly changing the classes used in the graduated symbol maps as a result of the disparity in populations between states.
Maps
![]() Areas of Origin for Foreign Born Citizens Download a PDF version of this map |
![]() Predominant Non-English Language Families Download a PDF version of this map |
![]() Total Number of Foreign Born Citizens Download a PDF version of this map |
![]() Total Number of Non-English using Households Download a PDF version of this map |
![]() Percentage of the Total Population that is Foreign Born Download a PDF version of this map |
![]() Percentage of the Total Population that uses a Non-English Language at Home Download a PDF version of this map |






