Media Resource

Immigration and Citizenship Keyword Thesaurus for Chronicling America

Created by partners in the National Digital Newspaper Program, this resource hopes to serve researchers at all levels through demonstrations and explanations of search terms related to immigration and citizenship status in Chronicling America. Established in 2005 through a partnership between the National Endowment for the Humanities and the Library of Congress, Chronicling America contains nearly 23 million pages of newspapers supported by the work of partners in 50 states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands.  

Searching a database of this size can seem like an impossible feat at the onset. To help, Chronicling America features two search functions that allow for search terms, or keywords, to be run through the texts of newspaper pages. This is an immensely efficient and effective way to search millions of pages of content, yet keyword searches still do not always produce the results we want or need. Developing the most useful keywords requires knowing what terms were used in the period you are searching, rather than what the common, contemporary terms might be.

Identifying keywords can be particularly challenging when searching for news about immigration , since much of the language describing the voluntary and involuntary movement of peoples has evolved and changed throughout the centuries, and their meanings may vary depending on who is using the terms and the context in which they are using them. 

The pages in this Immigration and Citizenship Keyword Thesaurus serve as a guide to searching topics of people’s voluntary and involuntary movement in Chronicling America, including lists of words used in the past that may help produce more results, as well as strategies for navigating the database. When using this resource, keep in mind that historical newspapers, like all primary documents from the past, use the language of the time they were written, which may include terms considered offensive today. While efforts have been made to include and increase the ethnic press content in Chronicling America, most of the newspapers currently in the database are English-language and produced by white publishers and editors. In this iteration, this Thesaurus is intended to primarily assist researchers in identifying terms on immigration in the English-language press. Researchers should be mindful that immigration does not exist independently but intersect with other aspects of society like community size, class, and gender.
 

Starting Your Search

Keyword Searching

When searching on the web, you can enter your complete research question in the search box and get a number of results. However, if you enter your entire research question in the Chronicling America search bar, you probably won't get any results. Keyword searches rely on exact words that you enter in the search box, so if the search can't find all the words that you entered in the search boxes in the information about an article, it won't bring back any results. 

Keywords, also commonly called search terms, are the words that you enter into the database search boxes.They represent the main concepts of your research topic and are the words used in everyday life to describe the topic, often specific to a time and a place. Without the right keywords, you may have difficulty finding the articles that you need. 

The Chronicling America help section offers advice for performing basic and advanced searches, as well as general tips for using the database.  

Choosing your Keywords

Usually, the nouns and adjectives of your research question will give you a good idea of what your keywords will be. From these keywords, make a list of synonyms to use as alternatives. Since different writers will describe the same thing using different words, creating a variety of keywords to use in your search will increase the number of results. This thesaurus aims to assist you with synonyms for researching questions around immigration and citizenship.  

Because historical primary documents like newspapers use the language of the past that is often no longer common parlance, some of today’s language will not make for good keywords to search on Chronicling America. For example, searching the term “Migrant” in Chronicling America produces 104,058 results. If you search the term “alien” however, 1,401, 268 results are returned as it is the most widely used term. As this example illustrates, though we do not often use the word “alien” in contemporary discourse, it was far more common in the past, and therefore, it is a better search term if we are researching the history of immigration and citizenship status. Even with the suggested keywords, you should be prepared to perform multiple searches to determine which keywords work best for your topic. Eventually, trial and error will help you find the materials you need.

OCR Considerations

Optical Character Recognition, or OCR, is a technology that transforms the images of the pages into the text that we, and by extension, the software, reads. Chronicling America uses OCR to process images of newspaper pages by locating and recognizing characters, such as letters, numbers, and symbols, and presenting them as words that keyword searches identify and return as search results. 

OCR works best when the images are clear and free of imperfections, but documents from the past do not always fit this description. Historic newspapers are usually digitized from microform copies of the original pages, which makes for some messy images full of smudges, blurs, and imperfections.  

Unfortunately, OCR software is unable to distinguish intentional characters, like the individual letters of a word, from the unintentional marks found on the page, like the smudges or blurs. For example, a blurry “c” might be read as an “o” by the OCR software, which would make a word like “cat” read as “oat.”  

To account for these errors, researchers must adjust some keywords to search for the correct term as well as the “OCR possibles.” Because of this, when you perform a search for “Alien,” it may also be useful to perform a search for “Allen” to get any results that may have errors in the OCR text. The resources in this Immigration and Citizenship Keyword Thesaurus include potential OCR blunders for each of the keywords provided under the section “OCR Considerations” to encourage you to consider how OCR might be determining your search results.   

Working with the Thesaurus

Created by nationwide partners in the National Digital Newspaper Program, this resource hopes to serve researchers at all levels through demonstrations and explanations of search terms related to immigration and citizenship in Chronicling America. Launched in September 2024, this site hopes to expand its coverage and reach in the years ahead.

Organized around eleven keywords for immigration and citizenship, the Thesaurus consists of words that were used in the past to describe this category and therefore will likely be helpful in searching for articles about a given immigration or citizenship topic. For each keyword, the Thesaurus offers the following information: 

Related Terms 

Here find other terms to try in your searches. Some of these terms are synonyms, but some are also terms from different periods of time to help searching in different time periods.

Definitions 

Provided are definitions of these terms throughout the years. This section also includes how various groups of people have used these terms. Many of these definitions have been sourced from the Oxford English Dictionary.

Contextual Considerations, or "How these Terms were Used"

Contextual considerations provide examples of how various groups of people have used these terms, as well as some of the historical reasons surrounding its use.

Examples from Chronicling America 

The images provided in this section show examples of these terms being used in newspapers found on Chronicling America. As you will notice, the examples selected illuminate the “Contextual Considerations” described in the previous section.

OCR Considerations, or "How the Computer Sees it"

Also provided are a few variants that consider the errors made in the OCR. You may want to search a few of these in addition to the correctly spelled term. This section does not aim to be comprehensive; instead, it features a few examples of how the OCR may incorrectly transform the page image into text.  


 

Thesaurus Categories and Keyword Lists

Each of the links below will open a new page in the Thesaurus organized around the broad categories of immigration and citizenship. The keywords suggested on each page represent related terms that were used by newspapers in various historical periods and may improve your Chronicling America search results. 

A Note on Harmful Language

On each keyword list page, a pop-up will appear with the following message about harmful language:

The following resource presents a list of terms that may reflect racist and xenophobic opinions and attitudes. In providing this list of keywords, our goal is to support research into the lives and experiences of various communities, rather than to propagate the use of derogatory or harmful language.

When researching historical materials, especially newspapers, it is often necessary to use language in common use at the time of publication. Historical newspapers reflect the opinions and attitudes of their time. Their pages often contain biased, offensive, and outdated words and images that are now understood to be harmful.    

As responsible researchers, we should acknowledge and be mindful of how these terms oppressed groups of people, and take great care in using these terms to conduct research in the present. 

This pop-up reminds users to take a physical and mental pause when reading these words to consider the ways this language has been used to oppress communities of people. The terms themselves detail the complicated and painful relationship that the country has had with understanding immigration , and they remind us that language holds power in the ways it can support or dismantle systems of oppression.
 

Acknowledgements

The work for our Immigration and Citizenship Keyword Thesaurus for Chronicling America was truly collaborative. The National Digital Newspaper Program’s Working Group on Race and Ethnicity, including David Ferrara (AL), Mary Feeney (AZ), Timothy Gieringer (TX), Molly Hardy (NEH), Melissa Jerome (FL), Lauren Kennedy (OH), Ana Krahmer (TX), Sheila McAlister (GA), Katherine Poland (IL), and William Schlaack (IL), Sarah Tew, Toben Traver (NH) initiated the project. The group's ideas became realized thanks to the skillful work of NEH’s Pathways Intern Britney Henry. We would also like to thank the previous NEH Division of Preservation and Access intern, Samantha Gilmore, for her support and suggestions.