Uncovering Hidden Government Documents

September, 2020

Shifts in presidential administrations, agency policies and procedures, and internet standards often result in changes to the content and organization of federal websites. This can make it difficult for researchers to locate documentation—such as consent decrees and guidance memoranda—that may still be in force. The hyperlink or title of such guidance may appear in the Federal Register or within an article or court filing, or it may be referred to by the agency itself, but the link may be broken or the document may be otherwise difficult to locate. The following tips can be used to find elusive government records.

If the URL is Known

If the URL is known, there are several ways to search for the information. First, the Internet Archive1 provides access to 20-plus years of web history through the Wayback Machine.2 After entering a URL, you’ll be taken to a calendar showing when and how successfully the site was saved. Larger circles indicate multiple captures on a particular day, making more of the site’s content available for viewing; blue circles indicate a more successful crawl of the site.3 Click on a date and provided snapshot link to view the desired site or document. (See Fig. 1.)

Another option is to perform a Google search. Google’s cached version of a website shows what the web page looked like the last time it was visited by one of its crawlers. Enter the cache: search operator in front of a URL into the Google.com search field.4 For example, to review a cached version of the CBA website, enter cache:cobar.org into the search field. While these backup snapshots are generally used to provide Google users with quick access to slow or nonresponsive web pages and are typically not very old, they may provide the needed entry to a site or document.

Finally, if navigating directly to a known web address is not successful, the URL might provide insight as to where a document used to “live” on a website before its restructuring. For example, the URL www.epa.gov/newsreleases/epa-declares-outdoor-burn-ban-tulalip-reservation.html demonstrates the needed document is titled “EPA Declares Outdoor Burn Ban for Tulalip Reservation.” Simply remove the end of a link—anything following the last forward slash, repeating as necessary—until a valid site is reached. In the case of a major website reorganization, you may need to remove everything but the root of a site’s URL (e.g., www.epa.gov). Then, search the site for the desired guidance using the keywords provided in the URL. This strategy is particularly useful when trying to locate a PDF that’s not text-searchable (i.e., it’s been saved as an image), as keyword searches would not be effective.

If the Document Title is Known

Just as a URL can provide insight as to where a document used to reside, it can also indicate the title of the needed guidance. Searching for a known document title in quotation marks ensures the search engine will keep all terms together as a phrase.5 You can use the site: search operator to further refine the search to a specific website.6 Using the previous EPA example, you would type “epa declares outdoor burn ban for tulalip reservation” site:epa.gov into the Google.com search field. This leads to the archived article on the EPA website

This strategy is often more effective than using a website’s embedded search tool. But because each search engine employs proprietary bots that crawl and index the web and use unique algorithms to rank websites, you may need to use more than one search engine and review more than the first page of search results.

Another helpful resource is OCLC, a global cooperative of libraries that “collectively steward a vast quantity of knowledge”7 through WorldCat, which bills itself as “the world’s largest network of library content and services.”8 With WorldCat’s advanced search, users can search by keyword, title, and author.9 The full record of a search result often contains “links to this item” and a unique identifier called a PURL, or persistent uniform resource locator.10 Even if a known URL is broken, a PURL may still be a successful means of access. “Links to this item” may also list a previously unknown website; use the “Known URL” tips above if this address no longer works.

Finally, HathiTrust is a nonprofit collaborative of academic and research libraries with more than 17 million digitized items.11 The mission of its U.S. Federal Government Documents Program is to enhance digital access to U.S. federal publications, including those issued by the U.S. Government Publishing Office (GPO) and other federal agencies.12 Use the advanced catalog search to input information about the needed document, including author, title, and subject.13 More refined searches may be conducted within the following U.S. Federal Documents Collections: U.S. Federal Documents, U.S. Congressional Serial Set, Bureau of Indian Affairs publications, U.S. Environmental Protection Agency publications, Foreign Relations of the United States, Statistical Abstract of the United States, and U.S. Civil Rights Commission.14 Nonmembers can search across all collections, but viewing and downloading privileges may be restricted.

Additional Search Options

If you have a quote or passage from the desired document or site, try searching for that excerpt within quotation marks. The guidance you seek may have been archived on a website other than the issuing agency’s. If searching for the excerpt generates a lot of results but not the full text of the document, try using the filetype:pdf search operator for a more targeted search by document format. For example, entering “epa wastewise” filetype:pdf into the Google.com search field returns only PDF results. Additionally, both the Wayback Machine15 and HathiTrust16 allow for full-text searching.

If you don’t have a URL, the document title, or a direct quote, conduct a broader internet search using any known document details, such as author, recipient, parties, agency, document, case or other identifying number, and date. You can also try to identify online repositories or special collections that typically contain the type of document you need. Examples include the U.S. Department of the Interior’s Office of Hearing and Appeals database17 and the EPA Web Archive.18 As stated above, it may be necessary to use more than one search engine and review multiple pages of search results.

