Ancestry Daily News
In the last few columns, we've talked about name variants, the reasons why they came about, and the frustrations they create for the genealogist. This week we look at some ways to search for variant spellings of names.
Searching for Soundex Equivalents
Before I do any Soundex searches for a name, I refer to my list of variant spellings for the surname. I like to have one sheet of variants (either on paper or in a word processing document) for each surname on which I'm working. For each of these spellings, I then list the Soundex code (usually a letter and three numbers). Most online databases do not require that I enter in the Soundex number (or even know what it is). Soundex searches of online databases are usually performed by checking a box for Soundex. However, having the list of my variants and their codes tells me how many separate searches I will have to conduct even if the Soundex option is used.
A Soundex search for Neil will not catch O'Neil. Consequently, two separate searches will have to be conducted for these variants. One for any of the names that have a code of N400 (such as NEIL) and one for a name that has a code of O540 (such as O'NEAL). Notice that a different first letter in this case generated a different Soundex number.
TRAUTVETTER has Soundex code T631
Again two separate searches are necessary. A Soundex search for Trautvetter will catch the first four references, but a separate search is required for the Trantvetter spelling.
Soundex searches work well in some situations, particularly when the name is pronounced in a language with a pronunciation that is reasonably similar to English and the handwriting of the original record is easy to read. In other cases, a Soundex search may not be the most effective tool available to the researcher. For many search interfaces, there are other options in addition to the Soundex that can allow the researcher to overcome the limitations that hinder Soundex.
Before our discussion of wildcards continues, it is important to note that the wildcard operator used can vary from one site to another and the number of known characters required before the search can be conducted varies among different sites.
Any last name containing the first three letters Nel will be returned. The wildcard operator * or % typically means that any number of characters can be put in the place of the operator, including none.
It may be possible to use this operator in places other than the name box. In the Social Security Death Index at RootsWeb, a zip code for last residence can be entered as 614*.
In this case, all the results will have 614 as the first three numbers of the zip code. This can be a way to broaden the search geographically, without entering any other locality information.
The United States Geological Survey Geographic Names Information System (USGS GNIS) also allows wildcard operators. A search was conducted for b%ville in the state of Illinois (by choosing Illinois from the dropdown menu of states and entering b%ville as the search term). At this site, the % serves the same function as * does at other sites (such as Ancestry and Rootsweb). The search for b%ville in Illinois at the USGS GNIS site resulted in several hits, including:
Of course, it would have located Bubbaville, Illinois, if there had been such a place!
The multicharacter wildcards are great, but sometimes they return too many hits or are not the most effective search tool. This is particularly true if there are a specific number of characters within a name that can vary.
The last name Kile is a good example. By far, the two main variants on this name are Kile and Kyle. If the site allows me to conduct a search for K*le (or K%le), I will get hits such as:
Some are too far off the mark to be the name I need. In this case, the main variants differ by one letter. This is where the single-character wildcards are convenient to use.
Single Character Wildcards
Results other than Kyle and Kile will be returned, but at least the number of matches has been reduced. Of course, if Kile is spelled as Coil the name will not be located using this method.
Considering only these variants a wildcard search could be constructed for this name as A*gust*.
The * (or %) typically does not have to be replaced by anything, so this search should catch those names that end in an “e” and those that have no letter after the final “t.” I find it helpful before conducting wildcard searches to write down as many of the name variants as possible and determine what letters each have in common. Those letters should be included in my search. Where there are differences, a wildcard operator should be placed.
Failing that, I conduct these searches in the surname box:
If the database is an English language database there will almost always be a few Smith entries.
How Many Characters?
- Read the help guide
Michael John Neill is the Course I Coordinator at the Genealogical Institute of Mid America (GIMA) held annually in Springfield, Illinois, and is also on the faculty of Carl Sandburg College in Galesburg, Illinois. Michael is the Web columnist for the FGS FORUM and is on the editorial board of the Illinois State Genealogical Society Quarterly. He conducts seminars and lectures on a wide variety of genealogical and computer topics and contributes to several genealogical publications, including Ancestry Magazine and Genealogical Computing. You can e-mail him at email@example.com or visit his website at http://www.rootdig.com, but he regrets that he is unable to assist with personal research.Copyright 2004, MyFamily.com.