Q. I keep finding my family name spelled differently. Why does this happen and how can I make sure I’m not missing a spelling variant when I search?
A. I’ve heard many people say, “That can’t be my ancestor because we spell our name differently.” I made that mistake for many years. In my personal research, it’s turned out that Francis Johnson and Franz Janzen were exactly the same man, with the same wife and children. Beginning genealogists often disregard valid information on an ancestor simply because of the way a name is spelled.
Remember, we enjoy much more formal education than our ancestors typically did. They may not have been able to read or write English, or even to speak the language well. It’s easy to imagine how a New England town clerk could record Johnson when speaking to an older German man named Janzen, or how a census taker in the Deep South could mistakenly record Capley for the young southern belle named Kepley. National and regional dialects can also dramatically affect the way a name might be spelled phonetically. And some of our ancestors anglicized their names intentionally, while others simply preferred a new spelling. The spelling of the name doesn’t change who that person was—after all, how often has someone misspelled your name?
The secret to keeping it all straight is called Soundex. Developed to address the name-spelling problems of the 1880 census, Soundex has remained a valuable tool for family historians ever since. So in the Soundex system, Johnson, Janzen, Johanson and Jansen are the same name—they’re all J525. Sometimes a name is spelled different ways even in the same immediate family: One brother is John Smith while the other is William Smythe. Using Soundex, however, they become John and William S530.
A Soundex name always contains four characters, no more and no less. The first letter of the name becomes the first character of the Soundex code. The remaining three numbers are drawn from the name sequentially (see chart). Some letters in a name are ignored. When adjacent letters are from the same category, the second is ignored. An example is Schmidt: Since the number 2 represents both S and C, the C is ignored. The letters A, E, H, I, O, U, Y and W are also ignored except at the start of the name (so Adams is A352). An empty space is represented by a zero. Once the four-character limit has been reached, all remaining letters are ignored.
You can see how our example of Johnson, Johanson, Janzen and Jansen become the same Soundex name of J525. Jones becomes J520. Black is B420, as is Blake. Christopher and Christian both get coded as C623. Of course, you can’t assume that all people named Christopher are somehow related to people named Christian but, you can see quickly that Newsom, Newsome and Nuesom—all N250—are not so different that a relationship is impossible. Using Soundex also helps you consider possibilities such as Kristoph (K623) for Christopher (C623); since the letters C and K have the same numerical key of 2, it’s possible that these names are related.