Faster than a speeding bullet, more powerful thana microfilm reader, able to leap brick walls in a single bound — it’s super Web searching to the rescue. [BY TARA CALISHAIN]
All genealogists have a little Superman in them. Just like Clark Kent’s alter ego, we’re “fighting a never-ending battle for truth” — about our ancestors. If only we had his superpowers, we could trace our family trees back to the beginning of time. But genealogists do have a powerful tool right at their fingertips: the Internet. And if you make it work for you, the World Wide Web can offer a wealth of resources.
The Web’s often intimidating, though — not to mention frustrating. Who hasn’t typed an ancestor’s name into a search engine such as Google <www.google.com> only to get millions of irrelevant results? You certainly don’t have time to click through thousands of John Smiths in hopes of finding the one related to you. But if you learn the quirks of various search engines, as well as the unique language that can focus each search, you won’t have to. Use these tricks, and you’ll unlock some serious search superpowers you never knew you had.
If you’ve used search engines before, you know basically how they work: You enter a couple of keywords (or even just one), click Search and away you go. But searching this way is like driving a Ferrari at 5 mph — you don’t come close to seeing what that machine can really do. Search engines offer a lot more options that can make your research easier and more fruitful. (At last count, Google indexed more than 4 billion pages, so you want to do everything you can to narrow your search results.) We’ll concentrate on Google, since it’s the most popular search engine. But you can apply these strategies to most engines and even more-specialized genealogy databases.
Start by learning basic Boolean syntax, sometimes referred to as search-engine math. Every search engine has a default behavior for dealing with the keywords you enter. Depending on the default setting, when you type savannah georgia into a search box, the engine will find pages with both of those words (savannah and georgia) or pages with either of those words (savannah or georgia). (Capitalization doesn’t matter in Internet searches.) Most search engines, including Google, assume that you want to find pages with all of the search terms. But you can influence a search engine’s behavior by using Boolean operators to refine your search.
The most common operators are the plus sign (+), minus sign (-) and quotation marks (“”). The plus sign specifies that a term must appear in each search result; the minus sign specifies that the term must not appear in your search results; and the quotation marks specify that search terms must appear as a phrase. So searching on selby +massachusetts returns only pages in which the words selby and massachusetts appear, whereas selby-massachusetts finds pages with selby but not massachusetts. And “robert selby” returns only pages in which the words robert and selby appear side by side in that order.
To search for either of two terms, try using the operator or, as in “birth certificate” or “death certificate.” Note that with Google, you must capitalize OR, or you can use the pipe symbol (|), found above the Return key (for example, “birth certificate” | “death certificate”). Neither Google nor Yahoo! <www.yahoo.com> uses a near operator, which finds search terms near one another. Near is particularly useful when you’re trying to find first names in conjunction with surnames, or surnames in conjunction with place names. But you can fake a near operator with Google by using a tool called GAPS (which is short for Google API Proximity Search) at <www.staggernation.com/cgi-bin/gaps.cgi>. GAPS lets you specify that a search term must be within one, two or three words of another term. Want to find your Smith surname near a mention of Walla Walla, Wash.? GAPS is the place to do it.
Now that you’ve learned basic Boolean, let’s look at a few ways you can use it. Suppose you have Dupree ancestors who lived in Onslow County, NC. You’ve seen their name spelled Dupree, Dupre, Dupray and Dupreen, so you want to find records with any of those variations. Try using the or operator. Your Google search might look like this: onslow genealogy “north carolina” dupree OR dupray OR dupre OR dupreen. You’re telling Google to find pages that contain the words onslow, genealogy and north carolina, plus one spelling of the surname Dupree.
You also can use Boolean to weed out results containing certain words. For example, say you have an ancestor named Lloyd Carr, but each time you search for his name, you get thousands of pages devoted to the University of Michigan’s football coach. You can eliminate some of those irrelevant results by excluding the word Michigan from your search: “lloyd carr” -michigan. (Obviously, this isn’t going to work if your Lloyd Carr lived in Michigan. Instead, exclude the word football.) On a second review of your search results, you see pages about a musician named Lloyd Carr, who’s also not your ancestor. So you can eliminate the word musician from your search, too: “lloyd carr” -football-musician. Keep eliminating words until you home in on your ancestor.
Boolean searching’s fairly simple: You add a word, subtract another word and find words that are near one another. But even when you’re doing all those things, you’re still searching a very large pool of content — all types and parts of Web pages. You often can conduct searches more efficiently by restricting results to a certain kind or group of Web pages, or even a specific area of a page. That’s where special syntaxes come in. Different search engines offer different special syntaxes, but we’ll take a look at the most common ones here:
Title: Restricts searches to Web pages’ titles.
URL: Restricts searches to pages’ URLs (or Web addresses).
Site: Restricts searches to either a single domain (for example, archives.gov) or a top-level domain (.gov, .com, .edu, .org).
Link: Restricts searches to pages that link to a specified site.
Again, search engines have their own quirks. For a breakdown of your favorite engine’s special syntaxes, visit the advanced-search pages (on the engine’s home page, look for links to Advanced Search, Search Help or Search Tips). You’ll find Google’s at <www.google.com/help/ refinesearch.html> and Yahoo!’s at <help.yahoo.com/help/us/ysearch/ basics/basics-08.html>.
Let’s go back to the Dupree example. You want to find information on the Dupree surname. You want to be extra sure that your search results focus on the name Dupree, and don’t just mention it in passing. To do this, try limiting your search to pages with Dupree in the title. With Google, which uses the syntax intitle:, you’d type intitle: dupree into the search box. To limit your search even more, try adding keywords related to genealogy — for example, intitle: dupree genealogy or intitle: dupree”family history.” Google will return only pages with the words dupree and genealogy (or dupree and family history) in the title.
When you run this search, your results will include message boards, record transcriptions and other genealogical resources dedicated to the Dupree surname. The more unusual a name, the less you’ll need this kind of search. On the other hand, if you’re researching a common surname or a surname that’s also a common noun (Farmer, Archer or Hawk), it can help a lot. The search intitle: farmer genealogy surname can give you far more targeted results than you might think. (I added the keyword surname to make sure that the results focused on the Farmer name, rather than farmers in general.)
Site syntaxes (such as site: in Google) also can help narrow your results, especially if you’re looking for genealogical resources for a specific location. Perhaps you seek Dupree records in Georgia. To find resources provided by the state of Georgia, search on site: ga.us (ga is Georgia’s postal code, and us stands for United States). You can apply this search to any state — just substitute its postal code.
boolean for beginners
Still getting thousands of irrelevant hits? Here’s a quick lesson in search-engine math.
1. First, choose a search engine, such as Google , and search on just your ancestor’s name (for example, lloyd carr). The search engine will find any page with the words lloyd and carr. In this case, Google found 183,000 pages — way too many to wade through.
2. Next, try putting quotation marks around your ancestor’s name: “lloyd carr.” This time, Google turns up 12,500 pages in which the words lloyd and carr appear side by side.
3. Now add a search term that’s relevant to your ancestor’s life, such as the state where he lived: “lloyd carr” +michigan. The plus sign (+) tells the search engine to return only the matches for “lloyd carr” that also include michigan. That’s 10,500 pages.
4. Let’s say you’ve seen your ancestor’s name spelled both Carr and Car. To find matches for both spellings, try using the or operator (in Google, you must capitalize the word or): “lloyd carr” OR “lloyd car.” Of course, you’ll get more results this way, but you won’t miss any references containing the variant spelling.
Now add the surname Dupree to your search: dupree site: ga.us, and Google will find Georgia state pages that contain the name Dupree. You’ll see that this search is too broad. Add a keyword related to genealogy for more-targeted results: dupree genealogy site: ga.us. This search will give you only two results, so you might want to change keywords, or start by finding general genealogy resources in Georgia, with a search such as genealogy site: ga.us.
You also can use the site syntax to search multiple sites at one time. Just make sure you use the or operator. For example, to search The Genealogy Home Page <genhomepage.com>, RootsWeb <www.rootsweb.com> and Genealogy.com <www.genealogy.com> for the Dupree surname, you’d key in dupree site: genhomepage.com OR site: www.rootsweb.com OR site: www.genealogy.com. In one fell swoop, you’ve searched three substantial sites — perhaps. There’s a caveat I want to add to this tip. The idea of searching three sites at once sounds appealing, doesn’t it? But remember you’re using an external search tool, which might not search an entire site. That’s because Google might not have a complete index of RootsWeb’s, Genealogy.com’s or another site’s pages. Here are three reasons that happens:
1. The site is dynamically generated. After you run a query on a Web site (say you search for a book on Amazon.com <www.amazon.com>), the site generates a results page that fits your unique search — called a dynamic page because the content changes depending on how or when you access it. Google doesn’t guarantee it can index all of these dynamic pages on a site because that would take a tremendous amount of time and resources.
2. The site contains a database accessed only by running queries at that site. A good example of this is the Arizona Genealogy Birth and Death Certificate site <genealogy.az.gov>. A search engine’s software can’t go to the site and run queries, so none of the data will be indexed by a search engine.
3. The site doesn’t want its material indexed. A webmaster can attach a small text file to her site that tells search engines, “Please don’t index any of the content on this site.” In this case, you’ll find no content from the site on any search engine that obeys the robots.txt standard (all the popular ones do).
Using special syntaxes, you can restrict search results to a specific kind or group of Web pages, or even a specific area of a page. Let’s take a look at the two most common ones:
Site: Looking for genealogical resources for a specific location? Try a site search. Google uses the syntax site: for these types of searches. (To learn about other search engines’ syntaxes, visit their advanced-search pages.) To find references to the state of Georgia on RootsWeb <www.rootsweb.com>, search on georgia site: www.rootsweb.com.
Title: To find pages devoted to a certain ancestor or surname, run a title search. This will turn up only pages with that name in the title. So to find pages focused on the Dupree family, for instance, you’d type intitle: dupree into Google. You’ll get results such as In Memory of Dupree and Nathalie Dupree’s Web Kitchen.
To limit that search even more, add genealogy-related keywords — for example, intitle: dupree +genealogy. Now the first hit is Ancestor Guide: Dupree Genealogy and Surname Search. Not bad!
If reason 2 or 3 holds true, you’ll know fairly quickly that material on a site isn’t being indexed by the search engine. In the case of reason 1, you’ll just have to see if the material indexed by Google (or another search engine) is sufficient for your genealogy needs. If it is, great. If not, you may have to go to the site and run a search.
Why am I giving you a search tip with such a large caveat? Because even when it doesn’t work perfectly, a multiple-site search can turn up a lot of information, especially if the engine offers search options not available on the original sites.
As you’ve seen from earlier examples, there’s no reason you can’t combine Boolean operators with special syntaxes. You can exclude words from title searches, specify one site or another — the possibilities are endless. Bear in mind, though, that some special syntaxes may have their own restrictions. For example, you can’t use Google’s link: syntax with any other special syntaxes or operators. (The advanced-search instructions, mentioned on page 7, usually alert you to these limitations.) But for the most part, special syntaxes and Boolean operators work well together.
I recently wrote a book on Internet research called Web Search Garage. In it, I discuss 10 principles that can help you rethink your search strategies. A couple of these will have powerful effects on your genealogical research:
The Principle of Nicknames: The idea is that proper names can be expressed several different ways in a search. For instance, I’ve found pages referring to Los Angeles as LA, LosAngeles and even LaLa. You as a genealogist run smack into this principle every day. Should you search for a name forward (John Smith) or backward (Smith John)? Should you assume a middle name (John David Smith) or just an initial (John D. Smith)?
You can anticipate all of these possibilities by using the Boolean operator or and Google’s full-word wildcard feature. It’s not really a syntax, and it’s not really Boolean, but it’s really handy. A full-word wildcard allows you to insert an asterisk (*) into a query to substitute for any word. So if you search for “three* mice,” you’ll find references to “three blind mice,” “three blue mice,” “three green mice” and so on. This helps with the middle-name problem. The search “john * smith” will match both John David Smith and John D. Smith. It won’t, however, match John Smith. There has to be a word where you’ve inserted the full-word wildcard; Google won’t just ignore it.
The Balanced Spectrum® floor lamp combines the benefits of natural daylight indoors with a savings of $51 over the life of one bulb!**
Ever since the first human went into a dark cave and built a fire, people have realized the importance of proper indoor lighting. Unfortunately, since Edison invented the light bulb, lighting technology has remained relatively prehistoric. Modern light fixtures do little to combat many symptoms of improper lighting, such as eyestrain, dryness or burning. As more and more of us spend longer hours in front of a computer monitor, the results are compounded. And the effects of indoor lighting are not necessarily limited to physical well being. Many people believe that the quantity and quality of light can play a part in one’s mood and work performance. Now, there’s a better way to bring the positive benefits of natural sunlight indoors.
The Balanced Spectrum® floor lamp will change the way you see and feel about your living or work spaces. Studies show that sunshine can lift your mood and your energy levels, but as we all know the sun, unfortunately, does not always shine. So to bring the benefits of natural daylight indoors, use the floor lamp that simulates the full spectrum of daylight. You will see with more clarity and enjoyment as this lamp provides sharp visibility for close tasks and reduces eyestrain.
Its 27-watt compact bulb is the equivalent to a 150-watt ordinary light bulb. This makes it perfect for activities such as reading, writing, sewing, needlepoint, and especially for aging eyes.
We’ve looked at lots of lights, but this one offered the benefit of dual light levels of 27 and 18 watts of power equivalent to 150 and 100 watt incandescent bulbs. This lamp has a flexible gooseneck design for maximum efficiency, with an “Instant On” switch that is flicker-free. The high-tech electronics, user-friendly design, and bulb that lasts ten times longer than an ordinary bulb make this product a must-have.
“I sit in my comfortable chair after my husband has gone to bed, and I turn that lamp on. It makes it sonice because it’s like daylight over my chair…I don’t get sore eyes like I used to.”
Try the Balanced Spectrum® floor lamp now at it’s lowest price of less than $50! Now more than ever is the time to add sunshine to every room in your house at this fantastic low price! The Balanced Spectrum® floor lamp comes with a one-year manufacturer’s limited warranty and firstSTREET’s exclusive guarantee. Try this product for 90 days and return it for the product purchase price if not completely satisfied.
For finding names forward and backward, you’ll have to use your Boolean. If you’re looking for John Smith, just make sure you include both iterations in your search: “john smith” OR “smith john” will cover your bases. Wouldn’t it be terrible if you missed some choice bit of information because you didn’t search for the name in the right order?
Wait, I see a question from one of the cool kids in the back. The cool kid is asking why not just search for the name without the quotes, meaning that any version of it would be found? If you’re looking for an unusual enough name, say Oceleo VonSneezeguard, that’ll work fine, because you won’t get a lot of false-positive results. But if the name is at all common, you’ll find you get too many irrelevant results to make the search worthwhile. Searching for the name forward and backward, in quotes, will keep your results much more focused.
The Principle of Unique Language: Certain fields generate their own unique words and phrases. The medical profession is a good example: If you’ve ever reviewed an old death certificate, you’ve probably seen some strange words. Where can you learn what those terms mean? Turn to your favorite search engine.
Let’s take an example: Your third-great-grandfather died of milk sick. What does milk sick mean? Try a simple search: “milk sick”means. (Don’t use quotes around the full query.) You might be able to pull the answer out of the first page of results. But you can narrow those hits even more by adding a keyword that indicates you’re searching for a medical term. The word medical will work fine in this case: “milk sick” means medical. Since you’re looking for old terms, you also could add the keyword archaic. The search “milk sick”means archaic will give you extremely targeted results. If you search for an archaic medical term such as milk sick and a clarifying word such as genealogy, you’ll find your results will be very genealogy oriented.
Genealogy has its own set of unique words: surname, ancestry, census. Even abbreviations such as SSDI (short for Social Security Death Index) and GEDCOM (Genealogical Data COMmunications) are part of the genealogical vocabulary. Of course these words aren’t used exclusively by genealogy sites, but adding them to your searches will focus your results considerably.
See, you don’t even need Superman-like powers to fight your battle for ancestral truth. With your new bag of Web-searching tricks, you’ll find family history facts in no time. Happy hunting!
TARA CALISHAIN is the author of Web Search Garage (Prentice Hall, $19.99) and co-author of Google Hacks (O’Reilly, $24.95). She also edits the weekly search-engine newsletter ResearchBuzz <www.researchbuzz.com>.
How DNA changed my genealogy…
“The Nitz’s: Reunited through DNA”
By the summer of 1999, I’d been an armchair genealogist for the best part of 3 decades. As a youngster I was the kid who always had a genealogy prepared prior to a teacher asking for us to do a ‘special family project’.
Most of my lines were well researched prior to my first visit to Salt Lake City in 1979. Further progress was made during a subsequent trip to the then federal records center located in Bayonne, NJ. However, I had some holes, and several dead ends in my family research…and as I tell friends ‘it goes with the territory, the better researcher you are the sooner you’ll hit a dead end.’
As you know, success breeds excitement and failure breeds frustration, boredom, or worse….at least for me. I have put down my research and resumed it again a few times per decade since my 1979 genealogy summer that culminated with a trip to Europe to look for remnants of family…or at least family villages.
In the summer of 1999 I was updating my mother’s father’s lineage, as she hadn’t seen most of her cousins since she was a youngster in the 30’s, and was coming soon to visit her grandchildren… While nearing the completion of the USA NITZ family I searched a familiar web site and found a person in Argentina searching for the same name…and claiming one city in common with the genealogy that I had recently updated. Within a few seconds I composed an email and let it fly…not to California where nearly all of my Nitz family went during the depression, but all the way to Buenos Aires! Over the next several weeks and despite great efforts on the part of the cousin’s from ‘down under’ we were not able to link the families by paper…that singular coveted item that all genealogists consider sacrosanct! Maybe I should take up gardening?
A few nights later I was walking the dog, late at night, and I recalled two different studies that had used a part of our DNA for lineage confirmation and authentication. Upon returning home I searched the web and found both articles, which dealt with the male inherited Y-Chromosome.… One was on a group of Jews, called Cohanim, who claimed to be direct male descendents of Aaron, the brother of Moses. The other story was of great interest, as it dealt with early American History, slavery, and the Jefferson family of Virginia.
Soon I was reading the comments of those who agreed with and disputed the finding of each of these two original Y-Chromosome papers…I was learning what one could and could not expect to learn from DNA testing. For example, I discovered that:
a) all males have a Y chromosome and they only receive their Y from their father, who received his Y from his father!
b) the women I was in contact with from Argentina could facilitate, but not contribute to Y-DNA testing
(since females don’t have it), and that was assuming I could find a lab willing to deal with a genealogist!
c) males obtain their surnames AND their Y chromosome from their fathers; therefore men make great candidates for genetic reconstruction.
In my search for a testing facility I discovered that no commercial Y-DNA testing lab existed although dozens of firms were conducting paternity tests (which I found used an entirely different portion of our DNA). The upshot of this was a challenge offer from the University of Arizona’s Michael Hammer (coauthor of the 1997 Cohanim study). He volunteered to test 2 dozen males, of my choice, as a proof of concept, with the condition being that IF this technique worked I’d start a commercial enterprise, with them providing the science and me organizing a company to deal with the clients on the front end.
After 6 weeks of collecting samples and 90 days of waiting for the lab to get around to processing it, the results came back. Of the 24 men tested, the twins matched, as did both other sets of men who had paper trails indicating that they shared a common male ancestor. My two Nitz volunteers, from CA. and from Argentina, were an exact match as well…even better, no random matches occurred and therefore of the 24 samples 4 sets of 2 matched each other and all the rest (16 men) matched no one, which according to the Anthropologists in Arizona clearly showed that the samples came from unrelated males.
This is how Family Tree DNA got started. All those results showed me that DNA testing, while not a replacement for the traditional tools of our beloved hobby, is the newest tool in the arsenal of the prepared genealogist. Today, 4 years later, over 20,000 individuals had their DNA tested for genealogy purposes, and more than 1,000 surname projects have been established.
I’ll be happy to tell you more about it if you call me at 713-868-1438 or send me an e-mail at [email protected]
President and Founder
Family Tree DNA