FamilySearch has Duplicates?

There are approximately 8 billion records of individuals on the public FamilySearch Family Tree. (Hereinafter referred to as FS Tree) Some of them are duplicates. That is, records referring to the same person, but with slightly different information. Some duplicate records may even contain the same information.

How do the duplicates get there? It’s easy. Either one person enters a record for a given person two or more times without checking to see if the information is already there, or two or more people enter information on the same person without checking to see if the information is already there.

Is that a problem? They have enough storage space for 8 billion records, so it is reasonable to believe that they have planned ahead, and have more storage space available. If they run out of space, they can always buy more hardware.

So why is it important to find and remove duplicates? Let’s say that Jonathan V. Doe was born 3 May 1721 in the town of Fordwich, Kent, England, population 381. Lets also say that over time 416 people enter Jonathan into the FS Tree. Now lets say that 364 of those 416 people checked to see if Johathan’s information was already there. The first one found that there was no record for Johathan and the other 363 found that first record and either used it as it was, or added to the information known about Jonathan. They did not use, or merge, any other records for Jonathan.

That leaves 52 people who, at some time, each entered a new record for Jonathan V. Doe, born 3 May 1721 in the town of Fordwich, Kent, England, population 381. Some entered him as Jon V. Doe, some as John V. Doe, Some as Jonathen Doe, some as John Doe, some as John, some as Jon. Some entered the date as 8 Feb 1721, some as 3 Mar 1721, and some were sure that Fordwich was really in Shropshire.

Still, that is only 52 extra records in a database holding 8 billion. Why is that important? Here’s why. There are now 53 records for Jonathan. There are 53 opportunities for people to extend the line with both ancestors and descendants. There are up to 53 individuals or groups of people doing the same research, but for different versions of Jonathan. Those 53 groups could have been spending their time productively rathar than wasting their time doing research which would have been available to them if only they had known that it was already there. Jonathan is probably not the only person on the FS Tree with more than one record referring to him. Duplicate records invite duplicate (wasted) research and wasted space.

As we reduce the number of duplicate records by combining multiple versions of the same individual, we slow down the increase in the size of the FS Tree, and we increase the linage-linked aspect of the records. That allows us to take advantage of research done by other people by connecting our reseach to theirs.

Incline Software has a video where Gaylon Findlay describes how a person can create a database which he calls DuplicateFinder.aq. This file is used to help find duplicate FS Tree records cooresponding to the records on a local family file. I don’t know the name of your family, so I will refer to your family file as (Family).aq.

Before I show this video to you, let me emphasize that under NO circumstances should you actually use your (Family).aq file instead of the DuplicateFinder.aq file. All links to the records in FS Tree will be removed from the DuplicateFinder.aq file you are using. If you use your (Family).aq file instead of the DuplicateFinder.aq file you will destroy all of the links you had to the FS Tree. The only two ways to recover that information are by restoring a backup of the (Family).aq file or by re-linking records individually.

Here is how Gaylon created the DuplicateFinder.aq file.

In my posting ‘Find the Dead‘ I described my Centerville.aq file. This file is not the typical (Family).aq file because I have a very high percentage of living people there. It would be an unnecessary waste of time and internet traffic for me to ask AQ to send queries for all of those living people up to FamilySearch just to have them sent back down to me as living people, about whom they can supply no information. So I decided to take a different approach to creating my own DuplicateFinder.aq file.

Stats for the Centerville.aq file.

If you intend to follow along with this process using your (Family).aq file, do this NOW.

Open your (Family).aq file, and MAKE A BACKUP NOW.

Here is what to do after your backup.

From your (Family).aq file click on the ‘File’ tab and select ‘Save As…’ from the menu.

Name the new file. This is an intermediate file which will later be discarded, so name it appropriately. I chose the name ‘Intermediate’ because it is meaningful to me. Click the ‘Save’ button.

STOP

Close the (Family).aq file and open the intermediate file.

Click on the ‘Tools’ tab and select ‘Preferences’ from the menu.

Click on the ‘Database’ tab. Enter the name of your interim file in the ‘Title’ box. I do this so that the top line of the AQ screen will show the Title of the interim file, not the Title of the (Family).aq file. I like all the visual clues I can get.

Off the subject for just a moment, if you configure the name of your backup files as illustrated above, the backups will be grouped in the File Explorer window by (Family) name, and within that, they will be listed chronologically by the date and time of the backup. If you keep multiple copies of backups it’s nice to see them listed in order. The most recent backup of any file will be at the top or bottom of its (Family) group. Your choice.

Click the ‘OK’ button.

In the interim file click on the ‘FamilySearch’ tab and select ‘Unlink Individuals(s)…’ from the menu.

Check again by looking at the bottom right of your screen, and be sure that you are not in your (Family).aq file. You don’t want to make a mistake here. Click on the ‘All Individuals in File’ radio button, then click on the ‘Unlink’ button.

AQ warns you that you are about to remove all links to the FS Tree, and asks if you are sure. Click the ‘Yes’ button.

AQ confirms that the links in the records were removed, and tells you how many links there were. Click on the ‘OK’ button. Your interim file now has no links to the FS Tree.

Click on the ‘Search’ tab at the top of the screen and select ‘Advanced Filter/Focus…’ from the menu.

We are about to remove the records for all individuals whom Ancestral Quest does not recognize as ‘Deceased’ because we don’t want to try to process records which we know FamilySearch will not allow us to know about. Be certain that the ‘Selections by Relationship’ drop down box is set to ‘Individual’ then click on the ‘Define’ button.

Scroll down the ‘Possible Fields’ list and select ‘Deceased'(1) from the menu. Push(2) it into the ‘Current Filter’ and select the ‘Is not'(3) radio button. Click on ‘OK'(4) in the ‘Deceased Field Filter’ window.

Click on the ‘OR’ button, then continue.

In my file I have used the word ‘Dead’ in the ‘Death Date’ to indicate that I do not know whether the person is living or dead. If I know the person is not living, but I don’t know the ‘Death Date’ I enter ‘Deceased’ in that field. AQ accepts either value in the ‘Death Date’ field, and in response to either it places the gray FamilySearch icon on the Family and Pedigree views. For any person with ‘Dead’ in their Death Date, I can click on the icon and see if FS Tree believes they have died. Since almost all of these will be living, I don’t want those records in my interim file. I will add them to this ‘Current Filter’ to be removed with those who are not identified as Deceased by Ancestral Quest.

Scroll down the ‘Possible Fields’ list and select ‘Death Date'(1) from the menu. Push(2) it into the ‘Current Filter’ and select ‘Contains'(3) from the drop down box. Enter ‘Dead’ into the ‘Date’ box. (4) Click on the ‘OK'(5) button in the ‘Death Date Field Filter’ window.

Click on the ‘OR’ button, then continue.

Select ‘Name'(1) from the ‘Possible Fields’ menu. Push(2) it into the ‘Current Filter’ and select the ‘Surname Only'(3) radio button. Select ‘Does not exist'(4) from the drop down box. Click on the ‘OK'(5) button in the ‘Name Field Filter’ then click on the ‘OK'(6) button in the ‘Search for Individual/Marriage’ window, then continue.

Click on the ‘Show results only'(1) checkbox. Click on the ‘Delete'(2) button. The Yes/No(3) box will appear. Click on the ‘Yes'(4) button. Click on the ‘OK'(5) button in the ‘Search for Individual’ window.

That removed the living from the file, but that leaves us with gaps in the RIN numbers where records of the living were removed. It will be more convenient to have consecutive RIN numbers, so we must renumber the records.

Stats for the Intermediate.aq file.

I now have a DuplicateFinder.aq file with RINs ranging from 1 to 10955. This will allow me to more efficiently remove duplicates from FS Tree.

Since I have fewer than 11,000 records to work with in a file with 18964 RINs, I know I don’t want to do it all at once. I will break it into smaller groups of 50 records. I will also open a DupFindCount.txt file where I will record the starting and ending RINs for the next batch to put through the system. That way, I can stop at any time, including in the middle of a 50 record batch, and come back later to start over. If I didn’t complete the last batch I worked on, I will just run it through again.

Here is how I will get rid of RINs that are in my Deleted RIN List. This is a list of formerly used RINS of deleted records. The Rins are waiting to be reused. I don’t want any wasted space in my 50 record batches, so I will compact my IntermediateFinder.aq file into a DuplicateFinder.aq file with no unused RINs.

Click on the ‘Export’ icon at the top ledt of the screen. We need only vital statistics, so lets not export unneeded information. Uncheck all items in the ‘Include’ box, Click on the ‘All’ radio button, then click on the ‘Export’ button.

Click the ‘OK’ button.

Stats for DuplicateFinder.aq.

Here is a link to the full ‘Find Duplicate Records on FamilySearch‘ video, where you can learn how to make those removals for FS Tree records matching your ancestors.

Here is a link to the ‘Ancestral Quest Learning Center‘ webpage.

4 Comments

  1. Ralph Layne

    I think this is great. However you only deal with one match. What if there are multiple duplicates, as in the case of my great grandfather there are over 30 duplicates. How do you handle this?

    • When you make your request, Find-A-Grave returns either a screen telling you that there were no matches or a screen listing one or more matches, and from there you select the appropriate (or only) match.

  2. Not only duplicates, but also so many errors, especially when linking emigrants with their family in thier home country…

    Luckily it is still great for finding inspiration of possible sources that could help in the search 🙂

    • I guess I was lucky. My maternal and paternal lines emigrated relatively recently. My paternal grandparents were born in the UK, but my dad was born here, so that side was simple. On Mom’s side it was nearly that easy.

I would like to hear from you!