DNA Color Clustering: Dealing with 3 Types of Overlap
What Is Overlap?
Overlap is when a DNA relative is sorted into more than one Color Cluster. In the example below, Daisy sorts into both the Blue and Orange Clusters.

Daisy is showing overlap by being in both the Blue and Orange Clusters.
Three types of overlap have been noticed: extreme overlap, isolated overlap, and heavy overlap.
What Is Extreme Overlap?Â
Extreme Overlap is where a lot of overlap occurs, often over many columns of color. Extreme overlap likely represents endogamy or pedigree collapse where ancestors are closely related. Unfortunately, at this time there is no known method for “separating” cousins with this kind of overlap.

All Color Clusters are overlapping except for the Yellow Cluster.
What Is Isolated Overlap?
Isolated Overlap is where a “clean sort” has been created and most of the people are in only one Color Cluster. However, one or a few cousins might end up in two or more clusters. These people are likely related to the test taker in more than one way. I will talk about these cases in my next post, but it is best to remove these individuals from our chart so they don’t cause us confusion.

Only Daisy is in more than one Color Cluster, both Blue and Orange.
What Is Heavy Overlap?
Heavy Overlap is where two or more Color Clusters have a lot of cousin overlap. These individuals are likely all related to one branch of the test taker’s family, but are sharing different pieces of DNA.

The Blue, Yellow, and Purple Clusters show heavy overlap & should be “collapsed” or combined.
The Orange and Green Clusters show heavy overlap & should be “collapsed” or combined.
In this case, it is usually best to “collapse” or combine these columns into one column to make the sort easier to understand.

The Blue, Yellow, and Purple Clusters have been combined into the Blue Cluster
The Orange and Green Clusters have been combined into the Orange Cluster
Wow, this is a lot of info, but after reading this I feel better. When doing my color clusters I thought I had done something wrong. Thank you for all your work
Hi:
I am trying to wrap my head around a scenario to explain extreme overlap. I have isolated the DNA to just my BF, and then using the Leeds Method – what is a great example to explain (how/why) the majority of 2nd-4th cousins (400 cm-90cm) shares between 5-6 colors with me? BF’s paternal and maternal lines do not overlap. They show their distinct color columns but both sides combined have a total of 13 color columns (2nd-4th cousins only). If each color represents a grandparent…please help me out here. It seems I need a specific example to grasp this better. For example, I have a cousin with shared 269 cm and 6 colors isolated to BF’s paternal line. What’s the assumed scenario here? I have read your posts but I am hitting my head at this point. Probably because there are no shared surnames to remotely hint at what I am looking at or hoping to uncover. Thoughts? Thank you.
Hi, Angel. If you send me a screenshot or copy of your chart, I’m happy to give some quick feedback. Also, please include a column that shows the # of cM each of the matches share with you as this is really helpful.
I have used the method on both parents. My dad’s side worked amazingly well with distinct groups, even to connecting to matches with common ancestors in the 1700s. Yay!
However my mom’s side has a case of extreme overlap. She has 35 matches between 90 and 400cm, starting at 221. All but two match to the 221 (blue). Of the two left, both are 93cm. The second person (red) matches to every blue but number one. The third person (pink) matches to everyone but for persons one and two and one other.
If I extend matches to include 60 to 89 cms, nothing much changes. Of 40 matches, all but three match to blue; about half match to red; all but four match to pink.
To get other colors, I have to go to 55 cms, where there are two, and 50cms where there is one. However each of these three also match to many of the previous matches of all colors. There are 14 people between 50 and 221 that match to all six colours and 46 who match to five of six colours. There are no independent color pairings at all.
There are no NPEs in my tree and I DNA match to many 2, 3, 4, 5th cousins who are where they should be according to the paper trail.
No incidences of cousins or even second cousins marrying that I can find, however my grandparents emigrated from an insular community founded in the 1500s. Could this overlap be explained by matches sharing many sets of very distant GX grandparents, even if few recent ones?
Hi, Mary. There are several reasons you could have one giant cluster at the range of 90 to 400 cM. This cluster could just represent one-fourth of your family. Or it could represent one-half. You mentioned your grandparents were immigrants. It could be no one from their country has tested so you don’t have any 2nd or 3rd cousins (which is what 90 to 400 cM generally represents) on that side.
Whatever the reason, the key is to figure out why the people clustered together. Are they all maternal? paternal? or just from one grandparent line? That will help you figure out what’s going on.
Your real question was could this be explained by matches sharing many sets of very distant grandparents? Maybe. You might look and see how many segments some of your higher matches share with you. Does the match with 221 cM share around 6-12 segments which would include some fairly large segments? Or do they share around 20? A higher number of segments could indicate endogamy.
Looking at my color segments I’m reminded of the dreaded story math problems I used to hate to do. There are all of these details, and I’m not sure which details mean something. I was able to trace my husband’s paternal tree back to his great grandfather, but no farther. He took a Y-DNA test in the hopes it might provide some answers. And yes it did. He does not match the men with his last name, but does match men in another surname project. So I did the Leeds method, and I got an overlap mess. Some of it can be explained because he has autosomal matches to people with his current surname and matches to the people with the other surname. Also, his great grandfather was married four times, so there are half matches in there. If I’m following you correctly, I should collapse all of those matches in one column? I’m not sure what to do with the rest of the overlaps.
Hi, Cheri. Yes, you can collapse/consolidate them – but you don’t have to. Whichever makes more sense and helps you! I’ll email you, too. 🙂
I’m confused with the idea of overlap. I thought that if you already assign a color to somebody in one group, that you ignore them if they appear under another match.
Hi, Tess. If you are working with another color and come across someone already assigned a color, you DO add them to that color, too. Hope that helps!
Dana,
I’m wondering if you think the extreme and/or heavy overlap would occur more often if one is running the Leeds Method on an older person from a large family who may have lots of 1C and 2C many times removed. Thanks – Ann
Hi, Ann. I’ve been meaning to blog about it, but just haven’t yet. Because of those younger 1C1R, 1C2R, etc., an older person might have clusters form around thier paternal and maternal sides instead of 4 grandparents. I’ve seen this in people in their 90s. And the youngest testers – I’ve seen in teens and young 20s – have a lot of older 1C1R, 1C2R, etc. – can get clusters forming around their 8 great grandparents instead.
Hi Dana:
I am trying to discover the Paternal grand parents of a 92 year old woman, Nancy.
Her father was supposed to have been adopted as a very young child in the 1880’s after his parents died. His true birth name is unknown, but records show he was born in Buffalo, NY.
What is confusing to me is that her father was 40 when he married her 20 year old mother.
So the 5 or so 2nd-3rd cousins on the paternal side by Cms are a generation younger than she is (roughly in their 6o-70’s.) While the same one’s on her mother’s side are of her own generation or younger.
1. Am I right in assuming that because of the 40 year gap between Nancy and her father, that her Paternal matches will be younger than on would normally see?
2. Does this mean that Nancy is the 1st cousin of the 2nd Cousin’s grand parents or great Grandparents?
What has been a surprise is all his matches are Italian when he was assumed to be Irish becasue of his name! No one knows anything about him. He left his marriage when Nancy was 10 and died of TB 6 years later across the country from Nancy and her sister. Nancy’s Daughter and 2 Grandnieces have taken their DNA tests.
It appears he never had any other marriages or children besides Nancy and her sister.
I have been able to establish 2 Lines which I assume are her father’s Paternal and Maternal lines, but I do not have a clue how to determine which of them is the Paternal vs the Maternal. I have grouped the matches in their respective lines but am lost how to proceed. I am thinking from the matches that his parents (Nancy’s Grandparents ) were immigrants or possibly the children of Immigrants.
I am going to follow the surnames list you suggest to see if that makes anything clearer. I am including the matches & their ancestors in floating trees within the tree I have built for Nancy and her family. So their information with sources are recorded for hopefully being able to link them to Nancy one day.
3. Any Suggestions how I might determine which how to determine which grouping is Paternal and Maternal of her Father?
Thank you for any thought you have on how to sharpen my efforts on this task.
Hi, Michele. You might keep the age gap in mind, but 40 isn’t that unusual for a man to have children. One of my grandfather’s had his last child at the age of 51. This is more usual in larger families.
You don’t need to determine which is paternal and which is maternal. I’d suggest reading this post – https://www.danaleeds.com/next-steps/ – and watching my webinar on Legacy Family Tree Webinars called “One Man, Multiple Names.” These techniques will hopefully help!
Best wishes,
Dana
Looking for some help. 10 years later and I am still confused and searching for Birth grandfather. I completed a Leeds Method Chart, then combined those overlapping.
Now I have two spreadsheets, but have no idea what they are ‘shouting out’ at me.
I would appreciate a little assistant.
Hi, Annie. I am happy to take a quick look and give some feedback if you email me: drleeds@sbcglobal.net
Also, I have a presentation on Legacy Family Tree Webinars where I identified an unknown great grandfather which might be helpful. You’d need a subscription or a free trial. Here’s the link: https://familytreewebinars.com/webinar/one-man-multiple-names-a-dna-based-case-study/
Hope this helps!
Dana