Automated Clustering

Analyzing a Supercluster Chart

Today, I will continue with the same data I’ve used the past two days: Analyzing a Cluster Chart and What Are the Grey Squares on a DNA Match Cluster Chart? This time, though, I am creating Superclusters.

First, the “old” CLM chart I’ve been analyzing the past two days (without superclusters):

Next, the same chart but “simplified” with the grandparent labels: B, C & D.

Then, the “labeling chart” I’m using which shows the letters A, B, C & D stand for the 4 grandparents of the test taker.

Lastly, the Superclustered chart:

SETTINGS

The settings I ran for this kit (& I’ve added an extra thick black line around each supercluster):

    • cM Range: 50 to 400
    • Inclusion Threshold: 2/3
    • Sort: by cM
    • Cluster Sort: By size, Superclustered
    • Paint Midline

Note: While I left them off on the earlier chart, I have included the clusters (within the superclusters) that only have two people this time.

On this chart, I have 3 Superclusters, though the last one is really just a “regular” cluster. By looking at the members of each cluster and what I know about their relationship to my family, I have determined these 3 superclusters represent three grandparent lines:

    • 1st supercluster: D (the mom’s mom’s line of Coppenbarger & Bennett)
    • 2nd supercluster: C (the mom’s dad’s line of Peters & Werther)
    • 3rd supercluster: B (the dad’s mom’s line of Merrill & Eastwood)

Analyzing Superclusters

Let’s look more closely at each individual supercluster.

Supercluster #1 (D)

This supercluster represents the mom’s mom’s family which is composed of 4 clusters. Also, you can see a lot of grey cells within this supercluster showing a relationship between these groups.

Cluster 1 (Blue): Coppenbarger & Bennett

    • 14 members
    • 291 to 53 cM range
    • can place 13 of  14 cousins on tree
    • 13 of these 14 descend from J R Coppenbarger & Elizabeth Bennett
    • Hypothesis: Coppenbarger/Bennett

Cluster 2 (Orange): Bennett & Bookout

    • 6 members
    • 111 to 51 cM range
    • can place 2 of  6 cousins on tree
    • 2 of these 2 descend from Henry Bennett & Ellender Bookout
    • Hypothesis: Bennett/Bookout

Cluster 3 (Dark Blue): Randolph & Kearns (or Keeran)

    • 3 members
    • 102 to 54 cM range
    • can place 2 of  3 cousins on tree
    • 2 of these 2 descend from William Randolph & Matilda (Kearns or Keearn?)
    • Hypothesis: Randolph/Kearns(or Keeran)

Cluster 4 (Green): Unknown, though grey cells suggest a Bennett match

    • 2 members
    • 94 to 51 cM range
    • can place 0 of 2 cousins on tree

Supercluster #2 (C)

This supercluster represents the mom’s dad’s family which is composed of 5 clusters Also, you can see a lot of grey cells within this supercluster showing a relationship between these groups.

Cluster 1 (Red): Peters, Werther, Reinhardt

    • 11 members
    • 182 to 51 cM range
    • can place 9 of 11 cousins on tree
    • 2 of these 9 descend from Carl Peters & Guntherine Werther
    • 7 of these 9 descend from Johann Werther & Amalie Reinhardt
    • Hypothesis: Peters/Werther & Werther/Reinhardt

Cluster 2 (Teal): Peters, Werther

    • 6 members
    • 151 to 54 cM range
    • can place 2 of 6 cousins on tree
    • 2 (both) of these descend from Carl Peters & Guntherine Werther
    • Hypothesis: Peters/Werther

Cluster 3 (Pink): Peters, Bingher

    • 4 members
    • 172 to 73 cM range
    • can place 2 of 4 cousins on tree
    • 2 (both) of these descend from Joachim Peters & Henriette Bingher
    • Hypothsis: Peters/Bingher

Cluster 4 (Lime Green): Reinhardt?

    • 3 members
    • 82 to 62 cM range
    • can place 3 of these on tree with same ancestor, but this is going by a hypothesis that this is a brother of one of my “brick wall” ancestors: Reinhardt
    • Hypothesis: Reinhardt surname

Cluster 5 (Hot Pink): Werther, Reinhardt

    • 2 members
    • 80 to 68 cM range
    • can place 2 of 2 cousins on tree
    • 2 (both) of these descend from Johann Werther & Amalie Reinhardt
    • Hypothesis: Werther & Reinhardt

Supercluster #3 (B)

This supercluster represents the dad’s mom’s line and is made up of only one cluster

Cluster 1 (Blue): Merrill, Smith, Fulkerson

    • 5 members
    • 129 to 50 cM range
    • can place 2 of 5 cousins on tree
    • 1 of 2 descends from the parents of James Merrill: Nicholas Merrill & Eleanor Smith
    • 1 of 2 descends from grandparents of James Merrill: Jacob Merrill & Elizabeth Fulkerson
    • Hypothesis: Merrill/Smith & Merrill/Fulkerson
Now What?

My next step is to reach out to these unknown cousins and point them to this post. I’m going to explain where they fall on the chart, and see if they can help me identify their direct line. I’ve already had one “success” with a cousin answering my message since two days ago. Now, I hope to be able to identify and place even more cousins!

(P.S. After the 3rd “supercluster,” there were 4 additional clusters of 2 people each that I did not include.)

6 thoughts on “Analyzing a Supercluster Chart

  • This is great, Dana! I like how you took it cluster-by-cluster to analyze. My father’s auto-cluster has 200+ matches, which makes it a little overwhelming at first. I’m taking a cue from you and breaking my analysis down in a similar way. I just did this first super cluster and made a new connection. 🙂

    Reply
    • That’s terrific, Jessica! I am really excited about the analysis I’m doing and am excited to do more. Since I first posted, I have now heard back from 2 cousins with “no trees” and they both fit where I expected them to!

      Reply
  • Cathy

    Hi Dana,

    Thanks for all your research and writing! I’m interested in this cluster method. I’ve been very happy with the Leeds Method and wondering if there is any way to make Excel do the clusters and grey squares? Have you tried that or is it best to just get one of the companies to do it for me? Thanks,
    Cathy

    Reply
    • Hi, Cathy. Good question! I think the larger cluster and grey scales takes some programming knowledge, so I would recommend you use one of the automated adaptations. Do you know where to find them?

      Reply
  • Hello Dana,
    I am done clustering in the hope of uncovering a brick-wall at the great-great-grandfather level . I was encouraged to see you added a lower range of (50) Cms as this is the level I need. Question: same as Cathy above – which functions of Excel allow the re-grouping of colour clusters? Were you using Excel? What do you mean by automated adaptations? Many thanks,
    Joanne

    Reply
    • Hi, Joanne. The Excel function to regroup by color groups is a “Sort.” Select the chart then go to Data then sort and you will be sorting “on” cell colors. And it was Excel. And the automated adaptations are: Genetic Affairs’ AutoCluster, DNAGedcom’s Collins’ Leeds Method, Shared Clustering (which is inspired by, not based on, the Leeds Method), and DNA2Tree. I talk about the 2 most common ones under the “Leeds Method” tab on my website.

      Best wishes,
      Dana

      Reply

Make a Comment