There has been much debate over the use of
small autosomal DNA segments. It is
important to understand where they come from and how they can be used for
genetic genealogy. Small segments are
considered noise and false matches. There
are too many small matches to make sense out of, but they are not necessarily
false matches. These segments have been
in the population for longer than we thought.
When I match someone at 2 cM it is very likely that they are a 12th
cousin, not a 5th cousin.
There is no reason for us to look for small segment matches until we
understand where these segments originated.
When we talk about autosomal DNA, we often
over simplify the process of genetic inheritance. The simple answer is that we inherit half of
our DNA from dad and half from mom. The
common message is that with every generation the DNA contribution from an
ancestor is randomized and reduced until it is insignificant. Genetic inheritance is actually much more complex
than that. Complex in a great way. There is a tremendous amount of ancestral
information that we are just beginning to tap into.
We inherit DNA from our parents and their
ancestors in large sections. Take a look
at the graphic below. Each example is
the comparison of a grandchild to a set of paternal grandparents. You can see in the first example that the
grandchild inherited over two-thirds of their grandfather’s first chromosome
intact (blue bars). The remaining
section of the first chromosome is from their grandmother. In the third example, the grandchild has
inherited the entire chromosome 14 from their grandmother. It is physically possible that this
grandchild could someday give one of their children the grandmother’s complete
chromosome 14.
In an effort
not to over simplify, this is just half the story. That grandchild has an equal contribution
from their maternal grandparents.
In the examples above, we can visualize what
happens when DNA recombines. The first
example shows where one section of the grandfather’s DNA swapped places with
the grandmother’s DNA before it was inherited by the grandchild. This is called crossover. In the examples, a) is a single crossover, b)
is a double crossover and c) has no crossover.
On average, each of our chromosomes experienced 2 or 3 crossovers before
we inherited them.
Where DNA
crossover takes place on a chromosome is not random. There are approximate locations where the
chromosome is more likely to split.
These locations are cleavage
sites.
These
locations exist because there are groups of genes along a chromosome that have
a tendency to stay together. These
groups are part of gene linkage. These linked genes only allow for chromosome
splits at either end of their linked section.
In my research, the minimum size for one of these gene-linked sections
is about 2.5 cM. These small segments
then travel in larger groups.
In the graphic above, the blue bar
represents about a 60 cM match. The
intersection between the black and orange ovals is about 2.5 cM and represents
a minimum segment. In this crossover
recombination, the large segment actually split to the right of the minimum
segment. In a future crossover, the
chromosome could split on the left side of the minimum segment, giving a large
segment bound by the orange oval.
Why are these minimum segments
important? My research shows that these
segments stay in the gene pool for dozens of generations. Over time, naturally occurring SNP mutations
take place. These minimum inherited
segments (MIS) can be differentiated into family groups.
In my research, I started with 28 well known
US colonial surnames and 393 autosomal kits.
For each surname, the associated kits were triangulated. If three or more kits match on the same
segment, you can deduce that it came from a common ancestor. Each of the surnames investigated had 6 to 13
distinct triangulated segments. Taken
together, these triangulated ancestral segments represent an autosomal
haplotype that can be used to identify a descendant’s genetic connection to an
ancestor. Across all of the surnames,
these distinct segments appear at recurring locations on each chromosome. I have listed 21 of these ancestral loci in
my paper.
Not all ancestral segments are the same
type. The segments can be categorized
into three groups. The first category is
Common to All. The surnames in this study are predominantly
European. One segment has been
identified on chromosome 2 that triangulates across all surnames. This segment correlates to a Western Atlantic
ethnicity and I call it the Western Atlantic Autosomal Haplotype (WAAH). The Western Atlantic Autosomal Haplotype
should not be confused with ancestry informative markers (AIMs). The WAAH is composed of about 800 SNPs and
there are only about 100 AIMs SNPs in that same stretch of chromosome 2.
The next category is Shared. Some segments can be
attributed to two or more surnames.
There was considerable intermarriage between US colonial families. That period was a bottleneck genealogically
and genetically. As two major families
married, their combined DNA segments entered the gene pool and were reinforced
as their descendants intermarried.
The third category is Unique. These shared
segments cannot be attributed to intermarriage of families. Yet the resulting familial autosomal
haplotypes are not composed of a single surname. In the case of Benjamin Franklin, the genetic
proximity to his wife, Deborah Read and his mother, Abiah Folger, may make it
impossible to distinguish between Folger, Franklin and Read DNA. Therefore, the haplotype represents the
combined inheritance.
Here is one of my case studies. Augustine
Bearse was born in England in 1618 and died in Barnstable, MA before 1697. The Bearse family was chosen due to my
familiarity with the genealogy and the debate surrounding Augustine’s wife. His wife Mary was supposedly the
granddaughter of the Chief of the Cape Cod Native American tribes. The goal was twofold; to identify the autosomal haplotype for the
Bearse family and determine whether any of the ancestral segments had Native
American ethnicity.
The Bearse study was composed of 48 autosomal
samples. These samples were collected
based on claimed genealogical connections.
The triangulated samples generated 8 ancestral loci and indicated an
additional 5 loci that had the potential to triangulate with more samples. The resulting Bearse autosomal haplotype is
found below.
Bearse Autosomal
Haplotype
The Bearse haplotype contains the Western
Atlantic Autosomal Haplotype (chromosome 2) which is common to all haplotypes
in the study. The other 12 loci are more
valuable for genealogical validation.
One of the Bearse descendants triangulates on six of the ancestral segments. It is highly unlikely that a descendant would
match on all of the segments. Although
ancestral segments survive over the generations, the randomness of their
distribution makes it difficult for any one person to have received them
all. Yet, triangulating on just one
segment unique to Bearse is enough to indicate and validate a
relationship. Lack of a match could mean
that an ancestral segment was not inherited or that a non-familial event
(adoption, infidelity, etc.) has occurred and the individual’s family tree is
incorrect.
In order to investigate the origins of
Augustine’s wife Mary, each ancestry segment from the haplotype was evaluated
for ethnicity. Only the segment on
chromosome six at location 55850885 had any Native American ethnicity. This ancestral segment had not fully
triangulated, yet a few of the samples match exactly on Native American
SNPs. With additional samples, the
segment could triangulate. Once
validated, the segment might be shared across multiple surnames or unique to
Bearse, indicating Native American genes in the Bearse descendants.
While the amount of autosomal DNA received
by each successive generation is only half from each parent, that does not mean
that given enough generations a distant ancestor’s genetic contribution will
become negligible. Through genetic
linkage, portions of DNA are inherited intact.
Naturally occurring cleavage sites allow for ancestral segments
averaging 2.5 cM to be passed from generation to generation as a minimum
inherited segment (MIS).
Ancestral segment analysis is invaluable for
the identification of distant ancestors.
All of the triangulated ancestral locations combine to become a Familial
Autosomal Haplotype (FAH) that can be used to validate family history.
Since finishing my initial research, I have
gone on to identify over 50 ancestral loci and over 700 autosomal haplotypes
for US colonial ancestors. Stay tuned
for further advances in autosomal research.
References:
Maglio, MR
(2015) Minimum Inherited DNA Segment Size and the Introduction of Familial
Autosomal Haplotypes (Link)
Website:
© 2015 Michael
Maglio and OriginsConnector. All Rights
Reserved.
No comments:
Post a Comment