Wednesday, January 28, 2015

Ghosts of DNA Past: Irish Kings

   In 2006, Laoise T. Moore and the folks at Trinity College in Dublin published a paper famous for identifying the modal haplotype of Irish High King Niall of the Nine Hostages.  In their work, they used seventeen Y-DNA STR markers.  While time to most recent common ancestor (TMRCA) calculations have accuracy issues, having only 17 markers gives a common ancestor over 2,000 years ago.   What the Trinity folks really accomplished was the identification of Niall’s paternal ancestor from over 400 years earlier.  The media in 2006 had a field day in their interpretation that most of Ireland is descended from Niall.  “Niall may be the most prolific male in Irish history.”  Also at 17 markers, there is a very high probability of convergence.  Through normal mutations, haplotypes can change over time to appear similar or identical to other haplotypes.  The lower the number of markers, the higher the chance of convergence.  At that time only high level SNPs were tested to determine haplogroup.  Without terminal SNPs it would have been impossible to recognize convergence, if it existed in the samples.

   In my research on the Kings of Ireland, I have used 67 markers to reduce the chance of convergence and to calculate the age of common ancestors on the descendant side of the target rather than the ancestor side.  I will demonstrate traditional median-joining networks and novel “tribal” markers for the identification of four historic Kings of Ireland.  Did Trinity get Niall’s haplotype correct with the limited data they had at the time?

Ghost:  a manifestation of a dead person

Modal haplotype:  a derived haplotype based on the DNA tests of a group of people

   A modal haplotype is a ghost of a person.  When we look at multiple DNA test results and calculate the mode, by definition we are just taking the values that appear most often.  There is no way to determine if the modal haplotype is the actual haplotype of the historic individual we are researching (short of historic samples).  While the modal is not perfect, it will be close enough at 67 markers for us to determine the genetic “ghost”.

   The septs of Ireland provide us an opportunity to develop genetic genealogy techniques and processes.  Irish surnames are typically patronymic.  The surnames generally take the form of Mac Cárthaigh (McCarthy), meaning son of Cárthaigh or Ui Néill (O’Neill), meaning grandson / descendant of Néill.  Irish septs serve as a collective of related families with shared ancestry and patronymic surnames.  Multiple septs then belong to larger dynasties such as the Eóganachta and the Dál gCais.

   If septs are patrilineal, then Y-DNA haplotypes should be consistent across sept surnames.  Research on the Uí Néill haplotype started with a geographical selection and then a subsequent reduction by sept surnames (Moore et al 2006).  For each target sept, affiliated surnames were identified.  In the case of Uí Néill, the following surnames and associated Y-DNA STR records were accessed from Family Tree DNA projects: O’Neill, Gallagher, Doherty and O’Donnell.  The selection includes 600 records and 5 common European haplogroups.

   Median-joining networks have been in use for over a decade for the visualization of genetic relationships.  The use of them at 67 STR markers has been rare, but it should be the norm.  This first image has the central cluster of a median joining network based on 25 STR markers from the Uí Néill group.  It is just a single cluster with no differentiation.



Figure 1 - Using only 25 STR markers, the Uí Néill network collapses to a single cluster.

When we look at the same group using 67 markers, we get four distinct clusters, each with their own SNP.  The cluster at the far right is predominantly R-L159 and the cluster at the lower right has R-P311/R-L151 nodes.  The cluster at the left contains all of the Uí Néill dynastic surnames, has the majority of nodes and is SNP R-M222, which is consistent with earlier studies.


Figure 2 - View of the Uí Néill network torso showing four distinct clusters.  Three groups on the right are O’Neill only.

As a double check to make sure that I wasn’t seeing some other phenomena, I analyzed three random Irish surnames; Duffy, Kelly and McCormick.  The random sample produced over ten unique clusters with no surname overlap.  This comparison shows that septs are patrilineal and that Y-DNA haplotypes are consistent across sept surnames. 

Figure 3 - Median-joining network of yDNA sampled from three random Irish surnames; Duffy, Kelly and McCormick.  

Re-evaluating the Uí Néill data also shows that Trinity was correct in their identification of a 17-marker Uí Néill haplotype.  New data and new techniques allow us to produce a 67-marker haplotype.


Figure 4 - Sixty-seven STR Uí Néill Modal Haplotype (Niall of the Nine Hostages).

   A different technique that I’d like to illustrate involves the fact that not all STR markers are created equal.  This method takes advantage of “slow” mutating STR markers.  Each marker has its own mutation rate.  By selecting the 15 “slowest” markers with an average mutation rate of 0.00024, a virtual tribal haplotype is created that would be stable within the last 2,000 years (90% probability of 80 generations).  This is an order of magnitude lower than the average rate of 0.0029 used as a constant in typical TMRCA calculations.  The “tribal” markers isolated are DYS426, DYS388, DYS392, DYS455, DYS454, DYS578, DYS590, DYS641, DYS472, DYS594, DYS436, DYS490, DYS450 and DYS640.

   To manipulate the “tribal” haplotype of 15 microsatellites faster the resulting values are concatenated into a string – ex. 12121411119168108101212811.  The “tribal” haplotypes are summarized per surname and plotted to illustrate majority and affinity.


Figure 5 - Uí Néill dynastic haplotypes converted into 15 marker “tribal” haplotypes and summarized.

   The Uí Néill dataset resolved into 37 unique “tribal” haplotypes.  Figure 5 shows that haplotype 12121411119168108101212811 is the most dominant across the Uí Néill surnames.  As with the median-joining network analysis, this “tribal” haplotype is consistent with SNP R-M222. 

   I repeated these two techniques for the Uí Briúin sept using the following surnames and associated Y-DNA records: O’Brien, Hogan, Kennedy and McMahon.  The selection includes 615 records.  The Mac Cárthaigh dataset has the following surnames: McCarthy, Callaghan, Donovan and Sullivan.  The selection includes 319 records.  The Ua Conchobhair data has the following surnames: O’Connor, McManus, Reilly and Rourke.  The selection includes 352 records.

For more details, see my paper at Academia.edu.



Figure 6 - Sixty-seven STR Uí Briúin Modal Haplotype (Brian Boru).


Figure 7 - Sixty-seven STR Mac Cárthaigh Modal Haplotype (McCarthy Eoganachta Kings).



Figure 8 - Sixty-seven STR Ua Conchobhair Modal Haplotype (Last High King Roderick O'Connor).


   Here are a couple of interesting insights from my research.  Niall Noígíallach was High King of Ireland around 378 CE and founder of the Uí Néill dynasty.  Historically, his half-brother Brión, was one of the founders on the Connachta dynasty and an ancestor of the last High King of Ireland, Ruaidrí Ua Conchobair.  If their genealogies are correct, the evidence is in their descendant’s DNA.  The data shows that Uí Néill and Ua Conchobair share the same SNP, R-M222.  The Uí Néill and Ua Conchobair modals are a 6-step match at 67 markers.  There is a 99% probability of a relationship not further than 1,260 years ago.  The results make a strong case for the validity of this historic genealogy.

   Brian Boru, High King of Ireland in 1002 CE, belonged to the Dál gCais dynasty and Tadhg Mac Cárthaigh, the first King of Desmond, belonged to the Eóganachta dynasty.  Ancient genealogies have the Eóganachta and Dál gCais dynasties descended from Ailill Aulom, the son-in-law of legendary king Conn of the Hundred Battles.  The Mac Cárthaighs and Uí Briúins do not share the same SNP (R-L226 vs. R-CTS4466), but by descent they would share a common R-DF13 ancestor.  The Mac Cárthaigh and Uí Briúin modals are an 11-step match at 67 markers.  There is a 99% probability of a relationship not further than 1,920 years ago.  This puts a Mac Cárthaigh-Uí Briúin common ancestor as a contemporary of the legendary Conn.

   New and improved genetic genealogy techniques are invaluable for the identification of historic individuals and the reconstruction of distant family trees at the macro level.

Reference:


Maglio, MR (2015) Identifying Y-Chromosome Dynastic Haplotypes: The High Kings of Ireland Revisited (Link)

Thursday, May 1, 2014

TribeMapper Contest Winners

Congratulations to all our winners!


The winners are:

  • Michael Durkin
  • George Heubach
  • Sylvia Jackson
  • Paul Smith
  • Jennifer Zinck
Stay tuned as we unravel their history over the next weeks.

Thank you to everyone who entered.  

The TribeMapper Report is now on sale until June 1, 2014.  Details are on the OriginsDNA website.


Where did you come from?

Wednesday, April 30, 2014

Last Day for Entries: TribeMapper Report Give-Away

As part of the DNA Day celebration, we are giving away five (5) TribeMapper Reports.

Tonight, at midnight EST, the contest will be closed.  Tomorrow, May 1st, I will announce the winners.

TribeMapper for the House of Normandy
Haplogroup R-L11*
Haplogroup I-L22 Flow into British Isles
Haplogroup G-Z725

For more details on the content of the report see our website.

Contest Terms & Conditions:

You must have completed at least a 37 marker Y-DNA (paternal line) test.  The results of your Report can be used for research, as the basis for an article or for the promotion of OriginsDNA.com.  Your supplied DNA results will not be disclosed, sold or otherwise transferred.

To enter the contest, please send an email to TribeMapper@OriginsDNA.com.  In the email, provide the full name of the Y-DNA donor, haplogroup (if known) and your Y-DNA marker results.

Good Luck!

Tuesday, April 29, 2014

Exploring Rollo's Roots: DNA Leads the Way


   It’s been nearly a year since I wrote about William the Conqueror’s DNA.  Based on a study of men with surnames historically associated with William and their corresponding Y-DNA, I concluded that I identified the genetic signature of the first Norman King of England.  Now it’s time to get back to William and more specifically his 3rd great grandfather, Rollo.  To be honest, the 37 marker Y-DNA haplotype that I published is really connected to Richard the Fearless, William’s great grandfather.  Genealogically, the surnames in the study trace back to Richard.  As long as there was no hanky-panky, William the Conqueror has the same Y-DNA as Richard.  What that also means is that Richard has the same Y-DNA as his grandfather, Rollo.

   Based on the work done in my previous paper, the following haplotype is that of William the Conqueror (and Richard the Fearless)-

DYS393
DYS390
DYS19
DYS391
DYS385a
DYS385b
DYS426
DYS388
DYS439
DYS389i
DYS392
DYS389ii
13
24
14
11
11
14
12
12
12
13
13
29

DYS458
DYS459a
DYS459b
DYS455
DYS454
DYS447
DYS437
DYS448
DYS449
DYS464a
DYS464b
DYS464c
DYS464d
17
9
10
11
11
25
15
19
29
15
15
17
17

DYS460
Y-GATA-H4
YCAIIa
YCAIIb
DYS456
DYS607
DYS576
DYS570
CDYa
CDYb
DYS442
DYS438
11
11
19
23
15
15
17
17
36
37
12
12

   There is an assumption, inherent in genetic genealogy, that there weren’t any non-paternal events between the generations that separate Rollo and William and that this haplotype is that of Rollo as well.  One of the goals for this Rollo study is to get more accurate with his haplotype by narrowing the dataset to only those records with 67 markers.  The second goal is to determine Rollo’s haplogroup R SNP.  The best I was able to determine for William was R-P312, which is a fairly high level SNP.  My third goal is to determine Rollo’s origin using my TribeMapper analysis.  Whether Rollo is Danish or Norwegian has been disputed for hundreds of years.

   I picked up where I left off with William.  There were 152 Y-DNA records that made it into the William the Conqueror Modal Haplotype (WCMH).  For each of these records a 67 marker test result and SNP testing result were added to the analysis, where the data was available.  I threw out any record that didn’t have enough data and retained the ones that grouped into a single SNP of R-DF13 (just downstream of R-L21).  Based on these final 25 records, I have identified the 67 marker Rollo Norman Modal Haplotype (RNMH) as follows:

DYS393
DYS390
DYS19
DYS391
DYS385a
DYS385b
DYS426
DYS388
DYS439
DYS389i
DYS392
DYS389ii
13
24
14
11
11
14
12
12
12
13
13
29

DYS458
DYS459a
DYS459b
DYS455
DYS454
DYS447
DYS437
DYS448
DYS449
DYS464a
DYS464b
DYS464c
DYS464d
17
9
10
11
11
25
15
19
29
15
15
17
17

DYS460
Y-GATA-H4
YCAIIa
YCAIIb
DYS456
DYS607
DYS576
DYS570
CDYa
CDYb
DYS442
DYS438
11
11
19
23
15
15
17
17
36
37
12
12

DYS531
DYS578
DYF395S1a
DYF395S1b
DYS590
DYS537
DYS641
DYS472
DYF406S1
DYS511
DYS425
DYS413a
DYS413b
11
9
15
16
8
10
10
8
10
10
12
23
23

DYS557
DYS594
DYS436
DYS490
DYS534
DYS450
DYS444
DYS481
DYS520
DYS446
DYS617
DYS568
16
10
12
12
16
8
12
22
20
13
12
11

DYS487
DYS572
DYS640
DYS492
DYS565
13
11
11
12
12

Based on this modal haplotype and the associated SNP, a broader collection of genetic cousin records were identified to be used with my new TribeMapper analysis (Biogeographical Multilateration).




   This map shows the geographic distribution of Rollo’s cousins.  The large number of points along the coast of Normandy is a good sign.  If the majority of points were in Eastern Europe, I would have to revisit my whole hypothesis about William the Conqueror.  It is best not to try to interpret any relationships until we look at them through the lens of a phylogenetic tree.



   The TribeMapper analysis takes into consideration the mapped location, the tree node connections and the time between common ancestors.  The time is converted to distance based on the demic diffusion migration rate.  The distance is plotted to ‘triangulate’ the geographic location of each common ancestor.  This is a process called multilateration.

   The earliest documented origins for Rollo come from Dudo of Saint-Quentin in 1015 and William of Jumièges in 1060.  Both ‘histories’ were commissioned by the House of Normandy and attribute a Danish origin to Rollo.  Commissioned biographies can border on mythology.   The Norwegian Orkneyinga Saga, from the 13th century, gives Rollo a Norwegian origin. 

   I’ve run the analysis with Rollo’s record as an unknown location.  TribeMapper allows us to back into the location for any unknown point.  What we get is a highly constrained location for Rollo’s ancestor, in the middle of Denmark.  The data then shows that Rollo may have lived within 226 km of that paternal ancestor.  The red circle illustrates the range for Rollo.  This covers the majority of Denmark.  The data also shows that Rollo’s ancestors, going back at least 12 generations were also in Denmark.



   We can give the Norwegians some credit also.  The ancestors of Rollo’s ancestors were Nowegian, with an origin on the west coast of Norway.  Rollo’s ancestors were responsible for multiple branches of migration into Europe.  This includes a back migration into Norway that then went on to invade Scotland.



   This was accomplished with small sample of 65 records for simplification.  Much larger data sets could determine the genetic flow in a greater geographic and chronologic view.  Additional records within the same SNP grouping could result in a more accurate origin for Rollo.  Records that are genetically upstream from the SNP and STR group used, may identify the nomadic migrations prior to the Western Norway settlement.


   I’ve run this simulation multiple times, getting the same results.  I’m comfortable calling Rollo – “The Dane”.

Reference:

Maglio, MR (2014) Biogeographical Origins and Y-chromosome Signature for the House of Normandy  (Link)