Friday, July 21, 2017

AND THE WINNER IS... (Comparing Admixture/Heritage Tests on Gedmatch)

Methodology:


  • We ran exhaustive tests of several commercial and free DNA-testing labs and ethnicity calculators.  
  • To test the sites, we used only individuals with well-documented, double confirmed, 100% known ancestry.  
  • We tested multiple males from multiple lines to assure as much as humanly possible no extra-parental events (bastardy) occurred.  
  • We even tested minor nobility with documented ties to geographic locales.  
  • We used individuals who do not come from cities or places of cosmopolitanism (influx of foreigners).  
  • We tested only people with all four grandparents from the same locale.  
  • We tested multiple people from different countries in Europe.
As we've posted before, of the commercial labs, 23andme takes first prize, and Ancestry.com is the worst.  23andme provides the most conservative and accurate ethnic ancestry approximations.

We have also completed our testing of all of the ancestry composition tests available on GEDMATCH.  Below is a summary, the results, and the rankings.

  • First of all, the specialty labs, Ethiohelix, Gedrosia DNA, puntDNAL, etc. do not even come close to being accurate, at least for individuals of European heritage.
  • None of MDLP's tests passed our accuracy gauntlet and correctly called west European DNA.
1. The overall winner, and the clear winner of all the tools currently available on Gedmatch, is the Eurogenes K13 test.  It was pretty darn good at distinguishing DNA from various western European lands, for people of "purebred" ancestry.

2. Coming in second was Eurogenes EUtest K15 v2, which also had a pretty darn good record of accurate calls.

3. An honorable mention, and a close third, with accurate calls roughly as close to the second-place finisher, was Dodecad's K12b test.

  • No other tests besides those three were even close to "often accurate."
  • No tests, including those three, were much use for accurately calling the ancestry of European "mutts."  We found that the same tests that were accurate with individuals with 100% heritage from one country, were of limited value for serving as an oracle (predicting accurately) the ancestry of individuals of mixed European heritage.



Tuesday, July 18, 2017

Will Tim Sullivan and Ancestry.com Continue Its VIRTUAL Ethnic Cleansing of Germans?

23andme discloses right off the bat that it cannot identify German or French ancestry 92% of the time.

Ancestry doesn't seem to be able to discern German ancestry too well either, but it doesn't tell its customers that.

Noted: Yet another reader of this blogger just wrote in and shared her experience.  She is 100% German, born in Germany, from a small town, not a big city.  Her ancestors are documented in the region she's from for the last 400 years.  Several of them were well-known and documented.

Ancestry.com called her ancestry as about 50% Scandinavian, 25% Italian, and 25% generic European.  What an epic fail.

How many "white bread" regular Americans, with German ancestry take one of these tests, and misleadingly, their German ancestry is literally wiped away?

We note Germans are America's LARGEST ethnic group, but their ancestry is also often hidden, because German surnames Americanize so well.  For example, Kohl becomes Cole; Schmidt becomes Smith, etc.

As an experiment, with our reader's permission, we ran her raw data through Gedmatch.  Both MDLP (the Magnus Ducatus Lituaniae Project) and Eurogenes were able to call her likeliest ancestry as German.   Dodecad, which specializes in Mediterraneans, was able to call her as German in about half of its tests.

So the question remains:

1.  If the amateurs can call German DNA with reasonable regularity, why the heck can't Ancestry.com?

2.  If Ancestry.com is so bad at identifying America's biggest ethnic group, why doesn't it do the decent thing and tell people?

Sunday, June 11, 2017

The Genetics of the Ancient Romans

As we've noted before, there are a bunch of charlatans in the world of Ancient DNA.  The worst offender, perhaps, is a pseudonymous Belgian named Maciamo Hay, who runs a site called Eupedia.  This uneducated man knows just enough to sound knowledgable, and to delude himself and some of the similarly ignorant.  In the world of Ancient DNA, he is probably the best example of Dunning-Krueger effect out there.

Many of these Ancient DNA practitioners spend their time trying to digest the most recent DNA studies, but don't ever come close to picking up a history book, much less to acquiring the deep, big-picture understanding of ancient history that is needed to explain the population movements that have occurred in places like Rome and Italy over time.

In this post, we go over those population movements, to review claims made by fools like Maciamo on Eupedia.

Let's start with his baldest misstatement: "In all logic, the ancient Romans, from the original founders of Rome to the patricians of the Roman Republic, should have been essentially R1b-U152 people."  This laughable statement was directly pulled from Eupedia on the same day that this post is dated, and as far as I can tell, it's still up.  (I just refuse to link to it, lest any more misinformation be circulated).

As Maciamo's own maps show! -- the distribution of U152 in Italy is centered in the ALPS, and radiates outward to all the parts of Italy that were previously inhabited by CELTS.

So: Where to begin?  How does one even start to explain history to someone so uneducated?

Let's start with something most people know.  The saying, "he's crossed the Rubicon" is a reference to Caesar crossing the Rubicone river.

Why was that so significant?  Because the Rubicon was the traditional BORDER of Italy at that time.  (49 BC.)  In other words, it was an act of war for Caesar to cross that border.  Where is the Rubicone river?  It's just south of modern Ravenna!

For 700 years, the "Italy" of Roman times -- that which was populated by Italians (versus Gauls) -- was the true peninsula parts (sticking out).  Never forget that.  The distribution of U152 clearly corresponds to where the population was Gaulish versus Roman!  U152 is the OPPOSITE of a Roman marker.

Southern Italy, on the other hand, was considered the most desirable real estate for much of the Roman Republic and early empire.  When Cicero listed the most beautiful and prosperous cities in Italy, most were in Southern Italy.  Places like Reggio Calabria and Capua.  When Mark Antony and Augustus' veterans demanded land, they demanded it in Southern Italy.

Furthermore, Rome devastated places like Samnium (modern Molise/Campania) and modern Cosenza, destroying most of the inhabitants, and then seizing the territory for Roman citizens.  Anyone who knows Roman history knows this.

Rome planted dozens (almost a hundred) of colonies (of Roman citizens) in Southern Italy.  Entire towns (like Vibo Valentia) were populated by tens of thousands of transplanted Romans.  These colonies were stocked BEFORE Rome became an empire, i.e., before it became cosmopolitan.  The people who founded these towns were of "pure" Roman stock.

Why does this matter?  Well, this blog is no Southern Italy apologist.  Southern Italy was a backwater for years.  Isolated and insignificant.  But from a genetic standpoint, those qualities ARE significant.

If you wanted to study the genetics of the Romans, would you go to a place where lots of people had passed through?  A place that was a successful and world port in the Middle Ages?  A place where people wanted to move to from elsewhere?  OF COURSE NOT.

You would WANT a backwater; a place unchanged over millennia.  The towns of South Italy (many of which who have never been invaded by anyone, thank you very much), are where you can find the descendants of Romans, unadulterated.

Well before modern genetic studies, very intelligent, very thorough researchers did large-scale demographic studies on Rome.  These folks, mostly British historians from Oxford, scoured records in churches and cemeteries, in abbeys and books -- everywhere, -- to estimate the population demography of Rome.  This much we know: at the dawn of the empire, "Italy" was Italy south of the Rubicon, well south of the Po.  The population was a mix of the local Italic tribes and Roman Latins, placed there as colonies.

Want to know the genetics of the Romans?  Look at which towns started out as Roman (not Gaulish, Maciamo!) and which towns have largely been untouched since.

Professor Chris Wickham produced exhaustive studies of Italy from 400-1000 AD.  He provides real numbers of the "others" in Italy.  He concludes the Goths and Lombards (German tribes who ruled large parts of Italy from 476 AD - c. 1000 AD) never were more than 2%-9% of the Italian population, and he believes aside from pockets in the South, they were clustered mostly in the North.  Again, it's the NORTHERN Italians with the non-Roman influences, not the Southerners.  Again, this skews the DNA of the North.  Don't assume the Southern differences from the North are from Southern exoticness.

Chances are, Northern Italian DNA is different because it started with a large dollop of Gaulish (Celtic) genes, and they received a small smattering of Germanic genes.  This is why northern Italians appear, well, more "northern."  Southern Italian DNA, for the most part is not different because of subsequent influences or invasions.  Southern Italians are generally darker (although not by much) because of the absence of Gaulish and Germanic influences.  But those southerners more closely represent Roman DNA as it was around the years 200 BC - 50 AD.

Wickham also studied the Byzantine (Eastern Roman empire, Greek-speaking), Norman (French Viking) and Saracen (Arab or North African) occupying forces in Italy, and concluded that for peninsular Italy, these forces were tiny, much less than 1% of the population, and that they left no real permanent traces.  Again, this is because these were occupying armies not settlers.  Please note contrary to popular belief, much of the towns and villages of Southern Italy were never physically occupied by ANY of these groups, even though suzerainty and tax payments did change hands.  Was Paris after the Nazis any less French?

Folks like Maciamo also greatly UNDERESTIMATE the effect of Roman colonies throughout the Mediterranean.  Rome, through much of its thousand-year history, was a population EXPORTER.  Romans bred like crazy -- there was never enough land to go around -- and they, as the most powerful people of their era, felt it was their prerogative to seize lands of the conquered and place their citizens' families there, to live long and prosper.  It wasn't like now, where middle class families have 2.5 kids.  Then, (aside from the patricians), a family had as many kids as it could afford -- as many kids as it could feed.  Romans had many kids...

A look at the map of Roman colonies shows just how widespread this practice was.   Note the concentration in Italy and Spain, followed by France and Romania.  Yes folks, there's a reason why the Latin language survived in those regions, and why Romance derivatives are still spoken there today.

Despite the Romans exporting so many people, I have never seen one of these modern, unschooled-in-history geneticists raise the question as to whether the similarities between South/Central Italian DNA and that of say, Greece,or North Africa is due to Roman OUTFLOW of genes.  These idiotic (and perhaps racist?) people only repeat the Quentin Tarantino-esque claims that the similarity between such genes must be from exotic INFLOWS into the population of Italy.

It's really idiotic if you think about it.  Rome locates a colony of 25,000 Italian FAMILIES in some town in backwater Greece (or North Africa), and the town prospers for 1000 years and still exists today.  A Byzantine (or Saracen) garrison of 1000 men holds an Italian town for 100 years and then departs.  But many dummies online ascribe the similarity between Italian and Greek (or North African) genes to the latter?  Incredibly myopic.

Anyway, in conclusion:

Maciamo Hay is an idiot.  He should read some JB Bury, some Sir Ronald Syme, and some Chris Wickham.

Geneticists should realize if they want to find Roman genetics, they should try to discern the similarities between backwater (untouched/remote) towns in Southern Italy and Spain, which were settled around the same time with Roman colonists.  There, you can detect and isolate the signal of Roman genetics.

And genetic similarities between Italy and the rest of the Mediterranean could just as easily be due to pre-Roman factors or Roman OUTFLOWS as they are to post-Roman inflows into Italy.

Related Posts: The Genetics of the Ancient Romans, Part II

Friday, May 12, 2017

Banned from Anthrogenica, Censored by Eurogenes, Laugh at Eupedia

Several posters at Davidski's Eurogenes blog have noted that they've been banned from Anthrogenica for challenging the Kool-Aid drinking orthodoxy that infects that website.

The pattern almost always goes as follows.  A regular Anthrogenica poster says something like, "Isn't the Kool-Aid grand?"  A newcomer says, "I don't want to drink your Kool-Aid."  The Anthrogenica regular says, "I'm right, you idiot."  And then the newcomer says, "You're the idiot" -- and yep, you guessed it, only one of them gets banned.

It's gotten so bad that some of the best citizen-scientist minds, and almost all contrarian voices, are gone from that website.  In the old days, the orthodoxy sought to excommunicate Galileo from the Catholic faith.  Now they excommunicate posters from the major discussion websites.  No dissent allowed.

With Dienekes inactive, Eurogenes is where many go for discussion.  But Davidski has been very heavy with the censorship button there too.  Post something he disagrees with?  He removes your comment.  It's really sad.

I myself have tried to post my most recent thread, about applying simple demographics to his "Conquest and Warfare" fantasies, and he always removes my comments asap.

What does that leave?  Eupedia?  Maciamo is a reductio ad absurdem idiot, who also doesn't hesitate to ban people with any contrarian viewpoint.

So, this is it.  This is your thread.  This thread (and this website) is for anyone Banned From Anthrogenica, or Censored by Eurogenes.

Post away.  You will not be censored here.

Saturday, April 29, 2017

When Is A "Conquest" Not A Conquest?

When Is A Conquest Not A Conquest?

You are a scientist living in the year 4017, specializing in the ancient civilizations that existed between 1492 and 2200 A.D.  Various cultures came and went, but alas, most written records were lost in the intervening centuries.  So you study DNA.

Your fellow scientists know that in many different regions, the DNA record shows profound change over time, both in autosomal percentages and uniparental markers (Y-Chromosome, mtDNA).

Unfortunately, arrogant bloggers still exist in 4017, and three of them, one called Davidski Futurski (who blogs at Eurogenes-ski), a fellow named Maciamo-Futuriamo, and another named Rocca Futura, are examples of "a little knowledge can be dangerous."  

They blindly state that all changes in DNA indicate evidence of conquest by some superior culture of badass men.  (Nevermind that they all believe they descend from the people they assert to be superior; that's irrelevant, we're sure).

Your boss at the university, someone who sees nuance better than the bloggers, asks you to model the record and various types of human interactions, and answer the question:

"When Might A 'Conquest' Not Really Be A Conquest?"  

So you come up with the following four models, and re-create as best as you can some historical examples for the clueless:

1.  Refugees from a war-torn area flood into a nearby land (and even some faraway lands), overwhelming the demographics.  The bloggers post that a people called the Syrians conquered the Lebanese, starting in 2011, but you're not so sure.  Your research finds the opposite: that there was a horrendous war in Syria, causing 11 million people to lose all their belongings and flee.  Therefore, you don't think these people were conquerors, but refugees.  Nevertheless, the stubborn bloggers point out how the record shows a massive DNA shift in Lebanon, where the Syrian markers went from 5% to 25% of the population in just three years.  

"It had to be conquest" they write, of powerful, rich, sophisticated men conquering the weak Lebanese.  

Alas, you tell them: it was the opposite: a beaten-down people streaming into a nearby land (and also places like Sweden), altering the gene pool.  In fact, Lebanon started with 5 million people, and absorbed an influx of 2.5 million refugees.  Thus, the autosomal genetics and uniparental frequencies were both significantly changed.  It's really as simple as that.

2. Disease.  In 1598, slaves from Africa were brought to a place called Puerto Rico.  They brought with them Yellow Fever, something the native American inhabitants did not have exposure or antibodies to.  

Although the natives were, under the caste system at the time, a couple of rungs higher than the African slaves, and although the natives had better sources of food and systems for dealing with the native landscape, they were killed off in the thousands simply because they didn't have antibodies to the new disease.  

But all the bloggers see is that Puerto Rico went from showing Native American DNA patterns to showing African (and European) DNA patterns.  And they cry, their must have been a conquest, led by the African newcomers!  You LOL, pointing out that these newcomers were slaves and vectors.

3.  Economic Opportunity.  The bloggers now discuss Los Angeles.  They point out that the DNA record shows that in the 1950s, Los Angeles was 80% inhabited by an ancient culture called, "whites."  By 2020, it was 60% Hispanic.  The record thus showed profound shifts in autosomal frequencies and Y-chromosome patterns.  

"There must have been a conquest!" the bloggers shout from the rooftops!  War!  Destruction!  A supreme powerful tribe of men, with better tools!  

No, you quietly assert.  Your research shows that poor Hispanic immigrants simply migrated to Los Angeles, looking for better economic opportunities than what existed back home.  Alas, the bloggers still don't grasp this example either.

4.  Simple Cultural Differences in Birthrates.  Palestinian women have vastly greater birthrates than their neighbors.  In the 1960s, it was 8 children per every female.  Even now, it's above 4.0 children for every woman.  The Israeli birthrate, while still relatively high at 3.0, is not as high.  By 2045, Palestinians may outnumber Israelis.  

Our future bloggers, looking at this from the perspective of the year 4017, may try to argue that there was a conquest by the Palestinians.  They must have had superior technology, they claim!  Better weapons!  

But again, your research (and history) shows this NOT to be the case.


Taking these four examples, you explain to the bloggers that many changes in DNA frequency, cannot be explained as "conquest" even though it's tempting for the simple-minded to do so.  

There are even examples of multiple of the above factors explaining demographic shifts.  For example, the Catholic Irish replacement of Anglo Saxons in many East Coast cities in the 1800s.  That was due to the Irish being refugees, seeking greater economic opportunity en masse, and having higher birthrates.  

Somewhere in the future, the intellectual heirs of Maciamo, Davidski, and many on Anthrogenica, are arguing that the Irish immigrants of the 1800s were in fact a technologically and militarily superior, overwhelming force of wealthy males who clearly conquered the British Americans of the time.

And you, and anyone with any degree of a nuances understanding of history, is still LOL'ing.  

Saturday, December 31, 2016

On the Need for More Interdisciplinariness in "Interdisciplinary" Studies

Ah, if they were all as good as Luigi Luca Cavalli-Sforza.  The pioneer of interdisciplinary studies, and a Renaissance man, he would thoroughly immerse himself in genetics, demography, history, archaeology, and linguistics -- or find collaborators who could augment his knowledge.  Thus, his work SAW THE BIG PICTURE. 

A new paper out shows that modern "interdisciplinary" studies aren't so interdisciplinary at all.

It's called Mapping European Population Movement through Genomic Research by Patrick J. Geary and Krishna Veeramah.  You can read it by clicking here.

The authors show that many geneticists writing about history simply pick up some bogus two-bit history book.  That is why you get so much pseudo-science out there.

I once talked to a guy, a fairly educated scientist from another discipline, who felt he saw some marker in European genes.  So he did some google searches as to which tribe had ever moved in the rough place where he found the markers.  He then published a paper claiming he found a Cimbri-specific marker.  But he didn't read the rest of the history; had he done so, he would have grasped perhaps that that tribe was wiped out by Gaius Marius in the first century BC....

The paper also points out that there isn't enough precision in genetics, because geneticists don't bother to understand that different regions have different histories.  What good is knowing some person was French, without logging if that person is Provencal or Norman?  Very little....

Best quote from the paper: "The Ralph and Coop study, while highly rigorous at the level of the population genetic analysis, included no historians or archaeologists, and the only historical literature cited, presumably to »identify« the Hunnic contribution to European population, was a general history of Europe, a survey of Slavic history, and two articles in the New Cambridge Medieval History. The Busby et al. study also included no historians or archaeologists on its team, and the only historical literature cited was a Penguin History of the World, Peter Heather’s survey of the Early Middle Ages, and a survey of Muslims in Italy. Unlike these studies, designed and executed  exclusively by geneticists who then look through a few general historical handbooks to try to find stories that might explain their data..."

In other words, many scientific papers suffer from the same thing that plagues the Anthrogenica or even worse, Maciamo's horrifically bad Eupedia: "a LITTLE knowledge is dangerous."  They don't bother grasping the big picture in genetics, demography, history, archaeology, and linguistics...

Sunday, October 30, 2016

We Are Our Brother's Keeper: Are All Men Cousins? And Is This The Root Of Prejudice?

Many of you already know the following concepts.  Humans intuit a sense of community and family with those with whom they are related.  This has been confirmed in study after study, on child abuse, on ingroup-outgroup dynamics, and on racial prejudices.

The percentages of relatedness to trigger that feeling of kinship need not be large.  As the following chart shows, many of us have folks over to Thanksgiving dinner with whom we only have 1-3% of identical DNA with.  But that identical DNA is hugely significant.  It's identical.  And that of course makes one much more "related" than this "we share most DNA with all humans and even chimpanzees."  Indeed, it's the margins that seem to count.  And again, studies on stepfathers in particular, have confirmed this time and time again.

Parent / Child
Full Sibling       50%



Grandparent / Grandchild
Aunt / Uncle
Niece / Nephew
Half Sibling
25%

1st Cousin 12.5%
1st Cousin once removed 6.25%
2nd Cousin 3.13%
2nd Cousin once removed 1.5%
3rd Cousin 0.78%

The weird quality of the Y-Chromosome makes what I am about to post intriguing:

A human genome, including the X and Y chromosomes, is about 3771 cM long.

The Y Chromosome makes up about 2% of that, by length, and about 1% by SNPs. 

Because men in certain haplogroups have IDENTICAL Y-Chromosomes (except for tiny combining parts), and because unlike the rest of DNA, those genes are passed on IDENTICALLY, then all men in the same haplogroup share as much DNA as, say, 2nd Cousins Once Removed.

Could this be the explanation why, for example, Western European males, which do not have much Y-chromosome diversity, exhibit a powerful ingroup dynamic with each other?

Fascinating, to be sure.

Tuesday, October 11, 2016

How DNA Ancestry Testing Works and How Can I Know It's Accurate

When a commercial DNA testing site like Ancestry.com or 23andme or FTDNA tests your DNA, they do not know which snippet came from which of your parents.

For example, if at a given point (a gene, in popular parlance), you have a "C" from your dad and a "T" from you mom (meaning you have brown eyes, but carry the blue-eyes gene), the testing service doesn't know which "letter" came from which parent.

What they then do is try to guess, stringing your DNA out into small chunks or strings of letters.

They then compare these to DNA in their reference database.  23andme's reference database, which is one of the best, if not the best in the world, only has about 11,000 samples in it.  To represent the whole world!


So if you have ancestry from a big country (like France or Germany) or a country that has pockets of deep isolation (like Italy), the odds -- that they have someone from your corner of the country, or your little isolated craggy valley in some mountain chain -- are small.

They then compare the little strings of letters and come up with a likelihood that you have ancestry from one of those reference populations.

23andme has the most scientific test in the business, but it gets French/German/Belgian/Dutch/Swiss/Austrian/Luxembourgisch ancestry wrong 92% of the time.  It most often shows up as "generic Northwest European."  Similarly, 23andme -- the best in the business -- can't identify Italian ancestry 50% of the time.  It often shows up incorrectly as Middle Eastern or Generic Southern European.

The moral of this story is to be patient with the science.  It's not 100% there yet.

If you have documented ancestry from one region, trust your documents.

If you don't have any cousins from a pool you were identified as, then chances are it was a miscall.  (For example, if you have documented Italian ancestry, but it says you are 1/8 Middle Eastern or 1/8 Spanish), then unless you have a known great-grandparent that is 100% such, it's probably a miscall.  (This would mean your parent would test as 1/4, by the way).

Finally, there is a series problem with testing sites, particularly FTDNA's, with the issue of timing.  If you go back far enough, we are ALL Africans, right?  Yet a DNA test telling you that you were African would not be too useful.  Do they mean recently or in the past?

Similarly, as has been well-documented, most European ancestry can be broken down into 3 big chunks: ancient hunter gatherers (Ancient Western Europeans, most similar modern population = Lithuanians); ancient farmers (Ancient Near Easterners, most similar modern populations include Greeks, Sardinians, others); and ancient pastoralists/horse rearers (Ancient Eurasian Steppe Dwellers, most similar modern populations include Ukrainians). But the migrations were really, truly all over the place.

Ancient Near Easterners are NOT modern Near Easterners.  Ancient hunter gatherers in France are NOT the modern French, etc.

If a test tells you that you have some Near Eastern blood, it often is sensing this ancient signal.

It doesn't do you much good for them to say that 6000 years ago, you had some ancestry in the Near East.  Everyone did.

Tuesday, May 24, 2016

Neandertals Never Died; Just Their Direct Sirelines and Matrilines

From a piece by Faye Flam in none other than Bloomberg, comes this wonderfully succinct nugget that expresses something that readers of this blog know I ascribe to:

"Scientists have also revised their view of Neanderthal extinction – long attributed to some deficit on their part.  Maybe nothing dramatic happened at all, said Hawks. They would have made up a small fraction of the world’s population, and when larger groups of modern humans joined them in Europe they might have simply been absorbed."

(emphasis added)

This is what I coined the "Demography not Drama" explanation.

It is likely the Neandertal population was tiny, and when modern humans entered Europe, they simply absorbed them, perhaps even absorbed multiple sub-populations (which the genetics data now supports too).

With each generation, there is a great chance that a male line or a female line will disappear.  All it takes is for a man to have only daughters, or a woman to have only sons.  Older lines (which have been around for more generations) face longer odds of appearing to have survived, because each generation increases the chances a line will appear to have died out.  The patrilines and matrilines from a group starting with a smaller population size will also appear to have died out over time.

We have seen this occur in the modern world, both in the example of surnames on isolated islands (the families didn't die out, but the surnames eventually greatly reduced in numbers because of the randomness of males having male children) and with thoroughbreds (the original thoroughbred founding population included 30+ male horses, but only 3 sirelines (akin to surnames or Y-chromosome haplogroups) have survived.

This doesn't mean the others "died out."  Like Neandertals, their genes live on among us.

Friday, February 5, 2016

The Sad Case of the Orthodoxy and the Posth Article on Pleistocene Demographics

Just a couple months ago, in the context of the peopling of Ireland, I emphasized on Eupeida (and here) how important it is to put all the Theories Du Jour that are based on modern uniparental distributions through a model based on population demographics and sound logic.

Specifically, I emphasized that ancient population sizes were minuscule compared to modern ones, and that if a population started a long long time ago, with a size that was way way small -- compared to subsequent waves -- that it would give a false signal that the original population was "conquered" or "outcompeted" or "never existed" or originated somewhere incorrect.   I cautioned against those four errors.  


This engendered quite the debate on Eupedia forums.  When backed into a corner and shown the weakness of his "R1b Were Studly Conquerors Theory," the "blindly following the current orthodoxy" folks react badly.
 

Many "Interwebz Scientistz" fail to grasp these concepts.  They favor their own wacky, biased theories based on what they see today only.  If a land is populated by one people, they must be all conquering studs, right?

Today, Posth et. al put out an extensive paper on Pleistocene demographics.  


Its shocking discovery?  Just like Y DNA Hg C existed in Europe in tiny numbers among the very first Europeans, so did mtDNA Hg M.

M disappeared eventually, due to the simple fact that its initial population size was tiny, and that because it had been there so long, the odds that certain women didn't have daughters, each generation, eventually meant it was not passed on.  Remember, we're talking uniparental markers here.  

The authors commented exactly as I did: up to now, people mistakenly believed that Hg M never set foot in Europe -- or that if it did, it was killed off or whatever by a new wave.  Sorry, both theories are wrong.

It is WONDERFUL to see another peer-reviewed, scholarly paper making this exact same point, and backing it up with newfound data.

As the paper indicates:

-These first hunter gatherers started with a TINY initial population size.

-There is a loss every generation of males having males or females having female offspring.

-I've calculated the approximate odds of a male not having a male child or a female not having a female child (i.e. looking like their uniparental marker was "conquered") at 12.5%, each generation, totally random.

-The longer a population has existed in a locale (and being free of mutations), the more generations go by, the greater the chance that random happenstance, chance, etc. will make it appear that a Hg either never existed or was slaughtered in a mass killing/enslavement/mate preference.

Now you have further proof of it.

I'm waiting to hear how Hg M died out because of some studly new more beautiful females who moved in.  Oh woops, Maciamo doesn't post here.  And he doesn't himself bear Hg M.  And M is not linked to R1b...

Saturday, January 30, 2016

In Praise of Roberta Estes and DNAeXplained.com

In a world of pseudo-science and echo chambers, a few blogs stick out for being mostly in touch with reality.  In the world of Ancient DNA, Dienekes, although less active than before, has pioneered much in the field of DNA, and still has many serious scientists who comment there.

In the world of DNA for Genealogy, one blog sticks out.  It is Roberta Estes' DNAeXplained.com.  Of all the blogs and websites dedicated to disseminating information about DNA, hers is consistently factual, science-based, and yet easy to understand. 

This scientist came across a few of her posts, and I daresay they are mandatory reading for anyone seeking a better understanding of their DNA.  Below are links and highlights:


Step 1:  Creation of the underlying population data base.
Don’t we wish this was as simple as it sounds.  It isn’t.  In fact, this step is the underpinnings of the accuracy of the ethnicity predictions.  The old GIGO (garbage in, garbage out) concept applies here. . . .

The third way to obtain this type of information is by inference.  Both Ancestry.com and 23andMe do some of this.  Ancestry released its V2 ethnicity updates this week, and as a part of that update, they included a white paper available to DNA participants.  In that paper, Ancestry discusses their process for utilizing contributed pedigree charts and states that, aside from immigrant locations, such as the United States and Canada, a common location for 4 grandparents is sufficient information to include that individuals DNA as “native” to that location.  Ancestry used 3000 samples in their new ethnicity predictions to cover 26 geographic locations.  That’s only 115 samples, on average, per location to represent all of that population.  That’s pretty slim pickins.  Their most highly represented area is Eastern Europe with 432 samples and the least represented is Mali with 16.  The regions they cover are shown below. . .

No matter which calculations you use relative to acceptable Margin of Error and Confidence Level, Ancestry’s sample size is extremely light. . . .
 


"having Haplogroup Origins and Ancestral Origins indicating Native American ancestry does not necessarily mean you are Native American or have Native American heritage. This is a very pervasive myth that needs to be dispelled. . . .

The good news is that more and more people are DNA testing.  The bad news is that errors in the system are tending to become more problematic, or said another way, GIGO – Garbage in, Garbage Out.

....

There are a very limited number of major haplogroups that include Native American results.  For mitochondrial DNA, they are A, B, C, D, X and possibly M.  I maintain a research list of the subgroups which are Native.  Each of these base haplogroups also have subgroups which are European and/or Asian.  The same holds true for Native American Y haplogroups Q and C.
In the Haplogroup Origins and Ancestral Origins, there are many examples where Non-Native haplogroups are assigned as Native American, such as haplogroup H1a below.  Haplogroup H is European...

One of the problems we have today is that because there are so many people who carry the oral history of grandmother being “Cherokee,” it has become common to “self-assign” oneself as Native.  That’s all fine and good, until one begins to “self-assign” those haplogroups as Native as well – by virtue of that “Native” assignment in the Family Tree DNA data base.  That’s a horse of a different color.

Monday, January 25, 2016

Calculating Matches on Gedmatch: Why CentiMorgans (cM) are more important than SNPs

I have discovered that very very very few people know this, so it is worth posting.

The different testing companies, 23andme, Ancestry, FTDNA, etc. all test slightly different SNPs.  In other words, the "points" on the genome, the "genes" that are tested vary from company to company.

I have seen some people on Gedmatch dismiss a match because "it doesn't have enough SNPs."  Or because "it's not above the SNP threshold."

Gedmatch itself uses a 7 cM and 700 SNPs match to qualify someone as a cousin.

The SNP part is faulty thinking.

Because the testing companies don't test the same SNPs, you can have long stretches that match with a low number of SNPs.

Case in point: Someone who tested on 23andme like I did matched me for 10.0 cM and 1024 SNPs.  That same person on FTNDA matched me for 10.0 cM but just 510 SNPs.  FTDNA tested half of the SNPs that 23andme did (or half of the same set). 

This is key to grasp.  Expect closer matches to you on Gedmatch if your kits start with the same letter (i.e. M for 23andme, F for FTDNA, and A for Ancestry.)  DO NOT DISMISS LOW SNP MATCHES.


Monday, December 28, 2015

The Cassidy Earthquake: Neolithic and Bronze Age migration to Ireland and establishment of the insular Atlantic genome

Lara Cassidy et al. just put out a paper that injects a bit of welcome science into the world of R1b fantasy theories.  Those theories, of marauding bands of R1b warriors, are popular on online messenger boards.  (One prominent board even maintains that most of Western Europe -- millions and millions of men -- are R1b because they are descended from royalty).

Here are the findings from this recent paper:

1.  The very derived downstream clades of R1b like R1b1a2a1a2c were well-established in Ireland by 3750 before the present.  There is no evidence the ancient specimens in the paper were the first generation in Ireland, so it is likely they were present by 2000 BCE.


2.  The population of the Central European migrants to Ireland, who were herders, and had some Steppe-derived ancestry, were MUCH higher, compared to hunter gatherers.  In other words, R1b is so common in Ireland because of massive migration of such people.

3.  This is emphatically NOT consistent with pioneer colonization and elite dominance.

4.  The current distributions in many parts of Western Europe are due to a LACK of invasions since (no Anglo-Saxon or Roman penetration.) In other words, this was a second but more prounounced founder effect of sorts.

5.  This is consistent with comparisons to more centrally located, easy to reach locales, like Italy, where the genomes show greater variability in both autosomes and Y DNA, due to introgressions that occurred after the late Neolithic and early Bronze Age migrations.  (Cavalli-Sforza's admonishment to understand the difference between an expansion and an "impansion" come to mind.)

6. In Western Europe, Bell Beaker culture is the most likely candidate for the spread of R1b and related autosomal genes.


7.  R1b and this Western European expansion is strongly scientifically correlated to lactose persistence, which likely provided the demographic advantage to propagate in larger numbers in places like Hibernia.

8.  As an addendum, the megaliths of Western Europe are indeed likely linked to early cardial cultures, who bore of mix of HG and farming genes, which correlate to I-M26 in Ireland and Sardinia.

WOW!




 

Wednesday, December 23, 2015

The Spread of Haplogroups in Europe, Especially R1b

This post is intended to be a general foray into what I call "The Two -Ics" that explain modern haplogroup distributions: demographics and mathematics.  IMO, both are poorly understood.

It's been said, "to be an R1b Fantasist, you have to believe that I2-M26 came to predominate Sardinia by chance (e.g., Founder Effect and Drift) -- but that R1b came to predominate other locales (e.g., Ireland or Spain) by merit (e.g., military superiority or sexual selection)." 

It's also been said, "to be an R1b Fantasist, you have to now believe that R1b marks the spread of the first pastoralists, equestrians, and herders, and that you're now 100% correct that is right -- when just 2-3 years ago, you were 100% that Hg G2 was the mark of the first pastoralists and herders."

With respect to the first saying, I believe that most of the R1b apologists understand the former concepts (of chance as they apply to archaeogenetics), so this post is designed to build upon that knowledge, and add some demographics and mathematics too.

With respect to the second saying, I believe what is most key in a discipline like archaeogenetics is to recognize that theories and findings change from year to year, but the underpinnings of solid scientific method do not.

Let's get into it:

First, it is crucial to outline the possible outcomes.  Every generation, every clade and subclade of every Haplogroup has three "options" (or three outcomes).  Those are:

1.  Mutate (i.e., become something else)
2.  Propagate -- and, in more or less the same form, by having a male child who survives
3.  Die out, by having only daughters, or by having male children who fail to themselves breed

The "stakes" were more pronounced during prehistory than today, because the population sizes were so profoundly lower.  If you don't grasp this and accept it as fact, you can't grasp what I will detail later.

Population of Europe Over Selected Times  
(YBP = Years Before Present)

~50,000 YBP: No more than 10,000 (Neandertals)

~38,000 YBP - 19,000 YBP: No more than 37,000, likely population just 5,000

~12,000 YBP: About 28,000

~2000 YBP: About 35,000,000

-0 YBP: About 743,000,000
You can read more here.

In essence, you must remember that the population of Europe at the beginning of the time we are discussing (the post-glacial-maximum recolonization through the Bronze Age) was about 28,000 and peaked at maybe 100,000.  This is hard for the modern mind to comprehend, I know.  There were less people from Spain to Ukraine then, than there are in one city block in London now.

There are two takeaways:
1.  This made the population more susceptible to chance events, like a plague outbreak, or a famine in an area.

2.  This made the population more susceptible to massive dilution, when population started on its massive upward trajectory, after people started drinking milk, wine, and beer, when they started making cheese, when they started farming cereals and living in one spot, and when they started herding animals and having meat at will.

Going back to our three outcomes for Y Haplogroups, every generation: the first "takeaway" above should inform several likely mechanisms of how R1b spread over time.  If they entered a territory and had different disease resistance, it could have meant that large numbers of a tiny starting population would die off. 

Similarly, because the initial population was so small, when larger populations migrated for whatever reason, indeed possibly even as refugees from other regions, the other haplogroups would seem to have shrunk in size, whereas it really is different population sizes.

All this is just build up.  Our main focus, however, is the simple application of mathematics to Outcome 3 above.

This is what you need to know before we start:

1.  Hunter/gatherer women space babies on average 4.5 years apart, whereas farmers and moderns space them 1.5 years apart.

2.  The average paleolithic woman would have about 3.8 children.

3.  Infant mortality among hunter/gatherers is 30 times higher than among "civilized", and reached approximately 25% at many points during history.

4.  If the average hunter/gatherer family consisted of 3 children to live to adulthood, the odds of each family having just female kids survive was 12.5% each generation.  (.5 x .5 x .5)

Now just these numbers by themselves (HGs having fewer kids than farmers or pastoralists) explain a LOT. 

But the main point is thus: "older" non-mutated Y-chromosome haplogroups are found in lesser numbers simply because they are...older...


Every generation that a Hg exists and doesn't change, there is a 12.5% chance that those bearing it, in any one family, will not pass it along.  To be very clear: if a Hg does not mutate into something else -- or does not die entirely -- its numbers and distribution will decrease over time.  This applies to all except the most recent arrival, which is currently breeding like rabbits.  For example:

Many people believe that C1a was the first Y Hg in Europe.  There were probably just 5000-15,000 of them at any time.  By definition, the Hg C1a are folks that did not go on to mutate into any of the downstream clades.  Over time, the odds will catch up.

Many people believe that I2 was the next Y Hg in Europe.  There were probably just 10,000 - 50,000 of them at any time.  By definition, these are members of the IJ branch, and not members of F or K who mutated.  Over time, the odds will catch up.

These very simple concepts explain much of the modern distribution of haplogroups in Europe.  Is it more complex?  Sure.  Were there other factors?  Absolutely.  But over time, you cannot escape mathematics and demography being the biggest factors.

Tuesday, December 15, 2015

A Review of All Theories, on Why R1b Is So Common in Western Europeans

The great Roman historian Tacitus begins the Germania by discussing how the Germans are separated from certain peoples by mountains, and separated from other peoples by rivers -- and where there were no rivers or mountains, the peoples were separated by Fear.

A similar, intangible concept applies today, to understanding why R1b is so common in Western Europe.  Some of it can be gleaned by archaeology, some of it can be gleaned by DNA -- and where archaeology and DNA cannot provide an answer, we must resort to what makes us human: Logic.

Below is a review of all of possible models explaining why R1b and its subclades are common throughout Western Europe.  After reading it, you decide which is the most logical.

1.  The Bronze-Age Badasses.

The theory goes:R1b males were an awesome military force, who swept through Europe and killed the overwhelming majority of other males in their path.  They started in modern Ukraine as bad-ass horsemen.  But by the time they got to the coasts, they turned into bad-ass sailors and navigators.  These horsemen built boats, and Ireland and England were next to be mowed down by their genocidal awesomeness.  Despite traveling the length of Europe, they were still pure R1b by the time they reached Ireland.  Sufficiently that some counties in Ireland are 80-98% R1b today.  This R1b Empire was the largest that Europe ever knew.  Not even Caesar's stretched from Ireland to Ukraine!  Even though there was plenty of open space in Europe (the population being less than 1/1000th of what it is today), they decided to conquer an empire this vast expanse and risk the lives of themselves and their children, just because they were such badasses.  They were such efficient killers they left no trace in archaeological records in Western Europe of destruction or razing.  Despite well-established standards for evolution of language, the empire spoke vastly different languages (i.e. Latin and Ukrainian), despite this all happening just 1000 years or so before the beginnings of Rome and Greece.

Believe it or not, this theory is favored by some people today, who just happen to be R1b males.

2.  The Irresistible Indo-Europeans  

This theory goes: R1b males had uniformly gorgeous looks, tremendous wealth, and all-star qualities that made all hunter-gatherer women swoon with delight.  Whether they had bright red hair, or looked like James Buchanan, cavewomen of all groups throughout Europe dropped their guys and decided to procreate with these R1b studs.  None of the local guys resisted.  They too were enamored by the R1b good looks, and some kind of genetic superiority that made them and their genes irresistible.  

Believe it or not, this theory is favored by some people today.  No, really.  They actually posted it in comments below.  And they just happen to be R1b males.

3.  Colonizing Conquistadores

This theory is a variation of theory 1, minus the genocide.  The theory is: just model the R1b spread after that of the Spanish conquest of the New World.  Nevermind that the Spanish had guns, germs, and steel.  Nevermind that they had cannon, smallpox, and boats that could traverse oceans.  Nevermind that in most places in Latin America, the native haplogroups like Q and C still dominate.  Just ignore these things and model R1b after the Spanish.

4.  Lactase Persistence

At last we enter the realm of the plausible. 

This theory goes as follows: very basal subclades of R1b were present throughout Europe in tiny pockets for a very long time.  

This is why a slightly more downstream clade of R1b*, ancestral to modern lineages, was found, already in Els Trocs Spain, 7000 years ago.  

I mean think about it.  He couldn't have flown there.  And in 5100 BC, he couldn't have even ridden a horse.  

We know that R1 originated in Eurasia, and that it was present on both ends of Europe by 5100 BC.

If you adopt regular migration theories for on-foot migrations, these very basal R1b people in Spain were likely present in small pockets throughout Europe by 6000 BC.  Perhaps in the modern Czech Republic, perhaps in France, perhaps in modern Germany.  We only have ~400 aDNA samples from this epoch, and a smaller percentage of them have been tested for Y DNA.

Perhaps they lived in a moist climate less likely to preserve remains.  Perhaps they cremated their dead.  Perhaps archaeologists ignore their tiny region.  But one thing is certain:  Basal R1b was present in Europe, end to end, by about 6000 BC.

At some point, during a period of profound starvation, Western Europeans evolved a tremendous caloric advantage: the ability to digest milk.  No more killing the cow to eat and therefore live: you can live off of turning grass into protein.

Let's assume the first humans to evolve this, living in some nameless, forgotten pocket of Germany or England or France or Spain were majority R1b, then their population would EXPLODE.  In a time of mass starvation and famine, those with a caloric advantage would propagate exponentially.  

(Perhaps these people come to worship the cow to some degree, creating taboos to killing it, as we see in modern lands, creating idols of bulls, as we saw in many ancient cultures, and creating elaborate drinking vessels in the shape of Bell Beakers -- but I digress...)

The population of Europe at this time was maybe a million people across the whole continent.  If you figure that R1b people had a greater fertility rate (more kids per female, less time to wean because of cowmil availability, less time between kids, healthier kids, more kids reaching adulthood to propagate), then the simple math of exponential demography will show that within as few as 200 years, your uniparental markers will dominate the landscape.

It should be noted that the various genes for lactase persistence mirror closely the distribution of R1b-S21 even today.

5.  Refugees and Different Cultural Attitudes

If you know a little about history or current events, this one is not hard to imagine.  The historical example is the Goths; the modern example is what is happening in Lebanon with Syrian refugees.

People used to think the Goths were bad-ass, uncivilized, warlike, mighty (insert "supreme" adjective here) Germanic overlords who conquered much of the Roman world. But anyone who knows the history understands that the truth is a little kinder to them (kinder, depending on if you believe being peaceful and not purposefully killing people is a good thing).

The Goths were not some mighty tribe hell-bent on destruction, who willfully took over the Roman Empire. Just the opposite: they started out from modern South Sweden because of FAMINE. They were so weak, they were forced to WANDER for centuries. Finally, they invaded the Roman Empire, because the Huns EVICTED them from their steppe lands in modern Ukraine.

In other words, one of the baddest-ass people in most people's minds were refugees, forced to emigrate not because they wanted to conquer, but because they themselves had been evicted from their homelands by famine (first) and then another people (the Huns).

If that is too hard on you, let's imagine something happening today. The population of Lebanon is about 2 million people. Aside from the districts controlled by terrible people, many of the coastal folks are pretty wealthy, modern, and diverse. They don't have extraordinarily high birthrates.

All hell has broken loose near them, in a country you may have heard a lot of recently. It's called Syria. In the last two years, Lebanon...has been swamped with 2 million Syrian refugees.

In other words, the population of the country has doubled, in a generation, from an influx of refugees.

Now imagine the Lebanese bear Haplogroup L, we will call it. Imagine like many wealthier people today, they're not having 20 kids each. More like 1 or 2.

Imagine the Syrian refugees bear Haplogroup S, we will call it. Imagine like many poorer people today, they DO have many kids...

The "old" samples within this area we call Lebanon will all be Haplogroup L. A future archaeologist would find that to be the case.

The "new" samples, after a few generations, will be like 75-25%, with Haplogroup S clearly "winning out." The cause is a mix of migration -- plus different cultural attitudes toward having kids.


Did the Syrian refugees "conquer" the Lebanese?  (No.)

Is it safe to say that the Syrian refugees genes were "selected for?" (No).

That the Syrian men were "more attractive" to women? (No).

That they bore some kind of genetic advantage, that made them fitter? (Again, no.) 


6.  Different Starting Population Sizes, Different In Time

This one is the hardest to fathom almost, because it is almost circular.  It states simply that R1b is the most numerous in Western Europe because they started out more populous, and were the most recent immigrants.  

Western Europe is a cul-de-sac for overland migrations.  Almost all haplogroups originated in Africa or the Near East, but came into Western Europe via the eastern entry points into Europe.  Iberia is the end of the cul-de-sac.

Imagine a 100-acre parcel. At first, it is a hunting preserve of sorts. It is inhabited by 5 families who own 20 acres each. They love the deer and geese they harvest from said land.

Next some farmers move in. 50 acres are used for farming. They support 10 farming families, who each have 5 acres.

The land is supporting 5 hunters and 10 farmers. (Have the farmers been "selected for?" No.  They are more numerous and more recent migrants).

Finally, some others land in the area. There are 100 refugee families or maybe just people who tolerate living close to one another, so they squeeze into one acre of the land. They have metals, which they trade for food, so they are able to live in a much smaller parcel.

Have they been selected for?  Again, no.

I just described something that has happened in recorded history several times, and surely in prehistory too.

Older, less numerous populations will appear to be "drowned out," unless you are careful. It's just simple math.  Those who have been in a locale the longest will be diluted over time.  


7. Disease

Many plagues in Western Europe entered through the east.  Since R1b-bearing males were the largest migration from the east, it must be considered that different immune systems played a role in their spread.
  

In sum: R1b could be simply the most common haplogroup in Western Europe because it came there later, in greater numbers, and perhaps as part of a people who had different cultural attitudes toward having children.

In subsequent posts, I will link or recap demographic studies that show the clear power of exponential growth with even tiny differences in birthrates.


 

Another Way of Thinking About Ancient Populations (Autosomes versus just Y and mtDNA)

I don't think Neandertals died out at all.
 
No more so than any population that existed from 600,000 to 25,000 years ago.

 
If you tested ANY species of Homo that old, of course they wouldn't match us exactly. The genus has evolved.


Neandertal population size was tiny. Imagine that there were 40,000 of them in Europe. (That number is actually large.  Many aDNA experts believe that there were never more than 10,000 Neanderthals alive at any one time!)


Now imagine that 1 million modern humans come into Europe, during various phases.
 
You mix them together, and you get the 4% of Neandertal autosomes in our populations.


You also get drastically smaller odds that their sex-linked DNA survives over time.


Never, ever forget population size.


This "study" has been replicated in modern times.

 
Imagine 100 men are marooned on an island. 4 have the surname "Rarityrareness."


After generations, it is likely that the sons of those 4 will have a generation or two that produces only daughters. In fact, it's almost certain.


So the odds are that there will be no Y chromosomes of the Rarityrareness males.


But did they survive? Yes. Their descendants through their daughters are very much alive in the population.  And like Neandertals, large percentages of their genome would survive autosomally, perhaps as high as 80%!

Never ever forget initial and comparable population size. It explains just about every ratio of the newer versus older Y Chromosomes in Europe. 


It explains the lack of Neandertal sex-linked DNA, and it explains the smaller number of the Old Europe haplogroups from small hunter-gatherer populations.

Sunday, December 13, 2015

A Proposal for a New Lexicon for Ancient DNA "Components" Like WHG, EHG, EEF, ANE, and CHG

Some of us a few years back started to decry the ever-ongoing ISOGG renaming process, which coupled with the discovery of new subclades, meant that one year, someone might be deemed R1b1b1a2bab2ba11babd12ba2b1c, and the next year R1b1b2bab2f1faf1fafaf1f1f1a. 

People started saying that it would probably be better to say the first couple letters and the major terminal SNP. For example, R1b-U106 or I2-M26. This was logical and goodUnlike the terminology, the SNPs never change. And they're shorter to write.
  
Here I humbly propose a new terminology for ancient autosomal samples. I think picking terms like, "WHG" was a mistake, and now that I read about EHG and CHG, I really think so. For the uninitiated, these acronyms stand for "Western Hunter Gatherer," "Caucasus Hunter Gatherer," etc.


People compare their modern genomes, or the genomes of modern populations or ethnic groups, to these ancient samples. And then they use the shorthand, like, "Scottish average 19% CHG." This is highly misleading.

 
Let me give the reasons why I think it is deficient, and tell me if you disagree.

 
1. As we get more samples over time, it will be hard to keep renaming the different samples, if they form a different component. We just saw this with the recent CHG finds. Imagine if we find a detectable signal of ancient genes from Iberia. What will we call that component? "Really Western Hunter Gatherer?"

 
2. The shorthand is deeply misleading (i.e., "Scottish are 19% CHG.") This to me is the most important point. Most people reading this are experts. But I see on so many other boards people who seem to think that some scientist somewhere took a survey of a bunch of ancient samples, "averaged" it, and that we are comparing populations to populations.


We're not. We are not comparing Scots to Western Hunter Gatherers. We are comparing Scots (or any other modern individual or group) to ONE SAMPLE. For WHG, it's Loschbour. For EEF, it's Stuttgart. For ANE, it's Mal'ta. Etc.


3. We don't know that that one sample will turn out to be representative of "Western Hunter Gatherers" any more than we know that taking Danny Devito or the harlequin model Fabio is a representative of a modern Italian. Indeed, as the number of samples we get grows, we know the situation is infinitely more complex.


We all remember, for example, when the first farmers sampled had very unique mtDNA. For a while, people tried to read too much into it. "OMG, what if all farmers bore this odd mtDNA?" was the refrain. But it turned out to be a one-off. This can and will happen again and again as we get more samples over time.


4. The acronyms will get repetitive real fast. We are talking about aDNA, remember? Before farming, the whole world were hunter gatherers. So, many (most) aDNA samples will eventually have -HG after them, if we follow the current convention.


I imagine a world where we have found 26 slightly different hunter gatherer samples, and thus we have one different -HG for every letter in the alphabet! That'd be just silly.

 

For these reasons, but primarily numbers 2 and 3, I think the current practice is misleading and doomed to failure. Europe is a very complicated place. We will find ancient samples with very unique genomes, which are detectable in modern populations. They will all be slightly different from one another, because one sample is, well, one sample... It is highly misleading to say that "John Smith..." or "Estonians are more Western Hunter Gatherer than..." because we have not sampled all, most, or even many Western Hunter Gatherers. (I don't mean to pick on WHG. This applies equally, indeed MORE, with EEF and ANE!)

So, what is the solution?


I think if we purport to be scientific, we need to speak with scientific precision.


If an individual or a modern population bears resemblance to an ancient genome, we should state that it has a percentage similarity to that one sample. And not try to make it more than it is, like the very official and extensive term like, "Eastern Hunter Gatherers."


As for the sample, we should also include the year discovered, the situs of the discovery, and the years Before Present (BP). 


Remember, many of these sites are caves where there have been and will be more discoveries. In other words, I expect there will be many more Loschbours, more Stuttgarts, etc., and it will get quite confusing unless we speak with specificity about when something was discovered and when in time it came from.
 
Let's avoid a situation like we had with terms like R1b1b1b1a2a1b2bc3d, which lose meaning. Let's refer to things with scientific precision.


Examples:


Instead of, "Scots are 19% Ancient North Eurasian."


SAY: "On average 19% of the genes of the modern Scottish population match 2013Mal'ta-24,000BP."


Instead of, "Southern European populations have a lot more CHG blood than I expected."


SAY: "Southern European populations bear many genes matching 2015Kotias-10,000BP."


Instead of, "Sardinians are 45% WHG."


SAY: "Approximately 45% of the genes in the modern Sardinian population resemble 2013Loschbour-6000BP."



This convention is much more accurate.