Wednesday, October 28, 2015

What Is the Best and Most Accurate Ancestry Calculator (DNA Testing)?

What Is the Best and Most Accurate Ancestry or Admixture Calculator from DNA Testing?

We Review 23andme, AncestryDNA, Family Tree DNA (FTDNA), DNA.Land, Dodecad, Eurogenes, etc.

Judging from community discussions in online forums, "Admixture" tests, where a company or entity takes your raw DNA data, puts it into a calculator, and then purports to tell you where your ancestors came from -- these are all the rage.  It is not rare for seemingly educated individuals to post on the Internet sheer and utter nonsense about their results, for example, assuming that a calculator identified their ancestry with something close to 100% accuracy.

In the online world, there is no such thing as perfect privacy.  And in DNA, there is no such thing as 100% accuracy for ancestry calculators. 

This is because all people are admixed, but not all ethnic groups form part of the samples.  Put another way, if your ancestors come from a valley in Switzerland where no one has ever been tested, you might show up in a test as French, German, Italian, Austrian, but not Swiss. 

You might say to yourself that you have documented ancestry back to the dawn of time that you are from Switzerland.  You may match other Swiss people exactly.  But because the Swiss are indeed mixes of the groups above, and because there are no specific, micro-targeted Swiss samples in the hypothetical database that match you more closely than those other nationalities, the test would be woefully inaccurate to YOU.  After all, you don't want a test to tell you you might be Northern Italian, if you are Swiss.  (For that matter, do you NEED a test to tell you that?  See below.)

In the online privacy world, they've named protections that are scientifically the best (and do their job pretty darn well) "Pretty Good Privacy."  In the DNA world, all we can hope for is "Pretty Good Accuracy" -- ancestry calculators that are scientifically grounded, don't make claims beyond what they can really do, and ones that get the broad regions correct in the very least.

The coolest benefit about living in a college town (Berkeley for this blogger) is that there are a ton of people from all over the world, with pretty well-defined ancestry.  For example, that Danish exchange student with 500 years of documented ancestors in Denmark?  That's a good candidate for testing some of these calculators.  Enough friends of mine have taken DNA tests, and we've plugged the results in the calculators across several paysites (testing companies) like 23andme and AncestryDNA, and free calculators, like the ones available on Gedmatch.  Who came out on top?

By far, the best and most accurate ancestry calculator is on 23andme.   Like all good scientists, they are humble instead of full of hubris.  They don't profess to give you one set of results and say, "this is it."  Instead, they give you three different results: standard, conservative, and speculative.  Each is pretty darn accurate for most of the people we know who have tested there and other sites.  Bottom line: 23andme's "Ancestry Composition" feature is outstanding, and the best, most accurate one online we could find.

It is our opinion that the least accurate ancestry calculator is at the new site  And the one on FTDNA is a close second.  Both are terrible.  Almost everyone who used the feature on reported that the calculator is way off; just not ready for prime time at time of writing this post.

How do these calculators work?  Well, remember, the data that comes out is only as good as the data that comes in.  It is worth to always remember the concept that computer programmers call "GIGO: Garbage In, Garbage Out."  What this means is that if the data on which a conclusion is based is faulty, the answer will also be faulty.  With calculators, this manifests itself two ways: with a shifted focus, or faulty or incomplete baseline data.

By a different focus, we mean: Several calculators, for example, the MDLP Ethnicity Calculator, also offered (with Eurogenes and Dodecad and Gedrosia) at Gedmatch, stands for Magnus Ducatiae Lituania Project.  As you might have guessed from its name, it focuses on the people from lands that used to form the Grand Duchy of Lithuania: places in Northeast Europe, including Poland, Estonia, etc. 

MDLP seeks to be very good at calculating ethnic tidbits of interest to those populations.  But is is good for determining the difference between, say, a Catalonian Spaniard and a Northern Italian?  No, it's actually quite bad on that front.  That's simply not its focus. 

Similarly, there are other calculators on Gedmatch that exist to focus on and cater to Asians, Africans, even mixed race folks.  And within European populations, you have other focuses, like Dodecad, which seems Grecocentric, for lack of a better word.  None of these will do that great outside their focus areas.  So take the results from those ones with a grain of salt, unless you happen to hail from their regions of focus.

Don't believe that?  Think I'm being extreme?  If you are European, try putting your data in a calculator that is focused on another population.  Like the East Asian-focused calculators.  It won't tell you that you are NOT East Asian.  It will tell you which East Asian population you resemble the most.  To be clear: if all a calculator has is East Asian samples, a European will be told he or she is Japanese or Chinese.  This same concept applies within European focused calculators at the regional level.

In terms of bad baselines, recall the Swiss example above.  Europe is filled with micropopulations that exhibit a high degree of population homogeneity (a little inbred, to use the pejorative term).  If a calculator does not have a sample from your micropopulation (the narrow region where your ancestor lived for millennia), then you will get a faulty reading. 

Put simply (to use a French example): It's a big country.  Normans are not Basques, Provencals are not Bretagnes, etc.  That is why the best calculators are HONEST.  23andme discloses quite readily that for the huge populations in the middle of Europe (French, Germans, but also Benelux countries, etc.), it cannot spot the DNA with certainty 92% of the time. 

Does the 23andme website have any drawbacks?  Sure it does.  But they are minor compared to the others. 

First, its "Countries of Ancestry" feature is not what it could be.  But it's important to understand three things:  (1) This is NOT their ancestry calculator, but another feature entirely, so perhaps it's unfair perhaps for us to even review it in this space.  (2) It's experimental, and they state that.  (3) They are wisely phasing it out.  What was the problem with that feature?  Well, it gave you the list of countries of people who have the most matches with you.  Let's say for example you are half Italian, half Polish (a common mix in Chicago).  In other parts of Chicago, another common mix is half Polish, half Irish.  For whatever reason, people of Irish heritage have tested themselves at far greater numbers than the others. Your Polish DNA would overlap (match) with the people who reported they were half Polish, half Irish.  And this feature would then tell you that "a high percentage of the people who have DNA similar to yours are from Ireland."  Do you understand?  It's a huge problem, especially for smaller populations, especially because so many Americans are now half this, half that.  It's just not that edifying then.

23andme also suffers from the same sample issues as many of the other ancestry calculators.  For example, 85% of Italian Americans (TRANSLATION: potential customers, since most people who test are from Britain or the US) hail from just 3 regions in the deep south of Italy: Campania (Naples), Calabria, and Sicily.  Yet the population samples that most of these websites use are from Tuscany.  Even though Dante tried to meld them, Tuscans are not Sicilians and vice-versa. 

Often, these calculators when they see Sicilian or rural Southern Italian genes, they, in effect, say: we don't know what you are! you are kind of Italian but you also resemble, a little bit, people from Cyprus or Jews.  So they give an odd result.  And then you have someone tested who says, "I might be Jewish."  No.  The answer is that your people were not included in the data-set by which the baseline was developed.  If they were, the calculator would recognize you as a run of the mill Sicilian.

All online ancestry alculators also suffer from lack of inter-operability and non-standardized terms.  For example, among the calculators on Gedmatch, some use the term "Caucasian" to mean "generalized European" (which is how it used in common parlance, of course).  Others use it to mean, the specific, like, from Soviet Georgia, Armenia, etc.

Here's the bottom line: don't expect any ethnicity or ethnic-origins calculator to be 100% correct.  Don't expect new insights if you have confirmed records.  In other words, if you look just like your dad (you're not a bastard), and you're not adopted, and you have records going back centuries -- why do you need an ethnicity calculator to begin with?

These admixture tests can help if you were adopted, and want to have a sense of where to start.  But keep in mind, the largest plurality of Americans come from German heritage, and yet the best currently cannot identify German DNA 92% of the time.

Avoid the mythology and those who oversimplify.  There are reliable sources out there in genetic genealogy, like Debbie Kennett -- and there are a lot of charlatans.  Be careful whenever someone oversimplifies to the point of exaggeration, falls into stereotypes, or tells you what you want to hear.  With DNA as with everything, the most parsimonious answer is often the best.  The exotic is often wrong.

As the science improves, you can't go wrong using the Standard or Conservative setting on the 23andme Ancestry Composition test.



Monday, October 19, 2015

Toward A New Understanding of Etruscan Origins

As this now archived thread on Anthrogenica shows, the two sides to the Etruscan debate are like ships passing in the night.  They can't seem to agree on much.  This post attempts to reconcile them, sort of, while debunking what I call the Contemporaneous Anatolian Origin of the Etruscans (CAOE) Model.

When talking about the origins of a people, it is important to specify timing as well.  Even the best scientists are guilty of disobeying this rule when they speak or write in shorthand.  The most obvious example is this: do you have any African blood?  Do you have an African origin?  You might answer, "no" if you took the question to mean in the genealogical time period (the last 500 years) or even during the post-Paleolithic time period (the last 40,000 years)!

But, as you know, everyone on the planet has an African origin if you go back long enough.  All modern humans migrated out of Africa.  So the same statements, "that population is of African origin" is both true and false, depending on the time context.

Let's apply this to the Etruscans.

What we have learned recently is that ALL Europeans descend from three primary groups:  Western European Hunter Gatherers (who originated in Western Europe during the Mesolithic), Farmers (who migrated from the Near East during the Neolithic), and Steppe People (who migrated from the flatlands between Europe and Asia during the early Bronze Age).

When the CAOE "Etruscans are exotic" folks ply their wares, they argue that Etruscans had an origin in Anatolia or the Aegean, right before they appeared in Italy.  Now, the first Etruscan sites date from approximately 900 BC.  We have clear Etruscan inscriptions dating to 750 BC, so they were probably writing by 800 BC.

I have always doubted there was a mass migration of Etruscans (from the Near East) before their appearance in Italy.  There are just too many facts weighing against it.

Then it dawned on me: we *all* came from the Middle East at some point.  Is it possible this argument is one of degrees?  That the CAOE folks have their timing wrong?  That the CAOE folks should have the "C" knocked off their theory, and the disagreements would be synthesized?

Here is how it might have worked:

There was mass migration to Europe of farmers from the Near East, and it appears to have been quite strong around 3000 BC.  The final waves of farmers were migrating to Europe around 2500 BC.  Now is it proper to call these "Anatolians" or "Aegeans" or "Near Easterners."  Insofar as those designations are intended to mean anything beyond geography: no.  This was pre-race, and since these people "became" modern Europeans, any such designation is pretty meaningless.  Most modern Europeans are about 40% descended from these people.

Is it possible that the Etruscans, having a stable, affluent, consistent civilization, retained more of their cultural practices, traditions, and indeed language, and thus some vague collective memory of this mass migration?  Is it possible that the first Italian culture to have writing was able to transmit more culture down between the generations because of it?  Because that is how it works.

In other words, ALL peoples in Europe then and now are partly descended from farmers who originated in the Middle East a long time ago.  If the Etruscan people (bringing the language) was from one of the later waves, and the Etruscan society was stable and had the ability to transmit culture, could these transmissions and uniqueness be the signals that the CAOE folks misinterpret and cite as evidence for a later Anatolian origin of the Etruscans?

Let's be clear: the land of the Etruscans overlaps perfectly with the land of the Villanovans, and there is no evidence for discontinuity or rapid replacement or trauma when Villanovan culture becomes unequivocally Etruscan.  I firmly believe the odds of an Etruscan "migration" event around 900-750 BC is sheer fantasy.

BUT, I think it is possible that of the peoples in Italy, the Etruscans, by holding the richest, most fertile, most well-defended, and most defendable pieces of real estate, simply did not suffer any further migrations and inflows after they established themselves in say, 2000 BC.  In other words, the Indo-Europeanized peoples of Italy ALSO descend from Western Hunter Gatherers and Neolithic Farmers (and the genetic evidence CERTAINLY backs me up on this point), BUT the Indo-Europeanized peoples of Italy (Latins, Umbrians, Oscans), experienced a more recent inflow of both people and genes, which resulted in language and culture change.  The Etruscans, for reasons already given, did not.

To this day there is very little genetic difference between the people of Tuscany and their neighbors in Italy.  The ancient Etruscans cluster with Southern Italians genetically, which would be consistent with this theory: that the ancient Etruscans had a smidge more Neolithic Farmer, plus cultural continuity, because they did not suffer an upheaval like the other peoples, when the Iron Age Indo European speaking Steppe people invaded.

This makes good sense.  This would explain also why the Etruscan language survived as a relic amidst a sea of Indo-European.

So next time you meet someone who thinks the Etruscans were contemporaneous (and ethnic) migrants to Italy from the Near East, remind them of the wealth of evidence against it.  And then, if they are the reasonable type, explain to them how ALL Europeans descended in a large part from people, who DID migrate to Europe from the same areas, just 1000 years before.  They could be spouting a mere truism, and be off by 1000 years or so.

Saturday, October 3, 2015

Berkeley's "Center for the Study of Ancient Italy" Off to an Inauspicious Start

It started with so much promise.  A new, interdisciplinary center at a stellar university, dedicated to studying all matters ancient Italian.

Many of us had hoped it would focus more on the more understudied but significant Italic tribes (Umbrians, Lucani, etc.), but that dream quickly dissipated.  The center will have a "special emphasis on the Etruscans and Romans."  (If that makes you wonder how this makes it different from most existing efforts, you are not alone.)

But it is not this emphasis that calls the Center's academic rigor into question.  It is instead its first major effort, the Center's involvement in a workshop on the "Material Connections" between the Etruscans and Anatolia.

It acts like noticing similarities (and cultural exchanges) between ancient Mediterranean civilizations is something new and groundbreaking.  Yawn.  It's not.

But that is not the topic of this screed. Instead it is something that is frankly really surprising and deeply disappointing: the Center's website makes several pseudo-scientific statements that should be an embarrassment to anyone with one undergraduate class under her belt on historical or scientific method.

We quote verbatim from the Center for the Study of Ancient Italy's website, with commentary in bold italics (no pun intended).

"Similarities in Etruscan and Anatolian material culture have long been noted, but disciplinary boundaries ... have prevented scholars from exploring their implications."

Really?  You've got to be kidding me.  Plenty of scholars and other individuals have "explored" the "implications" of similarities ad nauseam.  In fact, there appears sometimes to be a neverending quest to find such similarities, based on the prejudiced assumption that if a culture was advanced and Italian, it simply had to be exogenous.

Then, the website continues with the somewhat redundant but absolutely bizarre statement that archaeologists apparently haven't studied enough the possible Anatolian connections with the Etruscans, and then a non-sequitur that recent DNA studies have muddied things further.  (Actually, they haven't, but we won't go into that here).

Then the two whoppers of all whoppers:

"This workshop will bring together international scholars for the very first time to explore the striking similarities between Anatolian and Etruscan material culture, without an agenda of proving or disproving Herodotus, [i.e., the ancient writer who claimed the Etruscans originated in Anatolia]."

1.  Really?  Is this really the "very first time" that any scholars have gotten together to talk about the alleged similarities between Etruscans and certain ancient Anatolian cultures? 

This is hype that is completely inappropriate for a scholarly web page.  

You want to bring together scholars for something totally new?  Put on a symposium about Dionysius of Helicarnassus, who unlike Herodotus had met an Etruscan, and lived among them for 20 years, and likely spoke their language, and had access to their histories now lost -- and who stated unequivocally that they were autochthonous.  Prove or disprove him.

2.  And the second part of that clause ("without an agenda or proving or disproving Herodotus") is criminal from the standpoint of historical or scientific method.  

You have a statement that this workshop will explore the "striking similarities between the two cultures" but that it won't be taking a side.

Come again?  

The conclusion has been stated before the study.  

The outcome has been determined before the workshop.  

Put in layman's terms, there is none of the "if" here, that marks scholarly hypothesis, with a hope for rigorous testing.  The bias is apparent from the statement.  "There are these massive similarities, but we're not taking a side."  LOL.

To add to the ridiculousness of this webpage, which really must be viewed in its entirety to be appreciated, it shows a painting by an Etruscan male holding up his right hand, and (gasp!) a painting of an Anatolian male holding up his right hand. 

If this Center wants to be taken seriously, here are some suggestions for topics:

1.  The similarities with Etruscan culture and Egyptian culture.  The tombs, the attitudes towards afterlife, certain gods and goddesses, certain foods and drinks consumed, certain pottery styles, the fact that the longest Etruscan text discovered was on an Egyptian mummy;


2.  The similarities with Etruscan culture with Campanian (native South Italic) culture.  Certain pottery styles, certain gods and goddesses, certain terms for officials, etc.


3.  The similarities with Etruscan culture with Greek culture.


4.  The similarities with Etruscan culture with Tartessian (ancient south of Spain) culture . . . Phoenician (ancient Lebanon) culture . . . Nuragic (ancient Sardinian) culture... on and on.

And then the question to pose:  Why is it that despite these other similarities, which in certain areas are "striking" do scholars continue to focus to the point of obsession on the alleged Anatolian similarities?  

Why, despite tremendous similarities between the Etruscans with Faliscans (Central Italy), Campanians (South Italy), Egyptians, Greeks, etc. -- why is it that no one tries to connect Etruscans to them?  It really is all Herodotus.  And the wacky "proof" to connect the Etruscans to Anatolians is the very same "evidence" that exists linking them to these other ancient cultures!

In other words, everyone knows the Mediterranean was the "superhighway" of the Ancient World, and that the traders, pirates, and warships traversed and interacted to a much higher degree than we moderns typically assume.  

So why do we continue to attach significance that the highly civilized Etruscans, the pirates and merchants of their time, borrowed culture from Anatolia?  

Why the focus on Anatolia, if it isn't to prove Herodotus?

Now THAT is a workshop I would love to see.

The other day as I enjoyed a Sapporo, watched my Sams-sung TV, and gazed at my replica Terracotta Army figures on my lawn, I wondered if some future Berkeley interdisciplinary student would assume I am of Japanese, Korean, and Chinese heritage.  (I'm an Irish-American, living in the Bay Area, which happens to trade a lot with Asia over the Pacific Ocean).

Sometimes the questions asked reveal a bias.

Sometimes the bias is so overwhelming that it overcomes all science.

Sometimes the premise is the conclusion.

Berkeley's "Center for the Study of Ancient Italy" has disappointed here.