Ancestry Composition

Ancestry Composition is one of the main tools on 23&me to learn more about your African ancestry by way of regional admixture analysis.On this page I will mainly focus on a few key aspects:

  1. Reference Populations
  2. Comparison with Ancestry’s “Ethnicity Estimates

____________________

Your Ancestry Composition report shows the percentage of your DNA that comes from each of 45 different ancestry populations worldwide. We calculate your Ancestry Composition by comparing your genome to the genomes of over 14,000 people with known ancestry. When a segment of your DNA matches the DNA from one of the 45 populations with high probability, we assign that ancestry to that segment of your DNA. We calculate the ancestry for individual segments of your genome separately, and then we add them together to get your overall Ancestry Composition.

____________________

For a greater understanding of your estimated admixture scores it is always advised to learn more about the methodology used by 23andme to produce their results.  In particular 23andme’s Reference datasets and 23andme’s customized algorithm are crucial. To read more about how Ancestry Composition works follow these links:

One of the main things to keep in mind is that what is shown in Ancestry Composition is only as valid as the next update! Whenever new reference populations get added and/or the underlying algorithm gets tweaked or redesigned your results will change! After a long delay of any meaningful improvements 23andme has actually implemented several new updates in the period 2018-2020. See also:

Figure 1 (click to enlarge)

AC-33-Africa.001

Nine specific regions and 3 broadly macro-regions are now available on 23andme’s Ancestry Composition to describe your origins across the African continent (aside from “Broadly Sub-Saharan African” and “North African & Arabian”). The country labeling is not to be taken too literally, as always. But it is actually quite indicative if you simply take it as a proxy and also take into account surrounding countries. Despite being less specific it will still also be helpful to distinguish between macro-regional areas within Africa: “West African” versus “Central & Southern East Africa” versus “Northern East Africa”. There has been a name change in the meanwhile for “Coastal West Africa” which is now labeled as “Ghanaian, Liberian & Sierra Leonean”. While “Congolese” has been changed into “Angolan & Congolese”.

***

Of course with each of these updates the expectation will be that 23andme’s analysis gets more refined and more accurate. However no guarantees of course 😉 Which is why throughout the years I have undertaken many surveys in order to find out to what degree DNA testing for Afro-descendants (both regional admixture and also DNA matches) may correspond with expectations based on historical plausibility (especially slave trade patterns). And in addition I have also looked into the DNA results of  African DNA testers to see how well their results may already correlate with known genealogy/backgrounds.

Obviously there are still several shortcomings to take into account. But based on both my African and Afro-Diasporan surveyfindings I find it quite impressive though that 23andme is often able to describe a person’s African origins in a meaningful regional framework. Which will usually indeed quite closely correspond with either known genealogy or historical plausibility. The additional non-African scores and Recent Ancestral Locations actually reinforcing the robustness of 23andme’s predictions. See also:

______________________________________________________________________________

Reference Populations

____________________

The reference datasets are made up of individuals from publicly available datasets including the Human Genome Diversity Project HapMap and the 1000 Genomes project as well as individuals from private 23andMe data collections and a large number of 23andMe customers who have consented to participate in research. In total, there are 11,742 research-consented customers and 2,651 non-customers in these population reference datasets.

____________________

For more details go to:

Figure 2 (click to enlarge)

This overview seems to be the current one since the update in October 2019 (Ancestry Composition v5.2). Take note that sometimes the samples are also being specified by ethnic group! However this probably only goes for the samples taken from academic databases. The customer samples are more numerous overall but their ethnic background is probably not known to 23andme in most cases. However you can get an indication when going by 23andme’s “recent ancestral locations” as well as plausible background of African migrants living in the USA. Following this rationale it seems quite certain that for example Igbo samples are most likely also used by 23andme.

***

Figure 3 (click to enlarge)

This overview was put online by 23andme after their update in January 2019. The total number of African samples (2165) seems to have decreased somewhat afterwards (1980 since October 2019, while it seems to have been 2038 in May 2019; see this screenshot). Possibly because of quality pruning. Either way the most important thing to take away from this is that 23andme mostly relies on African customer samples! These are African migrants and/or their children who have taken a 23andme test and consented with having their data being used for this purpose. Greatly beneficial! If you meet certain criteria you can still apply for a free test with 23andme. See also: Global Genetics Project

Figure 4 (click to enlarge)

Taken from 23andme’s website highlighting their elaborate analysis of how accurate the predictions by Ancestry Composition may be when applied on their own Reference Database of African samples.  In particular dealing with so-called Precision and Recall. The numbers speak for themselves. Somewhat lower recall to be explained by 23andme’s more conservative approach (up till the 2020 update!). Assigning to “Broadly” categories whenever a more solid detection is not apparent. For more details follow this link: Ancestry Composition: 23andMe’s State-of-the-Art Geographic Ancestry Analysis.

***

Comparing with Ancestry

Not meant to be exhaustive in any way! I may eloborate in the near future. First of all it is crucial to realize that even when similarly labeled, ancestral categories on different DNA tests will not be perfect equivalents or measuring the same thing. Basically this is due to differences in reference samples as well as in algorithm applied by each separate DNA test. Either way the labeling of regional admixture categories is never to be taken too literally. Instead of fussing about it or being overly dismissive understand that this labeling is merely intended as an approximate proxy. Which can still be helpful if you also take into account: neighbouring countries; macro-regions (see this map); the known migrations of ethnic groups; pre-colonial history etc., etc…2

A second thing to keep in mind is that several updates carried out in the last two years have made a great impact on the regional admixture estimates reported by both 23andme and Ancestry. Arguably 23andme has maintained a greater degree of consistency and has shown steady improvement with their updates. The updates on Ancestry show a much more mixed record…although to be fair Ancestry is showing some positive changes again with its latest update (see this link). When comparing the African regional framework of AncestryDNA (2013-2018 version!) with that of 23andme (2018/2019 version) the following aspects stand out to me:

3 West African regions on 23andme versus 5 West African regions on Ancestry
  • 23andme does not have any regions in place comparable to either “Benin/Togo” or “Mali” on Ancestry. This inevitably results in some shifts within the African breakdown. In theory this allows for a finer West African resolution on Ancestry. But regrettably after a promising start in its 2013-2018 version this potential has not yet been been fully realized by Ancestry. The wild fluctuations caused by Ancestry’s updates in between 2018-2020 have undermined a greater coherence of especially its West African framework.
  • 23andme applies a more conservative approach than Ancestry. Whereby African DNA which cannot be classified reliably (given their limited set-up) is put under either “Broadly West African”, ‘Broadly Congolese & Southern East African”, or “Broadly African”. Some people don’t like this but frankly I prefer for DNA testing companies to go by the motto: “don’t be more specific than your underlying data allows for“.
  • Senegambian & Guinean” is quite similar to “Senegal” on Ancestry but is also covering Guinea Conakry. It seems that 23andme has an edge when it comes to predictive accuracy. Especially given that the current version of “Senegal” on Ancestry is not always reported consistently among Afro-descendants. See also this blogpost.
  • Ghanaian, Liberian & Sierra Leonean” is quite similar to “Ivory Coast/Ghana”. But with a clear shift to Sierra Leone and decreased coverage of Ghanaian DNA (especially for Ewe!). Still pretty accurate for native West Africans from these countries. Given the dramatic loss in predictive accuracy of “Ivory Coast & Ghana” on Ancestry since its 2018 update clearly 23andme has the upperhand here. See also this blogpost.
  • Nigerian” has nearly the exact same labeling as “Nigeria” on Ancestry. However it is much more predictive, going by results for southern Nigerians. That is when comparing with Ancestry’s 2013-2018 version. However Ancestry has actually greatly improved its detection of Nigerian lineage in its 2019 & 2020 updates. The “Nigerian” category on 23andme is still also covering DNA found in neighbouring countries. All the way west into Ghana even! Which will be relevant for especially Afro-Diasporans. See also this blogpost.

Map 1 (click to enlarge)

The map on the left shows the distribution of “Senegambian & Guinean” scores among my African 23andme survey participants. The map on the right shows the group averages for “Senegal” on Ancestry. Both regions appear to be quite similar in scope. The predictive accuracy being somewhat greater on 23andme though and also extending into Guinea Conakry. Ancestry’s updates in 2019 & 2020 did lead to considerable improvement for West Africans themselves and especially Cape Verdeans. However for other parts of the Afro-Diaspora this region seems to be understated.

***

Map 2 (click to enlarge)

23andme vs ADNA

The map on the left shows the distribution of  “Ghanaian, Liberian & Sierra Leonean” scores among my African 23andme survey participants. The map on the right shows the group averages for “Ivory Coast/Ghana”. Both regions appear to be quite similar in scope. The predictive accuracy being somewhat greater on 23andme though. Going by preliminary group averages. In particular 23andme can now provide a much better coverage of Sierra Leonean DNA. However in Ancestry’s 2013-2018 version the additional regions of “Benin/Togo” and “Mali” did enable greater and often also meaningful resolution. Helpful for making the distinction between Akan and Ewe results for example. Regrettably the predictive accuracy of “Ivory Coast & Ghana” has greatly decreased on Ancestry in subsequent updates.

***

Map 3 (click to enlarge)

NIGERIAN

This map shows the group averages of “Nigerian” among my African 23andme survey participants. Impressive coverage of southern Nigerian DNA for Nigerians themselves. Quite similar to the current situation on Ancestry (after its 2019 & 2020 updates). However considerable overlap also with DNA found to the west and the east of Nigeria. Native Ghanaians and Cameroonians without any recent Nigerian lineage will still often show “Nigeria” scores in excess of 30%.  Not a perfect outcome therefore.

***

3 Central & Southeast African regions on 23andme as well as Ancestry
  • “Angolan & Congolese” is quite similar to “Cameroon/Congo” on Ancestry. However it is fortunately much more focused on describing genuine Central African lineage. With hardly any overlap with the Bight of Biafra as was the case for “Cameroon/Congo” (2013-2018 version!). See also this blog post.
  • “Southern East African” is somewhat similar to “Southeastern Bantu” on Ancestry. But much better defined because 23andme also has 3 separate regions in place for Northeast Africa. Big improvement therefore when wanting to single out such lineage.  See also this blog post.
  • “African Hunter-Gatherer” is pretty much the same thing as “South-Central Hunter-Gatherers” on Ancestry (currently renamed into “Khoisan, Aka & Mbuti people”) . In both cases very minimal for almost all people and hardly relevant as these scores are often to be traced back thousands of years rather than within a genealogically meaningful timeframe. South African Coloureds being a notable exception.
  • Actually 23andme also has 3 additional categories for Northeast Africa as well as 1 for North Africa. Which currently do all have their equivalents on Ancestry as well (since the 2018-2020 updates). Except for “Sudanese”. But all of these regions are usually practically absent for Atlantic Afro-descendants. From my observations it is quite rare to see any such scores above trace level (>1%). Could be merely “noise” in many cases or otherwise an indication of Sahelian derived ancestry. The exception being “North Africa” which is frequently reported with minor but still distinctive percentages (even >5%) for Latin Americans and Cape Verdeans in particular. In most cases to be correlated with their Iberian ancestry though. Although actually at times also again a Sahelian/Fula ancestral scenario might apply.

Map 4 (click to enlarge)

Congolese

This region has now been renamed into “Angolan & Congolese”. And it is indeed quite predictive of both Angolan and Congolese lineage. But not exclusively so! In reality it is measuring genetic similarity among Bantu-speaking populations who are dispersed over a far greater territory! As my survey findings clearly demonstrate this “Angolan & Congolese” region can be found as far south as Mozambique and South Africa! To the north it also has a substantial presence in Cameroon. But more so among certain ethnic groups closely related to Bantu populations. Given the absence of Cameroonian samples on 23andme (unlike Ancestry!) this creates considerable genetic overlap with “Nigerian”. Otherwise still very useful as a proxy of Central African lineage. Similar to Ancestry’s current “Cameroon, Congo & Western Bantu” region.

***

Map 5 (click to enlarge)

Southern East Africana

The labeling of this region may be referring to the southeast of Africa. But the real focus is on samples from the Swahili countries: Tanzania, Kenya, Ruanda and Uganda. Notice that for example Mozambican DNA seems to be more so covered by “”Angolan & Congolese”.  Among my survey participants from Zimbabwe and Zambia “Southern East African” scores are mostly in between 10-20%. But it can get as low as 4% for one of my Mozambican samples! For Afro-Diasporans it is also worthwhile to know that Malagasy DNA is partially described by this region. But not convincingly so. Not an equivalent of the former “Southeast Bantu” region on Ancestry therefore. More so tending towards the current “Eastern Bantu” region perhaps. But also covering some of the same area as the newly introduced “Southern Bantu” region on Ancestry.

***

Youtube

Just for illustration a few African 23andme testers who discuss their results on Youtube. I have always believed when it comes to regional admixture the proof of the pudding is when people who are “100%” from one particular ethnic background take the test. Or also people of recently mixed but still known background. See how well their ancestry is being predicted or described and that already tells you a lot what you can expect for yourself 😉

***

AFRICAN AMERICAN & CAPE VERDE (1/4)

***

***

AFRICAN AMERICAN & LIBERIAN

***

***

NIGERIA (Yoruba)

***

***

CAMEROON (Bamenda/Bafut)

***

***

ZAMBIA (Lunda)

***

***

TANZANIA

***

***

NIGERIA (Yoruba) & RUANDA (Tutsi)

***

***

TUTSI (Banyamulenge)

***

***

ETHIOPIA & SUDAN

***

***

MOROCCO

***

***

___________________________________________________________________________

Notes

1) According to many pundits only continental admixture is to be taken seriously in DNA testing. Sub-continental, a.k.a. ethnicity estimates, a.k.a. regional admixture only being fit for entertainment purposes. I myself have never taken this stance. Preferring to judge each case on its own merits. Attempting to maximize informational value despite imperfections and avoiding source snobbery. Which is why I have conducted my AncestryDNA surveys among both Africans and Afro-descendants in the past. Since 2018 I have also started similar surveys based on 23andme results.

From these ongoing research efforts I have learnt that regional admixture DOES matter and is of course NOT randomly determined. Correct interpretation and knowing how to really “read the data” is a crucial requirement though to get the most out of your results. The ancestral predictions by 23andme may not be 100% accurate but still in most cases they are reasonably well-aligned with the known or historically plausible backgrounds of my African & Afro-Diasporan survey participants. It can therefore be very valuable in your journey to Trace African Roots! To be combined with any other ancestral clues you may have, especially DNA matches. See also:

2) Several valid objections can be made about the country name labeling being applied on both Ancestry and 23andme. But the truth is that the labeling of ancestral categories will always be tricky and a trade-off! Ancestral categories referring to ethnic groups might be just as deceptive or even more so! As many people will again tend to take them too literally. Underestimating not only the sheer number of ethnic groups existing in Africa (thousands!) but also the complexity of interplay between fluid ethnicity, overlapping genetics and shifting political borders. The same goes for precolonial African kingdoms which again were not static entities. But instead very often ended up being multi-ethnic after expansion and assimilation of neighbouring peoples.

I do agree that more appropriate labels than the present ones can be conceived of. Also knowledgeable scholars in African & Afro-Diasporan history should be involved to redo the regional descriptions so that people will more immediately be aware of the ancestral connections being implied. An intermediate solution might be ancestral regions which are referring to either non-political geography or meta-ethnic/linguistic groups. Such as Atlantic, Mande, Kru, Akan, Gbe etc. (see this page). But I fear that inherently there will always be some degree of blurriness involved and exact delineation might be impossible to achieve in many cases. Instead of generating false hope it might be a more honest approach to go by the motto of “don’t be more specific than your data supports”.

Understandably many people desire to have the most specific degree of resolution when searching for their African roots. They want to be able to pinpoint their exact ethnic origins and preferably also know the exact location of their ancestral village. In a way following in the footsteps of the still very influential ROOTS author Alex Haley. Unfortunately these are rather unrealistic expectations to have in regards to DNA testing (at least in regards to admixture analysis.). Not only given current scientific possibilities. But also because such expectations rest on widely spread misconceptions about ethnicity, genetics, genealogy as well as Afro-Diasporan history.

Too often people ignore how the melting pot concept is really nothing new but has always existed! Also in Africa where inter-ethnic mixing has usually been frequent! Throughout (pre) history and maybe even more so in the last 50 years or so. Generally speaking ethnicity is a fluid concept which is constantly being redefined across time and place.

Too often people fail to take into consideration how due to genetic recombination our DNA will never be a perfect reflection of our family tree but might actually also at times suggest very ancient migrations.

Too often people underestimate the actual number of relocated African-born ancestors they might have (dozens or even hundreds!). As well as the inevitable ethnic blending which must have taken place across the generations.

Too often people are still not informing themselves properly about Africa itself and the documented origins of the Afro-Diaspora. Many specific details may have been lost forever but there is a wealth of solid and unbiased sources available which can help you see both the greater picture as well as zoom in more closely to your own relevant context. See also: