Suggestions for improving the African breakdown on AncestryDNA

In previous blog posts I have demonstrated how the current African breakdown on AncestryDNA can be very insightful to gain a greater understanding of the regional African roots for people across the Afro-Diaspora as well as actual Africans themselves. Despite several shortcomings as well as the continued need for correct interpretation. My survey findings on a group level have still been reasonably in line with either historical plausibility or actual verifiable genealogy.

A new version of AncestryDNA’s Ethnicity Estimates has been provided gradually (and quietly..) to a subset of Ancestry’s customers for at least since April 2018. I do not have all the needed information in place yet to make a proper assessment. Therefore I reserve my final judgment on this intended update for later. However in this blog post I will discuss some suggestions on how to improve on the current African breakdown hopefully ensuring that Ancestry’s update will be a step forward and not a step backwards. Below a short summary of these suggestions. If you continue reading I will provide more details.

  1. Maintain current coherency of African breakdown and improve by creating less overlapping and more predictive regions
  2. Add more historically relevant African samples to Ancestry’s Reference Panel. In particular from Angola, Burkina Faso, Guinea Bissau/Conakry, Liberia, Madagascar, Mozambique and Sierra Leone.
  3. Create new regions and/or migrations centered around these historically relevant samples.
  4. Bring back the continental breakdown display (subtotals specified for each continent)
  5. Create new African “migrations”, a.k.a. genetic communities. In particular for Nigeria & Ghana, as sufficient customer samples may already exist.
  6. Mention the “aggregate ethnicity estimates” for each migration/genetic community.
  7. Enable the Ethnicity Estimate Comparison feature for all customers and not just USA-based smartphone users.
  8. Show ethnicity/admixture of shared DNA segments with your matches.
  9. Avoid misleading labeling of ancestral regions. Providing a false sense of accuracy.

Updated results for a Nigerian (Bini, Itsekiri, Urhobo & Isoko)

***(click to enlarge)

NAIJA updatea

Even when these are only individual results this outcome for an actual Nigerian could possibly imply that also for other people of (southern) Nigerian descent Ancestry’s update may lead to a substantial decrease of “Nigeria” amounts. While the “Benin/Togo” as well as the “Cameroon, Congo, and Southern Bantu Peoples” regional scores may drastically increase. Undoing the imperfect yet still reasonably predictive accuracy of the “Nigeria” region in the current set-up. See also: Nigerian AncestryDNA results.

***

***

Upcoming Update?

***(click to enlarge)

region1

Source: Ancestry.com

***

It is sometimes said that your DNA results are only as good as the next updateSo it’s best not to get too attached to them 😉  After all it is just a snap shot of how your DNA compares with the reference samples in Ancestry’s current database according to their current algorithm. Given scientific advancements and a greater number of relevant African reference samples hopefully a greater degree of accuracy may be obtained in the near future. But naturally no guarantees are given that any given update will automatically lead to an improvement (or at least not on all fronts).

As shown above one major change in regards to AncestryDNA’s current African breakdown might be the combining of the “Cameroon/Congo” and “Southeastern Bantu” regions into one single region labeled: “Cameroon, Congo & Southern Bantu Peoples”. In regards to Central/Southern African DNA this change therefore seems to be leading to more generic rather than specific results. Although the appearance of the new region labeled “Eastern Africa” in itself could represent an improvement enabling the identification of Northeast African DNA.

Another change concerns Ancestry’s algorithm which now “reads longer stretches of your DNA at once“. This might also be an improvement in itself as it may lead to a decrease of trace region reporting and a greater focus on a genealogically meaningful timeframe (going back 500 years or so). It might also be that some of the 13,000 newly added samples in Ancestry’s Reference Panel (partially) apply to the already existing African regions. Although I have not seen any specification yet of these newly added samples. At the moment of writing this blog post there is still some remaining uncertainty if AncestryDNA’s intended update will indeed be implemented or remain stuck in beta phase (as happened in 2016). I will therefore refrain from any in-depth judgement for now. For more details about the update:

Again I will need more data to make a detailed assessment of how Ancestry’s proposed update will work out for the African breakdown. I have however already seen more than a dozen updated results. Including for several African Americans, Cape Verdeans as well as a few Africans. Based on just these results and at the risk of speaking prematurely I find it regrettable to say that I am doubtful that Ancestry’s intended update will be an improvement for the African breakdown1. Rather I am quite concerned that it will lead to more people being confused and even mislead by their DNA results. Due to the often drastic and seemingly incoherent changes compared with the current set-up it might understandably also lead to a loss of confidence in admixture analysis. Even when I strongly believe that this aspect of DNA testing can provide very valuable information as long as it interpreted correctly and even more so when combined with other ancestral clues (population averages to be used as regional benchmarks, DNA matches, haplogroups, genealogy, relevant historical context etc.). See also these blog posts:

 Updated results for a Cape Verdean

***(click to enlarge)

CV

This Cape Verdean person went from having one of the highest “Senegal” amounts in my survey to having one of the highest “Mali” %’s. Which is remarkable even more so because receiving “Senegal” as a top region was a very consistent outcome of my Cape Verdean survey (90/95) while it was quite rare to have “Mali” show up as biggest region (3/95). Still this is not a completely random change as in my survey I have been treating socalled “Senegal” and “Mali” as closely related *Upper Guinean* regions. Not sure if such changes will be commonly seen. However it must be noted that “Senegal” might be a more fitting proxy labeling for Cape Verde’s Upper Guinean lineage (which is more so coastal, incl. Guiné Bissau, rather than interior: see this link)

***

Suggestions for improving AncestyDNA in regards to Tracing African Roots

What is about to follow partially represents my own subjective point of view as an Ancestry customer and an interested layman. However it is also based on the many observations I was able to make during my survey of AncestryDNA results among Afro-Diasporans and Africans, which I started five years ago already. I have also been inspired by the frequent interaction I have had with people sharing their results with me or when engaging in online discussion boards. This overview is not meant to be exhaustive. The suggestions being made are also not per se ranked in any particular order of priority. Not wishing to come across as overtly demanding I have attempted to balance feasibility with valid needs/wishes for an improved regional framework to describe the African roots of people from the Afro-Diaspora as well as actual Africans. For previous discussion on this topic see also:

Regional descriptions for “Benin/Togo”: current version and updated version

***(click to enlarge)

Benin Togo region (current)

Current map and regional description on Ancestry’s website given for “Benin/Togo”. Notice how due to genetic similarity this region (despite the modern day country name labeling) is also found in: Ghana, Nigeria and Mali. At first perhaps confusing but given proper follow-up research still more or less manageable: How to make more sense of “Benin/Togo” scores

***(click to enlarge)

Benin Togo region (new)

This map and regional description for “Benin/Togo” is being provided on Ancestry’s website for people whose results have been updated. Notice how the map has greatly expanded! Covering almost all of West Africa! Despite its seemingly exact labeling socalled “Benin/Togo” is now to be found as far west as Sierra Leone and as far south as Gabon. Decreasing rather than increasing its predictive accuracy!

***

1) Maintain the coherency of the current African breakdown and improve by creating less overlapping and more predictive regions. I fully realize that this is by no means an easy task given the inherent trade-offs to be dealt with when aiming for regional delineation despite genetic similarities. Many hurdles and pitfalls are to be overcome while designing an appropriate regional configuration. In my opinion the current African breakdown on AncestryDNA was a pioneering initiative which already did succeed in providing a very useful tool for Tracing African Roots, given correct interpretation. It would be a true shame if this accomplishment goes down the drain due to perhaps inadequate QA or unintended side-effects of Ancestry’s upcoming update…

Reviewing the current 9 African regions there is indeed much room for improvement. Many shortcomings however may be dealt with quite effectively and with relatively little effort:

  • Senegal“: despite the minimal sample size (n=28) this region has been very useful already in singling out Upper Guinean lineage. Its very high predictive accuracy for the “typical native” (100%) being confirmed by the results of in particular Cape Verdeans and Hispanics but also for example for Hausa-Fulani. In order to reduce its current coverage/overlap into Sierra Leone and even Liberia more defining samples are however needed. Perhaps Wolof or other Atlantic samples being more suitable than the presumably Mandenka samples being used right now (see also suggestion 3).
  • Mali“: this region is probably most in need of additional and appropriate sampling (along with “Southeastern Bantu”). Right now only 16 samples being available with a very low predictive accuracy for the “typical native” from Mali (39%). But going by my survey results this region is still reasonably predictive of Upper Guinean lineage for Afro-Diasporans. Then again it still remains ambivalent because of genetic overlap with Burkina Faso and surrounding areas in Ivory Coast/Ghana/Togo/Benin. The creation of a new region based on Gur samples could very well solve or atleast diminish this issue (see also suggestion 3).
  • Ivory Coast/Ghana“:  a rather robust region already (n=99). Even if also covering ancestral ties with Liberia and Sierra Leone. There is fewer overlap to the east though which is helpful for distinguishing between possible Akan and Gbe lineage. Replacing the current “Ivory Coast/Ghana” region with three separate and properly labeled regions to describe and measure genetic affiliations with either Kru, Akan/Kwa or southwestern Mandé samples could increase its informational value tremendously. See also:
  • Benin/Togo“: also a fairly robust region based on underlying sampling (n=60) and prediction accuracy for the “typical native” (82%). But again overlapping with neighbouring countries, especially (eastern) Ghana and (southern) Nigeria. It was probably the most confusing region in the current set-up. Especially for African Americans it was often unexpected when reported as main region (for Brazilians & Haitians however it was in line with historical plausibility). Highly unfortunate therefore that in the proposed update it seems “Benin/Togo” will be even more wide-ranging across borders! The creation of a more narrowly focused region to describe and measure genetic affiliations with Gbe samples from not only Benin and Togo but also Ghanaian Ewe would be much more beneficial. It will probably remain difficult to decrease the inevitable overlap with fellow Volta-Niger speaking southern Nigerians. Just as a tweaking idea it might be worthwhile to for once dispense with the ubiquitous Yoruba samples for the “Nigeria” region (all too often used as a generic stand-in for West African DNA). See also:
  • Nigeria“: one of the most admixed regions according to Ancestry’s own data (along with “Mali”) and therefore tending to underestimate genuine Nigerian ancestry. Even if my Afro-Diasporan survey findings for this region were mostly in line with historical expectations. For many Afro-Diasporans the crucial question to be answered is if their Nigerian lineage is either Yoruba or Igbo. I suspect however that making this very specific distinction could prove to be quite difficult still, given genetic similarities among southern Nigerians. And in stead of feeding into false hope it might be best to maintain the status-quo. Still possibly by including Middle Belt Nigerian samples a higher prediction accuracy may be obtained than right now: 69% for the “typical native” (n=67) but only around 50% according to my survey findings which now have a higher sample size (n=73) than Ancestry’s Reference Panel! Other tweaking possibilities based on adding various Nigerian sample sets may also be explored. However a northern shift of this region does not seem recommendable given the mostly southern Nigerian roots of Afro-Diasporans! Integrating Ancestry’s migration feature (based on the number of IBD matches with either Yoruba or Igbo customers/samples) could possibly also be very helpful (see also suggestion 5).
  • Cameroon/Congo“: this region was probably one of the most robust regions together with “Senegal”. High prediction accuracy for the “typical native” (92%; n=115) and also confirming historically known patterns of Central African heritage among Afro-Diasporans. The geographical range of this region was larger than indicated by Ancestry itself though. According to my African survey reaching into Zambia, Zimbabwe and even Madagascar! As I have argued from the beginning the inclusion of Cameroonian samples (despite ample availability..) should be reconsidered in order to create a clear distinction between Bight of Biafra origins and proper Central African roots. Given prevailing slave trade patterns this is a crucial issue for Afro-Diasporans! I do not know if the current Cameroonian samples might be genetically compatible with any southeast Nigerian samples. But if so then a new region centered on southeast Nigeria & Cameroon might be a very good alternative! The remaining Congolese samples preferably to be amplified with Angolan samples to create a genuine Central African region (see also suggestion 3). Such a region will be much more suited then to uncover historically documented Central African lineage for the Afro-Diasporans. See also:
  • Southeastern Bantu“: this region was undersampled (n=18) and had its inherent flaws because it was mislabeled and based on wideranging samples from presumably Kenya, Namibia and South Africa. Southwestern Bantu origins from Angola/DRC Congo also being covered. While additionally also an overlap with Northeast African DNA was implied. Due to a lack of Northeast African samples in Ancestry’s Reference Panel. The creation of the new “Eastern Africa” region in the update is certainly appropriate therefore. Given correct interpretation the distinction being made between “Cameroon/Congo” and “Southeastern Bantu” was still very useful for Afro-descendants as well as many Africans. This was demonstrated most clearly by the frequency of top-ranking scores for “Southeastern Bantu” for my Brazilian and Mexican survey participants, corroborating their strong ancestral ties with Angola (see this blog post). I would therefore strongly urge Ancestry to cancel the proposed “Cameroon, Congo and Southern Bantu Peoples” region. As this will only lead to less specification rather than more! And instead I would argue for the creation of two separate regions for describing Congolese/Angolan origins (“western Bantu”) and Mozambican/Malagasy (“southeastern Bantu”) DNA. This will be most relevant and highly informative for African Americans as well as other Afro-Diasporans and actual Africans!
  • South-Central Hunter-Gatherers“: this region is currently based on very genetically distinctive samples (n=35) from the Khoi-San & Pygmy people. Its predictive accuracy is quite solid (86% for the “typical native”). Still it almost always showed up as a minimal trace region for most people. Therefore its informational value was rather limited. Especially given that usually very ancient ancestral ties were indicated rather than anything from a genealogical meaningful timeframe (~500 years). Main exception being South Africans, and in particular South African Coloureds. Judging from a few updated Northeast African results as well as the new map it seems that Ancestry might have replaced its Pygmy samples by Sandawe samples from Tanzania instead. Another much studied yet very marginalized hunter-gathering population. This has resulted in peculiar and inflated “Hunter Gatherer” scores for Northeast Africans undoing the in itself useful addition of the new “Eastern Africa” region. For example see this screenshot for a Sudanese person. Frankly I do not believe there is much added value in reporting these genetic affinities with marginalized hunter-gathering populations (no matter how distinctive and fascinating in itself) as they usually go back thousands of years. It only leads to confusion while also the labeling may be perceived as akward by some people. I would only keep in the Khoi-San samples as they are meaningful and relevant to describe the recent origins of especially South Africans. See also:
  • Africa North“: this region showed an impressive prediction accuracy for the “typical native”: 100%! Still going by the results of actual mainstream North Africans in my survey it seems that the 26 samples used by Ancestry may not have been the most representative ones (most likely Mozabite Berbers from Algeria). This region usually was only reported in trace amounts or just absent for African Americans and West Indians. For Cape Verdeans and Hispanics it showed up more regularly, but still almost always below 10% (see this overview). In most cases inheritance by way of an Iberian or Canarian ancestor seems most plausible for them. In the current set-up especially Portuguese people tended to score quite consistent “Africa North” scores of around 5%. Given the creation of new regions for “Portugal” and “Spain”, such “Africa North” scores might actually decrease or even disappear as these improved Iberian regions might tend to incorporate older genetic affiliations. An ancestral scenario involving a Fula ancestor could also theoretically be possible in selected cases. As afterall according to my survey (n=42) the Fula could have around 13% “Africa North” on average. I suppose after the update such scenario’s might be easier to distinguish. Especially when Ancestry decides to implement a chromosome browser or starts mentioning the ethnicity/admixture of shared DNA segments with your matches (see suggestion 8). See also:

2) Add more historically relevant African samples to Ancestry’s Reference Panelin particular from Angola, Burkina Faso, Guinea Bissau/Conakry, Liberia, Madagascar, Mozambique and Sierra Leone. All of these countries currently missing from  AncestryDNA’s regional set-up. This historical relevancy is to be determined by the documented ethnic/regional provenance of people from the Afro-Diaspora in the first place. As this group of Afro-descendants is after all most reliant on admixture analysis to learn more about their background. The selection of ethnic groups & individual samples within these countries should naturally be done very carefully. Avoiding people with known mixed backgrounds (also in intra-African terms) such as the Krio people from Sierra Leone, Americo-Liberians, as well as Mestiço’s from either Angola or Guiné Bissau, who aside from having minor Portuguese lineage at times may also be partially Cape Verdean. Generally speaking there’s always much ado about the lack of African sampling in DNA testing. The difficulties involved may be underestimated to some extent. However it seems no more than reasonable to me that at least some of these badly needed African samples may be obtained in the following ways:

  • customer samples: based on my ongoing survey of African AncestryDNA results I would estimate that there could very well be hundreds of Africans (incl. 1st and 2nd generation migrants living in the USA or Europe) already within Ancestry’s customer database. I have personally seen results from practically all African countries already! However some countries tend to be overrepresented (such as Nigeria & Ghana) while others are underrepresented (such as Angola & Burkina Faso). Obviously also consent for research purposes would be required.
  • academic databases:  Ancestry is already making use of the HGDP database as well as its own Sorenson database. However several other academic collections of African DNA samples exist. Possibly restrictions may be in place against commercial use etc.. However the following collections might provide a very valuable source of appropriate African samples:
    • 1000 Genomes database (incl. samples from Gambia, Sierra Leone, Nigeria, Kenya)
    • MalariaGEN database (incl. no less than 1,266 individuals (!) from Burkina Faso, Cameroon, Gambia, Ghana, Kenya, Malawi, Mali, Tanzania)
  • creative recruiting of African samples: through targeted marketing for example offering free kits among African migrant associations in either the USA or Europe. Or else also by crowd-sourcing: third parties/ individuals travelling to Africa in order to test local Africans. Afterall where there is a will there is a way 😉  When carried out effectively the costs involved may be quite minimal while the added value could be enormous! For some very praiseworthy examples:

3) Create new regions centered around these historically relevant samples. Whenever possible and provided that the current coherency of the African regional framework as a whole is not compromised.

  • Atlantic samples from either Senegambia (Wolof, Sereer etc.) or Guiné Bissau (Balanta, Papel etc.) may be used to solidify the current “Senegal” region (hundreds of Gambian samples from various ethnic groups are possibly to be obtained via the MalariaGEN database!). Helping to pinpoint such lineage while also creating a sharper delineation for the “Senegal” region. Given sufficient genetic differentiation and appropriate sampling I suppose a very helpful distinction between northern Senegambian versus southern Senegambian/Guinean origins may also be enabled.
  • Mande samples from either Guinea or Mali may be used to solidify the current “Mali” region. Helping to pinpoint such lineage while also hopefully enabling a sharper delineation with the “Senegal” region.
  • Southwestern Mande samples from either Liberia, Sierra Leone, Ivory Coast or Guinea might be used to create a separate region. Helping to pinpoint such lineage while also hopefully contributing to a sharper delineation of both the “Mali” and “Ivory Coast/Ghana” regions.
  • Kru samples from Liberia and/or Ivory Coast may be used to create a new region. Helping to pinpoint such lineage while also contributing to a sharper delineation of the “Ivory Coast/Ghana” region, which would then become even more predictive of especially Akan lineage.
  • Gur samples from Burkina Faso might be used to create a greatly needed intermediate region to cover the genetic legacy of people nowadays found in northern areas of Ivory Coast, Ghana, Benin and Togo as well as Burkina Faso itself. It may also result in a sharper delineation of especially the “Mali” region which will become more strictly suggestive of Upper Guinean roots.
  • Angolan & Mozambican samples (preferably from relevant and currently undersampled populations such as the Mbundu and the Makua) might be used to solidify any Bantu orientated region. But given sufficient genetic differentiation I suppose an extremely useful distinction between western and southeastern Bantu origins may also be realized.

***(click to enlarge)

BR2x

Both of these results belong to the same Brazilian person. Notice how on the left the continental breakdown is still in place creating a more organized overview. On the right we can see how a “Portuguese” migration is being mentioned, which actually also includes Brazilians and therefore is quite informative. Having migrations in place for either Angolans or Mozambicans could have an even greater informational value! Notice also the useful distinction being made between “Southeastern Bantu” and “Cameroon/Congo”. Something which will be lost when combining both regions.

***

4)  Bring back the continental breakdown within the Ethnicity Estimate display. With subtotals specified for each continent. This used to be standard until it was changed about a year ago without any explanation why (as far as I am aware).  Right now the display merely shows you a seemingly haphazard listing of regions sorted from biggest to smallest amount, regardless of continent. This creates a lot of inconvenience for people who are also interested in knowing their continental percentages. This is especially relevant for Afro-Diasporans given their generally admixed genetics.

5)  Create new African “migrations”, a.k.a. genetic communities.  As far as I am aware currently there are only two “migrations” in place for Africans. One of them centered on South Africans (in particular Afrikaners & Coloureds, see this screenshot). And the other one based principally on Cape Verdeans (see this screenshot). Even when misleadingly labeled “Portuguese Islander”2. I understand that this potentially very insightful “migration” feature is a work under progress. Naturally a certain minimum number of DNA tested Africans with a common background will be required to create new genetic communities. Then again from my ongoing survey of African AncestryDNA results I have learnt that there could very well be hundreds of Africans (incl. 1st and 2nd generation migrants) already within Ancestry’s customer database. Especially for Nigerians and Ghanaians I would imagine something could already be set up. Even more so when appropriate academic samples can be added. Given the pressing need for more specification of African lineage I would argue for a loosening of certain thresholds and/or requirements provided that a minimum level of robustness for this “migration” tool can still be maintained.

6) Mention the “aggregate ethnicity estimates” for each migration / genetic communityThis aggregate was basically an average of the ethnicity estimates for all people belonging to a certain migration. It was briefly available for customers in 2017 with access to the beta version of the “migration” tool, being mentioned in a third tab called “insights” (see this screenshot). This information can be very useful as some sort of regional benchmark in order to see how you yourself or others fit in the bigger picture. Keeping in mind variations around the mean any statistically significant deviations could possibly still provide valuable ancestral clues. Functioning much in the same way as the very helpful admixture averages being provided by Ancestry for the “typical native” (see this link). And also very similar to the group averages I have been calculating during my survey of AncestryDNA results. As a crucial precondition Ancestry should however single out people who have 4 grand parents from the same area as mentioned in the migration. Given their access to people’s familytrees this should not be very difficult I imagine.

7) Enable the Ethnicity Estimate Comparison feature for *all* customersincl. PC users and people outside of the US. As far as I am aware this comparison feature (see this screenshot) is now only available for smartphone users who have downloaded the Ancestry app, which is restricted to the USA. Obviously it is not a fair policy to deprive other customers of this potentially very insightful tool! Basically by using this feature you can compare the ethnicity estimates of yourself with each one of your DNA matches. Very useful for example when wanting to find out a plausible background of a possibly African match! For more details see also this informative blog post:

8) Show ethnicity/admixture of shared DNA segments with your matches. This can be very useful for many purposes. For example when reviewing your African matches ideally you will want to verify if the shared DNA segment is showing up as a certain region, let’s say “Ivory Coast/Ghana”. Because that way you could have more certainty that these matches will indeed relate to your own “Ivory Coast/Ghana” amount. Given that most Africans when tested by AncestryDNA tend to be described as a composite of adjacent regions and not just one single one. Regrettably this potentially very insightful information is not available because Ancestry so far has not implemented a chromosome browser. See also:

9) Avoid misleading labeling of ancestral regions. Providing a false sense of accuracy. Even when in my opinion the current country name labeling by AncestryDNA is to be preferred above ethnic labeling. Ancestral categories named after ethnic groups will tend to be overlapping across ethnic boundaries just as much and therefore will be even more misleading!

Generally speaking the whole framing if you will of admixture analysis can be misleading and is often catering to unrealistic expectations. Specifically in regards to how ancestral categories should conform exactly to a person’s family tree and all the known ethnic lineage it may contain. Even when there are still so many misunderstandings and uncertainties about the genetic underpinnings of ethnicity. And to add to complexity ethnic groups are of course to some extent also social constructs due to fluid ethnic identities and inter-ethnic unions.

None of this is to deny the potential informational value to be gained from admixture analysis. As always I prefer to see the glass as half full rather than half empty 😉 However correct interpretation is a must! On the one hand this requires an educational effort on part of DNA testing companies. But frankly I believe that customers have their own responsibility in this matter too and should invest more time in informing themselves about inherent limitations etc..  A good start is by taking a proper look at the regional maps integrated in your Ethnicity Estimate. Compared with other DNA testing companies I actually find that Ancestry does a rather good job at providing helpful sections/pages offering guidance and context. I sincerely hope they will continue to do so. For example see this overview:

Appeal for true commitment 

Especially a few years ago I often got the notion that some DNA companies simply blurt out their admixture reports as well as their updates just like that. Without realizing or caring about what kind of additional questions they might raise or even what kind of emotional reactions they might trigger. Providing a minimum of context is the least thing they could do in my opinion. It is often still the customers themselves who need to put two and two together. The much hyped but eventually underwhelming update of 23andme’s Ancestry Composition earlier this year brought back memories of previous updates carried out in a rather cynical manner by that company3. In particular 23andme’s lack of clarifying communication and general indifference towards its customers over the years. Despite lip service and FIVE years of building up expectations 23andme is still not able to give any helpful insight into the African origins of Afro-descendants…

I am of Cape Verdean descent myself and I am personally not expecting any special treatment or favours from profit-driven companies such as 23andme or Ancestry. Even when I can easily imagine that as an USA-based company some degree of consideration for African Americans is in order. Afterall due to historical circumstance Afro-Diasporans are arguably  in most pressing need of receiving finer regional resolution of their admixture results (aside from adoptees). Most people in the Afro-Diaspora do not have any detailed knowledge about their African roots and are usually very eager to learn more. Not anything with “100% accuracy” even but just something meaningful and relevant which goes beyond the lump category of “West African”. Again this sets them apart from customers with verifiable background who have the luxury to be snobbish about admixture analysis.

Unfortunately it seems that many DNA testing companies are either not able or do not have a true commitment to cater to the particular needs of Afro-Diasporans when it comes to admixture analysis and other aspects of DNA testing. I always thought Ancestry was an exception but I might have to change my mind after this upcoming update…  Again I do not have a full scope on what will effectively be implemented and how this might impact AncestryDNA’s current African breakdown. But based on what I have seen sofar maintaining the current African breakdown seems like a better option right now.

Although ultimately of course we would want to see an update that actually improves on how African DNA is being described in regional terms. Leading to greater insights and further specification rather than confusion and running the risk of being mislead about your ancestry. Given Ancestry’s ample resources, incl. probably the biggest number of African tested customers, I do think they can do so much more. Which is why I have presented this overview of suggestions for improvement.

It is quite sad as up till now I have always regarded AncestryDNA’s African breakdown as the best on the market.4 Certainly not without shortcomings but still very insightful already for understanding the roots of both Africans and people across the Afro-Diaspora. It would be a true loss if Ancestry’s pioneering analysis of especially West African DNA will turn out to have been downgraded rather than upgraded…

Updated results for an African American

***(click to enlarge)

AA 2xa

Take note of how the combined sum of previous “Cameroon/Congo” + “Southeastern Bantu” scores (23%) seems inconsistent with the 47% “Cameroon, Congo, and Southern Bantu”! Also striking how the formerly predominant region of “Benin/Togo” has decreased by more than 20% without any increase of neighbouring regions! Also very peculiar how “Nigeria” was only reported as a minimal trace amount and even absent after the update. But after analyzing this person’s matches using my filtering method this African American seems to have atleast 5 Nigerian matches out of most likely 10 African matches! These kind of seemingly incoherent discrepancies are not going to increase people’s confidence in their results!

***

I normally don’t actively plead for my blog posts to be shared on social media. However given that the stakes are quite high and as they say strength is in numbers 😉 I want to urge everyone who is in agreement with the main outline of these suggestions to share this blog post on Facebook, Twitter etc. as well as with Ancestry.com as soon as your results have been updated. Because you will then be given the opportunity to let them know if you found the update to be helpful or not. I am not sure how exactly they will handle such feedback but it might just be that given sufficient complaints Ancestry will rethink this update or atleast the African part of it…Of course you are free to personalize this feedback and add your own suggestions as well! In fact I would also very much like to encourage my blog readers to discuss these suggestions in the comment section below. In order to get a fruitful and constructive exchange of ideas going on which again hopefully Ancestry will take notice of starting from this Independence Day!

___________________________________________________________________________

Notes

1) It might be different story for the European and Asian breakdowns. I have actually seen quite encouraging updated results in this regard. And generally speaking they could be an improvement indeed. The non-African regional breakdowns are however not a topic of discussion in this blog post.

2) I strongly urge Ancestry to change the labeling of the socalled  “Portuguese Islander” migration into “Cape Verdeans”. This will be much more appropriate given that from what I have seen the vast majority of people being assigned to this genetic community share common Cape Verdean lineage. I understand there might be an overlap with actual Portuguese Islanders from the Azores & Madeira. However there are already several separate “migrations” in place for them.

3) Eventhough 23andme’s recent update of its Ancestry Composition was widely regarded as an anti-climax (see this discussion thread). It still did have some merit too. The addition of the socalled Recent Ancestor Locations potentially does have added value (even when they strike me as just being a stripped down version of the former Countries of Ancestry tool). But by setting up high thresholds which only cover potential ancestry from “the last two hundred years” it was bound to leave out any African matches (measured by dots) for most Afro-Diasporans. History teaches us that their African origins are mostly to be traced back to the 1700’s or even earlier (depending on specific background, see this link). This basic aspect about the Afro-Diaspora seems to have escaped 23andme for some reason…

Then again on the European side things looked more positive (for those willing to explore that side of their ancestry). Historically plausible matches from the UK & Ireland (measured by dots) being reported for African Americans, French ones for Haitians, Spanish ones for Hispanics and Portuguese ones for Cape Verdeans and Brazilians. In this aspect 23andme might be said to have gained the upperhand on Ancestry. Because the socalled migration feature on Ancestry is generally speaking not picking up on recent European connections for New Worlders. For an interesting comparison read this blog post:

4) A very promising development is however taking place with a new DNA testing company called: Living DNA. Something which I hope to be covering in greater detail in the near future. See the map on their West African project page, it looks very ambitious to be honest but even if only half of the intended resolution will be achieved this could be MAJOR!

  • https://www.livingdna.com/one-family/research/west-africa

    “Living DNA, working with the world’s leading academics, scientists and genealogists are seeking your help. Together we are looking to map the world’s genetic ancestry to the finest scale possible, one where we identify patterns of DNA within countries. Following our collaboration with the academic team involved in the landmark publication “The fine-scale genetic structure of the British population”, we are now looking to extend the level of genetic detail throughout West Africa. Our preliminary research indicates at least 55 areas of West Africa may have distinct genetic differences.

    The aim of the project is to confirm whether the proposed genetic boundaries are correct, and redefine them based upon the genetic data submitted by participants that fall within these regions. By participating in this project, you will help us to map the genetic heritage of west Africa and show how we are all connected based on our DNA.”

Advertisements

6 thoughts on “Suggestions for improving the African breakdown on AncestryDNA

  1. “Replacing the current “Ivory Coast/Ghana” region with three separate and properly labeled regions to describe and measure genetic affiliations with either Kru, Akan/Kwa or southwestern Mandé samples could increase its informational value tremendously.”

    A question, aren’t the Gbe languages in the Kwa family? Has science found these Kwa-speakers to be different from the Kwa speakers further west in Ghana/Ivory Coast? It’s weird because I’ve seen much talk of how well “Benin/Togo” matches with Gbe-speakers, genetically. But that doesn’t seem to mesh with language maps which show while Kwa languages pretty clearly end at the Benin/Nigerian borderlands that no such language distinction seems to exist between, say, Ewe speakers and Akan speakers.

    I’ve also been surprised by this seeming genetic cluster (Gbe-speakers) since Ancestry specifically makes note of the culture of Togo being closer to Ghana (Akan) and the culture of Benin being closer to Nigeria (Yoruba). So is the argument here that the “Akan/Kwa” needs to included “Benin/Togo” Gbe-speakers or that that region is in fact a defined genetic cluster that needs to be left out of this proposed “Akan/Kwa” region?

    Thanks!

    Liked by 1 person

    • My argument is for a separate region centered on Akan samples and a separate region centred on Gbe samples (incl. Ewe). Because from my survey findings it seems that Ancestry is able to make a reasonable distinction despite inevitable overlap. See also this chart where a group average of nearly 90% “Ivory Coast/Ghana” is shown for my Akan samples and a group average of around 70-80% for “Benin/Togo” for my Ewe and Beninese samples. The sample size is admittedly rather minimal however I have seen more results in preview which seem to confirm these preliminary patterns:

      About the linguistic classification of the Gbe languages, it seems to have changed recently due to new insights (see wikipedia). Formerly they appear indeed to have been grouped under the Kwa language family along with Akan languages and even Kru languages from Liberia at one time (these are now seen as forming their own group though!) However a new language family called Volta-Niger has been proposed lately which combines the Gbe languages rather with southern Nigerian languages such as Yoruba and Igbo. I do not speak any of these languages so obviously i have no way of knowing if such a classification is better suited, lol. But I find it striking that it does seem to fit quite well with my survey findings. I discuss this in more detail in section 3 of this page:

      https://tracingafricanroots.wordpress.com/ancestrydna/west-african-results-part-1/

      I have a section on my blog which features several language maps (see this link). It’s good to keep in mind though that due to different definitions and new insights they will often not look exactly the same! Also it seems there may be different linguistic theories in Francophone countries as opposed to the Anglo mainstream. Either way this map shows the new Volta-Niger group in purple , combining Gbe with southern Nigerian languages.

      Btw about the comment on Ancestry’s website that Togolese culture is geared more so toward Ghana (Akan) and Benin is closer to Yoruba culture. This could indeed be true. I’m far from an expert on Ghanaian culture but I’ve read that the Ewe have incorporated several Akan elements in their culture. Given their proximity and historical power balance that would kind of be as expected. Then again the adoption or similarity of cultural practices does not negate possibly different origins/genetics.

      Like

      • See, this is what I don’t quite understand. Whether they are grouped with Kwa or “Volta-Niger” it appears that the main languages spoken in Benin/Togo are not a seperate branch. And then given how genetically so many Southern Nigerians get “Benin/Togo” as a primary or even major secondary region, it makes me wonder the usefulness of “Benin/Togo” as a seperate genetic region. Are they really so much less genetically diverse than “Nigeria” – and thus distinct in being more homogenous population that speaks “Volta-Niger” languages – that they can justify having their own region apart from southern Nigeria?

        I’m just genuinely curious, as I’m not 100% convinced of this region, yet. Is there enough of a delineation from “Nigeria” do you think?

        Thanks! Love the discussion, Fonte.

        Liked by 1 person

        • Yes I also love this discussion, very stimulating! Honestly I think there will always be trade-offs to made with admixture analysis. You can’t always have clear delineation on all fronts. I understand why you would want to question the “Benin/Togo” region as it seems to have caused a great deal of confusion for many people when appearing as a main region. Legitimate ancestral ties with southern Nigeria and eastern Ghana are being obscured by it. And I’m afraid with the way Ancestry’s update seems to be set up right now this will happen even more so after everyone has had their results updated.

          For me the usefulness of this region lies mostly in its delineation to the west. As discussed it may not be a perfect measure but it allows for a distinction to be made between Gbe and Akan speakers. In my survey among the Afro-Diaspora (see this chart) for example this is seemingly confirmed by how “Benin/Togo” shows up very strongly (as expected) for Haitians. Also for Brazilians “Benin/Togo” is clearly prevailing over “Ivory Coast/Ghana”. It’s less clearcut for Jamaicans and Barbadians, who also show high levels of this region. But they could in fact have more Gbe lineage (also dating from earlier time periods) than people usually tend to be aware of, given the narrative of predominant Akan ties. For African Americans it’s probably most ambivalent as I have already blogged about elsewhere. And the degree of any genuine ancestral ties with either Benin or Togo might be the least for them. However also for African Americans it might pinpoint such lineage. See for example:

          https://www.dnatestedafricans.org/single-post/2018/05/11/An-Amazing-Success-Story-of-DNA-Testing-and-a-Benin-Reconnection-

          To the east, into southern Nigeria, there’s indeed too much overlap though. I wouldn’t know how to solve it. I have suggested taking out the Yoruba samples, just as a tweaking idea, to see what kind of results will then be obtained. Intuitively it seems to me that a greater degree of delineation may be obtained by including populations on the far ends of the range, to avoid too much overlap. By cutting out the “middle men” so to speak and having only Igbo or otherwise southeastern Nigerian + perhaps also Middle Belt samples for “Nigeria” it may result in a greater prediction accuracy. But I’m not sure.

          If major genetic overlap is indeed inevitable between Gbe and southern Nigerians then perhaps combining the Gbe samples with Yoruba samples might make for a workable alternative? This would then by necessity be a broader region. But historically you could then correlate this with the slave trade records from the Bight of Benin. In turn you might want to set up a new region combining Igbo and other southeastern Nigerian samples with appropriate Cameroonian samples to create a proxy region for covering Bight of Biafra ties. I’m doubtful people would want to forsake the current regional specification though.

          Again I have no personal experience with admixture modeling myself. But I have been part of a very early endeavour called the African Ancestry Project by Razib Khan: http://blogs.discovermagazine.com/gnxp/2011/05/admixture-african-ancestry-project-and-confirmation-bias/#.W0MipNIzZPY

          This experience taught me how you can manipulate the admixture results just by adding or leaving out certain reference populations and also by increasing or decreasing the number of preset clusters (K=..). Furthermore it ingrained to me from early on how the labeling of ancestral categories (or rather clusters) is subjective to some degree and therefore not to be taken as gospel haha.

          So what are your thoughts on this? Would you rather just have a region for Ghana and one for Nigeria with nothing in between? Also how do you feel about the other suggestions I made? Curious to know especially about 4, 7 & 8.

          Like

          • I actually really like your idea for these two regions: Bight of Biafra and Bight of Benin. I guess thee one issues that’s been sticking with me is how the Benin and Togo political borders seem particularly artificial even as artifically drawn borders go. And, I guess it depends on what any individual thinks of the importance of the Gbe-speaking peoples as distinct cluster, and particularly in the disaspora. I guess as an American in the diaspora, I see them as much more a historical branching off of proto-Yoruban people than some sharp dilenation from them, but I imagine that that would be offensive in their eyes.

            Like you said, there are trade-offs either way you go. I like the geographical regions of Biagra and Benin as opposed to the political regions. And like you said, the dileneation at the western border is fairly clear for this region, so then you could have an Akan-centered “Gold/Ivory Coast” region that takes in the Kwa-speakers and maybe some southern/eastern Kru-speakers. The whole “Ghana/Ivory Coast”, “Benin/Togo” and “Nigeria” just seems to confuse too many people on their first viewing of the names of these regions.

            I’m still unsure of what to do west of Ghana, though, and most of that is my ignorance of the genetics and their delineations to the west. “Windward Coast” could be a region, but that would seem to overlap with a “Ghana/Ivory Coast” region. Maybe “Ghana/Ivory Coast” would go as far west as the middle of Ivory Coast’s shoreline and then “Windward Coast” would pick up the Kru-speakers west of that line.

            In any case, as you can see, I’d like the use of the more historic and geographical names than the current political names for the regions if the genetics show those delineations.

            Liked by 1 person

            • Benin & Togo do indeed really have extra artificial borders. Especially Togo is an extreme example of a wider pattern within that area (Lower Guinea) of borders generally running from south to north delineating sometimes very narrow landstrips or otherwise rectangular shapes. Throwing together coastal areas that have been christianized for the most part with interior areas that are muslim mostly. It’s the opposite in Upper Guinea, where borders seem to run first from west to east, Gambia being the perfect counterexample of Togo.

              I do also think the use of historical or geographical names could work better than political or ethnic labels. But it might be tricky to find appropriate names which will find broader acceptance. Of course you can’t always please everyone. But I imagine that some people might also take offense to the usage of historical regions devised by Europeans during the Slave Trade period. No matter how instructive such labeling might actually be. Then again for Africans themselves it would not be informative at all. It’s again a trade-off I think.

              I suspect most people prefer ethnic labeling but I’m opposed to it because I find it highly misleading and catering to unrealistic expectations about the exact pinpointing of ethnic lineage when currently the science simply isn’t there yet. I’m very curious to know how Living DNA is going to solve this issue. They seem to go with ethnic clusters for now. For example for Benin & Togo they have identified potential clusters named: Fon, Adja, Aizo, Ewe, Somba, Bargu etc.. I’m pretty sure though that given wide spread inter-ethnic unions and shared origins from ancient times actual Beninese & Togolese people are probably going to come out as a composite of these regions. Instead of just being “100% Fon” or “100% Ewe” as people might expect given their self-identification.

              Perhaps a more broader grouping based on linguistic (sub) families such as Gbe or Akan could work better although it could still be potentially misleading as well in cases of overlap. Again providing a false sense of accuracy. I do believe customers have their own responsibility in informing themselves about the inherent limitations of admixture analysis. But perhaps something must be included explicitly within the regional labeling to indicate right away that it’s only meant as an approximation. Could something like “Gbe proxy” or “Akan affiliation” work?

              Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s