6 New Geographic Codes Added
- UAE (though AE existed before)
- ER-DU Eritrea, Debub
- AF-LOW Afghanistan, Logar
- SD-13 Sudan, Janub Kurdufan
- AL-LB Albania, Librazhd
- AL-PR Albania, Përmet
Corrected Codes
Moved UAE closer to geographic center and Abu Dhabi
Many incorrect codes in Sudan corrected. The mistakes came from my original source and were usually confused with similar place names in Egypt or Saudi Arabia.
SD[01, 18, 07, 26, 17, 25] Ash Shumaliyah, Al Buhayrat, Al Jazirah, Al Baḩr al Aḩmar, Baḩr al Jabal, Sinnar
Improvements to Migration Calculating Algorithm
Clades with three children (basal samples and/or subclades) such that one child is close to the clade's parent and grandparent and the other two children are close to each other but further away should now in most cases be calculated to have originated near the parent location, rather than the (possibly oversampled) child locations.
I tested this behavior with haplogroup I-Y8943, which had previously been computed to have formed in Ireland, due to the majority of subclades (2/3) being found there.
However, based on the deep diversity in Scandinavia haplogroup researchers informed me that the a Scandinavian origin was more likely, which I agree with based on the YFull tree.
A second improvement affects clades with exactly two samples and/or subclades. When these two children are so distant from each other, that the total distance from their parent clade's parent through the interpolated point between them to each of the two children exceeds the sum of the distances from the parent clade's parent directly to each child, the latter migration results.
This avoids clades being computed to have formed in a 'no man's land' far from either child location and from the clade's parent.
These changes are algorithmic and will affect all previously computed paths, to a greater or lesser extent.
Improving PhyloGeographer is an iterative process. You can help me by pointing out subclades that, based on YFull sample distribution, you believe ought to be calculated to different locations.
Please keep in mind that the algorithm does not make use of coastline information. Coastlines change over time and to fully solve the problem of clades being computed in the water would require obtaining a set of coastline geometries for the whole world covering different time periods. It's beyond the scope of what I can do as one person who is not funded.
If you want me to work toward this goal, consider becoming a sponsor of PhyloGeographer on Patreon so you can vote for this feature in my regular polls for future improvements.
PhyloGeographer Project on Patreon