- 65 new regional codes represented for the first time on the YFull tree.
- 6 new regional codes in Turkey, 4 in Azerbaijan, 3 in Mongolia, the first two regional codes inside Peru (Lima and Cusco) and the first sample from Madagascar.
- Because they consist of some very large regions, I’ve entered the areas for the regional codes of Finland, Kazakhstan and Saudi Arabia. Now the distributions from samples represented by larger regions of these countries will spread further but at a lower intensity, proportional to their total area. This is what I had already implemented for Russian regional codes several months ago. So expect smoother distribution for haplogroups containing samples geolocated to larger specific regions in these countries.
- In the next update I plan to extend this treatment to other countries with notably large regional codes, including but not limited to Canada, US, Mexico and China.
Deduplication of multiple samples from the same person
I’ve added logic so that in cases where multiple samples are from the same person, indicated by a tooltip when you hover over the ‘i’ on the YFull tree, only the first geolocated sample encountered will count toward the Y Heatmap and PhyloGeographer migration calculation.
Another very small improvement affects only the YFull regional sample rate map. I noticed today that United Arab Emirates was not represented in the map despite being a very well sampled country. The reason was due to a mismatch between three letter countries codes for the country – YFull uses “UAE” whereas the mapping library I am working with uses “ARE” as does the World Bank population data source I use.