Some basic aspects have to be pointed up to the user of this database:

 This  data base has to be understood as a supplement to the book "The Distribution of the Human DNA-PCR-Polymorphisms" by W Huckenbeck, K Kuntze and H-G Scheil (Verlag Dr. Köster, Berlin); ISBN 3-89574-300-3; 1997.

We offer data which have been published after time of going to press (June 1997). The user will find the completed population data and references. Additionally the data base includes the actualized pooled data (Data of the book and new data pooled by weighted arithmetical means) for the relevant populations.

These pooled values are characterized by bold letter. These data are found at the following links:

Misprints – in the book – can be found in corrected form at the same site. The corrected values are characterized by red colour. 

In contrast to the book we decided that the sample size "n" should indicate the number of individuals instead of the number of alleles. This fact will be taken into account for the second edition of the book.

The descriptions of the populations examined are often insufficient. As far as possible they were cautiously standardized. In borderline cases the statements of the cited authors were assumed. Nevertheless, insufficient definitions could sometimes not be avoided. Descriptions like ‘China’ are not sufficiently clear for such a large area. Additionally, it was sometimes found to be more sensible to pick ethnic groups instead of political areas. For example, the data of Basque populations were pooled as ‘Basques’ without regard to their political affiliation.

 As shortly mentioned in the preface an important loss of data was caused by the fact that a number of authors limited themselves to the isolated presentation of bar charts instead of allele frequencies. To avoid a falsification of the other authors’ data, we knowingly avoided a visual evaluation of the bar charts and consequent extrapolation of the allele frequencies, which would be possible on a small scale. Nevertheless, we hope that one day this data will be published in a correct form and then be taken into account for a second edition. As far as possible, frequencies with three decimal places had been converted into four decimal places, obvious incorrect allele frequencies had been recalculated using the published genotype frequencies

The partly differing nomenclature has also created some difficulties. Although the International Society of Forensic Haemogenetics (ISFH) has recommended standardization (4, 5, 6), some authors have used their individual or obsolete nomenclature. In each case we tried to bring this data into line with the usual nomenclature, but it was not feasible in every case. The accuracy of these adaptions can verified by the authors themselves. If we made mistakes here we ask them to be lenient.

One more problem is in the use of different DNA protocols. Varied techniques like the use of native or denaturated gels resulted in a graphic loss of data for example in the SE33 system.

The aim of this work was to create by pooling sample sizes as large as possible. But one more problem occured. Some authors described alleles found close together (sub-alleles) separately while some authors combined them. In order to avoid a major loss of data, most of the time we used the combined data for our calculations. This concerns the alleles FES*10/10 a and 11/11 a, for example, but also the subtyping in the HLA-DQa system: alleles *4.1 and *4.2/4.3. Due to the population specific importance of the allele TH01*10, for this system population samples with separately typed alleles *9.3 and *10 were handled in the combined version and separately.

 In many cases the sample sizes are still inadequate. Apart from these exceptions, very low sample sizes were not taken into account. If possible, data of political (‘Germany’) and ethnic (‘Basques’) units respectively were pooled by use of the weighted arithmetical means. The aim was to create more solid data bases. Apart from this advantage this technique also has disadvantages: it appears bold to the authors (even though it is done) to pool data of such a large area as like China, Russia or India. It may happen that differing populations were compiled without considering the historical or genetic relations. Taking this into account in future, studies should attach more importance to the exact definition of the population samples. Another disadvantage of the pooling of data is that existing differences in populations may be concealed. In our opinion (with sample sizes as large as they are today) the errors caused by this effect will be small and can be neglected. Surely, in the future it will be desirable to get solid data bases for smaller geographical and political units, too. In any case, the user of the tables can fall back on the additionally cited single data.

In some cases of pooled data the sum of allele frequencies deviates clearly from 1. This is due to the fact, that the published data were not rarely found relatively unclear, a phenomenon which inevitably influenced the pooled data, too.


CD4 system
Worldwide data CD4 system
CD4.pdf [255.7 KB]
worldwide data CSF1PO system
CSF1PO.pdf [590.1 KB]
Worldwide data CYP19 system
CYP19.pdf [99.5 KB]
Worldwide data D1S80 system
D1S80.pdf [1.5 MB]
Worldwide data D2S1338 system
D2S1338.pdf [557.4 KB]
Worldwide data D3S1358 system
D3S13558.pdf [2.1 MB]
Worldwide data D5S818 system
D5S818.pdf [1.8 MB]
Worldwide data D79820 system
D7S820.pdf [582.3 KB]
Worldwide data D8S1132 system
D8S1132.pdf [103 KB]
Worldwide data D8S1179 systtem
D8S1179.pdf [558.3 KB]
Worldwide data D12S391 system
d12s391.pdf [285.3 KB]
Worldwide data D13S317 system
D13S317.pdf [621.9 KB]
Worldwide data D16S539 system
D16S539.pdf [262.3 KB]
Worldwide data D18S51 system
D18S51.pdf [1.2 MB]
Worldwide data D19S433 system
D19S433.pdf [602.4 KB]
Worldwide data D21S11 system
D21S11.pdf [1 MB]
Worldwide data F13A1 system
F13A1.pdf [527.5 KB]
Worldwide data F13B system
F13B.pdf [271.9 KB]
Worldwide data FES system
FES.pdf [333.3 KB]
Worldwide data FGA system
FGA.pdf [1.5 MB]
Worldwide data HLA-DQa system
hla-dq.pdf [847.4 KB]
Worldwide data HPRTB system
HPRTB.pdf [212 KB]
Worldwide data LPL system
Lipol.pdf [235 KB]
worldwide data PENTA D system
Penta D.pdf [636.6 KB]
Worldwide data PENTA E system
Penta E.pdf [726.8 KB]
Worldwide data Polymarker system
Polymarker.pdf [226 KB]
Worldwide data TH01 system
TH01.pdf [1.2 MB]
Worldwide data TPOX system
TPOX.pdf [563 KB]
Worldwide data VWA31
VWA31.pdf [957.1 KB]
YNZ22 (D17S5, D17S30)
Worldwide data YNZ22 system
YNZ22.pdf [168.6 KB]