Pentaploid genotype frequencies

Pentaploid genotype frequencies

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I know that if I am dealing with a diploid case, and I have 3 alleles, then I can have 6 possible genotypes. I am doing this by adding up all the numbers from 1 to 3. $$1+2+3 = 6$$ But if I want to generalize this, say for a pentaploid, how do I effectively compute the number of distinct possible genotypes? (Assuming perfect HW equilibrium)

I want to say this is just a simple problem of combination with replacement, but I'm not sure if the biology allows me to make this statement.

Ploidy number vs number of alleles

You seem to be confounding number of alleles with ploidy numbers. You rightly figured the number of possible genotypes for a diploid individuals when there are 3 possible allelic states. When you ask for a pentaploid, are you many alleles are you wiling to consider?

Pentaploids are very rare

If I am not mistaken, species with an odd ploidy number are very rare. Tetraploids and hexaploids, while still rare are much more common than pentaploids.

General case

It is indeed a simple math problem. Let's answer to the question in the most general case. If the ploidy number is $P$ and there are $S$ possible alleles, then the number of possible genotypes is

$${P+S-1 choose S-1} = {P+S-1 choose P}$$

, where ${choose }$ refers to the binomial coefficient. For a pentaploid organism, with 3 possible alleles, there are therefore 21 possibilities.

More information

You can have a look at this webpage that offers much more info about the combinatorics of genotypes.

Allele and genotype frequencies of CYP2C9, CYP2C19 and CYP2D6 in an Italian population

The polymorphic cytochrome P450 isoenzymes (CYPs) 2C9, 2C19 and 2D6 metabolise many important drugs, as well as other xenobiotics. Their polymorphism gives rise to important interindividual and interethnic variability in the metabolism and disposition of several therapeutic agents and may cause differences in the clinical response to these drugs. In this study, we determined the genotype profile of a random Italian population in order to compare the CYP2C9, CYP2C19 and CYP2D6 allele frequencies among Italians with previous findings in other Caucasian populations. Frequencies for the major CYP2C9, CYP2C19 and CYP2D6 mutated alleles and genotypes have been evaluated in 360 unrelated healthy Italian volunteers (210 males and 150 females, aged 19-52 years). Genotyping has been carried out on peripheral leukocytes DNA by molecular biology techniques (PCR, RFLP, long-PCR). CYP2C9, CYP2C19 and CYP2D6 allele and genotype frequencies resulted in equilibrium with the Hardy-Weinberg equation. One hundred and fourteen subjects (31.7%) carried one and 23 subjects (6.4%) carried two CYP2C9 mutated alleles. Sixty-eight (18.9%) volunteers were found to be heterozygous and six (1.7%) homozygous for the CYP2C19*2, while no CYP2C19*3 was detected in the evaluated population. Volunteers could be divided into four CYP2D6 genotypes groups: 192 subjects (53.3%) with no mutated alleles (homozygous extensive metabolisers, EM), 126 (35.0%) with one mutated allele (heterozygous EM), 12 (3.4%) with two mutated alleles (poor metabolisers, PM) and 30 (8.3%) with extracopies of a functional gene (ultrarapid metabolisers, UM). Frequencies of both CYP2C9 and CYP2C19 allelic variants, as well as CYP2D6 detrimental alleles, in Italian subjects were similar to those of other Caucasian populations. Conversely, the prevalence of CYP2D6 gene duplication among Italians resulted very high, confirming the higher frequency of CYP2D6 UM in the Mediterranean area compared to Northern Europe.

Population Genetics

Scope of Population Genetics

Population genetics seeks to understand how and why the frequencies of alleles and genotypes change over time within and between populations. It is the branch of biology that provides the deepest and clearest understanding of how evolutionary change occurs. Population genetics is particularly relevant today in the expanding quest to understand the basis for genetic variation in susceptibility to complex diseases. Many of the factors that affect allelic frequency and associations among alleles of linked genes have been first characterized in Drosophila and other model organisms, but the same principles apply to virtually all organisms.

Shortly after the rediscovery of Mendel's laws in 1900, a raging controversy developed over the relevance of the kind of variation and transmission that Mendel characterized to the smooth, continuous variation that biologists had noted and measured in virtually all organisms. Could the continuous variation in stature, for example, be explained by underlying genes of the sort Mendel described? One of the arguments against Mendel's genes was that recessive alleles would soon be lost from a population by virtue of its recessiveness. Godfrey Hardy and Wilhelm Weinberg independently demonstrated the folly of this argument, and showed instead that randomly mating populations would be expected to retain the allelic variation by simple Mendelian principles unless some other force acted on the variation. But this did not fully resolve the question of why parents and offspring have correlated phenotypes for continuously varying traits.

It was the theoretical population geneticist Ronald Fisher who developed the mathematics to show exactly how many genes acting together could produce the precise quantitative degrees of familial resemblance that are observed. This was one of many instances in the history of population genetics in which a formal mathematical model of the problem paved the way to understanding what empirical data needed to be gathered to test the new conceptualization. Fisher went on to develop, along with Sewall Wright and J. B. S. Haldane, much of the theory for allelic frequency change under simple models of natural selection. Wright and Fisher developed the theoretical machinery needed to understand the complex process of recurrent sampling that we now call random genetic drift. By 1940 much of the theory for the ‘modern synthesis’ of Darwinian evolution and Mendelian transmission genetics had been developed.

Before considering the development of the empirical aspects of population genetics, the basic mechanisms that underlie the modern synthesis are briefly reviewed below.

The Deiodinase Type 2 (DIO2) Gene and Mental Retardation in Iodine Deficiency

Ting-Wei Guo , . Lin He , in Comprehensive Handbook of Iodine , 2009

Statistical Analysis

Allele frequencies were calculated using SPSS 10.0 software for Windows (SPSS, Inc., Chicago, IL). Deviations from Hardy–Weinberg equilibrium (HWE), differences in allele and genotype distributions, and odds ratios (OR) with 95% confidence intervals (CI) were calculated using Finetti ( Sasieni, 1997 ). Linkage disequilibrium (LD) between two loci was measured using a two-locus LD calculator (2LD) ( Zhao, 2002 ). Haplotypes were inferred by Bayesian methods ( Stephens et al., 2001 ) and implemented in the PHASE package version 1.0 ( ). Differences of genotype and haplotype distribution between patient and control groups were assessed by the Monte Carlo method using the CLUMP program version 1.9 with 10000 simulations ( Sham and Curtis, 1995 ).

1) A study on blood types in a population found the following genotypic distribution among the people sampled: 1101 were MM, 1496 were MN and 503 were NN. Calculate the allele frequencies of M and N, the expected numbers of the three genotypic classes (assuming random mating). Using X2, determine whether or not this population is in Hardy-Weinberg equilibrium.

Freq of M = p = p2 + 1/2 (2pq) = 0.356 + 1/2 (0.482) = 0.356 + 0.241 = 0.597

Freq of N = q = 1-p = 1 - 0.597 = 0.403.



X2 = (1101-1107)2 /1107 + (1496-1491)2 /1491 + (502-503)2 /503

X2 (calculated) < X2 (table) [3.841, 1 df, 0.05 ls].

Therefore, conclude that there is no statistically significant difference between what you observed and what you expected under Hardy-Weinberg. That is, you fail to reject the null hypothesis and conclude that the population is in HWE.

2) A scientist has studied the amount of polymorphism in the alleles controlling the enzyme Lactate Dehydrogenase (LDH) in a species of minnow. From one population, 1000 individuals were sampled. The scientist found the following fequencies of genotypes: AA = .080, Aa = .280 aa = .640. From these data calculate the allele frequencies of the "A" and "a" alleles in this population. Use the appropriate statistical test to help you decide whether or not this population was in Hardy-Weinberg equilibrium.

p = Freq A = 0.08 + 1/2 (0.28) = 0.08 + 0.14 = 0.22

IF population is in HWE, then you'd expect the following frequencies:

Expected Numbers
Observed Numbers
0.0484 X 1000 = 48.4
0.080 X 1000 = 80
0.3432 X 1000 = 343.2
0.280 X 1000 = 280
0.6084 X 1000 = 608.4
0.640 X 1000 = 640
X2 = [(80 - 48.4)2/ 48.4] + [(280 - 343.2)2 / 343.2] + [(640 - 608.4)2/ 608.4]

X2 (Calculated) > X2 (table), therefore reject null hypothesis. Not in HWE.

3)The compound phenylthiocarbamide(PTC)tastes very bitter to most persons. The inability to taste PTC is controlled by a single recessive gene. In the American white population, about 70% can taste PTC while 30% cannot (are non-tasters). Estimate the frequencies of the Taster (T) and nontaster (t) alleles in this population as well as the frequencies of the diploid genotypes.

Estimated Freq t = q =square root of q2=square root of 0.30 = 0.5477

Freq T = p = 1 - q = 1 - 0.5477 = 0.4523

Tt = 2pq = 2(0.4523)(0.5477) = 0.4956

4) In another study of human blood groups, it was found that among a population of 400 individuals,230 were Rh+and 170 were Rh-.. Assuming that this trait (i.e., being Rh+) is controlled by a dominant allele (D), calculate the allele frequencies of D and d. How many of the Rh+ individuals would be expected to be heterozygous?

Number of dd individuals = 170, therefore the frequency of the genotype dd (q2) is 170/400 = 0.425. From this, we can estimate q as:

q = square root of q2 = square root of 0.425 = 0.652.

The allele frequency of D is:

Assuming HWE, genotype frequencies are as follows:

Using the expected genotype frequencies, the number Dd among the Rh+ individuals is:

5) Phenylketonuria is a severe form of mental retardation due to a rare autosomal recessive allele. About 1 in 10,000 newborn Caucasians are affected with the disease. Calculate the frequency of carriers (i.e., heterozygotes).

Given the above, estimate q from q2

q = square root of q2 =square root of 1/10,000 = square root of 0.0001 = 0.01

Therefore, p = 1 - q = 1 - 0.01 = 0.99

Using Hardy-Weinberg Law, calculate the expected number of individuals of each genotype as:

Therefore, 1.98% of the population is expected to be carriers.

6) For a human blood, there are two alleles (called S and s) and three distinct phenotypes that can be identified by means of the appropriate reagents. The following data was taken from people in Britain. Among the 1000 people sampled, the following genotype frequencies were observed SS = 99, Ss = 418 and ss = 483. Calculate the frequency of S and s in this population and carry out a X2 test. Is there any reason to reject the hypothesis of Hardy-Weinberg proportions in this population?

Observed Genotype frequencies:

Frequency of S = p = p2 + 1/2 (2pq) = 0.099 + 1/2 (0.418) = 0.308

Frequency of s = q = 1 - p = 1 - 0.308 = 0.692.

Expected Genotype frequencies:

Ss = 2pq = 2 (0.308)(0.692) = 0.426

Expected number of individuals:

X2 = (99-95)2 /95 + (418-426)2 /426 + (483-479)2 /479

X2 (calculated) < X2 (table) [3.841, 1 df at 0.05 ls).

Therefore, fail to reject null hypothesis and conclude that the population is in HWE.

7) A botanist is investigating a population of plants whose petal color is controlled by a single gene whose two alleles (B & B1) are codominant. She finds 170 plants that are homozygous brown, 340 plants that are homozygous purple and 21 plants whose petals are purple-brown. Is this population in HWE (don't forget to do the proper statistical test)? Calculate "F" (inbreeding coefficient) and explain what is happening in this population.

Freq. of brown (BB) = p2 = 170/531 = 0.32

Freq. of purple-brown (B1B) = 2pq = 21/531 = 0.04

Freq. of purple (B1 B1) = q2 = 340/531 = 0.64.

Freq of B = p = p2 + 1/2 (2pq) = 0.32 + 1/2(0.04)

Freq of B1 = q = 1- p = 1 - 0.34

Expected Genotype Frequencies:

B1B = 2 pq = 2 (0.34)(0.66) = 0.4488

X2 = (170-61.4)2 /61.4 + (21-238.3)2 /238.3 + (340-231.3)2 /231.3

X2 (Calculated) > X2 (table), therefore reject null hypothesis. Not in HWE.

What is Allele Frequency

The allele frequency is the frequency of the two forms of a particular allele in a population. They are dominant and recessive alleles . Each allele frequency can be calculated by dividing the number of individuals with the allele form by the total number of individuals in the population . Here, the p represents the dominant allele frequency of the population while the q allele represents the recessive allele frequency. Also, the sum of the allele frequencies in a population is equal to 1.

Figure 2: Inheritance of Dominant and Recessive Alleles

Pentaploid genotype frequencies - Biology

Click link to return to Biology 409 Schedule
or back to Chapter 4 or ahead to Chapter 6

General guide on these review questions here

Notes for Chapter 5: Genetic Equilibrium

RQUE5.1: Under assumptions of random mating, why is it that genotype frequencies tend to remain about constant?

RQUE5.2: If there is little difference in the fitness of rare vs. common alleles, why do rare alleles tend to persist through time?

IV. Hardy-Weinberg Equilibrium

Teachers: Check out these lessons on genetics and evolution

RQUE5.3: Estimate the expected frequency of genotype Aa if the allele frequency of A = 0.3 and a = 0.7. Would it be unexpected to have a case like this where a dominant allele less common than a recessive one?

V. Estimating the Frequency of Heterozygotes

RQUE5.4: How common are rare recessive abnormal alleles, for example, in human populations?

RQUE5.5: If Hardy-Weinberg equilibrium theory describes conditions under which populations do not evolve, why is this theory still important to understanding how populations might evolve?

Click link to return to Biology 409 Schedule
or back to Chapter 4 or ahead to Chapter 6

Genotype-free estimation of allele frequencies reduces bias and improves demographic inference from RADSeq data

Vera M. Warmuth, Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.

Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden

Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden

Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Martinsried, Germany

Vera M. Warmuth, Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.

Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden


Restriction-site associated DNA sequencing (RADSeq) facilitates rapid generation of thousands of genetic markers at relatively low cost however, several sources of error specific to RADSeq methods often lead to biased estimates of allele frequencies and thereby to erroneous population genetic inference. Estimating the distribution of sample allele frequencies without calling genotypes was shown to improve population inference from whole genome sequencing data, but the ability of this approach to account for RADSeq-specific biases remains unexplored. Here we assess in how far genotype-free methods of allele frequency estimation affect demographic inference from empirical RADSeq data. Using the well-studied pied flycatcher (Ficedula hypoleuca) as a study system, we compare allele frequency estimation and demographic inference from whole genome sequencing data with that from RADSeq data matched for samples using both genotype-based and genotype free methods. The demographic history of pied flycatchers as inferred from RADSeq data was highly congruent with that inferred from whole genome resequencing (WGS) data when allele frequencies were estimated directly from the read data. In contrast, when allele frequencies were derived from called genotypes, RADSeq-based estimates of most model parameters fell outside the 95% confidence interval of estimates derived from WGS data. Notably, more stringent filtering of the genotype calls tended to increase the discrepancy between parameter estimates from WGS and RADSeq data, respectively. The results from this study demonstrate the ability of genotype-free methods to improve allele frequency spectrum- (AFS-) based demographic inference from empirical RADSeq data and highlight the need to account for uncertainty in NGS data regardless of sequencing method.

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Hardy-Weinberg & Population Genetics

The Hardy-Weinberg principle is a mathematical model used to describe the equilibrium of two alleles in a population in the absence of evolutionary forces. This model was derived independently by G.H. Hardy and Wilhelm Weinberg. It states that the allele and genotype frequencies across a population will remain constant across generations in the absence of evolutionary forces. This equilibrium makes several assumptions in order to be true:

  1. An infinitely large population size
  2. The organism involved is diploid
  3. The organism only reproduces sexually
  4. There are no overlapping generations
  5. Mating is random
  6. Allele frequencies equal in both genders
  7. Absence of migration, mutation or selection

As we can see, many items in the list above can not be controlled for but it allows for us to make a comparison in situations where expected evolutionary forces come into play (selection etc.).

Hardy-Weinberg Equilibrium

The alleles in the equation are defined as the following:

  • Genotype frequency is calculated by the following:
  • Allele frequency is calculated by the following:
  • In a two allele system with dominant/recessive, we designate the frequency of one as p and the other as q and standardize to:
  • Therefore the total frequency of allalleles in this system equal 100% (or 1)
  • Likewise, the total frequency of all genotypes is expressed by the following quadratic where it also equals 1:
    • This equation is the Hardy-Weinberg theorem that states that there are no evolutionary forces at play that are altering the gene frequencies.

    Calculating Hardy-Weinberg Equilibrium (activity)

    This exercise refers to the PTC tasting exercise . One can test for selection for one allele within the population using this example. Though the class size is small, pooling results from multiple section can enhance the exercise. Remember to surmise the dominant/recessive traits from the class counts.

    1. What is the recessive phenotype and how can we represent the genotype?
    2. What is the dominant phenotype and how can we represent the genotypes?
    3. What is the frequency of recessive genotype? (q 2 )
    4. What is the frequency of the recessive allele? (q)
    5. What is the frequency of the dominant allele?(p=1-q)
    6. Use Hardy-Weinberg to calculate the frequency of heterozygotes in the class. (2pq)
    7. Use Hardy-Weinberg to calculate the frequency of homozygotes in the class. (p 2 )
    8. Using an aggregate of multiple section, compare the local allelic and genotypic frequencies with what the Hardy-Weinberg would predict.
    9. With this small number in mind, we can see that there are problems with the assumptions required for this principle. The instructor will perform the following simulation in class to illustrate the effects on multiple populations with the effects of selection and /or population limitations. A coefficient of fitness can be applied to illustrate a selective pressure against an allele.
      • Population Genetics Simulation of Alleles
    10. In the case of a selective pressure, a fitness coefficient ( w ) can be introduced. A research article has shown that the Tas2R38 receptor aids in the immune response against Pseudomonas. Imagine a situation where there is an epidemic of antibiotic resistant Pseudomonas. This would
      show that the dominant allele will have a selective advantage.

    • Modify the fitness coefficient in the Population Genetics Simulator and describe the effects this would have over many successive generations.