Why does low sequence diversity cause sequencing problems

Why does low sequence diversity cause sequencing problems

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

When sequencers are processing sequence fragments, if there is little diversity at a particular bp locus this can cause problems for the sequencer. I have the superficial understanding that when a flow cell contains a significant percentage of a single base on any given cycle, it has difficulty deciding what base is present in each sequence. Why is this?

I think the problem is more that the software has a hard time identifying individual clusters if way more than a quarter of them are lighting up for a given base. If you have more diversity in the first few cycles, when the cluster coordinates are being identified, I think this is less of a problem.

Why does low sequence diversity cause sequencing problems - Biology

Climate change causes loss of genetic diversity
April 2012

Where's the evolution?
Genetic variation refers to the idea that different individuals in a population are likely to carry different genetic sequences for corresponding regions of the genome. For example, in humans, one individual may carry genetic sequences that code for type A blood, and someone else may carry sequences that code for type O blood. Similarly, people may carry different sequences in regions of the genome that don't code for anything at all. Both are examples of genetic variation. Populations that have fewer genetic differences from one individual to the next have lower levels of genetic variation.

In the case of the chipmunks, scientists were able to study DNA from museum specimens collected in the early 1900s and compare those to samples from chipmunks collected in the last 10 years. The modern chipmunks were much more genetically homogenous — i.e., had fewer different genetic sequences at the seven DNA regions studied — than their historic counterparts. Though the study only looked at a few non-coding regions, it is very likely that all parts of the chipmunks' genome — those that help produce important traits, as well as those that don't code for anything — have lost genetic variation.

Climate change is the obvious culprit. As shown on the maps below, the range of the alpine chipmunk (and almost certainly, its population size) has decreased significantly over the last 90 years as the climate has warmed. And of course, as the population lost individuals, it also lost the genetic variants they carried, reducing its level of genetic variation. For comparison, the lodgepole chipmunk, T. speciosus, is not as finicky and has maintained its historic range despite the heat. When scientists studied the lodgepole chipmunk, they found similar levels of genetic variation in historic and modern samples. Since the main difference between the two species seems to be their response to rising temperatures, climate change is the most likely cause of reduced genetic variation in the alpine chipmunk.

    First, it corroborates the idea that the population is shrinking and becoming more fragmented — i.e., small subpopulations are being cut off from one another. Both of these factors increase the species' chance of extinction.

All three of these factors have the potential to increase extinction risk for the chipmunks.

Of course, at the moment, the chipmunks seem to be doing fine and are not listed as a threatened species. But the new research highlights the risks they are likely to face in the near future, as well as the risks faced by many other species whose ranges are contracting due to the warming climate. Alpine chipmunks are one of the few species for which we can easily examine changes in levels of genetic variation over the last century because we happen to have the right museum specimens. Many other high-altitude species have been moving uphill as well and are likely facing the same double-whammy that the chipmunks are — decreased population size and the hidden side effects of reduced genetic variation.

    Moritz, C., Patton, J. L., Conroy, C. J., Parra, J. L., White, G. C., and Beissinger, S. R. (2008). Impact of a century of climate change on small-mammal communities in Yosemite National Park, USA. Science. 322: 261-264.

from the UC Berkeley News Center

from The Daily Californian

Understanding Evolution resources:

Discussion and extension questions

    Why is genetic variation so important to evolution?

a. What percent of the population has the Aa genotype?

b. What percent of the population has poor vision caused by the aa genotype?

c. If the population mates randomly, in what percentage of matings will an Aa individual mate with another Aa individual?

d. In what percentage of the matings will an AA individual mate with an Aa individual?

Now imagine that two unrelated individuals (one with the genotype AA and one with the genotype Aa) mate and produce 20 pups over the course of their lifetimes.

e. What percentage of the pups would you expect to be carriers of the a allele?

f. How many pups would you expect to be carriers of the a allele?

g. What percentage of the pups would you expect to have poor vision?

Now imagine that this set of pups is isolated from other chipmunks and the pups wind up mating with each other — i.e., with siblings.

h. If the sibs mate randomly with one another, in what percentage of matings will an Aa individual mate with another Aa individual?

i. In those cases (when an Aa sib mates with another Aa sib), what percentage of the offspring will have poor vision?

Related lessons and teaching resources

    : In this lesson for grades 9-12, students conduct a classwide inventory of human traits, construct histograms of the data they collect, and play a brief game that introduces students to major concepts related to human genetic variation and the notion of each individual's uniqueness.

: This lesson for grades 9-12 uses paper chromatography to simulate electrophoresis of DNA. The problem posed is to identify the genetic similarities among several sub-species of wolf in order to provide information for a conservation/breeding program.

: This news brief for grades 9-16 describes a conservation plan to save the Florida panther by infusing genetic variation into the population. This article comes with a set of discussion questions for use in the classroom, as well as links to related lessons.

    Moritz, C., Patton, J. L., Conroy, C. J., Parra, J. L., White, G. C., and Beissinger, S. R. (2008). Impact of a century of climate change on small-mammal communities in Yosemite National Park, USA. Science. 322: 261-264.

Genes and Inherited Diseases

Genes are special segments of DNA letters that, when read correctly by the body's proteins, can provide a specific and important instruction for the body to function properly. Researchers estimate that there are about 22,000 genes contained in the genome. Although genes are very important, they make up only a small percentage of all of the DNA in the genome. Each gene has a specific location on one of our 23 chromosomes and is inherited, or passed down, from generation to generation as a unit. We have two copies of each chromosome and, thus, two copies of each gene.

We inherit one copy from each of our parents and, in turn, pass on one of our two copies to each of our children.

Each gene contains a specific set of instructions for the body. Some genes contain multiple sets of instructions. Usually these instructions make a protein. There are many different types of proteins in our bodies which can perform multiple important tasks. For example, proteins form the basis of our organ tissues, bones, and nervous system. They also guide how we digest food and medications.

What is an Inherited Disease?

Although genetic factors play a part in nearly all health conditions and characteristics, there are some conditions in which the genetic changes are almost exclusively responsible for causing the condition. These are called genetic disorders, or inherited diseases.

Since genes are passed from parent to child, any changes to the DNA within a gene are also passed. DNA changes may also happen spontaneously, showing up for the first time within the child of unaffected parents. This is referred to as a new mutation, where the word mutation means change.

Sometimes this change can cause mistakes in the protein instructions, leading to production of a protein that doesn't work properly or cannot be made at all. When one protein is missing or not working as it should, it can cause a genetic disorder.

The genetics of each disorder are unique. In some cases, all the mistakes in a particular gene cause one specific genetic disorder. In other cases, different changes within the same gene can lead to different health or developmental problems or even to different genetic disorders. Sometimes changes in several similar genes may all lead to the same genetic disorder.

When Do Inherited Diseases Appear?

Genetic disorders are typically inherited (passed down) in either a dominant or recessive manner. We each have two copies of every gene on our 22 numbered chromosomes. In addition, females have two copies of all the genes on the X chromosome, whereas males have one copy of the X chromosome genes and one copy of the Y chromosome genes.

When a disorder is dominant, the disease can occur when there are DNA mistakes in only one of the two gene copies. This means that if a parent has the DNA change, there is a 50-50 chance that it will be passed on to each child.

When a disorder is recessive, there must be mistakes in both copies of the gene for the disorder to occur. This means that both parents must carry at least one copy of the specific gene change in order to produce an affected child.

If both parents have one changed copy, there is a 1 in 4, or 25% chance, that a child may inherit both changed copies at the same time, causing the disorder in the child. Parents who have only one changed gene copy usually do not display any symptoms of the disorder and may not even know they carry a gene change. Researchers estimate that we each have ("carry") 6-10 recessive gene changes. Certain recessive gene changes may be more common in different population groups. For example, sickle cell gene changes are found more often in individuals with West African ancestry and cystic fibrosis gene changes are more common in individuals with North European ancestry.

In addition to the inheritance pattern, some genetic disorders may be inconsistent when it comes to whether a person develops symptoms and their severity. Penetrance refers to whether the person who has the causative gene changes actually develops any symptoms of the disorder. Expressivity refers to the symptoms that may develop and their severity.

[D] Why not use momentum based optimizer with WGAN?

I keep reading that I should not use any momentum when optimizing a WGAN. Nothing I have read offers an explanation as to why. So why shouldn't I use a momentum based optimizer? Anyone know the reason?

This is presumably because of the adversarial training dynamics. You want the generator/critic to be able to respond to any new features identified by the other, so momentum would hurt this responsiveness and could lead to instabilities and mode collapse.

Finally, as a negative result, we report that WGAN training becomes unstable at times when one uses a momentum based optimizer such as Adam [8] (with β1 > 0) on the critic, or when one uses high learning rates. Since the loss for the critic is nonstationary, momentum based methods seemed to perform worse. We identified momentum as a potential cause because, as the loss blew up and samples got worse, the cosine between the Adam step and the gradient usually turned negative. The only places where this cosine was negative was in these situations of instability. We therefore switched to RMSProp [21] which is known to perform well even on very nonstationary problems [13].

Click here to order our latest book, A Handy Guide to Ancestry and Relationship DNA Tests

I never really understood what happens if you are inbred. I know it means your parents are closely related, but does having closely related parents mess with your DNA?

-A high school student from Michigan

That is an interesting question with an equally interesting answer! Having closely related parents doesn’t exactly ‘mess with your DNA’, as you put it. But it does mean that you have less diversity, or variety, in your DNA. And diversity can be very important to your health.

Less variety in your DNA can increase your chances for getting rare genetic diseases. You may have heard of some of these diseases: albinism, cystic fibrosis, hemophilia and so on.

Less variety in your DNA can also make you unhealthy in another way – it can weaken your immune system so you can’t fight off diseases as well. You can end up a very sickly person!

Now of course inbreeding doesn’t mean you will definitely get a genetic disease or wind up sickly. You are just more likely to have health problems. And the more inbreeding, the greater the risk.

So inbreeding doesn’t actually make your DNA change in any way. Instead, inbreeding is risky because it means the DNA from your mom and your dad is similar. And as you’ll see below, when these similar parts come together in their child, this child can end up with problems.

Same DNA, Same Diseases

Every person has 46 chromosomes and each chromosome holds a bunch of genes. Each gene has the directions for one small part of you. So there is a gene that determines if you’ll have red hair, one that gives color to your skin by making melanin, another one that helps blood to carry oxygen, and so on.

You actually have two sets of 23 chromosomes. One set of 23 comes from mom and the other 23 comes from dad. Since each set of chromosomes has the same set of genes*, this means that you have two copies of most every gene. What is important for making us each unique is that the copy you get from your mom can be very different than the copy you get from your dad.

So for example, the gene that causes red hair comes in a red version, and a not-red version (these different versions are called 'alleles'). And the gene that makes a pigment called melanin comes in a normal version that makes melanin and a broken one that doesn't. If you only have the broken one, you will end up with albinism.

Having two copies of everything is actually a really great system. This is because if one copy is broken, you still have a second copy to use as back up.

This is the case for the gene that makes melanin. People with just one broken copy don’t have albinism, because their good copy makes enough to keep albinism away.

But people with one bad gene copy can still pass it down to their kids. We call these people ‘carriers’, because they carry a single copy, but don’t have the actual disease. And this is where the trouble can start with inbreeding.

But with inbreeding, it is more likely that your spouse could carry the same broken gene. So in the example of albinism, it would mean that both mom and dad are carriers for the broken gene for making melanin. Then both mom and dad have a 50% chance of passing a broken gene to their child. This translates to each child having a 25% chance for getting the disease (0.5 x 0.5 = 0.25). That’s a pretty high risk!

Now, I’m not saying that people with albinism (or any rare disease for that matter), are always the result of inbreeding. Everyone has five or ten of these broken genes lurking in their DNA. This means it is always a roll of the dice when you pick a spouse as to whether they’ll carry the same broken genes as you do.

But with inbreeding, the risk that you’ll both carry the same bad genes is much higher. Each family is likely to have its own type of disease genes, and inbreeding is an opportunity for two carriers of the same broken gene to pass two copies of it to their children. And then their kid can end up with that disease.

As you can see, it’s good to have babies with someone that has different DNA from you. Then you can give your babies a diverse collection of DNA, and they will always have a back-up allele for any broken ones.

But this isn’t the only reason you want parents to be pretty different genetically. The second reason you need a lot of variety from each parent is to be able to fight off as many diseases as possible.

Same Genes, Weak Immune System

Having diverse DNA is important for having a strong immune system. This is why inbreeding can make for some sickly children. And it is why laboratory mice and some farm animals get sick so easily.

The immune system depends on a very important part of DNA called the MHC or Major Histocompatability Complex region. This is a lot of big words, but basically the MHC region is made up of a bunch of genes that help you fight off disease.

The MHC region’s secret to fighting off disease is to have as many different types of alleles (or versions of genes) as possible. The more variety you have, the better you are at fighting disease.

Diversity is important because each MHC gene is good at fighting a different set of diseases. You can think of it like a lock-and-key system. Each disease is a different shaped lock, and each MHC gene is a key. The more keys you have, the more diseases you can unlock and destroy.

While this may sound very oversimplified, it is quite similar to how our bodies actually work. Our bodies are constantly trying to detect foreign material in the body. Scientists think that each MHC gene allows us to detect a different type of foreign material.

And even more importantly, each allele of an MHC gene can help detect a different type of foreign material. We don’t fully understand yet what types of foreign material each allele can help detect, but we do know that every unique allele helps to detect a different type.

Now I think you can see why inbreeding can cause problems here. When inbreeding happens and two closely related people have children, these children are likely to have less diversity in their DNA. Which means these inbred children would have fewer types of MHC alleles (or fewer keys).

With fewer types of MHC alleles, they can detect fewer types of foreign material (or locks). They will be more likely to get sick as they can’t successfully fight off as many diseases. The end result is a more sickly person.

As you can see, diversity is the most important thing lost with inbreeding. Whether it’s to ensure that you don’t get two bad alleles and end up with some rare genetic disease, or if it’s to ensure that you get many different MHC alleles, you need diversity to protect yourself.

Examples of species with low genetic diversity and consequences

Large populations tend to have high levels of genetic diversity. However, as populations shrink, they lose much of their diversity. The result is that the remaining individuals are more genetically similar to one another.

This becomes a problem if survival traits have been lost and if genetic combinations causing diseases are expressed with marked frequency.

The potato famine

The absence of genetic diversity was the reason behind one of history’s biggest famines. The causes of the Potato Famine in Ireland which took place in the 19th century can be traced back to the susceptibility of the new potato plant to a specific disease.

Because new potato plants are not a result of reproduction – they are instead created from one parent plant – they exhibit very low genetic diversity.

The potato’s low genetic diversity meant that the virus spread to the vast majority of the potato crop which was a staple food for the Irish population, leaving one million people to starve to death. Absence or low genetic diversity is also in part what makes agricultural monocultures more susceptible to disease.

Atlantic wild salmon endangered by salmon hatcheries

Atlantic wild salmon may be losing the traits needed to survive in ocean waters.

Raising fish for restocking as a solution to dwindling populations either from overfishing or in the case of Sweden to mitigate the impact of hydropower plants, appears to be creating a different problem: lower genetic diversity.

The rivers of Sweden feed into the Baltic Sea, providing it with 90 percent of its juvenile wild salmon. A recent study of Atlantic salmon populations across thirteen rivers in Sweden, five of which are home to salmon raised in hatcheries, show that in contrast to one hundred years ago before the stocking measures began the fish are more genetically similar.

One might expect those raised in the hatcheries to be, since they are selected for fast growth rather than speed and prowess in the wild and are breeding within a defined population. However, it appears that when they breed with the wild salmon, the fish are passing on their inferior genes. This is jeopardizing the survival abilities of the salmon entire population [11] .

Platypus population from the King Island

A study of platypus population on King Island in the Bass Strait off the northwestern coast of Tasmania, Australia, found that low levels of genetic diversity are affecting reproductive success, survival, and parasite resistance of these animals.

The low genetic diversity in an important immune response system is especially of deep concern. Scientists are worried that this could have devastating consequences for the species if the fungal mucormycosis from the Tasmanian mainland reached the island population. The fungal mucormycosis is a disease caused by the fungus Mucor amphibiorum, which causes infection prone skin lesions and can be deadly to platypuses.

Without genetic variation a population does not have the arsenal to help it respond to changing environmental variables.

In 1492 when explorers brought a host of new diseases to the Americas including smallpox, measles and flu the Native American population experienced a “massive demographic collapse” [12] .

It is estimated that the new diseases killed 90% of the Native American population [13] . If a population does not have the genes to resist a disease and the disease is so virulent it threatens to wipe out the entire population before the species has an opportunity to evolve a resistance, the population then may face extinction.

Neandertal genome FAQ

With the release of the initial two papers describing chromosomal DNA sequences from a Neandertal, I thought I would put together some frequently asked questions and answers to them. I actually have been frequently asked most of these questions this week -- mostly by journalists -- so I think this is a good list.

I'll be following up over the next few weeks with additional details, particularly as some of our own work moves forward. I've left some loose ends dangling here deliberately -- sometimes for the sake of brevity, in other cases because they await further developments.

UPDATE (11/17/2006): I'm editing through this, making changes here and there to make things clearer. So as this progresses, it won't be identical to the initial version, although changes will be minor.

There are two papers in two journals, by two different teams of people. What's the difference?

Both teams used samples from the same specimen, Vindija (Vi) 80 -- so in principle, they are sequencing the same genome. The difference between the two comes from their methods of sequencing the DNA.

The Rubin group (Noonan et al. 2006) is using a metagenomics method based on the creation of a clone library from the ancient DNA. To make a clone library, DNA from a sample is cut with a restriction enzyme, which cuts the DNA at every place that it displays the same short sequence (usually 4- or 6-bp sequences, such as "ATTA"). The short fragments of DNA are mixed together and bound to vectors that can be maintained and replicated in cells. This is the "cloning" process, and the "library" consists of all the short fragments, which (hopefully) overlap each other so that they can be reconstructed.

People have made libraries for a long time. For example, the entire mRNA complement in a given tissue type may be made into a library of complementary DNA (cDNA). Once the library is made, it can be probed with short, labeled DNA sequences to assess whether a given gene is expressed in that tissue type. Or contrariwise, after cDNA from the library is sequenced, it can be used to design probes to find where in the genome it came from.

The unique aspect of the metagenomic approach is that all DNA sequences from a sample will be included in the library, potentially seqeunced, and ultimately reconstructed with computers into separate genomes. Usually cloning is preceded by an amplification step (generally using PCR), which selects and amplifies DNA of particular interest for cloning. But metagenomic methods skip this amplification -- because they cannot predict in advance what they are looking for. One of the most important early applications of metagenomics has been to reconstruct the genomes of microbes that cannot be cultured. Even though these organisms are not amenable to keeping in laboratory colonies, their genomes can be reconstructed by sampling their environments -- for example, soil or pondwater.

Or fossils. For the Vindija 80 fossil, the extract includes only around 6 percent identifiable "primate" DNA sequences. Out of the roughly 20 percent that are identifiable at all, over half are microbial.

I suppose if you were interested in the long-term microbial decomposition of fossil bone, you could do your disseration on those. For the rest of us, the final step is to let the computer spit out the humanlike sequences, which are assumed to be the Neandertal DNA plus some proportion of human contamination.

In contrast, the 454 group (Green et al. 2006) used a method called bead-based emulsion PCR. That is a mouthful, so it bears some explanation (for which I'm paraphrasing material from Margulies 2005 and Ronaghi 2001).

The "polymerase chain reaction," or PCR, is a method of replicating many copies of a DNA sequence from a single template. Usually to do PCR, you design a "primer," which is a short sequence of DNA that causes the target sequence to be preferentially replicated by the DNA polymerase. With a number of heat cycles and sufficient primer, you end up with a whole lot of copies of just the piece of DNA that you want.

This is, of course, exactly why standard PCR is so problematic for ancient sequences. There, you can't get exactly what you want, because it is broken into tiny bits and damaged. You would be happy to get anything. But if you amplify everything together in one giant vat, then the less damaged sequences will be the ones that amplify preferentially, and these are going to be worthless to you because they all represent contaminants of various kinds, like microbial DNA or modern human sequences.

The 454 method attaches all the tiny bits of sequence to tiny beads and separates these beads into oil droplets within a water suspension. The oil droplets are the "emulsion" part, and by separating them in this way, the process can employ PCR while keeping all the tiny sequences seperate from each other. Because they are kept separate, one good sequence can't swamp out all the others in the solution. The PCR products all stick to the bead so that after they come out of the emulsion the copies of different sequences are still separate.

After PCR, the DNA is broken down into single strands, still attached to their beads, and the beads are deposited on a fiber-optic slide assembly. The slide has tiny wells that are optically connected to a light-sensing CCD, which is essential for the "pyrosequencing" step. Nucleotides flow across the slide and into these wells one after another (T, A, C, then G). When the DNA polymerase connects one of these nucleotides to the single-strand DNA in a well, it releases a molecule of pyrophosphate (PPi).

That's when the magic happens. The solution also contains luciferase -- the enzyme that makes fireflies glow. With some additional chemistry, the PPi gives a burst of energy to the luciferase, which then emits a spark of light. The CCD picks up the light, which is a signal that the nucleotide was incorporated into the sequence.

Since nucleotides are added only every few seconds, a clever person with a notebook could reconstruct the sequence of the DNA fragment in each well. The real trick is that the fiber-optic slide contains well over a million wells, all being sequenced simultaneously. As the CCD picks up the series of flashes from every cell, the system is tracking many megabases of DNA in every run.

At present, this is the fastest method of DNA sequencing on the planet. It can construct the complete genome of a microbe in a couple of hours.

If the 454 sequencing method is so much faster, then why would anybody ever want to build clone libraries?

The claim is that the library approach is superior as a way to probe for specific genetic loci. For instance, here's a passage from p. 1071 of the Pennisi article:

This sounds similar to the study earlier this year that found Mc1r variants in different mammoths, but in fact that study used direct PCR rather than cloning (I suppose because they have a heck of a lot more mammoth tissue to work with!).

It's not obvious to me that this is really that much of an advantage. I mean, it's certainly true that we really want to sample some genes (like MCPH1) from several different Neandertal fossils. But I don't see any point to drilling into fossils for this purpose without also sequencing their full genomes.

Now, somebody will say, "Well, sequencing the full genome of every fossil is just too expensive. We can limit to work on just a few genes much more cheaply, and we can use the same samples later to sequence other genes, or whole genomes."

Personally, I don't see the rush. These fossils were in the ground for 40,000 years, and they're not going anywhere. If we can sequence whole genomes cheaply in 10 or 20 years, and additionally have better means of dealing with contamination, I don't see why we just shouldn't wait. Training graduate students in metagenomics is not a good enough reason to work on these rare fossils.

One may say that the same samples will be sufficient for later sequencing of whole genomes, or other genes, or Neandertal athlete's foot fungus, or whatever, but in my experience it somehow never works out that way. Somebody is always coming back to grind up, dissolve, or laser ablate more bone.

In fact, if I were looking to make the next advance in metagenomics, I would take some of that mammoth flesh, mix in some elephant blood, and find ways to resolve the parts of the resulting mix. That would be something.

Are you saying you are against destructive sampling of these fossils?

Not at all. In fact, I think that genomics gives the most compelling reason ever for grinding up more bones.

There is just a huge quantity of information from DNA sequences far more than from the morphology -- especially for samples like bone fragments or isolated teeth.

Heck, if the devil came to me and said I could have the full genome sequence of every fossil if I would agree to their destruction, I think that would be a good bargain!

But it's pretty clear that we're not in that situation. We can have our cake and eat it too -- and the longer we wait, the cheaper and less destructive this is likely to be. And frankly, just one Neandertal genome is going to give us plenty to work on for a long time.

But then, I was trained as a fossil guy, and I'm used to working with a few bits and pieces. It gives me a natural advantage!

They say there's no significant evidence of interbreeding. Yet you told us last week that there is significant evidence of interbreeding. What gives?

A few years ago I gave a talk where I laid out what I saw as the problems interpreting nuclear DNA sequences from Neandertals. Now, this was long before we had any reasonable prospect of getting such sequences, so it was purely based on knowledge about human genetic variation. As I saw it then, there were two problems:

  1. Human mtDNA is really variable, with greater than 1 percent sequence divergence between people, and much higher in some places. In contrast, human nuclear DNA has less than one base pair in a thousand different between copies. To get a reasonable picture of variation among people, you need long nuclear sequences so that you will find polymorphisms. But ancient DNA is broken into short little sequences that are very difficult to reconstruct. With mtDNA, this is less of a problem because it is clonal and a person basically has one sequence in many copies. But most nuclear DNA (all autosomal DNA) exists in two, possibly different copies. So reconstructing long enough sequences to study polymorphisms is very difficult.
  2. The coalescence age of human mtDNA is only a couple hundred thousand years, so sampling ancient humans is sort of likely to result in sequences that lie outside this range of variation -- and with Neandertals, that is precisely what happened. But nuclear loci have coalescence ages on the order of 600,000 to 2 million years or older. With these dates, the diversity among living people must significantly predate any divergence of archaic humans for most nuclear genetic loci. This means that Neandertals ought to have shared a high proportion of polymorphisms that are still variable in humans. Since we can expect that Neandertals will not be very genetically divergent for these nuclear genes, compared to the genetic differences among living people, we can conclude that no gene is likely to tell us very much about the phylogenetic relationships of an ancient Neandertal with living people.

These two problems are still stumbling blocks for interpreting Neandertal sequences. But the research teams found a very clever way to circumvent them, by using genomics approaches instead of genetic approches.

If you've been scratching your head wondering exactly why "genomics" has a buzz, then this is a good example.

Because of projects like the HapMap and the chimpanzee genome project, we know a lot (not everything, but a lot) about human genetic polymorphisms and our genetic differences from chimpanzees. In fact, we have databases of human single nucleotide polymorphisms (SNPs), and human-chimpanzee comparisons. For each SNP, some humans have an ancestral nucleotide -- generally the one that chimpanzees have. Other humans have a derived nucleotide -- the one that appeared in some ancient human, and different from chimpanzees.

For the most part, derived SNP alleles are recent. A few of them are very old, and these tend to be found at high frequencies (because the person who originated them had lots of descendants in that time). But many more of them are recent, found in a relatively small number of people today, who descend from a common ancestor during the past couple hundred thousand years.

If Neandertals diverged from humans over 200,000 years ago, and they didn't interbreed after that time, then the Neandertal genome should have relatively few derived human SNPs. In contrast, if the two populations continued to interbreed after 200,000 years ago, they might share fairly many of these derived SNPs.

Hence, we have a potential test for Neandertal-human genetic interactions.

Noonan et al. (2006) looked for these derived SNPs and found very few of them. They concluded that there was no significant evidence of Neandertal-human interbreeding, although their statistical test couldn't rule out as much as 25 percent admixture (for reference, Plagnol and Wall 2006 estimated only 5 percent ancestry from all archaic humans, not only Neandertals).

Green et al. (2006) also looked for derived SNPs. They had a much bigger sample of DNA to work with, so they ought to have a stronger test. Here's what they wrote (p. 334):

If this observation holds (i.e., if it is not influenced by contamination, and the ascertainment function does indeed show this to be an excess of derived SNPs), then it is one of the strongest pieces of evidence for genetic intermixture of Neandertals and modern humans. Note that there are two avenues for this gene flow -- either from the ancient ancestors of modern humans into Neandertals, or out of Neandertals into early modern humans. I'm sure we will hear more about this when they have more sequence.

In the meantime, the other source of evidence about Neandertal-human genetic interaction is the genomic variation of living people. Last week's paper on MCPH1 (discussed here) is a good example of what that evidence looks like. The key feature is that if you troll through the genome, you begin to notice some loci with interesting genealogies. The interestingness is a combined signature of recent selection and ancient population structure.

Looking for genes like MCPH1 in the Neandertal genome is a no-brainer. We probably won't find a lot of them, because the Neandertals were a small subset of the ancient human population.

There is one further problem. We can recognize these interesting loci in living people because they lie on relatively long haplotypes with little recombination. The inference is that such an allele must have begun from a very low copy number around 30,000 years ago, presumably because it was introduced from some archaic population. But the SNPs that are presently linked to the selected site were probably polymorphic within the archaic population, not fixed on a long haplotype. Unless we know exactly which SNP is the selected site on a human allelic variant, we may have some trouble telling whether an archaic genome has the allele. And as I note below, a large proportion of SNPs are going to be missing from the draft Neandertal genome even when it reaches an average 1x coverage.

This just means that evidence from the genomics of living people and from the Neandertal genome won't mesh together seamlessly. There remains some complexity interpreting these relationships.

The divergence date of Neandertal and human sequences is estimated at around 520,000 years ago. What does that mean?

First, what it doesn't mean. It doesn't mean that the human and Neandertal populations diverged 520,000 years ago. I noted above that the estimate of the genetic divergence time comes from the proportion of chimpanzee-human differences for which the Neandertal shares the human allele. But of course, some living humans have the ancestral, chimpanzee-like allele for many polymorphisms, so this comparison of polymorphisms is not saying that Neandertals were like chimps. Instead, we are just disregarding the Neandertal-specific evolutionary events.

I'm sticking with the 520,000 year genetic divergence estimate from Green et al. (2006), instead of the older estimate from Noonan et al. (2006), because of the vastly larger sample in the Green paper. Still, most of the discussion does not hang too critically on the precise date although the date changes the interpretation by degrees.

The real interesting observation is the Neanderal-human genome draft difference compared to the human-human difference. Here's a passage from p. 354 of Green et al. (2006):

They don't specify where this "contemporary human" was from. The draft human genome is a chimera made up of anonymous people from different populations. That means that wherever the "contemporary human" is from, it will be the same region as represented by some part of the draft genome, but not all. So the divergence between these two mystery sequences is likely to be greater than average within a single population, and less than average between different populations.

Keeping that in mind, the human-Neandertal difference is startlingly close to this human-human difference measurement. The Neandertal is only 10 percent more different from the draft human genome than these two human sequences are from each other.

It seems very likely that we will find pairs of living human populations where the average genetic divergence is older -- maybe much older -- than this human-Neandertal divergence. For instance, it seems almost certain that the great genetic variability among living African groups will exceed this human-Neandertal difference.

Some geneticists have noted that European and Asian populations seem to be a genetic "subset" of African populations, at least for many genetic loci. With these kinds of numbers, it looks like Neandertals may be a subset of living human diversity in the same sense. I've never much liked that formulation, because "subset" is never really an accurate description of the genetic relationships. But if the seat of living human diversity is Africa, adding Neandertals to the mix may not change that pattern at all.

As Green and colleagues note, most of the genetic divergence between humans and Neandertals, and between humans and other living humans, is actually much older than the divergence of these populations from each other.

At one limit (that is, assuming complete isolation of humans and Neandertals after some date), the population divergence time depends on the effective size of the population that was ancestral to living humans and Neandertals. It is basically not possible to obtain a good estimate of this ancestral effective population size from the current Neandertal data -- mainly because good estimates depend on heterogeneity in divergence times among loci, which we can't infer for the short Neandertal sequences.

Both papers assume that this ancestral effective population size was small -- even smaller than the long-term human effective population size of around 10,000 individuals. A smaller effective size for the human-Neandertal ancestral population is fairly unlikely, though, since it must have been distributed across large parts of Europe and Africa at a minimum. More likely, the effective size was close to 10,000, just as in humans, since the human effective size is inferred to have been that small over at least the past million years.

If you're reading the term "effective population size" for the first time, don't worry. It doesn't mean "population size", and it has mainly a technical genetic meaning. It is sort of important that the Neandertal sequence supports this particular effective size over the long term, but it will take another post to explain why.

As noted above, the populations may never have been isolated. The derived SNP evidence might suggest that there was never any population divergence, or at least no long period of complete isolation, between humans and Neandertals. We'll have to wait and see.

Why does this bone have such a low level of contamination compared to other Neandertals?

I should start by pointing out that "contamination" here means "modern human sequence". All fossil bones are loaded with exogenous DNA, like bacterial and fungal genomes that invaded after the animal died. From a certain point of view, those exogenous genes are contaminants -- we are generally not interested in their sequences, and sorting them out from the endogenous Neandertal DNA is a real nuisance. But because we have a reference genome from humans to compare with the sequences from the ancient bone, we can sort out these bacterial and other exogenous sequences. So although they do "contaminate" the bone, they don't distort our picture of the sequence.

The real problem is that there are contaminating sequences from recent humans in the ancient bones. These sequences come from excavators, anthropologists who studied the bones, museum personnel, graduate students who cleaned and prepared the bones for sequencing, other samples from the labs doing the work, and who knows where else.

I have been asked many times why they can't eliminate this contamination. For example, why can't they just clean the bone, or take samples from deep inside the bone, or take samples from deep inside of teeth, or use a clean room, yada yada yada.

The answer is that they do wash the bones, and they do eliminate the outer surface, and they do take samples from deep inside of bones, and they do work in a clean room, with ultraviolet lights and positive air pressure so that DNA can't get sucked into the room, and rubber gloves and bunny suits, and the whole nine yards. And the bones are still contaminated, deep inside them.

Now, you may imagine anthropologists picking their noses with the bones, and using them as chopsticks, and putting them up to their ears to hear them breathing, and all manner of other things. The truth is, I have no idea how the contamination gets in there, and neither does anybody else. It's just there, and apparently we can't avoid it.

The extraction team looked at lots of Neandertal specimens, with one question in mind: How much human contamination does this bone have? To answer this question, they amplified mtDNA sequences, and assessed what proportion of transcripts were Neandertal-like and what proportion were human-like. Vindija 80 stood out as having a very low proportion of human-like transcripts -- less than 2 percent. So they inferred that there was little contamination of the sample by recent human DNA, and are working under the assumption that the nuclear genome is contaminated in a similar low proportion.

As for why this particular bone has such low contamination, well, nobody really knows that either. Svante Pääbo speculates that it is because Vi 80 was originally identified as fauna and hasn't been handled much. He might well be right. Which would bring us back to the nose-picking chopstick bone theory, I suppose.

If Vindija 80 was put in a box with fauna, it can't be very diagnostic. This high preservation seems very unusual. How do they know it was a Neandertal?

The radiocarbon date is 38,310 +/- 2130, and they found very high preservation of a Neandertal-like mtDNA sequence. If you think that fails to answer the question, well.

How can they deal with the damage to ancient DNA sequences?

One of the things that has become clear about ancient DNA research is that DNA from ancient fossils undergoes various kinds of damage. The most obvious is the fragmentation of the DNA into very small pieces, a problem that both the sequencing approaches have been designed to circumvent.

But a more serious problem is that some bases become degraded over time in ways that cause the sequencing methods to misidentify them. For example, cytosine (the "C" base) can be chemically modified over time into a base called uracil, which sequencing methods misidentify as a thymine (the "T" base).

There seems to be no way to tell which base pair changes are diagenetic (i.e. DNA damage-induced) and which are genuine Neandertal changes.

So, the teams took a radical approach: just ignore all the changes that are possibly damage. Instead of analyzing Neandertal-specific changes, they decided to assess the status of human polymorphisms and human-chimpanzee differences in the Neandertal seqeunce. This method is how they estimated the Neandertal-human genetic divergence time, for example -- because the Neandertals have approximately 96 percent similarity with humans for human-chimpanzee genetic differences, it is possible to infer that their genes diverged from the average human gene only 4 percent of the evolutionary time separating humans and chimpanzees. The research teams assumed that humans and chimpanzees are separated by 13 million years of evolution -- this includes the time on both the human and chimpanzee lineages since their common ancestor, assumed to be 6.5 million years ago. These dates and genetic differences produce an estimate of around 520,000 years ago for human-Neandertal genetic divergences.

In the long run, it should be possible to sequence the genome with multiple coverage, which would allow damage to be resolved. With many copies, the damage to any individual DNA sequence will be unique, while changes that are evident in multiple copies must probably be real.

But we are quite a ways from the long run, so for the time being we have to deal with DNA damage. For individual genes, it may be possible to reason exactly what effects changes would have and thereby arrive at a conclusion about which changes are diagenetic. For instance, only a minority of such changes will affect coding regions, and some of those will be synonymous changes, so only a small proportion will make amino acid changes, and if there are only a couple of these per gene the resulting protein structure may be able to be analyzed. So from a functional perspective, it should be possible to work with damaged sequence.

The main problem is from the statistical perspective (i.e., assuming neutrality), and here I think the teams have taken a very reasonable approach by just throwing the changes out.

Will they really be able to sequence the full Neandertal genome in two years?

I got a lot of questions from journalists on this point. I really see no reason to doubt it -- they know their average sequence yield from a given amount of extract, and the proportion of that yield that is actually Neandertal DNA.

The main caveat is a statistical one: 3 billion base pairs of sequence is -- on average -- one full coverage of the genome, but in practice some loci will be sequenced many times, while a fairly large proportion (a bit over 30 percent) won't be sequenced at all.

A billion missing bases may not seem like a big deal, but there is a catch: the short average fragment size means that the missing patches will be distributed throughout every gene. Since the average gene covers a region of a few kilobases, complete gene sequences will be pretty rare -- most will have gaps in them amounting to around 30 percent of their length.

Or to put it another way, a bit more than 30 percent of informative SNPs in humans will not be represented in the first Neandertal genome draft.

A second issue is that the genome of Vindija 80 is not haploid -- there are two copies of most everything in that bone. Some of these copies were polymorphisms in Neandertals, and if these are reconstructed into a single sequence, there will be mixed-up haplotypes. This means that it will be difficult, if not impossible, to assess whether there were functional multi-SNP differences between the human and Neandertal sequences of particular genes.

Anyway, that's probably getting beyond ourselves. No doubt somebody will think of some way to improve these problems and it will eventually become cheap enough to do 10x coverage instead of 1x coverage.

They're already making plans to clone Neandertal super-soldiers, aren't they?

Maybe unsurprisingly, this question about Neandertal cloning is the one most journalists so far have wanted to ask me. I'm sure they're asking everybody, hoping that somebody will slip a really pithy quote for them.

Since I have clones here at home, I can't bring myself to get to worked up about it. A Neandertal clone army would definitely be an improvement over a Neandertal Jar-Jar.

Personally, I have another problematic scenario in mind, which I am developing elsewhere.


Green RE, Krause J, Ptak SE, Briggs AW, Ronan MT, Simons JF, Du L, Egholm M, Rothberg JM, Paunovic M, Pääbo S. 2006. Analysis of one million base pairs of Neanderthal DNA. Nature 444:330-336. DOI link

Lambert DM, Millar CD. 2006. Evolutionary biology: Ancient genomics is born. Nature 444:275-276. DOI link

Margulies M and 55 others. 2005. Genomie sequencing in microfabricated high-density picolitre containers. Nature 437:376-380. DOI link

Noonan JP, Coop G, Kudaravalli S, Smith D, Krause J, Alessi J, Chen F, Platt D, Pääbo S, Pritchard JK, Rubin EM. 2006. Sequencing and analysis of Neanderthal genomic DNA. Science 314:1113-1118. DOI link

Pennisi E. 2006. The dawn of stone age genomics. Science 314:1068-1071.

Römpler H and 8 others. 2006. Nuclear gene indicates coat-color polymorphism in mammoths. DOI link

Ronaghi M. 2001. Pyrosequencing sheds light on DNA sequencing. Genome Res 11:3-11. Abstract

Schloss PD, Handelsman J. 2003. Biotechnological prospects from metagenomics. Current Opinion in Biotechnology 14:303-310.

Updated: November 17, 2006

You May Also Enjoy

Set 3: Question 4

Note to 7.014 students who are Biology majors: Use these problems for practice some may go further than you are used to. But try them anyway, because they are interesting. Use the Hypertext for help, or the textbook, if you want to.

Enzyme Biochemistry Practice Problems

1) Which of the following assumptions are made in Michaelis-Menten kinetics? For Review material on Michaelis-Menten kinetics, consult the Hypertextbook

2) An enzyme (E) in a certain flower converts a colorless substance (S) into a pink colored compound (P). The enzyme follows Michaelis-Menten kinetics.
E + S <--> ES --> E + P

a) Which of the following are valid definitions of the specific activity of enzyme E?

b) What is Vmax? For Review material on Km and Vmax, consult the Hypertextbook

3) You are a research scientist studying a novel enzyme X, and you want to characterize this new enzyme. You measure the velocity of the reaction with different substrate concentrations and get the following data:

[substrate] (mM) Initial Velocity (mmol/min)
3.0 10.4
5.0 14.5
10.0 22.5
30.0 33.8
90.0 40.5

a) Given these data, choose the correct graph. To review material on enzyme kinetics, consult the Hypertextbook.

b). What would be the effect on the initial reaction velocities if the concentration of enzyme was reduced to 10% of the amount used in part i)? To review material on Michaelis-Menten kinetics, consult the Hypertextbook.

c). How would the ten-fold decrease in enzyme concentration affect the observed Km? To review material on enzyme kinetics examples, consult the Hypertextbook.

4) You have isolated the enzyme milleniase from the rare Y2K bug. This enzyme cleaves the first two numbers off of four digit dates. You flex your skills as an enzymologist and come up with the following graph of initial velocity as a function of substrate concentration:

a) From the graph, estimate Km. To review material on measuring Km and Vmax, consult the Hyptertextbook.

b) What is the approximate value of Vmax? To review material on measuring Km and Vmax, consult the Hyptertextbook.

c) Is milleniase an allosteric enzyme? Explain.

5) You wish to find the amino acid sequence of an enzyme from the baker's yeast, Saccharomyces cerevisiae. Which of the following methods might you use to determine this?

a) Purify the protein and subject it to N-terminal sequencing.

b) Treat the purified enzyme with protease and then sequence the proteolytic fragments by Edman degradation.

c) Clone the gene and sequence the DNA of the coding sequence, then generate a deduced protein sequence.

d). Query GenBank (through the National Center for Biotechnology Information) and look up the sequence in the database.

8) Your biology professor states in his lecture that all enzymes are proteins, but you see your TA raising her eyebrows. The reason for her expression is:

9) Lactose is a disaccharide found in milk. Many adults throughout the world get sick from drinking milk because they cannot digest lactose. Lactose intolerance varies markedly among various human populations. For example, only about 3% of people of Danish descent are lactose intolerant, compared with 97% of people of Thai descent. When someone who is lactose intolerant ingests milk, the lactose accumulates in the lumen of the small intestine because there is no mechanism for uptake of the disaccharide. This causes abdominal distension, cramping, and watery diarrhea. Adults who can drink milk can do so because of the enzyme lactase which is located on the outer surface of epithelial cells lining the small intestine. Lactase hydrolyzes lactose into its two component monosaccharides, glucose and galactose. Both glucose and galactose can cross the epithelial cells, and therefore do not cause illness.

a) Why can't lactose diffuse across the membranes of the intestinal epithelial cells in the absence of a carrier-mediated uptake system?

b) Why does the accumulation of sugar (or any solute) in the intestinal lumen cause an influx of water that leads to watery diarrhea?

c) You decide to study lactase further, and see whether it can also cleave other common disaccharides, such as maltose. (Maltose = glucose + glucose.) You find that maltose is NOT cleaved by lactase, and furthermore, maltose appears to have some kind of inhibitory effect on lactase's ability to cleave lactose. Is maltose a more likely candidate for competitive or noncompetitive inhibition of lactase?

d) In order to confirm your hypothesis in part (c), you quantitatively study the kinetics of lactase with lactose alone, and in the presence of both lactose and maltose. You measure the initial velocity of the reaction (rate at which lactose is cleaved) at varying concentrations of substrate. The data are given below.

[Lactose] moles/L Velocity (mol/min)
lactose only with maltose
0.3 x 10-5 10.4 4.1
0.5 x 10-5 14.5 6.4
1.0 x 10-5 22.5 11.3
3.0 x 10-5 33.8 22.6
9.0 x 10-5 40.5 33

Do these data support the model that maltose is a competitive or noncompetitive inhibitor? (You may need to graph 1/V vs. 1/[S] for lactase in the presence and absence of maltose).

Human papillomaviruses (HPV)

HPV are important viral pathogens that cause a variety of mucosal infections of the anogenital and oral mucosa. These infections can lead to epithelial cancers at these mucosal sites. Papillomaviruses (PVs) are species-restricted such that HPVs cannot be directly studied in pre-clinical laboratory animal models. Two rabbit papillomavirus models have been used extensively to study various aspects of papillomavirus biology, including vaccine testing 96 , anti-viral treatments 97 , papillomavirus biology 98 , and latent viral infections 99 . The viruses include cottontail rabbit papillomavirus (CRPV), which is a cutaneous-tropic virus whose lesions spontaneously progress to cancer, and rabbit oral papillomavirus (ROPV), which is a mucosa-tropic virus that induces oral infections. Numerous viral mutant genomes 98 and a unique HLA.A2.1 transgenic rabbit line 100 have been developed to study host immune responses to viral infection, therapeutic T-cell-based vaccines and host anti-CD8 immunity to viral proteins.

Gene Transfer Between Species Is Surprisingly Common

Bacteria are known to share genes, spreading drug resistance, for example. But how common is it in other organisms, including mammals like us? Two new studies show that most bacteria have genes or large groups of genes shared by other bacteria.

Even among higher organisms, shared genes are the rule rather than the exception, UC Berkeley and LBNL researchers say.

Two new studies by University of California, Berkeley, scientists highlight the amazing promiscuity of genes, which appear to shuttle frequently between organisms, especially more primitive organisms, and often in packs.

Such gene flow, dubbed horizontal gene transfer, has been seen frequently in bacteria, allowing pathogenic bacteria, for example, to share genes conferring resistance to a drug. Recently, two different species of plants were shown to share genes as well. The questions have been: How common is it, and how does it occur?

In a report appearing this week in the Proceedings of the National Academy of Sciences (PNAS), UC Berkeley and Lawrence Berkeley National Laboratory (LBNL) researchers analyzed more than 8,000 different families of genes coding for proteins - families that represent the millions of proteins in all living creatures - to assess the prevalence of horizontal gene transfer.

They found that more than half of all the most primitive organisms, Archaea, have one or more protein genes acquired by horizontal gene transfer, as compared to 30 to 50 percent of bacteria that have acquired genes this way. Fewer than 10 percent of eukaryotes - plants and animals - have genes acquired via horizontal gene transfer.

In a second report published by Nature, two species of bacteria living together in the pink slime of an acidic California mine were found to share large groups of genes. These genes code for proteins that work together, so by acquiring the entire block from another organism, bacteria can gain a new function that helps them adapt more quickly to the same type of environment - in this case, a hot, highly acidic, metal-rich broth.

This is the first observation of exchange of very large genomic blocks between organisms in a natural microbial community, according to UC Berkeley's Jill Banfield, who led the team of researchers from LBNL, Oak Ridge National Laboratory (ORNL), Lawrence Livermore National Laboratory and the U. S. Department of Energy's Joint Genome Institute (JGI).

"One of the key questions being debated was, 'Is horizontal gene transfer extensive and rampant, or is it a relatively rare event?'" said Sung-Hou Kim, professor of chemistry at UC Berkeley and coauthor of the PNAS paper. "This becomes important in classifying organisms and comparing whole genomes to find their relationships.

"Our study shows that gene transfer is fairly common, but the extent in a given organism is fairly low - that is, most organisms have received one or more genes from a closely related organism. And while it's very likely that genes are transferred in chunks that are linked metabolically, I bet it's not always true. If a group of genes doesn't have value in a new environment for a new organism, it's not going to stick around."

"This provides important information about the conservation of genetic resources to enable life to survive and thrive," said ORNL's Bob Hettich, a co-author of the Nature paper. "Ultimately, the basic knowledge gained from this research will lead to a greater understanding of genetic diversity in related organisms and should lead to developments in human health and bioremediation."

Though the Nature findings about mine slime bear on the issue of horizontal gene transfer, the study's main goal was to detect, with high resolution, which organism is able to carry out what function within a natural, uncultivated microbial community, according to Banfield.

"In addition to revealing a history of genetic exchange between two dominant organism types in the mine, we show that it is possible to identify a large fraction of the proteins from coexisting organisms and determine which organism most of the proteins comes from, even if the organisms are quite closely related," said Banfield, a professor of earth and planetary science and of environmental science, policy and management at UC Berkeley and also an LBNL researcher.

Banfield leads a long-term study of the community of organisms in mine slime obtained from the Richmond Mine near Redding, Calif. This microbial biofilm has turned out to be an ideal research subject, Banfield said, because the simple community contains few enough organisms that they can be used as a model system to uncover aspects of how microbes interact with each other and their surroundings in ways that are difficult or impossible in other environments.

Banfield contrasts her strategy of ever more detailed studies of a single site to that of Craig Venter, who has been sailing the world's oceans aboard his boat, Sorcerer II, sampling large communities of organisms to survey global diversity. After four years collecting vast amounts of genomic information, he plans to publish some of his analyses next week in the Public Library of Science, or PLoS.

In 2002, the mine was the source of samples for the first fairly comprehensive community genomic, or metagenomic, characterization of a natural microbial consortium. In 2005, Banfield and colleagues presented the first relatively large-scale analysis of the proteins that consortia members make to carry out the various metabolic tasks needed for life underground - work that revealed information about the machinery used to adapt to the extreme conditions in which they live. More recently, in 2006, research scientist Brett Baker, Banfield and colleagues reported that the biofilms harbor novel archaeal organisms that appear to be extremely small compared to other life forms.

"Analysis of how microorganisms respond to their environments and the role of exchange of genetic material in adaptation and evolution is important if we are to understand important environmental processes such as acid mine drainage, or even degradation of cellulose for ethanol production by microbial communities," added Banfield.

In their new paper, the researchers combine metagenomics with strain-resolved shotgun proteomics to show that different organisms are exchanging large blocks of their genes.

"Who's there and what are they doing are key questions in microbial ecology," said Banfield's colleague Vincent Denef, a post-doctoral researcher in UC Berkeley's Department of Earth and Planetary Science. "Our high-resolution, mass spectrometry-based community proteomics approach answers both at the same time. We can now tell apart closely related organisms, which we previously would have grouped as one species, and we can monitor and discriminate their behavior within the same natural community. These abilities will allow us to understand the implications of small differences in genome sequence and content on ecological performance, one of the key goals of the current microbial genomic sequencing efforts."

Watch the video: Τεχνική Easy Paint Wash. Παλαιές ιδέες Makeover τραπεζιού (February 2023).