Information

Why are riboswitches mostly present in bacteria and not in eukaryotes?

Why are riboswitches mostly present in bacteria and not in eukaryotes?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Riboswitches are a rather elegant way to regulate gene expression without any additional machinery. A small ligand binds to the mRNA and directly influences transcription or translation.

Most of the known riboswitches are found in bacteria, there are few examples of riboswitches in eukaryotes. There are no classical riboswitches in humans as far as I know (there is one example, but triggered by a protein and not a metabolite), it seems that more complex organisms tend to use other methods of gene regulation.

Are there any known reasons for this? What are the drawbacks of regulating gene expression with riboswitches compared to using regulatory proteins? Is there an explanation for the lack of riboswitches in more complex organisms?


You get higher accuracy and more timely control at a post translational level (think about the time it takes to affect a pathway with cofactor stimulation compared to gene expression stimulation). You also have more points of control throughout the pathway if you affect activity post translationally allowing for a more complex interaction between stimulating and repressing factors.


Genetics Chapter 16

Inducer becomes available and binds to regulatory protein and causes a conformation change and the complex of regulator bound to the inducer cannot remain bound to operator.

When co-repressor is available, it will bind to inactive regulator causing conformational change to activate the regulator which will bind to operator and downregulate transcription preventing RNA polymerase from binding.

Promoter for 1 gene (repressor_ native form binds to operator region and locks polymerase from binding to lac promoter.

Do not have transcription of lacZ, lacY, and lacA 3 gene products of lac operon

When lactose is present, lactose gets converted to allolactose which will bindd to repressor and inactivate it.

Bacteria will preferentially use glucose as energy but if not available, they'll switch to sugars in the system

When glucose is high, cell doesn't need to transcribe lac operon

Decision matric for transcribing or not transcribing

Only transcribes when lactose is high, no matter what

cAMP binds to CAP and together they bind to the binding site upstream of the promoter and recruit polymerase.

Done because when cAMP and CAP binf upstream of the promoter, they cause DNA to bend and they interact with RN polymerase, stabilizing it at the promoter, allowing transcription to occur at a high rate.

In mRNA, there's the ribosomal binding site, start codon, and there's regulatory genes

Transcription regulator is expressed in an inactive form

Once translation reaches end of region 1, it will stall because trp is low and there is not enough tRNAs loaded with tryptophans.

Stalling prevents region 1 from hybridizing with region 2

Ribosome will not stall, and it is on region 2, which cannot bind to region 3.
3+4 hairpin loop forms, which dislodges RNA polymerase from the template.

RNA molecules acting as regulatory regions

Ribosome binding site becomes engaged in a complemntation event.

Ribosome cannot bind and translation doesn't happen.

Have a specific or general affinity for either ssDNA or dsDNA

Generally interact with the major groove of ß-DNA because it exposes more functional groups that identify a base pair

Acts as an agent to bind two or more molecules together

When an inducer molecule binds to the riboswirch, the binding changes the configuration of the RNA molecule and alters the expression of the RNA

Small molecules in prokaryotes

Loop of amino acids in zinc at the base

Helix of leucine residues and basic arm

Two leucine resides integrate

Two α-helices separated by a loop of amino acids

Because the gene ecpression necessary for utilizing other sugars is turned off, only enzymes involved in the metabolism of glucose will be synthesized.

Operons that exhibit catabolite repression are under the positive control of catabolic activator protein (CAP)

For CAP to be active, it must form a complex with cAMP.

If a mutation prevents CAP from binding to the site, then RNA polymerase will bind the lac promoter poorly.

In streptococcus pheomoniae the ability to carry out transformation requires from 105-124 genes, collectivelt termed the com regulation.

The com regulation is activated in response to a protein called competence-stimulating peptide (CSP), which is produced by bacteria and is exported into the surrounding medium. When enough CSP accumulates, it attaches to a receptor protein that stimulates the transcription of genes within the com regulation and sets in motion a series of reactions that ultimately results in transformation.

Negative repressible: No, the regulon requires CSP for gene expression. If it was negative repressible operon, gene expression would be occurring until repressed by the activated repressor.

Positive inducible: Yes. Increased levels of CSP induce the receptor to stimulate transcriptional activator functions, which allow gene expression to occur similar to what occurs in positive inducible regulation.


Key Concepts and Summary

  • Gene expression is a tightly regulated process.
  • Gene expression in prokaryotes is largely regulated at the point of transcription. Gene expression in eukaryotes is additionally regulated post-transcriptionally.
  • Prokaryotic structural genes of related function are often organized into operons, all controlled by transcription from a single promoter. The regulatory region of an operon includes the promoter itself and the region surrounding the promoter to which transcription factors can bind to influence transcription.
  • Although some operons are constitutively expressed, most are subject to regulation through the use of transcription factors (repressors and activators). A repressor binds to an operator, a DNA sequence within the regulatory region between the RNA polymerase binding site in the promoter and first structural gene, thereby physically blocking transcription of these operons. An activator binds within the regulatory region of an operon, helping RNA polymerase bind to the promoter, thereby enhancing the transcription of this operon. An inducerinfluences transcription through interacting with a repressor or activator.
  • The trp operon is a classic example of a repressible operon. When tryptophan accumulates, tryptophan binds to a repressor, which then binds to the operator, preventing further transcription.
  • The lac operon is a classic example an inducible operon. When lactose is present in the cell, it is converted to allolactose. Allolactose acts as an inducer, binding to the repressor and preventing the repressor from binding to the operator. This allows transcription of the structural genes.
  • The lac operon is also subject to activation. When glucose levels are depleted, some cellular ATP is converted into cAMP, which binds to the catabolite activator protein (CAP). The cAMP-CAP complex activates transcription of the lac operon. When glucose levels are high, its presence prevents transcription of the lac operon and other operons by catabolite repression.
  • Small intracellular molecules called alarmones are made in response to various environmental stresses, allowing bacteria to control the transcription of a group of operons, called a regulon.
  • Bacteria have the ability to change which &sigma factor of RNA polymerase they use in response to environmental conditions to quickly and globally change which regulons are transcribed.
  • Prokaryotes have regulatory mechanisms, including attenuation and the use of riboswitches, to simultaneously control the completion of transcription and translation from that transcript. These mechanisms work through the formation of stem loops in the 5&rsquo end of an mRNA molecule currently being synthesized.
  • There are additional points of regulation of gene expression in prokaryotes and eukaryotes. In eukaryotes, epigenetic regulation by chemical modification of DNA or histones, and regulation of RNA processing are two methods.

DMCA Complaint

If you believe that content available by means of the Website (as defined in our Terms of Service) infringes one or more of your copyrights, please notify us by providing a written notice (“Infringement Notice”) containing the information described below to the designated agent listed below. If Varsity Tutors takes action in response to an Infringement Notice, it will make a good faith attempt to contact the party that made such content available by means of the most recent email address, if any, provided by such party to Varsity Tutors.

Your Infringement Notice may be forwarded to the party that made the content available or to third parties such as ChillingEffects.org.

Please be advised that you will be liable for damages (including costs and attorneys’ fees) if you materially misrepresent that a product or activity is infringing your copyrights. Thus, if you are not sure content located on or linked-to by the Website infringes your copyright, you should consider first contacting an attorney.

Please follow these steps to file a notice:

You must include the following:

A physical or electronic signature of the copyright owner or a person authorized to act on their behalf An identification of the copyright claimed to have been infringed A description of the nature and exact location of the content that you claim to infringe your copyright, in sufficient detail to permit Varsity Tutors to find and positively identify that content for example we require a link to the specific question (not just the name of the question) that contains the content and a description of which specific portion of the question – an image, a link, the text, etc – your complaint refers to Your name, address, telephone number and email address and A statement by you: (a) that you believe in good faith that the use of the content that you claim to infringe your copyright is not authorized by law, or by the copyright owner or such owner’s agent (b) that all of the information contained in your Infringement Notice is accurate, and (c) under penalty of perjury, that you are either the copyright owner or a person authorized to act on their behalf.

Send your complaint to our designated agent at:

Charles Cohn Varsity Tutors LLC
101 S. Hanley Rd, Suite 300
St. Louis, MO 63105


Gene regulation by riboswitches

Riboswitches are structured domains within the non-coding portions of some mRNAs, where they serve as metabolite-sensing genetic switches. Metabolite binding causes allosteric changes in the mRNA that bring about changes in gene-expression processes such as transcription termination and translation initiation.

Riboswitches comprise two domains: an aptamer and an expression platform. The aptamer is highly conserved even in distantly related organisms, and serves as a precise sensor for its target metabolite. The expression platform is far more variable in sequence and in structure as it can function by assuming one of many structural forms to control gene expression.

Experimental data that are now known to correspond to riboswitch function date back at least 30 years. Recent studies have confirmed that a variety of gene-control 'mysteries' described in the literature over the past decades can be explained by the presence of seven distinct classes of riboswitches.

The aptamer domains of riboswitches exhibit surprising selectivity and specificity that compares favourably with protein receptors. These findings, along with the possibility that modern riboswitches might be evolutionary hold outs of an ancient form of gene-control system, indicate that the performance characteristics of riboswitches are competitive with those that are exhibited by proteins.

The mechanisms of gene control by bacterial riboswitches are largely based on transcription termination and translation initiation. However, the discovery of a riboswitch that has ribozyme function, and evidence which indicates that eukaryotes might use riboswitches for splicing control, hint at the potential for far greater diversity for riboswitch function in ancient and modern organisms.

New studies indicate that bacteria express numerous new RNA motifs and small non-coding RNAs. These findings suggest that more riboswitches will be identified, and so riboswitches seem to be a significant form of genetic control in bacteria.


4 LIGAND BINDING AFFINITY AND KINETICS OF RIBOSWITCH FUNCTION

Riboswitch aptamers can tightly bind their target ligands with values for dissociation constants (KD) ranging from the mid micromolar as measured for GlcN6P binding by glmS ribozymes (Winkler et al. 2004) to the mid picomolar range as observed for some riboswitch representatives that bind TPP (Welz and Breaker 2007), FMN (Lee et al. 2009) and c-di-GMP (Sudarsan et al. 2008). For Escherichia coli, the presence of a single molecule per cell corresponds to a concentration in the low nanomolar range. Therefore, picomolar KD values for aptamers are at least two orders of magnitude better than required for a riboswitch to detect compounds as rare as one per cell!

There are several possible explanations for this KD and metabolite concentration paradox (Ames and Breaker 2009). Perhaps, the conditions used to estimate ligand affinities are not a good approximation of cellular conditions, and the actual KD values may be much poorer. It is also known that nucleotides flanking the aptamer domains can diminish measured KD values, presumably by forming alternative folds that compete with the ligand-receptive structure. Thus the affinities of riboswitch aptamers may be poorer for nascent transcripts as they emerge from RNA polymerase. Another hypothesis that would resolve this contradiction is that some riboswitches may not reach equilibrium with their target metabolite, but they rely on the kinetics of RNA folding and ligand binding to properly modulate gene expression. In other words, the speed of ligand association, rather than the equilibrium constant reflecting ligand affinity, may be the critical determinant for the concentration of ligand needed to trigger riboswitch action. Unless the ligand binds before RNA polymerase passes beyond the terminator stem, transcription will progress to completion even if ligand binding eventually occurs.

Evidence for the importance of ligand-binding kinetics has emerged from an analysis of an FMN riboswitch from the ribD operon of B. subtilis (Wickiser et al. 2005b). This riboswitch operates via transcription termination whereby ligand binding permits formation of a terminator stem (Fig. 3A). The study revealed that the concentration required to trigger efficient transcription termination is more than 10-fold higher than the KD value measured for the minimal aptamer domain. Furthermore, conducting transcription reactions under conditions that accelerate the speed of RNA polymerization (e.g., increasing NTP concentrations or using mutations to remove RNA polymerase pause sites) creates a demand for even higher FMN concentration to trigger termination.

Kinetic function of an FMN riboswitch. (A) Sequence and secondary structure models for the ribD FMN riboswitch from B. subtilis (Winkler et al. 2002a). The RNA functions as a genetic “OFF” switch wherein FMN binding stabilizes P1 formation, precludes the formation of an antiterminator stem, and permits the formation of a terminator stem that represses gene expression. (B) Simplified kinetic scheme for the function of the ribD FMN riboswitch depicted in A. Steps represented by black and gray arrows lead to termination and full-length mRNA production, respectively. See elsewhere for details (Wickiser et al. 2005b).

This and related data (Wickiser et al. 2005a Gilbert et al. 2006) indicate that at least some riboswitches are not thermodynamically driven, but rely on the kinetics of transcription, RNA folding and metabolite binding to tune the concentration of ligand needed to trigger genetic control. Interestingly, this characteristic may give riboswitches an advantage over protein genetic factors in some instances (Ames and Breaker 2009). A kinetically driven riboswitch can be tuned to respond to a different concentration of metabolite by accruing mutations in the aptamer that change the rate constant for ligand association. Perhaps more likely to occur are mutations in the span of nucleotides linking aptamer to expression platform that change the time needed for RNA polymerase to reach the terminator stem. Kinetically driven riboswitches could lower or raise the concentration needed to trigger function by simply inserting or deleting nucleotides to this linker, respectively. RNA World riboswitches may have exploited similar characteristics to experience a more smooth evolutionary landscape where functional tuning could occur by mutations outside the binding and catalytic sites of receptors and ribozymes.


Signalling Mechanism in Prokaryotes and Eukaryotes | Microbiology

In this article we will discuss about the signalling mechanisms, both in eukaryotes and prokaryotes.

1. Eukaryotic Cell-to-Cell Signaling:

The integrative nature of biological systems could be understood after the pioneering work of Claude Bernard (1813-1878) of France. He gave the concept of the miliew interieur and suggested the system of ductless gland (i.e. endocrine glands) for integrating function and maintaining homeostasis.

In 1902, Bayliss and Staling demonstrated a marked flow of pancreatic juice in dogs after injecting an acid extract of duodenum. Starling coined the term ‘hormone’ (Greek, I excite) for such intercellular messenger molecules. An American physiologist Water Cannon coined the term ‘homeostasis’ (a condition that may vary but is relatively constant).

There are three types of signalling systems in multicellular organisms like mammals: neuronal, endocrine and cytokine signalling (Fig. 27.3). Neuronal signalling occurs over very long distance i.e. brain to toe. Synaptic junctions communicate rapidly. Many chemicals are involved in signalling at junctions and associated with inflammation.

Endocrine signalling involves the release of a hormone from its gland and its transport to blood to a limited number of cells in the target tissue. It occurs at a long distance and limited by the rate of blood flow and diffusion from blood to tissues.

Most of the intracellular signalling is that of the cytokines. Much of this signalling occurs through paracrine signalling (over short distance cell to nearby cell) or by auto-signalling (stimulation of the cell producing the cytokines).

Certain bacterial endotoxins target neuronal signalling hence these are called neurotoxins as produced by Clostridium tetani and CI. botulinum. These have metalloproteinase activity and cleave specific intracellular proteins. Thus they prevent the neurotransmitters. A recent discovery also points to the interaction of a bacterial toxin with neuroendocrine signalling and synapsis of cytokines.

(i) Endocrine Hormone Signalling:

Endocrine hormones are mostly produced by specific glands (such as pituitary, hypothalamus, and parathyroid glands) and glandular tissues (such as pancreas and intestine).

There are three major groups of hormones: peptide hormones (produced especially by the intestine which has neurotransmitter-like activity), steroid hormones (produced by adrenal cortex, gonads and skin), and thyrosine derivatives (e.g. thyroid hormones T3, T4, etc. and catecholamines, noradrenaline, adrenaline and dopamine having neurotransmitter activity).

After secretion from glands, endocrine hormones are circulated as free hormone or bound to carrier proteins, for example a serum protein, albumin. It binds to several circulating hormones and exerts action (only in bound form) to specific cell receptors in target tissues.

The peptide hormones bind to specific membrane receptors which result in specific intracellular signalling pathway. On the other hand the steroid and sterol hormones enter into cells and bind to cytoplasmic receptor proteins and then move to nucleus and act as a factor for transcription.

Endocrine hormones control the energy metabolism through insulin and glycogen, adrenaline and noradrenaline involving production and breakdown of carbohydrate stored as glycogen in the liver and muscle.

The lack of control of this system is visible in diabetes. Bacterial infection results in hormonal imbalance in body. Certain bacteria and viruses affect neural tissue. Mycobacterium leprae and Treponema pallidum have a tropism for nervous tissue.

The tissue of the gastric and intestinal mucosae are highly regulated. They respond and produce several endocrine signals including gastro-intestinal hormones e.g. gastrin, secretin, cholecystokinin and guanylin. E. coli alters fluid imbalance in the intestine and causes diarrhoea. Now seven different strains of E. coli have been reported which induce different pathological symptoms.

The enter toxigenic E. coli strains produce heat-labile toxin (LT) and heat-stable toxin (ST). ST is the first bacterial analogue of an endocrine hormone (guanyl) that activates guanyl cyclase and control fluid release from intestinal cells so that mucin layer could be kept wet (Fig. 27.4). There are other heat-stable guanyl-like toxin of other strains of E. coli and other bacteria.

(ii) Cytokines:

Cytokines are a large group of over 1000 proteins which are involved in cell- to-cell signalling and control the inflammatory response to bacterial infection. These are polypeptide hormones secreted by a cell that affects growth and metabolism of the same cell (autocrine signalling) or another cell (paracrine signalling).

Their over production causes disease. These are found at the site of infection by the agents. These induce lipid mediators (prostaglandins, leukotrienes, lipo xins, platelet-activating factor and the mediators from mast cells (e.g. histamine and enzymes such as tryptase).

(a) Nomenclature of cytokines:

Cytokines are divided into six sub-families (Table 27.7) on the basis of several criteria such as historical types, sequence homology, localisation of chromo­somes and biochemical actions. In 1979, the term interleukin (inter: between, leukin: leukocytes) was coined to denote the proteinaceous factors which modulate the function of the other leukocytes.

At present there are over 20 interleukins (IL-1 to IL-18). The endotoxin-injected mice expressed a tumour necrosis factor (TNF) grouped under cytotoxic cytokines. The TNF kill certain tumour cell lines via induction of apoptosis and are potent pro-inflammatory molecules. The TNF receptor family is itself membrane-bounded proteins (e.g. CD27, CD30 and CD40).

Table 27.7 : Cytokines: nomenclature and sub-families.

The interferon’s (IFNs) are such cytokines which were discovered first. These are involved in inhibiting the growth and spread of viruses. They are of three types: INF-a, INF-P, and INF- Y- Interferon’s also act against protozoa, rickettsia and mycobacteria.

The colony-stimulating fac­tors (CSFs) control the growth and differentiation of neutrophils, monocytes and cell populations derived from monocytes in the bone marrow. The monocytes/macrophages are the phagocytic cells which engulf and kill bacteria. Hence, they are also called as antigen-presenting cells and stimu­late T and B lymphocytes.

Growth factors include families of pro­teins such as fibroblast growth factor (FGF) family, platelet-derived growth factors (PDGF), and transforming growth factor-β (TGFβ). The FGF cytokines act on mesenchyme cells and epithelial cells also.

The peptide chemotactic factors are called chemokines which is a large sub-group of cytokines. Chemokines have molecular mass of 8-10 kDalton, with 20-50% sequence homology at protein level and cysteine as conserved residues which form disulphide bonds within the molecules.

On the basis of chromosomal location of genes and protein structure, chemokines are divided into two families: α-chemokine and β-chemokine families. A third family of chemokines discovered in 1994 currently has one member called lympholactin which is a strong attractant of T cells.

(b) Receptors of cytokines:

Cytokine receptors have high affinity for their ligand. The number of individual receptor present on target cell is low. On the basis of sequence homology and structural motifs cytokine receptors are grouped into a small number of families. At present there are nine receptors for CC chemokines (CCR), five receptor for CXC chemokines and CXCR1, one receptor for fractalkine.

The cytokine receptors are shed from cell via proteolytic cleavage. Cell surface metalloproteinases (sheddases) help the release of cytokine receptors. The released receptors bind the soluble cytokines and inhibit their activity or stimulate the cytokine-receptor lacking cells.

(c) Biological action of cytokines:

Cytokines play a role in physiological development. They are found at all developmental stages in mammals. On the other hand, cytokine receptors present on cell membrane also play a physiological role. They act as portals for vital entry into cells. For example HIV enters through binding to cytokine receptors. Similarly, herpes simplex virus enters through binding the TNF receptor family.

After binding receptors induce selective intracellular signalling resulting in switching on or switching off of particular genes and production of cyclooxygenase II, and nitric oxide (NO) is synthesised after induction of nitric oxide synthetase.

Aspirin and ibuprofen are the non-steroid anti-inflammatory drugs which block cyclooxygenase activity. These drugs reduce pain and fever as the prostaglandins and prostacyclin lower threshold in pain nerve resulting in a relief of pain and fever.

Various molecules are produced after binding cytokines to cytokine-receptor which produce pathology [prostaglandins, NO, tissue plasminogen activator (tPA) and plasminogen activator inhibitor and collaginases]. Tissue damage is directly induced by collagenase and tPA.

Besides, cytokines also induce the synthesis of their own and other cytokines which result in a complex network of interactions. Cytokines can also modify the behaviour of cells in many ways. Various actions of cytokines on cells are shown in Fig. 27.5.

2. Prokaryotic Cell-to-Cell Signalling: Quorum Sensing and Bacterial Pheromones:

Until the 1980s, no attention was paid that bacteria could talk to one another. Thereafter, examples were put forth for cell-to-cell signalling in bacteria. Conjuga­tion is one of the methods of DNA transfer between two bacteria. To establish conju­gation, both the bacteria must establish cell-to-cell contact. Enterococcus faecalis is a Gram-positive mammalian pathogen.

Its aggregation in controlled by the secretion of small peptide pheromones. Pheromones induce adhesion production consequently bacteria form cell clumps which facilitate conjugation. Several pheromones have been isolated which are hepta- or octa-peptides found in low concentration (5吆 -11 M).

Endospores of Clostridium tetani are regarded as resting forms of bacteria and a part of virulence mechanism. In contrast some bacteria such as a myxobacterium under adverse environ­mental conditions undergo complex morphological changes. Polyangium vitellinum forms cyst-like structure consisting of an outer covering of polysaccharide to resist from dehydration.

Myxococcus xanthus forms myxospores (fruiting body) and alternate with vegetative cells This programme is triggered by starvation which causes morphological changes within 4 hours. A dense mound-shaped structure is formed when a cell density of bacteria has reached to about 10 5 . After 20 hours of starvation the cells inside this mound differentiate into myxospores.

Myxospores are heat- and starvation-resistant dormant cells. They germinate during favourable conditions and produce vegetative cells. Again myxospores are formed when conditions are unfavourable. This type of cell differentiation is controlled by extracellular signals. Cell-to-cell signalling mechanism is given in Table 27.8.

Quorum Sensing:

The term quorum refers to ‘a fixed number of members of any committee of the society whose presence is mandatory for proper transaction of business’. Quorum sensing in bacteria is a mechanism through which they take a census of their number. After reaching a quorum of cell number they can transact the business of switching on or switching off of specific genes.

The current knowledge of quorum sensing began with the study of luminescence in Vibrio fischeri and V. harveyi. They are marine bacteria forming symbiotic relationship with monocentrid fish and with bobtail squids (e.g. Euprymna scolopes). The bobtail squid consists of very high concentration of V. fischeri. The light organ is supposed to be part of a counter illumination the details of which are not clear.

The newly hatched squids develop symbiotic association with only certain strains of V.fischeri. Within hours after hatching, light organ is colonised by V.fischeri. The light organ positively selects only certain strains of V.fischeri and negatively selects the others to exclude colonisation of other bacteria present in sea water.

It is not known how this selection is made. One of the possible mechanisms may be the expression of specific adhesin for V.fischeri by epithelium of light organ. The epithelium is exposed to trypsin which di­rectly triggers a specific morphogenetic response in the squid. This results in formation of the complex.

(a) Mechanism of quorum sensing:

It is the feedback control system. Bacteria continu­ously produce a small amount of signal called auto inducer. Most of the Gram-positive bacteria produce auto inducer which are acylhomoserine lactones (AHLs). Staphylococcus aureus and other bacteria produce peptide auto inducers. E. colt and S. typhimurium produce a quorum sensing mol­ecule of 1 kDalton. These extracellular inducers are diffused out.

Besides, bacteria also recognise the pres­ence of auto inducer. The bacterial membrane protein does this function. It acts both as receptor of auto inducer and activator of gene transcription. V. fischeri produces luminescence. V. fischeri system is the best studied quorum sensing system.

Luminescence is associated with lux operon system which consists of two main regulatory genes luxl and luxR (Fig. 27.6) and other genes (luxCDABEG) which synthesise chemicals to produce light. LuxI encodes a protein which catalyses the synthesis of a wide range of AHLα. Autoinducer of V. fischeri is N-(3-oxo-hexanoyl)-L- homoserine lactone.

LuxR encodes a protein which acts both as a receptor for AHL and as a transducer of the signal that activates the other genes of lux operon. The luxCDABEG genes are expressed after binding AHL to the luxR protein (Fig. 27.6). The luxA and luxB genes synthesise the α- and β- subunits of bacterial luciferase. The other genes encode polypeptides which facilitate the synthesis of the substrate and produces light.

(b) Quorum sensing as a virulence mechanism:

In addition to V. fischeri, there is a large number of Gram-negative bacteria which produce AHLs to quorum sense. These are medically important bacteria, for example Pseudomonas aeruginosa, Proteus mirabilis, Serratia liquefaciens and Yersinia enterocolitica. In these bacteria LuxI/LuxA homologues are involved in quorum sensing system. Ps. aeruginosa utilises two quorum systems, the las and rhl.

The las operon expresses LasR protein which is similar to LuxR and acts as transcriptional activator in the presence of PAI of Pseudomonas. The LasI (the Luxl homologue) produces AHL. The autoinducer of P. aeruginosa at a threshold concentration swich on a group of virulence gene including lasB, lasA apv and toxA.

The rhl system is the second quorum sensing system which involves RhIR (the transcriptional activator protein) along with the autoinducer (N-butyryl-L-homoserine lactone) synthesised by RhIR. This quorum sensing system results in production of extra virulence factor e.g. elastase which cleaves and inhibits the interleukin-2 (the key host defence cytokines). The las system is dominant which is activated before the rhl system.

Many Gram-positive bacteria use oligopeptide as signalling molecules. For example, two different peptides are secreted by Bacillus subtilis. These are necessary for competence (ability for DNA uptake) and sporulation.

In Staphylococcus aureus, a locus agr controls the expression of many virulence factors, namely exotoxins, capsular polysaccharide type 8 and V8 protease. An octapeptide quorum sensing autoinducer is encoded by the agr lucus which induces the agr locus. The quorum sensing autoinducer interacts with host defence system and inhibits the albeit at high concentration (Fig. 27.7).


FINDING RIBOSWITCHES

Although the usual method to define a riboswitch involves locating a conserved secondary structure in the RNA molecule, the highly restricted nature of the sensing element argues that sequence alone should be enough to locate riboswitches correctly. We have previously developed a computer algorithm capable of finding bacterial regulatory motifs, based exclusively on sequence conservation in the regulatory regions of orthologous groups of genes ( 5 ). The main restrictions of our method are that a regulatory element must be closely associated with at least one COG (cluster of orthologous groups of proteins) ( 6 ) and it must be present in at least five non-redundant genomes. On the other hand, the advantage is that it is an automatic process, requiring no previous regulatory information to produce relevant results, and as such, can be easily run every time that new genomes or annotations are available.

We updated our previous results ( 5 ), taking into account 223 complete genomes. From these, a reduced set of 145 non-redundant organisms was obtained using CVtree ( 7 ). We were able to recover 10 out of the 11 currently reported riboswitches. Additionally, our results included many regulatory elements that are also known to depend on structured RNA for recognition, such as the Gram-positive T-box and the PyrR protein binding site. We thus call our set of regulatory elements: riboswitch-like elements (RLEs), given the fact that almost all the identified conserved signals were RNA-dependant regulatory elements.

RibEx is a web server that allows any user to easily find any RLE in the sequence of his/her interest. Since most known riboswitches are associated with attenuators, we have included the option of searching for transcriptional and translational attenuators, which can help in selecting the most likely candidates, as has been shown by Barrick et al . ( 4 ). Additionally, our web server displays representative drawings of the open reading frames (ORFs) and their corresponding regulatory elements, any of which can be selected, in order to acquire its sequence for submission to NCBI's BLAST server ( 8 ). Every RLE is linked to a list of genes that are predicted to be subject to its regulation. The genome context of these genes, analyzed with our local GeConT web server ( 9 ), in addition to the scores of the pre-computed RLEs, can be of great assistance when evaluating the likelihood of a new prediction.

A great resource when working with RNA families is the Rfam database ( 10 ). We have used their models to annotate our RLEs. As of version 7.0, Rfam contains a total of 503 families, 125 of them are non-coding, and 11 of these are annotated as riboswitches. We were able to recover automatically all but one of these riboswitches, missing the ykoK element. Our matrices for the most abundant riboswitches perform very well when compared with the co-variance models used by Rfam (∼90% coverage when analyzing bacterial sequences). Less common riboswitches (e.g. lysine and purine) are more difficult to model with sequence-based weight-matrices. Our method thus tends to recover between 70 and 80% of these Rfam members. Our data set also contains six more RLEs that coincide with an Rfam cis -regulating member and 341 RLEs that do not have a match and thus remain as predicted elements. We have calculated a P -value, assuming a hyper-geometrical distribution, for each RLE to be over-represented in a given COG or KEGG pathway ( 11 ). Thus, we provide every RLE with a tentative functional assignation.

As far as we know there are only two servers, beside ours, that can be used to locate riboswitches in a given sequence: riboswitch finder ( 12 ) which, in its current implementation, only searches for the purine-sensing riboswitch, and Rfam, that has an option to locate riboswitches in any sequence, but as co-variance searches have high computational requirements, the sequence length is limited to 2 kb. RibEx, in addition to performing searches on larger sequences, allows the user a greater view of the regulatory potential of his sequence, by showing the ORFs and predicted attenuators. The 341 predicted RLEs also make RibEx a great complement to the curated families contained in Rfam.


Definition of eukaryotes and prokaryotes

Prokaryotes (pro-KAR-ee-ot-es) (from Old Greek pro- before + karyon nut or kernel, referring to the cell nucleus, + suffix -otos, pl. -otes also spelled "procaryotes") are organisms without a cell nucleus (= karyon), or any other membrane-bound organelles. Most are unicellular, but some prokaryotes are multicellular.

Eukaryotes (IPA: [juːˈkæɹɪɒt]) are organisms whose cells are organized into complex structures by internal membranes and a cytoskeleton. The most characteristic membrane bound structure is the nucleus. This feature gives them their name, (also spelled "eucaryote,") which comes from the Greek ευ, meaning good/true, and κάρυον, meaning nut, referring to the nucleus. Animals, plants, fungi, and protists are eukaryotes.


Materials and methods

Computational analysis

In-house Perl scripts were used to organize the execution of other software tools, compute various statistics, and maintain local relational databases of genome and gene information. Many of these scripts rely on Bioperl [82], and the Bio::Graphics module was particularly useful for visualizing the genomic contexts of riboswitch matches.

Riboswitch identification

Covariance models were trained on sequence alignments adapted from various sources (Table 1) using the Infernal software package (version 0.55) [83]. Heuristic filtering techniques [16] were used to accelerate CM searches of microbial sequences in the RefSeq database (version 12) [84] and environmental shotgun sequences from an acid mine drainage community [85], the Sargasso Sea [25], and Minnesota soil and whale fall sites [86]. CM searches for TPP riboswitches were also conducted against the plant and fungal portions of the RefSeq database (version 13).

The regulatory potentials of putative riboswitch aptamers were assessed by examining their genomic contexts. To uniformly predict gene functions, protein domains were assigned to COGs (orthologous gene clusters) [87] using RPS-BLAST and scoring matrices from the Conserved Domain Database (CDD) [88]. The plausibility of putative aptamer structures was assessed by computationally aligning hits to the original CM with Infernal and manually examining divergent RNA structures. Using these two complementary criteria, we established trusted CM score cutoffs. All hits in the microbial RefSeq database above these thresholds were judged to be functional riboswitches. Since gene context information is not available for most environmental sequences, hits from these data sets were included only if they had CM scores above the trusted threshold. Additional low-scoring sequences from the RefSeq database were also included when their genomic contexts and alignments strongly indicated that they were functional riboswitches.

To verify that this approach efficiently recovers known riboswitches, the final results were compared to a list of TPP riboswitches compiled in a comparative genomics analysis of thiamin metabolic genes and this regulatory RNA element [48]. The new searches successfully found all TPP riboswitches that had been previously identified in the set of complete microbial genomes analyzed in both studies. They also discovered a small number of TPP riboswitches upstream of thiamin-related genes (for example, a pnuC homolog in Helicobacter pylori and thiM in Lactococcus lactis) in genomes examined by the former study that had not yet been reported.

For the glycine riboswitch, a single aptamer covariance model and a tandem model containing both the first and second aptamers were used to separately identify matches. Every aptamer that is part of a tandem configuration was found by the single aptamer CM search, and cases of lone aptamers were noted. For consensus structure and MI calculations only the tandem glycine aptamer alignment was considered, but the complete set of lone and tandem aptamer glycine riboswitches were included in the expression platform analysis. Expression platform counts for other riboswitch classes that rarely occur in tandem were not corrected.

Mechanism classification

Expression platforms were classified according to the scheme in Figure 2 for a subset of the riboswitch matches found in complete and unfinished microbial genomes. Aptamer sequences with more than 95% pairwise identity at reference columns (positions where ≥50% of the weighted sequences in the alignment do not contain a gap) were omitted to avoid biasing statistics with duplicate sequences. Riboswitches with suspect gene annotations where >60 nucleotides (nt) of an open reading frame (ORF) on the same strand overlapped the aptamer or >700 nt separated the aptamer and the nearest downstream ORF were also screened out. Most of these cases appear to result from incorrect start codon choices, overpredictions of hypothetical ORFs, or missing annotation of real genes. The remaining sequences constituted the expression platform data set, and sequences beginning at the 5' end of each aptamer and continuing through the first 120 nt of the downstream ORF were extracted for further analysis.

Riboswitches where the downstream gene was on the opposite strand were examined as candidates for antisense regulation. Other riboswitches were classified as directly regulating translation initiation when the downstream gene's start codon was within 15 nt of the end of the conserved aptamer core structure (usually the P1 paired element). The remaining expression platforms were scanned with the local RNA secondary structure prediction program Rnall (version 1.1) [89] for intrinsic transcription terminators with a scanning window of 50 nt, a U-tail weight threshold of 4.0, a U-tail pairing stability cutoff of -8.3 kcal/mol, and default settings for other parameters. Riboswitches with a terminator predicted in their expression platform sequence were assigned transcription attenuation mechanisms. These riboswitches were classified as also regulating translation if the distance between the terminator hairpin and the gene's start codon is no more than 10 nt. Expression platforms that did not match any of the above criteria are assumed to employ translation attenuation mechanisms.

Rnall and distance parameters were calibrated by comparing expression platform predictions to expert predictions for a large and phylogenetically diverse collection of TPP riboswitches [48]. Rnall correctly predicts 46 out of 52 terminators in this data set with only 3 predictions of terminators in sequences not manually evaluated as containing a terminator (a sensitivity of 88% and an accuracy of 94%). The three false positives resemble terminators and may be functional, whereas the terminators that Rnall misses usually have large hairpins with poor thermodynamic stabilities. Overall, the decision tree classifies 159 out of 180 TPP riboswitch expression platforms (88%) correctly into the category assigned in the control set.

Consensus secondary structures

We manually adjusted the covariance model alignments of riboswitch aptamers while refining their consensus secondary structures. In particular, bases taking part in pseudoknotted pairings that cannot be represented by CMs were shifted to accurately represent these interactions. Bases flanking gapped consensus columns, which are sometimes ambiguously spread out across many possible positions by the alignment algorithm, were also systematically condensed into a minimum number of overall consensus columns. As new structure motifs and base-base interactions became evident, the alignments were adjusted to reflect these new constraints. Riboswitch sequences in the final alignments were weighted using Infernal's internal implementation of the GSC algorithm [90] to reduce biases from duplicate and similar sequences before calculating consensus structure statistics.

Mutual information significance

Duplicate sequences were purged and columns with >50% gaps were removed from riboswitch alignments prior to the MI analysis, and, if necessary, alignments were further pruned to the 300 most diverse sequences (as judged by pairwise base differences). A customized version of the program Rate4Site (version 2.01) [91] with modified output options was used to simultaneously estimate distances and per-column rates of evolution according to a gamma distributed model with at least 16 rate categories and a phylogenetic tree created with Jukes-Cantor distances that treated gaps as missing information. The resulting trees, rates, and distances were used to simulate 10,000 resampled alignments starting from an arbitrary ancestral sequence. Then, gaps and sequence weights were re-inserted into each of these derivative alignments at the same positions that they occupied in the original alignment.

Mutual information was calculated between column pairs for all alignments according to standard formulas [60], taking into account sequence weights and treating gaps as a fifth character state. The resampled alignments were used to estimate what the MI score distribution would have been if the bases present in each column had evolved independently, without covariation constraints. The p value significance of the actual MI between two columns is the fraction of the resampled alignments that have a greater MI score than the value observed between those two columns in the real alignment.