Information

How to generate simulated mass-spectrometry data for phosphorylated proteins?

How to generate simulated mass-spectrometry data for phosphorylated proteins?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I am trying to generate simulated MS data (Top-down and Bottom-up) for phosphorylated protein such as platelet-derived growth factor receptor (PDGFR-B). There are 10 tyrosine sites which are phosphorylated. Could you kindly tell me if there is any good software application to help me do that?

Also, how can I include in the input file (FASTA) that there are 10 modification sites on the protein sequence?

What are the parameters required to be set for top-down and bottom-up MS separately?


I know this is a two years old question… but I recently developed a web application which could be useful for you: Prot pi

The Protein tool lets you simulate the mass spectrum of the intact protein, while the Peptide tool gives you the fragment ions of peptides. Modifications can easily attached to every site you want.


For peptides (bottom up) mMass accepts a sequence of interest and phosphorylated tyrosine can be specified as a modification after that. The program also performs in silico enzyme digests and in silico MS/MS fragmentation.

There are other tools that attempt to accurately mimic the intensities of the various fragments that could be generated.

The meaning of the final question is unclear. If you want to know which parameters to set for the in silico analyses you need to use those that match the sample processing and analyses that would take place in the real world. Sample processing and instrument setup are highly context dependent, there's insufficient information to help with those.

Obtaining good quality intact measurements from a protein the size of PDGFR-B would be a significant challenge.


Have a look at PeptideMass, you can choose to use the "post-translational modifications" option that will output the masses of phosphorylated (among other modifications) peptides.


Abstract

The systematic and quantitative molecular analysis of mutant organisms that has been pioneered by studies on mutant metabolomes and transcriptomes holds great promise for improving our understanding of how phenotypes emerge. Unfortunately, owing to the limitations of classical biochemical analysis, proteins have previously been excluded from such studies. Here we review how technical advances in mass spectrometry-based proteomics can be applied to measure changes in protein abundance, posttranslational modifications and protein–protein interactions in mutants at the scale of the proteome. We finally discuss examples that integrate proteomics data with genomic and phenomic information to build network-centred models, which provide a promising route for understanding how phenotypes emerge.


Amphitrite: A program for processing travelling wave ion mobility mass spectrometry data

Since the introduction of travelling wave (T-Wave)-based ion mobility in 2007 a large number of research laboratories have embraced the technique, particularly those working in the field of structural biology. The development of software to process the data generated from this technique, however, has been limited. We present a novel software package that enables the processing of T-Wave ion mobility data. The program can deconvolute components in a mass spectrum and uses this information to extract corresponding arrival time distributions (ATDs) with minimal user intervention. It can also be used to automatically create a collision cross section (CCS) calibration and apply this to subsequent files of interest. A number of applications of the software, and how it enhances the information content extracted from the raw data, are illustrated using model proteins.

Graphical abstract

Highlights

► Development of the first software to automate the processing of travelling wave ion mobility mass spectrometry (TWIM-MS) data. ► Automatic creation and application of collision cross section calibration to TWIM-MS data. ► Creation of fine-grained collision cross section vs. m/z heat maps that can be overlaid between experimental conditions.


References

Krishna, R. G. & Wold, F. Post-translational modification of proteins. Adv. Enzymol. Relat. Areas Mol. Biol. 67, 265–298 (1993).

Jensen, O. N. Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry. Curr. Opin. Chem. Biol. 8, 33–41 (2004).

Mann, M. & Jensen, O. N. Proteomic analysis of post-translational modifications. Nature Biotechnol. 21, 255–261 (2003).

Eichler, J. & Adams, M. W. Posttranslational protein modification in Archaea. Microbiol. Mol. Biol. Rev. 69, 393–425 (2005).

Yang, X. J. Multisite protein modification and intramolecular signaling. Oncogene 24, 1653–1662 (2005). An excellent review of multi-site PTMs and their biological roles.

Cohen, P. The regulation of protein function by multisite phosphorylation — a 25 year update. Trends Biochem. Sci. 25, 596–601 (2000).

Gunawardena, J. Multisite protein phosphorylation makes a good threshold but can be a poor switch. Proc. Natl Acad. Sci. USA 102, 14617–14622 (2005).

Cosgrove, M. S., Boeke, J. D. & Wolberger, C. Regulated nucleosome mobility and the histone code. Nature Struct. Mol. Biol. 11, 1037–1043 (2004).

Fischle, W., Wang, Y. & Allis, C. D. Binary switches and modification cassettes in histone biology and beyond. Nature 425, 475–479 (2003).

Pokholok, D. K. et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517–527 (2005).

Fischle, W. et al. Regulation of HP1–chromatin binding by histone H3 methylation and phosphorylation. Nature 438, 1116–1122 (2005).

Freitas, M. A., Sklenar, A. R. & Parthun, M. R. Application of mass spectrometry to the identification and quantification of histone post-translational modifications. J. Cell. Biochem. 92, 691–700 (2004).

Insinga, A., Minucci, S. & Pelicci, P. G. Mechanisms of selective anticancer action of histone deacetylase inhibitors. Cell Cycle 4, 741–743 (2005).

Steen, H. & Mann, M. The abc's (and xyz's) of peptide sequencing. Nature Rev. Mol. Cell Biol. 5, 699–711 (2004). A good introduction to the concepts and methods that are involved in peptide sequencing by MS/MS.

Yates, J. R. 3rd, Gilchrist, A., Howell, K. E. & Bergeron, J. J. Proteomics of organelles and large cellular structures. Nature Rev. Mol. Cell Biol. 6, 702–714 (2005).

Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003).

Loughrey Chen, S. et al. Mass spectrometry-based methods for phosphorylation site mapping of hyperphosphorylated proteins applied to Net1, a regulator of exit from mitosis in yeast. Mol. Cell. Proteomics 1, 186–196 (2002).

Neubauer, G. & Mann, M. Mapping of phosphorylation sites of gel-isolated proteins by nanoelectrospray tandem mass spectrometry: potentials and limitations. Anal. Chem. 71, 235–242 (1999).

Unwin, R. D. et al. Multiple reaction monitoring to identify sites of protein phosphorylation with high sensitivity. Mol. Cell. Proteomics 4, 1134–1144 (2005).

Beausoleil, S. A. et al. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl Acad. Sci. USA 101, 12130–12135 (2004).

Gruhler, A. et al. Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway. Mol. Cell. Proteomics 4, 310–327 (2005). A large-scale quantitative phosphoproteomics study using affinity enrichment and multi-stage MS reveals some of the key signalling modules that are involved in the yeast pheromone response.

Coon, J. J. et al. Protein identification using sequential ion–ion reactions and tandem mass spectrometry. Proc. Natl Acad. Sci. USA 102, 9463–9468 (2005).

Sze, S. K., Ge, Y., Oh, H. & McLafferty, F. W. Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue. Proc. Natl Acad. Sci. USA 99, 1774–1779 (2002).

Wu, S. L., Kim, J., Hancock, W. S. & Karger, B. Extended range proteomic analysis (ERPA): a new and sensitive LC–MS platform for high sequence coverage of complex proteins with extensive post-translational modifications-comprehensive analysis of β-casein and epidermal growth factor receptor (EGFR). J. Proteome Res. 4, 1155–1170 (2005).

Creasy, D. M. & Cottrell, J. S. Error tolerant searching of uninterpreted tandem mass spectrometry data. Proteomics 2, 1426–1434 (2002).

Matthiesen, R., Trelle, M. B., Højrup, P., Bunkenborg, J. & Jensen, O. N. VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins. J. Proteome Res. 4, 2327–2337 (2005).

Sadygov, R. G., Cociorva, D. & Yates, J. R. 3rd. Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nature Methods 1, 195–202 (2004).

Cantin, G. T. & Yates, J. R. 3rd. Strategies for shotgun identification of post-translational modifications by mass spectrometry. J. Chromatogr. A 1053, 7–14 (2004).

Kirkpatrick, D. S., Denison, C. & Gygi, S. P. Weighing in on ubiquitin: the expanding role of mass-spectrometry-based proteomics. Nature Cell Biol. 7, 750–757 (2005). Excellent review on proteomics approaches for the study of ubiquitylation and protein modification by ubiquitin-like modifiers.

Boeri Erba, E. et al. Systematic analysis of the epidermal growth factor receptor by mass spectrometry reveals stimulation-dependent multisite phosphorylation. Mol. Cell. Proteomics 4, 1107–1121 (2005).

Pandey, A. et al. Analysis of receptor signaling pathways by mass spectrometry: identification of Vav-2 as a substrate of the epidermal and platelet-derived growth factor receptors. Proc. Natl Acad. Sci. USA 97, 179–184 (2000).

Salomon, A. R. et al. Profiling of tyrosine phosphorylation pathways in human cells using mass spectrometry. Proc. Natl Acad. Sci. USA 100, 443–448 (2003).

Blagoev, B., Ong, S. E., Kratchmarova, I. & Mann, M. Temporal analysis of phosphotyrosine-dependent signaling networks by quantitative proteomics. Nature Biotechnol. 22, 1139–1145 (2004). A temporal analysis that used triplex stable-isotope labelling and MS to study EGF-induced phosphotyrosine signalling events.

Rush, J. et al. Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nature Biotechnol. 23, 94–101 (2005).

Gronborg, M. et al. A mass spectrometry-based proteomic approach for identification of serine/threonine-phosphorylated proteins by enrichment with phospho-specific antibodies: identification of a novel protein, Frigg, as a protein kinase A substrate. Mol. Cell. Proteomics 1, 517–527 (2002).

Kane, S. et al. A method to identify serine kinase substrates. Akt phosphorylates a novel adipocyte protein with a Rab GTPase-activating protein (GAP) domain. J. Biol. Chem. 277, 22115–22118 (2002).

Posewitz, M. C. & Tempst, P. Immobilized gallium(III) affinity chromatography of phosphopeptides. Anal. Chem. 71, 2883–2892 (1999).

Stensballe, A., Andersen, S. & Jensen, O. N. Characterization of phosphoproteins from electrophoretic gels by nanoscale Fe(III) affinity chromatography with off-line mass spectrometry analysis. Proteomics 1, 207–222 (2001).

Ficarro, S. B. et al. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nature Biotechnol. 20, 301–305 (2002).

Nuhse, T. S., Stensballe, A., Jensen, O. N. & Peck, S. C. Large-scale analysis of in vivo phosphorylated membrane proteins by immobilized metal ion affinity chromatography and mass spectrometry. Mol. Cell. Proteomics 2, 1234–1243 (2003).

Zhang, Y. et al. Time-resolved mass spectrometry of tyrosine phosphorylation sites in the epidermal growth factor receptor signaling network reveals dynamic modules. Mol. Cell. Proteomics 4, 1240–1250 (2005).

Collins, M. O. et al. Proteomic analysis of in vivo phosphorylated synaptic proteins. J. Biol. Chem. 280, 5972–5982 (2005).

Pinkse, M. W., Uitto, P. M., Hilhorst, M. J., Ooms, B. & Heck, A. J. Selective isolation at the femtomole level of phosphopeptides from proteolytic digests using 2D-nanoLC–ESI–MS/MS and titanium oxide precolumns. Anal. Chem. 76, 3935–3943 (2004).

Larsen, M. R., Thingholm, T. E., Jensen, O. N., Roepstorff, P. & Jorgensen, T. J. Highly selective enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Mol. Cell. Proteomics 4, 873–886 (2005).

Dube, D. H. & Bertozzi, C. R. Glycans in cancer and inflammation — potential for therapeutics and diagnostics. Nature Rev. Drug Discov. 4, 477–488 (2005).

Wilson, N. L., Schulz, B. L., Karlsson, N. G. & Packer, N. H. Sequential analysis of N- and O-linked glycosylation of 2D-PAGE separated glycoproteins. J. Proteome Res. 1, 521–529 (2002).

Harvey, D. J. Proteomic analysis of glycosylation: structural determination of N- and O-linked glycans by mass spectrometry. Expert Rev. Proteomics 2, 87–101 (2005). A comprehensive overview of MS-based methods for glycoprotein and glycan analysis using proteomics approaches.

Gabius, H. J., Andre, S., Kaltner, H. & Siebert, H. C. The sugar code: functional lectinomics. Biochim. Biophys. Acta 1572, 165–177 (2002).

Yang, Z. & Hancock, W. S. Approach to the comprehensive analysis of glycoproteins isolated from human serum using a multi-lectin affinity column. J. Chromatogr. A 1053, 79–88 (2004).

Kaji, H. et al. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins. Nature Biotechnol. 21, 667–672 (2003).

Küster, B. & Mann, M. 18 O-labeling of N-glycosylation sites to improve the identification of gel-separated glycoproteins using peptide mass mapping and database searching. Anal. Chem. 71, 1431–1440 (1999).

Hagglund, P., Bunkenborg, J., Elortza, F., Jensen, O. N. & Roepstorff, P. A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation. J. Proteome Res. 3, 556–566 (2004).

Küster, B., Krogh, T. N., Mortz, E. & Harvey, D. J. Glycosylation analysis of gel-separated proteins. Proteomics 1, 350–361 (2001).

Larsen, M. R., Hojrup, P. & Roepstorff, P. Characterization of gel-separated glycoproteins using two-step proteolytic digestion combined with sequential microcolumns and mass spectrometry. Mol. Cell. Proteomics 4, 107–119 (2005).

McLachlin, D. T. & Chait, B. T. Improved β-elimination-based affinity purification strategy for enrichment of phosphopeptides. Anal. Chem. 75, 6826–6836 (2003).

Zhou, H., Watts, J. D. & Aebersold, R. A systematic approach to the analysis of protein phosphorylation. Nature Biotechnol. 19, 375–378 (2001).

Tao, W. A. et al. Quantitative phosphoproteome analysis using a dendrimer conjugation chemistry and tandem mass spectrometry. Nature Methods 2, 591–598 (2005).

Brittain, S. M., Ficarro, S. B., Brock, A. & Peters, E. C. Enrichment and analysis of peptide subsets using fluorous affinity tags and mass spectrometry. Nature Biotechnol. 23, 463–468 (2005).

Kho, Y. et al. A tagging-via-substrate technology for detection and proteomics of farnesylated proteins. Proc. Natl Acad. Sci. USA 101, 12479–12484 (2004).

Sprung, R. et al. Tagging-via-substrate strategy for probing O-GlcNAc modified proteins. J. Proteome Res. 4, 950–957 (2005).

Gevaert, K. et al. Exploring proteomes and analyzing protein processing by mass spectrometric identification of sorted N-terminal peptides. Nature Biotechnol. 21, 566–569 (2003).

Van Damme, P. et al. Caspase-specific and nonspecific in vivo protein processing during Fas-induced apoptosis. Nature Methods 2, 771–777 (2005).

Welchman, R. L., Gordon, C. & Mayer, R. J. Ubiquitin and ubiquitin-like proteins as multifunctional signals. Nature Rev. Mol. Cell Biol. 6, 599–609 (2005).

Giannakopoulos, N. V. et al. Proteomic identification of proteins conjugated to ISG15 in mouse and human cells. Biochem. Biophys. Res. Commun. 336, 496–506 (2005).

Ghezzi, P. & Bonetto, V. Redox proteomics: identification of oxidatively modified proteins. Proteomics 3, 1145–1153 (2003).

Aulak, K. S., Koeck, T., Crabb, J. W. & Stuehr, D. J. Proteomic method for identification of tyrosine-nitrated proteins. Methods Mol. Biol. 279, 151–165 (2004).

Kanski, J. & Schoneich, C. Protein nitration in biological aging: proteomic and tandem mass spectrometric characterization of nitrated sites. Methods Enzymol. 396, 160–171 (2005).

Miyagi, M. et al. Evidence that light modulates protein nitration in rat retina. Mol. Cell. Proteomics 1, 293–303 (2002).

Kanski, J., Hong, S. J. & Schoneich, C. Proteomic analysis of protein nitration in aging skeletal muscle and identification of nitrotyrosine-containing sequences in vivo by nanoelectrospray ionization tandem mass spectrometry. J. Biol. Chem. 280, 24261–24266 (2005).

MacCoss, M. J. et al. Shotgun identification of protein modifications from protein complexes and lens tissue. Proc. Natl Acad. Sci. USA 99, 7900–7905 (2002).

Wu, C. C., MacCoss, M. J., Howell, K. E. & Yates, J. R. 3rd. A method for the comprehensive proteomic analysis of membrane proteins. Nature Biotechnol. 21, 532–538 (2003).

Elortza, F. et al. Proteomic analysis of glycosylphosphatidylinositol-anchored membrane proteins. Mol. Cell. Proteomics 2, 1261–1270 (2003).

Steen, H., Jebanathirajah, J. A., Springer, M. & Kirschner, M. W. Stable isotope-free relative and absolute quantitation of protein phosphorylation stoichiometry by MS. Proc. Natl Acad. Sci. USA 102, 3948–3953 (2005).

Oda, Y., Huang, K., Cross, F. R., Cowburn, D. & Chait, B. T. Accurate quantitation of protein expression and site-specific phosphorylation. Proc. Natl Acad. Sci. USA 96, 6591–6596 (1999).

Heck, A. J. & Krijgsveld, J. Mass spectrometry-based quantitative proteomics. Expert Rev. Proteomics 1, 317–326 (2004).

Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl Acad. Sci. USA 100, 6940–6945 (2003).

Ross, P. L. et al. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 3, 1154–1169 (2004).

Krijgsveld, J. et al. Metabolic labeling of C. elegans and D. melanogaster for quantitative proteomics. Nature Biotechnol. 21, 927–931 (2003).

Gruhler, A., Schulze, W. X., Matthiesen, R., Mann, M. & Jensen, O. N. Stable isotope labeling of Arabidopsis thaliana cells and quantitative proteomics by mass spectrometry. Mol. Cell. Proteomics 4, 1697–1709 (2005).

Ibarrola, N., Molina, H., Iwahori, A. & Pandey, A. A novel proteomic approach for specific identification of tyrosine kinase substrates using [ 13 C] tyrosine. J. Biol. Chem. 279, 15805–15813 (2004).

Ong, S. E. et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376–386 (2002).

Zhu, H., Pan, S., Gu, S., Bradbury, E. M. & Chen, X. Amino acid residue specific stable isotope labeling for quantitative proteomics. Rapid Commun. Mass Spectrom. 16, 2115–2123 (2002).

Kratchmarova, I., Blagoev, B., Haack-Sorensen, M., Kassem, M. & Mann, M. Mechanism of divergent growth factor effects in mesenchymal stem cell differentiation. Science 308, 1472–1477 (2005).

Ballif, B. A. et al. Quantitative phosphorylation profiling of the ERK/p90 ribosomal S6 kinase-signaling cassette and its targets, the tuberous sclerosis tumor suppressors. Proc. Natl Acad. Sci. USA 102, 667–672 (2005).

Guo, D. et al. A tethered catalysis, two-hybrid system to identify protein–protein interactions requiring post-translational modifications. Nature Biotechnol. 22, 888–892 (2004).

Ptacek, J. et al. Global analysis of protein phosphorylation in yeast. Nature 438, 679–684 (2005).

Gorg, A., Weiss, W. & Dunn, M. J. Current two-dimensional electrophoresis technology for proteomics. Proteomics 4, 3665–3685 (2004). An overview of 2D-gel electrophoresis, which includes its application to the study of protein diversity and PTMs.

Ge, Y. et al. Multiplexed fluorescence detection of phosphorylation, glycosylation, and total protein in the proteomic analysis of breast cancer refractoriness. Proteomics 4, 3464–3467 (2004).

Meng, F., Forbes, A. J., Miller, L. M. & Kelleher, N. L. Detection and localization of protein modifications by high resolution tandem mass spectrometry. Mass Spectrom. Rev. 24, 126–134 (2005).

Hu, Q. et al. The Orbitrap: a new mass spectrometer. J. Mass Spectrom. 40, 430–443 (2005).

Uhlen, M. et al. A human protein atlas for normal and cancer tissues based on antibody proteomics. Mol. Cell. Proteomics 4, 1920–1932 (2005).

Kelleher, N. L. et al. Localization of labile posttranslational modifications by electron capture dissociation: the case of γ-carboxyglutamic acid. Anal. Chem. 71, 4250–4253 (1999).

Zubarev, R. A. Electron-capture dissociation tandem mass spectrometry. Curr. Opin. Biotechnol. 15, 12–16 (2004).

Syka, J. E., Coon, J. J., Schroeder, M. J., Shabanowitz, J. & Hunt, D. F. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc. Natl Acad. Sci. USA 101, 9528–9533 (2004).

Forbes, A. J. et al. Targeted analysis and discovery of posttranslational modifications in proteins from methanogenic archaea by top-down MS. Proc. Natl Acad. Sci. USA 101, 2678–2683 (2004).

Zabrouskov, V., Giacomelli, L., van Wijk, K. J. & McLafferty, F. W. A new approach for plant proteomics: characterization of chloroplast proteins of Arabidopsis thaliana by top-down mass spectrometry. Mol. Cell. Proteomics 2, 1253–1260 (2003).

Jones, P. et al. PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucleic Acids Res. 34, D659–D663 (2006).

Orchard, S., Hermjakob, H., Taylor, C., Aebersold, R. & Apweiler, R. Human proteome organisation proteomics standards initiative pre-congress initiative. Proteomics 5, 4651–4652 (2005).

Peri, S. et al. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003).

Blom, N., Sicheritz-Ponten, T., Gupta, R., Gammeltoft, S. & Brunak, S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633–1649 (2004).

Chang, E. J., Archambault, V., McLachlin, D. T., Krutchinsky, A. N. & Chait, B. T. Analysis of protein phosphorylation by hypothesis-driven multiple-stage mass spectrometry. Anal. Chem. 76, 4472–4483 (2004).

Hjerrild, M. et al. Identification of phosphorylation sites in protein kinase A substrates using artificial neural networks and mass spectrometry. J. Proteome Res. 3, 426–433 (2004).

Nuhse, T. S., Stensballe, A., Jensen, O. N. & Peck, S. C. Phosphoproteomics of the Arabidopsis plasma membrane and a new phosphorylation site database. Plant Cell 16, 2394–2405 (2004).

Schwartz, D. & Gygi, S. P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nature Biotechnol. 23, 1391–1398 (2005).

de Lichtenberg, U., Jensen, L. J., Brunak, S. & Bork, P. Dynamic complex formation during the yeast cell cycle. Science 307, 724–727 (2005). Shows the power of an integrative computational analysis of gene-expression data and protein–protein interaction data for the characterization of dynamic, functional protein modules.

Rual, J. F. et al. Towards a proteome-scale map of the human protein–protein interaction network. Nature 437, 1173–1178 (2005).


How to generate simulated mass-spectrometry data for phosphorylated proteins? - Biology

Connexins require an integrated network for protein synthesis, assembly, gating, internalization, degradation and feedback control that are necessary to regulate the biosynthesis, and turnover of gap junction channels. At the most fundamental level, the introduction of sequence-altering, modifications introduces changes in protein conformation, activity, charge, stability and localization. Understanding the sites, patterns and magnitude of protein post-translational modification, including phosphorylation, is absolutely critical. Historically, the examination of connexin phosphorylation has been placed within the context that one or small number of sites of modification strictly corresponds to one molecular function. However, the release of high-profile proteomic datasets appears to challenge this dogma by demonstrating connexins undergo multiple levels of multi-site phosphorylation. With the growing prominence of mass spectrometry in biology and medicine, we are now getting a glimpse of the richness of connexin phosphate signals. Having implications to health and disease, this review provides an overview of technologies in the context of targeted and discovery proteomics, and further discusses how these techniques are being applied to “fill the gaps” in understanding of connexin post-translational control. This article is part of a Special Issue entitled: The Communicating junctions, roles and dysfunctions.

Highlights

► We provide an up to date overview of mass spectrometry (MS)-based proteomics. ► Cx43 appears to undergo multiple levels of multisite phosphorylation. ► Bioinformatics reveal the majority of sites are consensus to multiple kinases. ► Multisite phosphorylation sequence and order provides insight into the Cx code.


A STEP-BY-STEP GUIDE TO THE IDENTIFICATION AND ANALYSIS OF PHOSPHORYLATION SITES

The following summarizes key steps and considerations for successful analysis of phosphorylation sites on individual proteins. We assume here that the kinase of interest phosphorylates its target independently of other kinases, which is usually the case. However, some kinases can only phosphorylate proteins that have previously been “primed” by another kinase, which can significantly complicate in vitro reconstitution of phosphorylation and analysis of phosphorylation site mutants.

Step 1: Optimization of assays for detection of phosphorylation events in vivo and in vitro

An important first step is to optimize methods for detecting phosphorylation events. The most convenient means of detecting protein phosphorylation is via electrophoretic mobility shift. Detection is rapid and simple, works on proteins modified in vivo and in vitro, and allows one to determine the stoichiometry of phosphorylation simply by assessing the fraction of shifted protein. A limitation of this approach is that many phosphorylation events do not cause a shift in electrophoretic mobility. Phosphorylation-induced shifts in electrophoretic mobility are likely due to local context-dependent effects on the flexibility of the peptide chain rather than to changes in molecular weight or overall charge. As a result, the effects of phosphorylation on electrophoretic mobility are unpredictable. Empirical manipulation of the ratio of acrylamide to bisacrylamide can improve resolution of differently phosphorylated forms of a protein (Nishiwaki et al., 2007). In addition, a recently developed method for incorporating a phosphate-binding molecule (termed Phos-Tag) into polyacrylamide gels can allow high-resolution detection of multiple phosphorylation events (Kinoshita et al., 2006, 2012). Isoelectric focusing can also provide high resolution of multiple sites. Although it is more labor intensive, isoelectric focusing works on most proteins and can provide information on how many phosphates are attached for each charge isoform.

Incorporation of 32 P either in vitro or in vivo is highly sensitive but provides no information on the stoichiometry of phosphorylation of individual sites. However, an advantage of in vivo labeling with 32 P is that it can provide definitive results in less than a week as to whether a protein is phosphorylated in vivo (methods available in Cooper et al., 1984 Den Haese et al., 1995). Immunoblotting with phosphospecific antibodies is another common method for monitoring phosphorylation. However, not all sites can be detected with this approach. Effective antibodies that recognize phosphotyrosine have been available for >20 years. However, no comparably reliable antibodies exist for detecting phosphoserine or phosphothreonine. A number of antibodies that recognize phosphorylated residues within specific short linear motifs are available. These phospho-motif antibodies can be used to track phosphorylation due to certain classes of kinases (Zhang et al., 2002a).

To preserve phosphorylation, it is essential that extract buffers and SDS–PAGE sample buffers contain high concentrations of phosphatase inhibitors. Sodium fluoride and β-glycerol phosphate, used together at 50 and 100 mM, respectively, are good inexpensive options.

Once a site has been identified, it may be possible to generate phosphospecific antibodies that recognize the site and its surrounding context. These powerful reagents allow rapid detection of the site in vivo and in vitro, but their production is expensive and not guaranteed. In our collective experience, only 50% of phosphopeptides have yielded useful phosphospecific antibodies. In addition, such antibodies provide no information on the stoichiometry of phosphorylation.

Step 2: Identification of the kinase or kinases that act directly on the protein of interest

Consider the consensus recognition site of the kinase of interest. Although far from perfect, there are many clear examples of site preferences, and there are good computational tools for examining consensus sites (e.g., http://elm.eu.org/, http://netphorest.info/, http://scansite.mit.edu/Songyang et al., 1994 Yaffe et al., 2001 Obenauer et al., 2003 Miller et al., 2008 Dinkel et al., 2012). Matches between mapped phosphorylation sites and the minimal consensus recognition site for the kinase under study increase confidence that the relevant kinase has been matched to its target sites.

Loss-of-function or gain-of-function mutations or small interfering/short hairpin RNA knockdown of the kinase should have the predicted effects on phosphorylation of the putative target protein. Of course, one must be aware of potential redundancies between kinases, which can complicate the analysis.

Kinase catalytic domains generally do not show a significant affinity for their target proteins however, many kinases associate with their targets via secondary docking sites (for examples, see Choi et al., 1994 Mortensen et al., 2002 Harvey et al., 2005 Dard and Peter, 2006). Thus detecting binding interactions between the kinase and a putative target increases confidence that the protein is a direct target.

Purified kinase should be capable of efficiently and quantitatively phosphorylating the target protein on a biologically relevant time scale in vitro. If quantitative phosphorylation requires a 10-fold excess of kinase and a 2-h reaction time but signaling in vivo occurs within minutes, the protein may not be a direct target or additional factors are required.

Fulfilling these criteria is relatively straightforward in yeast, which is why yeast are indispensable organisms for signaling analysis. It can be more difficult to fulfill all of the criteria in animal cells, but one should aim to fulfill as many as possible.

Step 3: Mapping of sites phosphorylated in vitro

The best mapping results come from an approach that combines data obtained in vivo and in vitro. Thus it is preferable to develop an in vitro reconstituted system to generate phosphorylated protein for analysis. To obtain good sequence coverage and quality spectra that yield high-confidence phosphopeptide matches, it is best to obtain as much phosphorylated protein as possible. The amount of starting material required for successful analysis will vary widely, but a “more is better” rule should apply. High-occupancy sites on MS-friendly peptides may be detectable from as little as 2 pmol of total protein (∼100 ng of a 50-kDa protein) however, much higher levels (≥10 pmol) are often required.

Ideally, controls should be carried out when establishing the in vitro system to ensure that the kinase of interest is directly responsible for phosphorylation of the protein. Use of a kinase-dead control, specific kinase inhibitors, or analogue-sensitive kinase mutants can help rule out the possibility that phosphorylation is due to a copurifying kinase.

One should also consider whether the isolated substrate protein is already phosphorylated, potentially by multiple kinases. If so, the protein should be treated with phosphatase during purification. If the protein is isolated by affinity chromatography, this can be achieved by treating the protein with lambda or calf intestinal phosphatase while it is still bound to the affinity beads. Even proteins produced in bacteria sometimes have undergone nonspecific phosphorylation and may need to be treated with phosphatase before being used as a substrate.

Two reactions should be carried out: one containing kinase, and a reference reaction containing no kinase or a kinase-dead mutant. If one is using protein labeled by SILAC, the reference and experimental reactions are combined after the reactions and should contain equal amounts of protein labeled with heavy or light isotope. The combined reactions are analyzed directly by mass spectrometry after protease treatment to identify sites phosphorylated by the kinase. Alternatively, the combined reactions can be resolved by electrophoresis, and the band representing the protein of interest can be excised and analyzed by mass spectrometry to compare levels of phosphorylation, which can improve detection in some cases. If one is not using SILAC, the kinase and control reactions can be independently labeled with mass tags, combined, and analyzed.

As a complement to mass spectrometry analysis, phosphoamino acid analysis can be applied to the protein of interest isolated from 32 P-labeled cells or after in vitro phosphorylation by specific kinases (Kamps and Sefton, 1989). For example, if phosphoamino acid analysis reveals that the protein of interest is phosphorylated on both serine and threonine residues in vivo and in vitro, but mass spectrometry analysis identifies only serine phosphorylation sites, it might be that potentially important modifications were missed because the corresponding threonine phosphopeptides could not be detected by mass spectrometry. Similarly, two-dimensional phosphopeptide mapping can provide a sense of the complexity of phosphorylation, as well as a means of testing whether all important sites of phosphorylation have been identified after mutagenesis, without the need for complex mass spectrometry experiments (Boyle et al., 1991). Classic phosphopeptide mapping also has the advantage that all the phosphopeptides are detected, in contrast to mass spectrometry.

Step 4: Mapping of sites phosphorylated in vivo

To determine whether phosphorylation sites identified in vitro are relevant, it is important to show that they are also phosphorylated in vivo. A number of databases catalogue sites that have been found to be phosphorylated in vivo in large-scale surveys, so an easy first step is to determine whether sites identified in vitro have already been identified in vivo, while keeping in mind that large scale surveys provide low-sequence coverage and miss many sites (for databases see PhosphoSitePlus, Phospho.ELM, and PHOSIDA).

To map sites phosphorylated in vivo, one should first define physiological conditions under which a significant fraction of the protein is phosphorylated. This could involve treating cells with a stimulus that activates signaling or synchronizing cells in the cell cycle. Another approach is to manipulate the signaling pathway genetically such that the kinase of interest is hyperactivated. An ideal situation is when one can purify the substrate protein from control cells, cells in which the relevant kinase is hyperactivated, and cells in which the relevant kinase has been inactivated, which allows one to determine which sites depend on the kinase of interest in vivo.

Once the appropriate conditions have been defined, the substrate protein must be purified under conditions that preserve phosphorylation and yield sufficient amounts of protein. For best results, one should aim for ≥10 pmol of protein. Affinity purification methods that allow specific release of the purified protein will produce the best results. For example, proteins tagged with hemagglutinin or FLAG can be purified using antibody beads and eluted with an excess of peptide (Ho et al., 2002 Harvey et al., 2011). Tandem affinity purification (TAP), multifunctional TAP, or other affinity-based tags can also be used (Rigaut et al., 1999 Ma et al., 2012). Purification of proteins using antibodies raised against the protein of interest can be problematic because elution from the antibody requires harsh conditions that also release large amounts of antibody from the beads, complicating the analysis, although this problem can be circumvented by cross-linking the antibodies to beads. Nonspecific elutions also generate a higher background of contaminant proteins that can interfere with target peptide detection.

To preserve phosphorylation, purification should be carried out in buffers that contain high concentrations of salt and phosphatase inhibitors. High salt concentrations help inhibit phosphatases and reduce nonspecific binding, whereas phosphatase inhibitors minimize dephosphorylation during purification.

Step 5: Interpretation of phosphorylation-site mapping data

Mass spectrometry analysis will yield a list of identified sites. It should include a tally of peptide spectral matches (PSMs), each representing a unique MS/MS spectrum that matches a peptide containing that site, and scoring parameters for peptide identification and site localization. Peptide identification data are routinely filtered to ∼1% false-discovery rate using the target-decoy strategy (Elias and Gygi, 2007). Researchers should be aware that the data will contain incorrect matches. The observation of multiple PSMs for a given site either through multiple observations of the same peptide or the detection of different peptide sequences harboring the same site bolsters the con­fidence of correct site assignment. As discussed earlier, in many cases site assignments cannot always be resolved to a single residue. In some cases, for example, where the study is focused on a kinase with a known consensus motif, local sequence can be used to guide the choice of sites to pursue for further validation. However, in most cases, all possible sites on each peptide must be considered. Sites that are phosphorylated both in vivo and in vitro have high confidence of being relevant sites. Sites that change in occupancy in response to changes in relevant upstream signals are also high-confidence sites. Another consideration that can enhance confidence that correct identification has been made is the conservation of the phosphorylation site(s) throughout evolution. Although proteins phosphorylated at multiple sites within unstructured regions may not show evolutionary conservation of phosphorylation sites, within folded domains, phosphorylation sites are often conserved (Landry et al., 2009 Niu et al., 2012).

Phosphorylation-site mapping screens are almost never saturating. Failure to observe a site is insufficient evidence to conclude that the site is not phosphorylated in the cell. Many factors, including length, hydrophobicity, and charge, affect the chromatographic properties and ionization efficiencies and thus the ease of detection of different peptides. Phospho and nonphospho forms of the same peptide can have very different signal intensities. Thus even the observation of an unphosphorylated peptide is no guarantee that the correlate phosphopeptide is easily detectable. Despite these caveats, the unphosphorylated peptide sequence coverage still provides some indication of the depth of analysis. With sufficient amounts of protein, and barring long stretches of intractable sequence, it should be possible to achieve ≥80% amino acid coverage. Lower coverage decreases confidence that all sites have been identified and often indicates that more protein or an additional protease is needed to generate peptides for the analysis.

For quantitative experiments, the data will include abundance ratios for each phosphopeptide along with signal intensities, often recorded as a signal-to-noise ratio. Unlike protein-level analysis, in which multiple quantified peptides are often observed, phosphopeptides are more frequently detected and quantified only once. There is a strong correlation between signal strength and reproducibility. When selecting sites for further study, investigators should pay close attention to the number of PSMs and the signal strength for peptides harboring each site. It should also be noted that compiling ratios at the site level from multiple peptide measurements is not always trivial. Simply calculating averages or medians of all peptides containing a given site might not reveal the full complexity of cellular phosphorylation patterns. Singly and doubly phosphorylated forms of a peptide might be present at different levels. One must also not forget that changes in total protein level are not reflected in the phosphopeptide ratios. Wherever possible, separate protein-level measurements made from unmodified peptides should be performed and used to normalize phosphopeptide ratios.

Step 6: Analysis and interpretation of phosphorylation-site mutants

After identification of all phosphorylation sites possible and their assignment to likely protein kinases, the next step is to mutate the sites so that their biological significance can be ascertained. Typically, serine and threonine phosphosites are mutated to alanine (or valine for threonine), and tyrosine phosphorylation sites are mutated to phenylalanine. Because mass spectrometry can miss sites, it is important to verify that most or all key sites have been identified and mutated. If the mutant protein loses its SDS–PAGE shift, it is likely that most sites have been identified. However, this does not exclude the possibility that some sites have been missed that do not cause a shift. Thus a more rigorous approach is to show that the mutant protein fails to incorporate 32 P in a reconstituted in vitro system or that the relevant phosphopeptides identified by in vivo, 32 P-labeled, two-dimensional phosphopeptide mapping disappear. It can sometimes also be informative to switch serine for threonine residues or vice versa. Many protein kinases do not distinguish serine from threonine, and if the site is targeted in vivo, the protein's gel shift should not be lost with this switch, and yet a change in phosphoamino acid content of the corresponding peptide can be readily identified, which allows one to directly verify that the relevant phosphorylation site has been identified.

If a phosphorylation site mutant causes a loss of function, there can be the concern that it causes nonspecific damage to the protein. The vast majority of phosphorylation sites occur in regions of proteins that are predicted to be disordered, so it is unlikely that phosphorylation-site mutants disrupt protein structure (Iakoucheva et al., 2004 Gsponer et al., 2008). In addition, a number of criteria can be used to help rule out this possibility. For example, if normal levels of the protein are expressed in vivo, it is likely that the protein undergoes normal folding, since proteins that cannot fold correctly are destroyed. Another helpful test is to determine whether the phosphorylation-site mutant retains a subset of normal functions, which would indicate that the mutants affect specific functions of the protein. If the protein shows normal localization, it clearly retains key functions.

It has become common in protein phosphoregulation studies to mutate phosphorylation sites to “phosphomimetic” residues in an attempt to study the constitutively phosphorylated state. In this approach, serine and threonine are typically mutated to aspartic or glutamic acid residues, whereas tyrosine is substituted with glutamic acid. This approach has two significant shortcomings. First, if the phosphorylation site serves as a recognition signal for an adaptor protein (i.e., 14-3-3, FHA-domain, PTB-domain, and SH2-domain proteins), phosphomimetic mutants will not bind to the adapter protein (Durocher et al., 1999 Zisch et al., 2000 Roberts-Galbraith et al., 2010) because they do not fit into the binding pocket (van der Geer and Pawson, 1995 Yaffe et al., 1997 Durocher et al., 1999). Second, the negative charge introduced by aspartate or glutamate substitutions (−1) does not match that of the phosphorylated residue (generally −1.5) at physiological pH. Neighboring pairs of aspartic or glutamatic acid side chains can overcome the charge differential and may act as better phosphomimetics (Strickfaden et al., 2007 Pearlman et al., 2011). However, the size of the ionic shell produced by a phosphate group is also different, and so the overall chemical environment created by phosphorylation is very different from that of negatively charged amino acids (Hunter, 2012). It is therefore not surprising that phosphomimetic mutations often fail to reproduce the changes to a protein caused by phosphorylation. As a result of these two limitations, the behavior of phosphomimetic mutations can be uninterpretable. Of course, there are examples in which phosphomimetic substitutions have been highly informative. The constitutive activation of MEK kinases by a phosphomimetic mutation is an excellent example (McKay and Morrison, 2007).


Thermal proximity coaggregation (TPCA)

Thermal Proximity Coaggregation (TPCA) is a relatively recent and unconventional approach for proteome-wide profiling of protein complex dynamics [87]. It exploits the phenomenon that interacting proteins co-aggregate after heat-induced denaturation and co-precipitate. As a result, they have a high similarity in their thermal solubility compared to non-interacting proteins. The assembly state of known protein complexes can be inferred from the similarity or changes in protein thermal solubility to identify those modulated across cellular states or physiological conditions. To simultaneously monitor the dynamics for hundreds to thousands of protein complexes, proteome-wide quantification of protein thermal solubility is determined using quantitative MS, similar to that of thermal proteome profiling [88], which employs isobaric TMT (tandem mass tag) reagents to simultaneously quantify protein solubility across ten different temperatures from CETSA (Cellular Thermal Shift Assay) experiments [89] (Fig. 5).

The TPCA workflow. TPCA can be performed on intact cells or cell lysate. Lysed samples are first divided into an equal amount of aliquots and subjected to heat treatment with an increasing temperature gradient. Heat treatment induces denaturation and coaggregation of interacting proteins, which then co-precipitate. Upon centrifugation, the supernatant consisting of soluble proteins from different temperature treatment is retrieved for isobaric TMT-labelling and quantitative LC–MS/MS analysis. The abundance of each soluble proteins identified and quantified is then plotted against the temperatures to generate the “protein melting curve”

Current implementation of TPCA utilizes the CETSA protocol [90] to denature proteins and extract the soluble fraction, followed by TPP for proteome-wide quantification of protein solubility [91]. When the thermal solubilities of proteins are plotted against increasing temperatures, the so-called melting curve of proteins can be constructed to visualize TPCA signature across cell types or conditions. The similarity in protein thermal solubility between pairs of proteins across multiple temperatures can be quantified using measures like Euclidean distance [87] and Pearson's correlation [92]. Statistical significance of observed similarities and changes in thermal solubility between pairs of proteins are estimated through a bootstrapping approach using random pairs of proteins to establish random background distribution [87].

Using TPP and CETSA protocols, data for TPCA analysis can be obtained from both cell lysate and intact cells. In the former, cells are first lysed before heat denaturation, while in the latter, intact cells are first heated before cell lysis. In the first proof-of-concept work demonstrating TPCA can be used to identify protein complexes modulated across cell types, cellular states and cellular conditions, protein complexes were observed to exhibit much stronger TPCA signature (i.e. co-aggregating) in data from intact cells than from cell lysate. As the first proof-of-concept experiment, TPCA was performed to identify protein complexes modulated across different cell types, cellular states and cellular conditions [87]. The final results showed that protein complexes obtained from intact cells exhibited a higher level of co-aggregation (stronger TPCA signature) than those originated from cell lysate [87]. This observation suggests the integrity of protein complexes might have been compromised after cell lysis. Notably, for many protein complexes that exhibit TPCA signature only in intact cells, they are often associated and likely dependent on subcellular scaffolds like chromatin and membrane for structural stability, which is probably absent in cell lysate. Taken together, these observations suggest TPCA will be valuable for studying protein complexes in situ, particularly for weak-binding protein complexes that easily dissociate after cell lysis. Importantly, TPCA can reveal the subcomplex organization of megacomplexes like the nuclear pore complex and the proteasome [87, 92]. Also, it has been reported that phosphorylation can affect the thermal solubility of protein through modulating PPIs, suggesting the ability to identify phosphorylation-dependent protein complexes [93]. Interestingly, similar to CETSA and TPP, it has also been shown that TPCA analysis can be extended to in vivo specimens such as tissues and blood samples [87, 94].

TPCA for system-wide profiling of protein complex dynamics has the advantages of requiring neither antibodies nor epitope tagging. It requires little preparation time compared to existing methods, and most importantly, permits the study of protein complexes in situ and in vivo. The current version of TPCA could be deployed to study the dynamics of known or predicted protein complexes across cellular states and physiological conditions efficiently, but need to incorporate existing interaction data with graph/network clustering algorithms to identify novel protein complexes. Nevertheless, Hashimoto et al. recently demonstrated novel protein–protein interactions could be inferred among the small set of viral proteins using only TPCA data [95]. Large-scale human interactome projects and integrative data analysis have uncovered many novel but functionally uncharacterized protein complexes. TPCA profiling can be rapidly deployed to unravel the assembly state of these protein complexes across cellular state, cell type, tissue and physiological conditions to provide insight into their functions in normal and diseased cells. The thermal protein solubility of proteins can be rapidly generated across species, and with data now available over 13 species ranging from human to archaea species. Thus, we envision that the TPCA analysis approach could be widely adopted to study protein complexes and protein interactions across the tree of life [96,96,98].


Methods

Biochemical processes (such as proteolysis) can be described by ordinary differential equations (ODEs). This allows to simulate and analyze a process and thus to draw conclusions about its properties, such as steady-states or changes in concentration of its constituents over time. A simple example for such a system is Tyson’s cell cycle model [18]. To visualize these ODE systems oftentimes graphs are used, where nodes are the reactants and edges between them are the reactions. Note that both representations (ODE and graph) are equivalent. For modeling and visualizing proteolytic processes Kluge et al. introduced the cleavage graph [17] which they used to model exoproteolytic cleavage reactions. In the following we will extend this concept to also include endoproteolytic reactions. We call the resulting data structure degradation graph since it can be used to model all degradation reactions of a proteolytic process and also allows a convenient and comprehensible visualization.

Degradation Graph

A proteolytic process where single or multiple peptides are generated by cutting peptides into smaller fragments can be modeled as a graph .

The nodes V correspond to the degraded and generated peptides and the edges E to the proteolytic reactions. Since proteolysis is an irreversible reaction under physiological conditions the edges in the graph are directed from the degraded to the generated peptides.

As mentioned above, one can distinguish two types of proteolytic reactions, exoproteolytic reactions, where a single amino acid is removed from one of the free termini of the peptide, and endoproteolytic reactions, where the targeted peptide is cleaved at a position between the N- and C-terminus. For exoproteolytic reactions we connect two nodes with a directed edge from node u to v if we can obtain the amino acid sequence of v by subtracting a single amino acid from the beginning or the end of the amino acid sequence of u. For endoproteolytic reactions this is not that easy. Since we need to connect three nodes (the peptide that is targeted u and the two resulting fragments v,w) we need to break the idea of one reaction equals one edge in the graph. To ensure that we still associate the reaction with single edge, we introduce pseudo-nodes , that represent the endoproteolytic process of cutting the peptide u at a specific position c. The pseudo-nodes can also be seen as representation of the endoprotease that cuts the peptide u at position c. We can now connect u to and associate all reaction specific information (e.g., reaction rate) with this single edge. We further connect to v and w with so called pseudo-edges.

Both reaction types are separately shown in Figure 3 . An example with real peptide sequences and both reaction types is shown in Figure 2 .

(a) Exoprotease reaction, (b) Endoprotease reaction. See Figure 2 for an example containing both reaction types.

Constructing the Graph from Mass Spectrometry Data

In the previous section we defined the degradation graph and its relation to proteolytic processes. Now we present an approach to construct this graph based on series of N mass spectra collected at different time points and a seed sequence S which we will also call base peptide from here on. Based on this input we try to identify signals in the mass spectra, that represent fragments of S produced by a proteolytic process. The seed sequence needs to be provided as input. It can for instance be the sequence of a known peptide probe that was incubated with an unknown mixture of proteases or a sequence taken from MS/MS identifications.

We shortly introduce some notation that eases the understanding of the following explanations. Given a node v in the degradation graph, denotes the amino acid sequence of the peptide associated with the node v. The length of the amino acid sequence is given by . with is the subsequence of the amino acid sequence from position a to position b. denotes the mass of the peptide associated with the node v. If we could identify a signal that corresponds to the peptide associated with v, we will denote it’s intensity with . The association between mass and intensity takes into account, that mass spectrometers measure only mass to charge ratios and therefore cannot distinguish peptides with equal mass. Therefore different peptides with equal mass can be associated to the same intensity value, without counting the signal twice in the later analysis. The set of all peptide masses in the graph is denoted by M. We further introduce a queue of nodes L, which is empty at the beginning of the construction.

The construction of the graph is divided into two parts, verification and extension, which are executed on each of the input spectra. Before we can execute these steps, we need to initialize the degradation graph. This is done by adding a node for the seed sequence to the degradation graph. Afterwards we start with the verification step for the first spectrum recorded at time point , followed by the extension step. This is repeated for each of the input spectra. The pseudocode for both parts is shown in the Supporting Information (Figure S1).

Verification

The first step is the verification of the degradation graph on the new spectrum. We therefore check for each node in the degradation graph whether we can find a signal that corresponds to this node in the spectrum. In general, we will identify signals by peptide mass fingerprinting [19]. Our approach is described in the Supporting Information (Text S1). Existing MS/MS identifications [20] are solely used for validation, since relying only MS/MS identifications during the construction phase of the algorithm would introduce a bias towards the used acquisition strategy. Each node v that could be identified in the spectrum is added to L and annotated with the observed intensity .

Extension

The extension step is performed on the current spectrum as long as L is not empty. In each cycle a node u is removed from L and the following procedure is executed.

Given the node u, we start by removing the N- and C-terminal amino acid separately from to simulate exoproteolytic degradation and search for the corresponding signals. If we find a signal we add the corresponding node v to the graph, annotate it with the signal intensity , set it’s sequence to either or , and connect the nodes u and v by an edge pointing from u to v. The generated node v is appended to the list L.

Subsequently we simulate the endoproteolytic reactions by splitting the sequence in two parts at each position c with . If we can identify both fragments of such a split in the mass spectrum, we add a pseudo-node , annotated with the sequence and the cutting position c to the graph and connect it to the degraded node u. We then add nodes v and w for each of the fragments to the graph, annotate it with the corresponding signal intensities (, ), the sequences ( and ), and connect it to the pseudo-node . The generated nodes are appended to the list L.

Estimation of Kinetic Parameters

After we generated the model representing the proteolytic process, i.e., the degradation graph, the next task is to estimate the kinetic parameters of the underlying process. To achieve this we first generate a system of ordinary differential equations (ODE) based on a degradation graph as described in the following section. For this system we estimate the kinetic parameters based on the observed signal intensities.

Generating an ODE Model for the degradation graph

Following the ideas presented by Yi et al. [16] the mathematical model is derived by the law of mass action and each proteolytic reaction is modeled as a first-order reaction, i.e., the rate of the reaction depends on the concentration of only one reactant. In case of proteolytic reactions, this reactant is the protein or peptide that is degraded. We neglect side effects like saturation of the degradation products but incorporating these would be possible by an extension of the ODE system. We write the rate equations for an exoprotease reaction, where u is degraded to v as follows

where and denote the concentration of peptide u and v at time t. is the kinetic rate constant for the reaction. Endoprotease reactions are represented in the same manner with the slight difference that we need to model both degraded products.

This transformation can be done for each reaction and each reactant in the degradation graph. As an example we transformed the degradation graph shown in Figure 2 into the following system of differential equations.

Since the degradation process as well as the mass spectrometry measurements happen ex-vivo, the base peptide ( in the above example) has a fixed starting concentration and there will be no further production of the base peptide. In settings where this does not hold, one would need to explicitly model the generation of the base peptide into the equations (e.g., by a constant generation rate).

Transforming peptide concentrations to signal intensities

The presented ODE model is based on concentrations of peptides but with a mass spectrometer we can only observe intensities associated with a specific mass. The obvious question is what kind of relationship exists for a single peptide between its concentration and the intensity observed with the mass spectrometer. Moreover one cannot guarantee that two peptides with equal concentration will have the same intensity in the mass spectrometer.

Different studies [21], [22] have shown that for a single peptide a linear relationship between intensity and concentration is a reasonable assumption. Based on this we introduced a linear transformation from the model concentrations to the predicted signal intensities.

where is the intensity associated with the mass m at time point t, m is the mass of the peptide , is a peptide specific factor, and the concentration, computed by the model, for peptide at time point . Yi et al. [16] already used a similar transformation successfully in their study. This transformation implicitly solves also the second problem of comparability between two observed intensities. Since each observed intensity will be transformed individually into the common concentration domain, the resulting concentrations can be compared afterwards. This transformation can also be used to compensate for systematic effects that occur in each measurement, e.g., quantification errors or incomplete ionization.

Another problem is that it can happen that two or more different peptides have the same or a nearly identical mass. These isobaric peptides cannot be distinguished in a mass spectrum. We therefore transform them into a single intensity value. For every observed mass , we compute a linear combination of all peptide concentrations, of peptides with a mass equal (or nearly equal) to .

where is the set of all peptides which have a mass of .

Estimating reaction rates

To estimate kinetic parameters we first generated an ODE model based on a degradation graph as described above. We now need to find the optimal set of model parameters () as well as transformation parameters (), so that the difference between the computed model intensities and the observed intensities is minimal. Following standard practice we use a weighted sum of least squares differences between observed and model intensities as an error measure.

where is the set of all observed masses, is the intensity observed for mass at time point , is intensity predicted by ODE system for the mass at time point , and is a weighting function. The weighting function can for instance be used to use relative instead of absolute deviations, i.e.,

This is used to reduce the effect of different intensities being on different orders of magnitude. This minimization problem can theoretically be solved by any available optimization technique. After testing different available techniques we decided to use POEM, a Matlab-based version of BioPARKIN [23], [24], to estimate the model parameters as well as the transformation parameters. We further use POEM to estimate the initial concentration of the base peptide. POEM is based on damped Gauss-Newton techniques for solving the above optimization problem. Lack of robustness of damped Gauss-Newton techniques as observed often in model discrimination contexts, see [25], can be overcome by using dimension reduction in parameter space [26].

How to choose initial values

As the prior knowledge on the modeled system is very limited good initial values for the estimation of the model parameters are hard to find. We therefore chose the initial values based on the following scheme: For each node the edge (i.e., proteolytic reaction) is selected, which leads on the shortest path to the root node. For the corresponding reaction rate () we assign an initial value of . For all other incoming reactions the initial value is set to a value of . All transformation parameters () are set to .

Evaluation and Optimization of the Degradation Graph Structure

The above presented approach to construct the degradation graph is greedy, i.e., it assumes that every signal in a spectrum that could match a subsequence of the base peptide is part of the proteolytic process and that every possible reaction occurred. This assumption is not always true. The signals could also originate from peptides with equal or at least similar masses as we have already seen in the previous section. But these peptides do not necessarily take part in the proteolytic reactions, that we want to model. We will call such peptides decoy peptides. Alternatively we may have multiple reactions to explain the formation of a peptide where only one is true. Hence, the degradation graph may contain peptides or reactions that did not occur in the actual underlying proteolytic process. To account for this we present a method to rank different subgraphs of an initial degradation graph with respect to their ability to explain observed data. Followed by a heuristic approach to construct a series of smaller models from the initially generated degradation graph without the need to compute every possible subgraph.

Evaluating different models

To find the degradation graph that optimally explains the observed data it is necessary to rank the different graphs. Here we describe a scoring scheme that can be used to rank the generated models.

To ease the following explanations we will introduce some further notation. Given a degradation graph , a subgraph is defined as , where and . We also require that is connected, i.e., for all pairs of nodes exists a path of length in that connects and . The subgraph also defines as the subset of all masses and their associated intensities that are explained by the subgraph .

The proposed score consists of two components. The first score component is the average Pearson correlation of the intensities predicted by the model (with estimated reaction parameters) and the actual observed data. This component should reflect the goodness of fit between the measured intensities and the computed model intensities. We compute for each explained mass the Pearson correlation between the observed intensity values and the predicted values from the model.

We then use the mean of all Pearson correlation values as measure for the goodness of fit.

The second component of the score is the part of the standard deviation of the original degradation graph, that is conserved by the specific subgraph.

where is the standard deviation of the signal corresponding to the mass . reflects the ability of the subgraph to explain the important parts of the originally collected signals.

To compute a single score from these two components we build the weighted sum of both scores.

To determine good weights and we carried out several experiments on simulated data. A weight of for the correlation score and for the variability score showed the best separation of the correctly and wrongly identified models. For datasets with low quality (e.g., due to high amounts of noise or too few sampling points) weights of and have shown a good performance. For such datasets we expect a less reliable fit for the time series and therefore decreased the weighting factor for the quality of the fit.

Heuristic search for the optimal graph

Constructing all possible subgraphs, generate the associated ODE system, and estimating the corresponding reaction and transformation parameters is possible for small graphs. With increasing size in terms of number of nodes and reactions, estimating the reaction and transformation parameters for all subgraphs gets more computationally intensive. If we want to generate each possible combination of reactions we would get possible subgraphs. Even if we filter out some of the subgraphs (e.g., those who do not contain the root node or are not connected) we would still have to consider exponentially many subgraphs. For each of these subgraphs we would then need to derive the associated ODE system and estimate the reaction and transition parameters.

To speed up this procedure we present a heuristic approach. Preliminary tests have shown that the presented graph score improves, if the structure of the degradation graph gets closer to the original one. This can be explained based on the composition of the score. The first component reflects the goodness of fit between model and observed data. This should improve if we remove peptides and reactions, that do not belong to underlying process. The second component reflects the variability of the signals. If we remove only nodes that do not participate in the reaction, i.e., whose variability is low compared to the signals of peptides which are degraded and produced, this score component should still be near to the optimal value.

Based on the construction algorithm we know that the identified degradation graph is maximal in the sense that it contains all signals that were produced by the assumed process and possibly also parts that do not belong to the process. To find the optimal subgraph we start by removing all terminal reactions of the graph (i.e., reactions that produce at least one leaf) separately. For each of these subgraphs we estimate the kinetic parameters as described earlier. Subsequently we rate all subgraphs according to the criteria presented above. Then we take the best models and again remove all leafs separately. We continue with this procedure as long as we can find at least one graph whose score is under the top of all so far computed subgraphs and that was not trimmed in a previous iteration.

With this approach we can drastically reduce the amount of parameter optimizations that need to be carried out by still finding the originally embedded graph.

Preliminary tests have shown that setting to either 2 or 3 is sufficient to effectively bound the number of unnecessary model evaluations while still identifying the original degradation graph.

Run-time Considerations

The above presented combination of degradation graph construction, parameter estimation and structure optimization requires a considerable amount of time, if the initial degradation graph is large. Therefore we now describe an approximation of the running time in the worst case. The run time of the initial degradation graph construction is determined by the number of verifications needed. Under the assumption that we would construct the complete degradation graph, i.e., all peptides are degraded in every possible way, one would create a degradation graph, which contains all possible substrings of the initial peptide sequence. Since we would need to verify each of this substrings once, the running time is in the worst case bounded by the maximal number of possible substrings of the initial peptide sequence. Given a seed sequence of length we can construct at most possible fragments, which could be checked in the spectrum. If we now analyze spectra we will have at most verifications.

The complexity of the parameter estimation procedure can be approximated by , where is the number of time points, i.e., the number of evaluated mass spectra, and is the number of unknown parameters, i.e., the number of edges in the graph minus the number of edges connecting pseudo- and real nodes. Given this the time required for the parameter estimation will decrease with the subgraphs getting smaller. Under the assumption that even the proposed heuristic could require the computation of each subgraph, we would need to trigger optimizations in the worst case.


FAK phosphorylation sites mapped by mass spectrometry

Pablo R. Grigera, Erin D. Jeffery, Karen H. Martin, Jeffrey Shabanowitz, Donald F. Hunt, J. Thomas Parsons FAK phosphorylation sites mapped by mass spectrometry. J Cell Sci 1 November 2005 118 (21): 4931–4935. doi: https://doi.org/10.1242/jcs.02696

The protein tyrosine kinase, focal adhesion kinase (FAK), is a central regulator of integrin-mediated signaling (Mitra et al., 2005 Parsons, 2003 Schlaepfer and Mitra, 2004). FAK is present in focal adhesions, adhesive structures that mediate attachment of cells to the extracellular matrix (ECM) (Hynes, 1992 Lauffenburger and Horwitz, 1996). In addition, FAK is present in other adhesion structures within the cell, most notably the dynamic adhesion complexes found at the periphery of lamellipodia of migrating cells (Webb et al., 2004). FAK is essential for the turnover (e.g. breakdown) of this latter class of cell adhesions and has also been implicated in the release of adhesion at the rear of the cell (Carragher et al., 2001 Webb et al., 2004). Inhibition of FAK activity or targeted deletion of FAK expression results in cell migration defects (Ilic et al., 1995 Richardson et al., 1997a Schlaepfer and Mitra, 2004). Increased FAK expression is a hallmark of many cancers and may contribute to the metastatic phenotype of highly malignant cells (Gabarra-Niecko et al., 2003).

Clustering of cell surface integrins, as a consequence of cell attachment to ECM proteins or treatment with certain integrin-specific antibodies, leads to the activation of FAK catalytic activity and an increase in FAK tyrosine phosphorylation (Mitra et al., 2005 Parsons, 2003). Activation of FAK results in autophosphorylation of Y397, the recruitment of Src and Src-family kinases, and the increased phosphorylation of other proteins present in the adhesion complexes, notably paxillin and p130Cas (Mitra et al., 2005). The present evidence suggests that the initiation of this tyrosine phosphorylation cascade is essential for the downstream signaling mediated by cellular adhesion complexes. In addition to being a protein tyrosine kinase, FAK appears to perform a scaffolding function, binding to a variety of membrane-associated proteins including growth factor receptors and ERM proteins. In adhesions, FAK colocalizes with and binds to the focal adhesion proteins paxillin, p130Cas and talin, as well as SH3-domain-containing GTPase-activating proteins for the Rho and Arf family of small GTPases. In addition, the phosphorylation of specific tyrosine residues appears to be important for the recruitment of SH2-domain-containing signaling proteins, including Src and Src family kinases, PI 3-kinase and Grb2 (Mitra et al., 2005 Parsons, 2003 Parsons et al., 2000).

Little is known about the spatial and temporal organization of FAK in newly formed or more stable cellular adhesions. Further, little information is available about the factors that regulate the enzymatic activation of FAK or the interaction of FAK with its binding proteins. To understand the possible role of post-translational modifications on the regulation of FAK function, we have mapped the phosphorylation sites within FAK using immunoaffinity purification of FAK and mass spectrometry (MS). In this report, we identify 25 sites of phosphorylation (15 serine, 5 threonine and 5 tyrosine residues), including a number of the serine and tyrosine phosphorylation sites previously reported. In addition, we note the juxtaposition of phosphoserine-, phosphothreonine- and phosphotyrosine-containing residues, which suggests that coordinated phosphorylation of FAK by serine/threonine and tyrosine-specific kinases may be an important aspect of regulation of FAK function.

To identify the sites of phosphorylation in FAK, MS analysis was performed on FAK immune complexes prepared from HEK293 cells expressing chicken FAK. In some cases the cells were pretreated with phosphatase inhibitors to enhance the recovery of phosphorylated residues. Through this analysis, peptides representing 90% of the FAK protein were identified. Fig. 1 shows the sequence of FAK with all of the observed phosphorylation sites marked in red. The amino acid numbering in this report is based on the chicken sequence, which differs slightly from the numbering for human or mouse FAK. The regions of the protein that were not identified by this MS analysis are shown in lower case. The phosphorylation of residues within these regions is therefore unknown. In addition, there were two instances in which the specific phosphorylated amino acids within a peptide could not be assigned by the MS/MS spectra (shown in brackets). A total of 38 phosphopeptides were identified, and their sequences were confirmed by manual inspection of the spectra (Table 1). From these peptides, we have identified the phosphorylation of 15 serine, 5 threonine and 5 tyrosine residues.

Relative abundance of phosphopeptides identified by mass spectrometry

. . Relative abundance . .
Residue . FAK peptide sequence (chicken) . (+) Inhibitors . (–) Inhibitors .
T13 DPNLNHTPSSSAK ++
S29 THLGTGMERSPGAMERVLK +++ ++++
Y155 NDYMLEIADQVDQEIALK ++++
S386/T388 QGVRSHTVSVSETDDYAEIIDEE IMAC
S386/T388 GVRSHTVSVSETDDYAEIIDEE IMAC
S386/S390 KQGVRSHTVSVSE IMAC
T388/S390 KQGVRSHTVSVSE IMAC
T388/S392 QGVRSHTVSVSETDDYAEIIDEE IMAC
T388/Y397 QGVRSHTVSVSETDDYAEIIDEE IMAC
S390 QGVRSHTVSVSETD IMAC
S390 QGVRSHTVSVSETDDYAEIIDEE ++ IMAC
S390 GVRSHTVSVSET IMAC
S390/[T394-Y397] QGVRSHTVSVSE [TDDY] AEIIDEE IMAC
[T394-Y397] QGVRSHTVSVSE [TDDY] AEIIDEE IMAC
Y397 QGVRSHTVSVSETDDYAEIIDEE ++
T406/Y407 DTYTMPSTRDYE IMAC
Y570 DFGLSRYME IMAC +
T700/S708 RMRMESRRQVTVSWDSGGSD ++++
S722/[S725-S726] DEAPPKPSRPGYPSPR [SS] EGF IMAC
S732 GFYPSPQHMVQPNHYQVSGYSGSHGIPAMAGSIYPGQASLLDQTDSWNHRPQE IMAC
S766 SHGIPAMAGSIYPGQASLL IMAC
T793 DSGTLDVRGMGQVLPTHLM IMAC
S845 RFLVMKPDVRLSRGSIE IMAC
S888 KPPRPGAPHLGSLASLNSPV ++++
S888 KPPRPGAPHLGSLASLNSPVDSYNEGVK +++
S888/S891 KPPRPGAPHLGSLASLN IMAC
S888/S891 KPPRPGAPHLGSLASLNSPV IMAC
S888/S891/S894 KPPRPGAPHLGSLASLNSPV IMAC
S888/S894 KPPRPGAPHLGSLASLNSPV +++
S888/S894 KPPRPGAPHLGSLASLNSPVDSYNEGVK +++
S894 KPPRPGAPHLGSLASLNSPV ++++
S894 KPPRPGAPHLGSLASLNSPVDSYNE IMAC
S894 KPPRPGAPHLGSLASLNSPVDSYNEGVK ++++
S894/S898 KPPRPGAPHLGSLASLNSPVDSYNEGVK ++
S894/S898/Y899 KPPRPGAPHLGSLASLNSPVDSYNEGVK IMAC
S894/Y899 SLNSPVDSYNEGVK IMAC
S911 IKPQEISPPPTANL ++++ ++++
S911 IKPQEISPPPTANLDRSNDK ++++
. . Relative abundance . .
Residue . FAK peptide sequence (chicken) . (+) Inhibitors . (–) Inhibitors .
T13 DPNLNHTPSSSAK ++
S29 THLGTGMERSPGAMERVLK +++ ++++
Y155 NDYMLEIADQVDQEIALK ++++
S386/T388 QGVRSHTVSVSETDDYAEIIDEE IMAC
S386/T388 GVRSHTVSVSETDDYAEIIDEE IMAC
S386/S390 KQGVRSHTVSVSE IMAC
T388/S390 KQGVRSHTVSVSE IMAC
T388/S392 QGVRSHTVSVSETDDYAEIIDEE IMAC
T388/Y397 QGVRSHTVSVSETDDYAEIIDEE IMAC
S390 QGVRSHTVSVSETD IMAC
S390 QGVRSHTVSVSETDDYAEIIDEE ++ IMAC
S390 GVRSHTVSVSET IMAC
S390/[T394-Y397] QGVRSHTVSVSE [TDDY] AEIIDEE IMAC
[T394-Y397] QGVRSHTVSVSE [TDDY] AEIIDEE IMAC
Y397 QGVRSHTVSVSETDDYAEIIDEE ++
T406/Y407 DTYTMPSTRDYE IMAC
Y570 DFGLSRYME IMAC +
T700/S708 RMRMESRRQVTVSWDSGGSD ++++
S722/[S725-S726] DEAPPKPSRPGYPSPR [SS] EGF IMAC
S732 GFYPSPQHMVQPNHYQVSGYSGSHGIPAMAGSIYPGQASLLDQTDSWNHRPQE IMAC
S766 SHGIPAMAGSIYPGQASLL IMAC
T793 DSGTLDVRGMGQVLPTHLM IMAC
S845 RFLVMKPDVRLSRGSIE IMAC
S888 KPPRPGAPHLGSLASLNSPV ++++
S888 KPPRPGAPHLGSLASLNSPVDSYNEGVK +++
S888/S891 KPPRPGAPHLGSLASLN IMAC
S888/S891 KPPRPGAPHLGSLASLNSPV IMAC
S888/S891/S894 KPPRPGAPHLGSLASLNSPV IMAC
S888/S894 KPPRPGAPHLGSLASLNSPV +++
S888/S894 KPPRPGAPHLGSLASLNSPVDSYNEGVK +++
S894 KPPRPGAPHLGSLASLNSPV ++++
S894 KPPRPGAPHLGSLASLNSPVDSYNE IMAC
S894 KPPRPGAPHLGSLASLNSPVDSYNEGVK ++++
S894/S898 KPPRPGAPHLGSLASLNSPVDSYNEGVK ++
S894/S898/Y899 KPPRPGAPHLGSLASLNSPVDSYNEGVK IMAC
S894/Y899 SLNSPVDSYNEGVK IMAC
S911 IKPQEISPPPTANL ++++ ++++
S911 IKPQEISPPPTANLDRSNDK ++++

Each of the phosphopeptides identified from FAK is listed, with the phosphorylated residue(s) shown in red. The phosphorylated amino acids in brackets could not be distinguished. In some experiments, the cells were treated with phosphatase inhibitors before lysis. All of the cells were lysed in the presence of inhibitors. Relative phosphopeptide abundance is expressed in terms of ion counts (peak heights) observed for the most abundant charge state. Ion counts for the most abundant phosphopeptides are displayed as ++++ those that exhibit ion currents decreased by a factor of 10, 100 and 1000 are shown as +++, ++ and +, respectively. The relative abundance of peptides can only be compared within an individual experiment and, thus, peptides that were identified under different conditions may have been present at different levels in different experiments. We have indicated the highest relative amount that was observed. Peptide abundance with `+' inhibitors cannot be compared to abundance without `–' inhibitors. In some cases, peptides were only observed after enrichment with immobilized metal-affinity chromatography (IMAC), indicating that these peptides were present in limited amounts.

Fig. 2 shows the superposition of the detected phosphorylation sites on the domain structure of FAK. A number of features of the phospho-landscape are notable. The N-terminal FERM domain serves an autoinhibitory function (Cooper et al., 2003 Dunty et al., 2004) and as a bridge to growth factor receptors and membrane adapter proteins such as ezrin (Poullet et al., 2001 Sieg et al., 2000). Two sites of phosphorylation, T13 and S29, were observed N-terminal to the beginning of the FERM domain. Both residues are conserved in FAK from man, mouse and frog, indicating conservation of function in this region (Table 2). This region is unique to FAK and does not share sequence similarity with the related FAK kinase, PYK2/CAKβ. In the middle of the FERM domain, Y155 is a site of phosphorylation. Y155, which is conserved in FAK from man, mouse and frog, as well as in PYK2, is proximal to the previously identified site of FAK sumoylation, and phosphorylation of this residue may play a role in regulating nuclear trafficking of FAK (Kadare et al., 2003).

Conservation of phosphorylation sites between species

. Species conservation . . .
Residue . Human . Mouse . Xenopus .
T13 + + -
S29 + + +
Y155 + + +
S386 - - -
T388 - - -
S390 + + +
S392 + + +
[T394] + + +
Y397 + + +
T406 + + +
Y407 + + +
Y570 + + +
T700 + + +
S708 + + +
S722 + + +
[S725] + + +
[S726] + + +
S732 + + +
S766 + + +
T793 - - -
S845 + + +
S888 + - -
S891 + + -
S894 + + +
S898 + + -
Y899 + + +
S911 + + +
. Species conservation . . .
Residue . Human . Mouse . Xenopus .
T13 + + -
S29 + + +
Y155 + + +
S386 - - -
T388 - - -
S390 + + +
S392 + + +
[T394] + + +
Y397 + + +
T406 + + +
Y407 + + +
Y570 + + +
T700 + + +
S708 + + +
S722 + + +
[S725] + + +
[S726] + + +
S732 + + +
S766 + + +
T793 - - -
S845 + + +
S888 + - -
S891 + + -
S894 + + +
S898 + + -
Y899 + + +
S911 + + +

The sequence of chicken FAK was aligned with the sequences from human (GenBank accession no. L13616), mouse (GenBank accession no. M95408) and the frog, Xenopus laevis (GenBank accession no. L33920) by the Clustal method using the DNAstar MegAlign program. The phosphorylated amino acids that are identical are shown as `+'. The numbering of residues was based on the sequence of chicken FAK and may vary slightly for other species.

Clustered sites of phosphorylation were observed proximal to the major site of autophosphorylation, Y397 (Schaller et al., 1994), and included five newly identified sites of phosphorylation (S386, T388, S390, S392 and T406), as well as Y407, which was previously identified as a site of Src phosphorylation (Calalb et al., 1995). S390, S392 and T406 are conserved amongst FAK proteins from mouse, man and frog, whereas S386 and T388 are chicken specific. The proximity of this cluster of phosphorylation sites to Y397 suggests that phosphorylation of these residues plays a role in the regulation of FAK enzymatic activity. Current evidence indicates that FAK autophosphorylation takes place via the dimerization of two FAK molecules (Toutant et al., 2002). The efficiency of dimerization and/or Y397 phosphorylation may be influenced by the proximal serine/threonine phosphorylations. In addition, phospho-Y397 constitutes a docking site for the SH2 domain of Src and Src family kinases (Schaller et al., 1994). It is possible that phosphorylation influences the association of FAK with Src or other binding partners that use phospho-Y397 as a docking site.

Identified sites of serine, threonine and tyrosine phosphorylation in FAK. The sequence of chicken FAK (GenBank accession no. M86656) is shown with the amino acids that were identified by mass spectrometry (MS) shown in capital letters. The phosphorylated residues identified by the MS analysis are shown in red. There were two peptides in which the phosphorylated residue could not be distinguished (indicated by brackets). Because T394 was only observed within a peptide containing Y397, a known site of phosphorylation, the phosphorylation status of T394 is questionable.

Identified sites of serine, threonine and tyrosine phosphorylation in FAK. The sequence of chicken FAK (GenBank accession no. M86656) is shown with the amino acids that were identified by mass spectrometry (MS) shown in capital letters. The phosphorylated residues identified by the MS analysis are shown in red. There were two peptides in which the phosphorylated residue could not be distinguished (indicated by brackets). Because T394 was only observed within a peptide containing Y397, a known site of phosphorylation, the phosphorylation status of T394 is questionable.



Comments:

  1. Cenewig

    You burn, buddy))

  2. Bayhard

    I fully share her point of view. In this nothing in there and I think this is a good idea. Fully agree with her.

  3. Takazahn

    You are absolutely right. There is something in this and the idea is excellent, I support it.

  4. Cormic

    Probably inspired by standard thinking? Keep it simple))



Write a message