• Technology
    • Resources
  • Company
  • Collaborate
  • Careers
  • Investors
    • News & Events
      • News Releases
      • Events
      • Presentations
    • Stock Information
      • Stock Quote & Chart
    • Corporate Governance
      • Documents & Charters
      • Leadership
      • Board of Directors
      • Committee Composition
    • Financial Information
      • SEC Filings
    • IR Resources
      • FAQs
      • Email Alerts
      • Contact IR
  • Blog
  • Contact
Nautilus Biotechnology
  • Technology
    • Resources
  • Company
  • Collaborate
  • Careers
  • Investors
    • News & Events
    • Stock Information
    • Corporate Governance
    • Financial Information
    • IR Resources
  • Blog
  • Contact

Traditional protein analysis methods: Protein sequencing

November 3, 2022
Tyler Ford
LinkedInTwitter

A proteome is the full set of proteins within a biological sample like a cell, tissue, or organism. Proteomics studies all of these proteins and their interactions through various protein analysis methods.

 

The particular proteomic make-up of any biological entity, including individual protein abundances and locations, determines how that entity works at the molecular level. Thus, mapping a cell’s proteome provides mechanistic information on biological activities and makes it possible to study the mechanisms behind these activities as well as disease states in human health. Knowledge of the proteome may lead to a broad range of discoveries and applications including everything from new therapeutics to new ways to protect plants from drought.

 

Given the essential importance of the proteome, scientists have been studying it for many years using various protein analysis methods like mass spectrometry, protein sequencing and more. In this “Traditional protein analysis” series, we cover how researchers typically study the proteins that make up the proteome.

 

These protein analysis methods are not necessarily “ omics” scale technologies, which generally aim to measure all or a large amount of a particular type of biological molecule in a sample. Nonetheless, these methods form the foundations upon which proteomic analysis methods have been built. In a future series, we will dive into the technical concepts behind emerging omics technologies, which aim to enable scientists to study the full proteome.

Comparing different protein analysis methods

In the graphics portraying the technologies below, we provide qualitative assessments of proteome coverage and ease-of-use for each. Technologies with low proteome coverage, like western blotting and flow cytometry, are generally used for targeted experiments analyzing a small number of proteins at once. Medium coverage technologies like affinity arrays can look at 100’s to 1000’s of proteins in a single experiment. Truly comprehensive, high proteome coverage technologies can analyze tens of thousands of proteins at once.

 

Some protein analysis methods are easier to use than others. Low ease-of-use protein analysis technologies generally require complicated, difficult, or customized sample preparation involving a lot of hands-on experimenter time, and their data may be difficult to analyze or require bioinformatics support. High ease-of-use protein analysis technologies employ simple, standardized sample preparation, include more automation, and provide simple, data-rich outputs including protein abundance. Medium ease-of-use technologies have some mix of these attributes.

 

In addition to assessing standard metrics like these, we also discuss some of the advantages and disadvantages of each protein analysis technique. Many protein profiling technologies have specific pros and cons that make them ideal for certain applications, but not others.

 

Importantly, the technology assessments here should not be viewed as definitive. Rather, our goal is to help you think about how you can best leverage these technologies for your specific experimental goals.

The fundamental work of protein sequencing

Graphic depicting the process of protein sequencing by Edman degradation.

 

Delving into the proteome starts with exploring proteins themselves.  For instance, a scientist may find cells that carry out a particular function and want to know what protein or proteins give them that function. To figure this out, they can extract the proteins from the cells and separate the proteins in fractions based on a variety of characteristics such as size and hydrophobicity. Then, a researcher can test each protein fraction to see if it still has the function of interest. If it does, the researcher can repeat the process, making the protein fractions smaller and smaller, hopefully isolating an individual protein with the function they’re interested in.

 

Even after isolating a protein, researchers still don’t know the identity of the protein, what gene encodes it, or how that protein carries out its function. That’s where protein sequencing methods come in.

 

Proteins are composed of chemical building blocks called amino acids. There are 20 amino acids with a variety of chemical properties. These are attached to one another in a linear fashion to form full proteins, and their precise order and abundance in a protein gives the protein its specific structure and function. Protein sequencing methods can determine the order and abundance of all the amino acids that make up a protein.

 

Sequencing proteins can give scientists information about:

 

  • Protein identity – A protein’s sequence determines its identity. The sequence of amino acids in a protein ultimately determines how it folds in 3D space and gives the protein its particular shape, and function. Once researchers sequence a protein, it can be identified.

 

  • What gene encodes the protein – Individual amino acids are encoded by codons, three letter combinations of the four nucleic acids in DNA (A, G, C, and T). If researchers know the amino acid sequence of a protein, they can search through the genome of the organism the protein came from to find any DNA sequences (genes) capable of encoding that protein. Knowing the gene and its location in the genome may provide clues about the production of the protein. Knowing the gene that encodes a protein also enables researchers to direct cells to produce that protein. Rather than go through an arduous purification process every time they want to study a protein, with a gene in hand, researchers can instead instruct cells to produce the protein in large amounts or a in a way that makes that protein much easier to purify.

 

  • Protein function – Knowing a protein’s sequence gives researchers more information about its function. In addition, computational methods for determining a protein’s structure based on its amino acid sequence have improved phenomenally over the last couple of decades (Jumper et al 2021). Using modern methods, scientists can compare amino acid sequences and structures across proteins to make hypotheses about protein function with confidence.

 

Protein sequencing by Edman degradation

So how do researchers determine protein sequences? Traditionally, protein sequencing is done with Edman degradation. In this protein sequencing method, a purified protein is incubated with a chemical that attaches to an amino acid on one of its ends. Manipulating reaction conditions causes this chemical and the terminal amino acid to break off. The chemical-terminal amino acid compound can then be extracted from the rest of the protein.

 

Researchers can use a variety of analytical techniques to determine which of the 20 amino acids is contained in the extracted compound. Repeating this process many times over reveals the full amino acid sequence of a protein and thus its identity.

 

The Edman degradation protein sequencing method is very slow and can only be used to sequence short fragments of full proteins called peptides. Nonetheless, by breaking apart a purified protein and sequencing its component peptides, researchers can determine the full sequence of the original protein.

 

Advantages of Edman degradation

  • Novel protein sequencing – Using Edman degradation, researchers can sequence newly purified proteins without any prior information about them.

 

Disadvantages of Edman degradation

  • Proteome coverage – Before sequencing proteins using Edman degradation, researchers must first isolate them. This makes it difficult to sequence many proteins at once. In addition, it’s only possible to sequence one protein at a time in a single reaction. Finally, certain post-translational modifications of the peptide sequence can interfere with the chemical reactions involved in Edman degradation, limiting its use on some fraction of the proteome.

 

  • Peptide-based inference of protein identity – This technique infers a protein sequence by compiling the sequences of multiple component peptides. Such inferences can be inaccurate if multiple protein species in a sample are composed of similar peptides. 

 

  • Low throughput – The Edman degradation protein sequencing process is slow. Automation can speed up the process, but it will still take about a day to get a full protein sequence.

 

  • Low ease-of-use – Preparing proteins for Edman degradation can be time consuming and difficult. While the Edman degradation process can be automated, data analysis can be difficult because many peptides have similar sequences and it may not be clear how they should be combined into a full protein sequence.

 

Researchers are currently developing variations on Edman degradation to sequence many proteins at once. While Edman degradation is slow, one of its key advantages is it does not require prior knowledge of a protein’s sequence to determine its full identity.

 

Antibodies, on the other hand, require known, well-validated protein targets to be useful for protein identification. And, as you’ll learn in the next post in this series, mass spectrometry relies on prior knowledge of proteins to determine protein sequence as well. Thus, while protein sequencing may be a bit old school, it’s still very useful when researchers have limited information!

 

Browse by topic

  • Nautilus in the News
  • Events
  • Research and collaborations
  • Nautilus and our Platform
  • Proteomics
  • Life at Nautilus
  • Industry Interviews
SUBSCRIBE
Back To Blog

Nautilus Biotechnology

 

Corporate Headquarters
2701 Eastlake Avenue East
Seattle, WA 98102

 

Research Headquarters
835 Industrial Road, Suite 200
San Carlos, CA 94070

 

     

+1 (206) 333-2001

© 2023 Nautilus Biotechnology Inc.
All Rights Reserved.

Privacy Policy | Terms of Use

We use cookies in order to continually improve your experience on our website. By clicking “Accept cookies” or clicking on any content on our site, you are consenting to our use of cookies as described in our Privacy Policy. You can opt-out if you wish. Accept Reject All
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT