To all articles

What is the proteome?

Tyler Ford

March 16, 2023

The proteome is the collection of all the proteins inside a cell, organism, or biological sample. The proteome is to proteins what the genome is to genes. Because proteins are so fundamental to life, studying the proteome means studying the inner workings of life itself.

Scientists typically study the proteome using techniques like mass spectrometry or traditional protein sequencing, but much of the proteome remains unexplored. Next-generation proteomics technologies will uncover far more of the proteome, vastly expanding what’s possible. We call this coming wave of proteome exploration and application the proteomics revolution.

Human body with proteins from various organs highlighted next to a definition of “The Proteome.”

What is the proteome?

A proteome is all the proteins in a biological sample. A researcher might want to understand the proteome of a single cell, a tissue sample, an individual organism, or an entire species. So, how is the proteome defined? The exact definition of the proteome – and the size of it – depends on the scale of your investigation.

Nonetheless, proteomes are generally huge and complex, making them a challenge to study. A single cell can contain billions of proteins, some of which are entirely different from each other, while others are variations on the same gene product (called proteoforms). But understanding the proteome is worth it: By studying the proteome of a cell, you can learn not only which proteins you have and in what amounts but also dig into how the cell works.

History of proteomics and the proteome

It first became possible to study the proteome in 1975, when a technique known as two dimensional gel electrophoresis was invented. This proteomics technique involves separating molecules in two dimensions, such as vertically and horizontally, with each dimension corresponding to a different property.

Early proteome studies included analyses of the proteomes of mice, guinea pigs and Escherischia coli bacteria. These first studies were groundbreaking in that they allowed researchers to begin to understand what proteins existed inside these organisms, but they were fairly limited in capability.

More than twenty years later, around the turn of the century, mass spectrometry began to be used more heavily for proteomics research, allowing researchers to break proteins apart to study them, and work with a mixture of multiple proteins at once. It marked another turning point in the history of the proteome, greatly expanding what was possible to study in the proteome.

The term “proteomics” itself wasn’t coined until 1995, in a paper studying the proteome of a species of a bacteria. Similarly, the word “proteoform” was first proposed in a commentary in Nature in 2013 as a way to encompass all of the variations to proteins with a single term.

Tools for studying the proteome

There are many different approaches to studying the proteome, including broadscale proteomics techniques (often called discovery proteomics) that sample much or all of the proteome of a sample, as well as targeted proteomics approaches that selectively study just a subset of the proteome of a cell, sample, or organism. Soon, new next-generation proteomics tools may enable far more in-depth and high-resolution studies of the proteome, fully unlocking all of the proteins in a sample.

One of the most common ways to study the proteome is with mass spectrometry. This proteomics technique involves breaking proteins apart into small molecules that are then run through a mass spectrometer. The molecules are separated by mass, which lets scientists identify them and compare them to known proteins.

Another common way to explore the proteome is with antibodies and other affinity reagents. These molecules bind to specific proteins, and, when combined with fluorescent labeling or other techniques, scientists can see which proteins are present by seeing which have been bound. Affinity reagents can also show how much of a specific protein is in a sample, and where that protein is.

Researchers also use a technique called Edman degradation to study the proteome. It’s a form of protein sequencing that strips off the amino acids at the end of a protein one by one, identifying each in turn. Eventually, this shows scientists what sequence of amino acids makes up a protein, which reveals its identity and can be used to model its structure.

Watch these animations to learn how the Nautilus Platform is designed to enable analysis of the proteome and its proteoforms:

From the genome to the proteome

For years, scientists have been asking, “What makes a living cell tick?” If you peer inside a cell, you’ll find lipids, carbohydrates, nucleic acids, and proteins – the four macromolecules that form the building blocks of life. All four combine to form the various structures in cells, from their outermost membranes to their innermost machinery. But when it comes to the mechanisms that make cells function, it’s proteins that get work done.

The proteins a cell can produce are determined by its genome. This DNA contains the blueprints for life in the form of the genetic code. Inside a cell, DNA is transcribed into RNA, which is in turn translated into proteins. This process of moving from DNA to RNA to protein is so foundational to all life that it’s referred to as the central dogma of biology.

Ever since the Human Genome Project completed its first sequence 20 years ago, advances from genetic and genomic research have made it easier to learn about every component of the central dogma, including proteins. But truly effective tools for studying all the proteins in the last component of that dogma, the proteome, are still lacking. While we can comprehensively sequence DNA and RNA, scientists are still unable to see the full scope of the proteome.

The Human Proteome Project

With increasing interest in proteomics, we may soon see the completion of a Human Proteome Project to complement the Human Genome Project. This extraordinary undertaking is organized by the Human Proteome Organization (HUPO). The goal of this international effort is to reveal the function of every protein. As of March 2022, researchers working on the project have reported finding 18,407 proteins, and estimate they have 6.8% of the human proteome to go.

Characterizing each human protein in this way will supply an important framework for proteome research but is just the first step to understanding the human proteome — like having an encyclopedia entry about an animal in-hand before setting out to study its ecology. To build upon this baseline knowledge of the proteome, researchers will need new technologies that allow them to understand how the proteome varies between cells and organisms, health and disease/stress, in response to drug treatments, and much more.

Size of the human proteome

The size of the proteomes of different organisms can differ widely. Much work to date has gone into studying the human proteome to discover proteins involved in various biological pathways, diseases, and conditions. Despite all that research, there is still some debate over the actual size of the human proteome.

The question isn’t very straightforward. There are around 20,000 protein-coding genes in the human genome. Yet, there are more than 20,000 protein variants in the human body. That’s because mRNA transcripts can be modified before they are transcribed into proteins, and because proteins themselves can be altered after they are translated thus creating many proteoforms. Altogether, there are likely millions of human proteoforms.

Scientists are beginning to try and study all of those proteins, but it’s a difficult task. For example, the Human Proteome Project in 2020 released a blueprint of the human proteome covering a little more than 90% of the proteins that genes encode for. A 2021 update notes that they’ve slightly increased that number to 92.8%, or 18,357 of the proteins predicted by genomic studies of protein-producing genes.

Even with those efforts, there are still millions of proteoforms to account for, meaning that studies of the human proteome are far from finished. Importantly, many studies get only crude measurements of protein abundance and more quantitative proteomics techniques must be used in efforts to measure protein levels in different kinds of human tissue. Quantifying protein levels more accurately and precisely is particularly important for understanding human biology and combating disease. That kind of work is only getting started.

Next-generation proteomics tools could significantly enhance this type of research. These technologies aim to provide researchers with accessible means of quantifying every protein in a sample quickly. They have the incredible potential to improve our mechanistic understanding of all facets of biology.

Applications of proteomics

Unlocking the proteome is as fundamental to biology as understanding DNA and RNA. This means proteomics applications span all sectors that deal with cells and life in general. Furthermore, by integrating data from genomics (DNA), transcriptomics (RNA), and proteomics (proteins), the burgeoning field of multiomics is bringing with it an unparalleled depth of biological understanding.

One of the most valuable things studying the proteome adds to the –omics space is mechanism of action. For instance, the chain of events from a mutation in a gene to a disease phenotype are complex, relying on interactions between the genome, other parts of the cell, and the environment. Proteins directly carry out the vast majority of these interactions and perform most cellular functions. Thus studying proteins can show us the mechanisms through which mutations, drugs, toxins, and more cause changes in biological function. Genomics shows us what proteins a cell could possibly make, but proteomics shows us what proteins are active now, how many of them there are, where they are, and, with specialized techniques, what proteins and other molecules interact with one another. Adding insights from proteomics connects the dots between environment, DNA, RNA, proteins, and biological function.

Even if DNA encodes the architecture for these interactions, the proteins are the last step in this chain and the most closely linked to how cells, tissues, and organisms behave. This means studying proteins gives us the clearest views into real-time biology, and manipulating proteins can have direct impacts on processes across all levels of biology.

Check out this episode of the Translating Proteomics podcast to learn how scientists are “Putting the Proteome to Work”

Lessons from the proteome

Some of the most immediate applications of proteomics are in health. Measuring the proteins in both healthy and diseased cells increases our understanding of disease mechanisms and provides potential protein biomarkers for diagnosis as well as targets for treatment.

Importantly, proteins are generally easier to target with drugs than DNA or RNA, so knowing what proteins are directly involved in a disease provides researchers with clear paths to therapeutic development.

For example, an understanding of the proteomes of cancer cells can enable the development of precision medicines that target the mechanisms controlling tumor growth. In addition, many think diseases like Alzheimer’s are caused by a buildup of proteins. A better understanding of the proteome could help researchers identify biomarker proteins that are indicative of the early stages of this build up and could be targeted to either prevent or reverse this process.

Some research is already using discovery proteomics to find protein networks involved in Alzheimer’s, and linking genes and proteins in different kinds of neurological disease. This proteomics research and more may lead to novel treatments for Alzheimer’s and other neurological conditions.

Find additional applications of proteomics

Unleash the proteome

We are excited to be at the forefront of efforts to unleash the proteome and bring proteomics applications like those described above to the researchers, doctors, and patients who need them. To learn how Nautilus is driving a revolution in proteomic analysis, subscribe to the Nautilus blog and never miss an update on the upcoming #ProteomicsRevolution.

View our animation to learn how proteomics can spur the development of precision medicines

Share this Article

Stay up-to-date on all things Nautilus

World-class articles, delivered weekly

Subscribe

MORE ARTICLES

Proteomics applications in medicine, basic biology, and beyond

Proteomics is being applied in many biological fields to unlock new discoveries and drive scientific advances.

Proteomics and popular culture

We're giving the proteome a PR boost by highlighting ways proteomics can interact with your life in both healthcare and popular culture.

Traditional protein analysis methods – Western blotting, flow cytometry, and other methods using affinity reagents

There are many existing protein analysis methods using affinity reagents, and all come with advantages and disadvantages.

View all articles

Stay up-to-date on all things Nautilus

Subscribe to our Newsletter

Subscribe