Do You Know How to Fold Right?

Youthful Aging Depends on
Proper Protein Folding

Protein misfolding is strongly associated with
age-related degenerative diseases
By Will Block

Surprisingly, there are no humorous quotations on the
subject of protein folding. You’ll have to make up your own.

f you’re skeptical about the importance of this strange-sounding topic, consider the following quotation from a recent paper by the distinguished scientist Richard I. Morimoto, who is a professor of biochemistry, molecular biology, and cell biology and director of the Rice Institute for Biomedical Research at Northwestern University:1
Aging and stress, stress and aging—these two human conditions, when paired, can profoundly affect the quality of life. When events go awry, molecular processes take place that, over time, can lead to neurodegenerative disease. At the root of the problem is a fundamental process: protein folding. . . . When proteins misfold, they can acquire alternative proteotoxic states [proteins becoming toxic] that seed a cascade of deleterious molecular events resulting in cellular dysfunction. When these events occur in neurons, the consequences can be devastating. . . . Collectively, these observations provide support for the hypothesis that graceful aging depends on the cell’s ability to counter the effects of stress by maintaining protein folding, which in turn permits appropriate protein function.

Now do you think protein folding is worth reading about? You do want to age gracefully, don’t you? Of course, to understand the significance of protein folding and how it can go awry (leading to a variety of inherited and age-related diseases, not just neurodegenerative ones), you have to know a few things about proteins and how they fold. In this article, we’ll talk about that subject. Next month, we’ll talk about a way in which the damage done by protein misfolding might be mitigated. You’ll understand that article better if you’ve read this one first.

Proteins Are Not Simple

You’re probably not going to believe this, but here goes anyway: If you were to make small chains consisting of only five amino acids (linked like beads on a string), but using all possible combinations of the 20 different amino acids of which our proteins are composed, then the number of possible three-dimensional molecular configurations arising from all those five-unit chains would be 104,857,600,000! (Ripley, where are you when we need you?)

The number is that large because: (1) there are 3,200,000 different ways (20 x 20 x 20 x 20 x 20) in which a 5-unit chain can be made from among the 20 amino acids;* and (2) the number of different configurations that are possible when those chains twist and turn and fold in on themselves is 32,768 for each configuration (based on certain molecular-geometric considerations). And that’s for only five amino acids! With each additional amino acid in the chain, the number of possible configurations increases exponentially.

*You might be thinking that some of those ways are identical to each other and should not be counted twice (good thinking!). But, for technical reasons having to do with the directional nature of the chemical bonding in amino acid chains, they are not identical, so the full count is valid.

Most actual proteins consist of several hundred linked amino acids, but some have thousands, and the largest known protein (named titin) contains 26,926 amino acids in one looong chain. All protein chains are synthesized in our cells from information encoded in our genes, which are specific segments of another polymeric molecule, DNA. After the synthesis (sometimes during the synthesis), the protein chains fold themselves into their ultimate, optimal, compact, three-dimensional configurations, as if by magic. (There is, of course, no magic in the natural world, just things that are really hard to understand.)

When 1 Million Is a Tiny Number

For a small protein consisting of only 100 amino acids, the number of possible sequences of the amino acids in the chain (ignoring the incomparably larger number of possible three-dimensional configurations arising from those sequences) is 20100 (20 to the 100th power), or 1.3 x 10130. That number is 1 trillion trillion trillion trillion times greater than the estimated number of elementary particles in the known universe (about 1082). It’s even greater than the number of times Congress has screwed things up!

Obviously, the number of different kinds of proteins that actually exist is infinitesimal by comparison. In humans, it’s estimated to be at least 1 million.2 That may surprise those who know that the current best estimate of the total number of genes in the human genome is only about 20,000 to 25,000. What happened to the well-known “one gene, one protein” rule? That rule was undercut by discoveries of several mechanisms by which many different (but related) proteins can arise from a single gene. Most of the diversity, however, is the result of chemical modifications that occur after the protein has been synthesized.

How Do Proteins Fold?

Proteins are so complex that almost everything about them poses huge challenges for the chemists and biologists who seek to understand their structures—and thus their functions—in detail.* (If you’re not too familiar with the basics of protein structure, now would be a good time to read the sidebar “A Protein Primer.”)

*By comparison with proteins, the other major classes of biological polymers—nucleic acids (DNA and RNA) and polysaccharides (complex carbohydrates, such as starch and glycogen)—are simple molecules whose structures are relatively easy for chemists to understand.

A Protein Primer

Not counting the water, about half of your body weight consists of proteins. These diverse and dazzlingly versatile compounds constitute much of the machinery of life, providing both structural stability and functional capability for the myriad workings of your body. They are the product of life’s master molecule, DNA, whose primary role is to be a genetic “blueprint” for your proteins; it does this by encoding their amino acid sequences in its molecular structure.

All of the biochemical and biomechanical wonders performed by proteins come about as a result of their molecular structures, most of which are exceedingly complex. To help our understanding, we can view these structures at four levels, which develop in sequence when proteins are synthesized in our cells:

  • Primary structure – This is simply the linear sequence of amino acids in the chain, as determined by the gene encoding that protein. A mutation in the gene may alter the amino acid sequence of the protein, resulting in functional changes that can range from inconsequential to fatal.
  • Secondary structure – This is the occurrence, along some segments of the chain, of localized, three-dimensional amino acid structures that are held together by a type of weak chemical bond called a hydrogen bond, whose importance to the existence of life cannot be overstated. The two most common of these protein “scaffold” structures are: (1) the α-helix, a spiral configuration of the amino acid chain; and (2) the β-pleated sheet, a configuration in which the chain loops back and forth upon itself a number of times so that parallel segments of it lie side-by-side in a pattern whose atomic landscape resembles pleats (which run perpendicular to the segments). The prediction and discovery of these structures by Linus Pauling in 1951 was a pivotal event in the history of chemistry.
  • Tertiary structure – This is the protein’s ultimate three-dimensional configuration, the result of the amino acid chain’s intricate twisting and turning and folding in upon itself until it settles into a stable molecular structure, usually more or less globular or ellipsoidal in overall shape (many others, however, are long and fibrous). The process is called intramolecular self-assembly—or simply folding—and it’s enormously complex. Despite the virtually infinite number of ways in which the chain could fold, it almost invariably folds to the one “correct” structure, i.e., the one that we know (from experimental evidence) to be the normal structure of the protein under the physiological conditions in question. This seemingly magical ability to select the one correct structure from an infinity of choices is a consequence of the laws of chemical thermodynamics, which determine the outcomes of all chemical processes and, therefore, of all biological processes.
  • Quaternary structure – This represents the special case in which fully formed protein molecules—sometimes of the same kind, sometimes of different kinds—become bonded to each other to form a supramolecular complex with a definite overall structure. This important process occurs naturally with many proteins, yielding complexes (e.g., hormone and neurotransmitter receptors in our cell membranes) that are essential for life.

A protein’s primary structure (amino acid chain, left) folds itself spontaneously into its tertiary structure, or native state (right), which incorporates the two main types of secondary structure: α-helix (coiled segments) and β-pleated sheet (ribbon segments).
With proteins, structure determines function. In most cases, it’s not so much the overall molecular structure that determines the protein’s function as it is the detailed atomic force fields associated with various features of its irregular surface—the protuberances, depressions, clefts, and channels that give each protein its unique form. All of these features result from the protein’s folding pattern, and all of them influence, in one way or another, the ways in which the protein interacts with the thousands of other kinds of molecules in the chemical “soup” of our cells and body fluids.

The interactions occur via different kinds of interatomic and intermolecular forces that are governed by the laws of quantum mechanics, and they’re of the same kinds that hold the protein together in its folded configuration in the first place. The mathematical analysis of these interactions in proteins is very difficult because of the large numbers of atoms, their complex geometric arrays, and the varied (and variable) force fields associated with them.

One of the major challenges (first tackled by Linus Pauling in 1936) is to understand how the extremely long, flexible amino acid chains fold in on themselves to achieve their ultimate configurations, each of which is uniquely determined by the exact amino acid sequence. Why they fold is well understood—it’s the how that is devilishly difficult to understand. (For different proteins, by the way, the folding process can take anywhere from microseconds to hours—a roughly 1 billion-fold range—depending on various factors.)

Even more challenging is to predict, from an amino acid sequence alone, how a given chain will fold into its final, correct configuration, called the native state. This requires a deep knowledge of molecular physics and vast computational power in the form of supercomputers or distributed computing for performing the calculations involved. (To learn how you—yes, you!—can get involved in this endeavor, see the sidebar “Fun and Games with Protein Folding.”)

Fun and Games with Protein Folding

Would you like to make a contribution to science without even being a scientist? It’s amazingly easy—except for your personal computer, which will have to do all the work. All you have to do is own a PC, have an Internet connection, and sign up for the Folding@Home project developed at Stanford University’s chemistry department. And there’s no final exam! Here’s how it works.

A Human Proteome Folding Project screensaver.
Ever heard of distributed computing? It’s a strategy that scientists sometimes use when dealing with hugely complex computational problems (such as protein folding) requiring long amounts of run time even when done on modern supercomputers. Instead of having one supercomputer grind away at the problem for a very long time (at great cost), the problem is divided mathematically into many small, manageable chunks that can be distributed to ordinary PCs—many of them—belonging to regular folks like you. The idea is to make up—in very large numbers of PCs, running simultaneously—for the small amount of computing power that each one can bring to the task.

As a PC owner, you sign up for the program, download an assigned chunk of the problem (with the requisite software for solving it) and let your PC have at it. It does the work while its central processing unit (CPU) is not engaged in enabling you to surf the Net, write letters to the editor, e-mail jokes to your friends, or whatever it is you do. Actually, the CPU can be working on the folding problem in the background while you’re doing those things, because most CPU capacity is unused most of the time, even when your computer appears to be busy.

For you it’s a no-brainer—literally—yet it helps scientists pursue worthy objectives efficiently at almost no cost. And when all the chunks of computed data are sent back and assembled to achieve the end result, you can share (a wee bit) in the credit. It’s a win-win situation!

If you like the idea, Google the term Folding@Home (there’s a Wikipedia article about the project, of course), or go directly to, where you can read all about it and join the “fold,” i.e., the hundreds of thousands of other folks who have made this one of the world’s largest distributed computing projects. The goal of the project is “to understand protein folding, misfolding, and related diseases.”* It’s very worthy.

*Folding@Home is not the only distributed computing project devoted to protein folding. Some others are Rosetta@home, Predictor@home, Human Proteome Folding Project, and TANPAKU. For links, see the article “Protein structure prediction” on Wikipedia. And for projects in many other fields, see the article “List of distributed computing projects.”

Now here’s a kicker: while your computer’s CPU is slaving away in the background, you could be playing an online video game called Foldit, which is about . . . protein folding! (This is one game where folding is a good thing.) Playing the game will help a different group of scientists, at the University of Washington. Their objective is to see whether human beings’ innate pattern-recognition and puzzle-solving abilities are superior to those that have been programmed into computers used for studying protein folding. If this turns out to be true, the scientists hope to be able to teach the computers better methods for attacking this monster problem, thereby saving time and money.

To find out how to play, Google Foldit or go directly to, where you can read all about it. Let the games begin!

Protein Misfolding Can Cause Disease

The proper folding of proteins is indispensable to life as we know it. Understanding how it occurs could shed new light on some of the basic mechanisms of life and could pave the way for a better understanding of diseases and how to treat them. By the same token, understanding how protein folding can go awry through misfolding (incorrect folding) could illuminate many aspects of life, disease, and death, showing us how better to preserve life, prevent disease, and delay death.

Recall that a genetic mutation can lead to a protein with an abnormal primary structure and probably, therefore, an abnormal tertiary structure, which could cause disease. Similarly, misfolding in a normal protein’s tertiary structure could cause disease. Here’s another analogy: just as an error in the folding of an intricate origami figure could ruin the whole thing, a misfolding error in a protein could cause disease or death.

Misfolded proteins are implicated in a number of diseases, including cancer, cystic fibrosis, emphysema, chronic liver disease, hypercholesterolemia, nephrogenic diabetes insipidus, and a host of neurodegenerative diseases, such as Alzheimer’s, Parkinson’s, and Huntington’s diseases, amyotrophic lateral sclerosis (ALS, Lou Gehrig’s disease), and Creutzfeldt-Jakob disease (the human analog of bovine spongiform encephalopathy, or mad cow disease). Because these diseases are due in part to protein misfolding—a pathological change in the protein’s conformation—they are called conformational disorders.

Are Proteins Cotton Candy or Steel Wool?

But what causes misfolding? Age, for one thing. Like many other processes that maintain health and vitality, this process seems to deteriorate with age, for reasons that are not well understood. On the other hand, some causes of misfolding are well known. A common cause is temperatures that are too high (or, sometimes, too low) for the protein’s zone of thermal stability. Others include: too high or too low a pH (acidity level); excessive radiation exposure; too high or too low a concentration of some cellular component; the presence of harmful reactive oxygen species, including free radicals; and disruptive mechanical forces acting on the protein.

Yet another cause is the presence of denaturants. These are compounds, such as urea (a waste product of nitrogen metabolism), that can make the protein’s tertiary structure unravel, partially or totally. Denaturation is often irreversible, but not always. (For more on this topic, with a “ball-of-spaghetti” view of protein folding, see the sidebar “How to Unboil an Egg” in the article “Clear Eyes with N-Acetylcarnosine” in the December 2006 issue.)

Most proteins are very sensitive to outside influences such as those mentioned above, because the forces that hold their tertiary structure together—primarily hydrogen bonds and a variety of weak, nonbonding interactions—are, for the most part, much weaker than conventional chemical bonds. Think of a protein as a ball of cotton candy rather than a ball of steel wool, and you’ll have a pretty good idea of its vulnerability.*

*Another thing to realize is that, like all other molecules, proteins are not static entities with rigid molecular structures. On the contrary, they’re highly dynamic entities, seething with motion in the form of atomic vibrations in all their chemical bonds, as well as rotations of groups of atoms about the axes of many of their bonds. Although their energy levels can change, these vibrations and rotations never cease, because they’re quantum mechanical in nature.

Chaperones Are There to Help

The protein hexokinase, seen at atomic resolution (minus the hydrogen atoms for clarity). The white, red, blue, and yellow atoms are carbon, oxygen, nitrogen, and sulfur, respectively. Seen at the upper right, to scale, are the substrate molecules ATP (the three orange atoms are phosphorus) and glucose, whose reaction is catalyzed by hexokinase, deep within the cleft between the two large lobes.
On the other hand, the very susceptibility of proteins to outside influences presents an opposite opportunity. Just as there are chemical agents that can harm our proteins, there are others that can help protect them from harm or rescue them from harm that has already been done.3

Remember your high school dances, where there was always a certain potential for, shall we say, misbehavior? Who was there to watch over you and keep you in line? Chaperones. In our cells, molecular chaperones (also called heat-shock proteins) are specialized proteins that have evolved to help other proteins fold properly or to help stabilize or refold proteins that have become misfolded. By repairing the damage, the chaperones allow the proteins to regain their functionality. Without this vital cellular function, life as we know it could not exist.

Complementing molecular chaperones are chemical chaperones, which are small organic molecules that serve the same functions, albeit via different mechanisms. The confusing terminology is unfortunate, as the terms “molecular” and “chemical” are obviously relevant to both classes of chaperones. (There are also pharmacological chaperones, but let’s not get into that.)

Certain Osmolytes Can Help Protect Us from Protein Misfolding

There are two major classes of chemical chaperones, one of which is called osmolytes. Virtually all organisms have osmolytes in every cell, and they depend critically on them for protection against the damage caused by protein misfolding. Although most osmolytes, such as the amino acid proline, serve this protective role, some others have the opposite effect. Urea, for example, is an osmolyte, but it’s a protein denaturant, as we saw above.

Thus there are both good guys and bad guys in the osmolyte domain, and part of the cellular responsibility of the former is to protect our proteins from the effects of the latter. By and large, they do a good job of it, because if they didn’t, none of us would be alive. In next month’s article, we will see how the good osmolytes help protect us from protein misfolding and how we can take advantage of their chemical benefits through supplementation right now. As our proteins function, so do we. Stay tuned.

Also see The Origami of Aging.


  1. Morimoto RI. Stress, aging, and neurodegenerative disease. NEJM 2006; 355(21):2254-5.
  2. Jensen ON. Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry. Curr Opin Chem Biol 2004;8(1):33-41.
  3. Leandro P, Gomes CM. Protein misfolding in conformational disorders: rescue of folding defects and chemical chaperoning. Mini Rev Med Chem 2008;8:901-11.

Will Block is the publisher and editorial director of Life Enhancement magazine.

FREE Subscription

  • You're just getting started! We have published thousands of scientific health articles. Stay updated and maintain your health.

    It's free to your e-mail inbox and you can unsubscribe at any time.
    Loading Indicator