Protein structures revealed at record pace

One of the hardest problems in biology is predicting the structure of a protein. Proteins are complicated. There are many interactions  between both the side chains and backbones of the proteins, making it very difficult to predict how a protein will fold into its 3D structure solely based on the amino acid sequence (primary structure). In our AP Biology class, we talked extensively about how this 3D (tertiary) structure of the protein is extremely important as it determines the function of the protein. For example, the success of the delta variant of SARS-CoV-2 is largely due to the change in the tertiary structure of it’s spike proteins. Thus, if the 3D structure of a protein is known, it is much easier to predict the function of that protein, and how well it performs the function. However, the methods of determining the tertiary structure of proteins is extremely costly. To determine the structure of a single protein, it can take up to $120,000 and one year.

AlphaFold 2.0 is a breakthrough in this long thought impossible problem. AlphaFold, created by Deepmind, uses deep learning to predict protein’s tertiary structures. In particular, it uses an architecture of transformers, a relatively new and increasingly popular deep learning technique. Using this method AlphaFold is able to achieve remarkably accurate and detailed results, even on an atomic level.

Because of its ability to predict the structure of unknown proteins, AlphaFold can be used to determine how a single nucleotide mutation can affect the structure of a protein. Interestingly, many diseases result from an improperly folded protein, these include: Cystic Fibrosis, Alzheimer’s, and Parkinson’s. While the protein structures themselves do not often lead to the creation of new treatments, they do offer a better understanding of how the protein works. This deeper understanding can then be used to develop new therapies. Thus, AlphaFold has the potential to accelerate new treatments for many untreatable diseases at a much lower cost.

In addition to diseases resulting from misfolded proteins, AlphaFold can be used to predict the effect mutations will have on the folding of the SARS-CoV-2 spike proteins. This can help to quickly determine how a mutation will change the shape (and thus function) of the spike proteins. This makes it much easier to predict how these mutations will affect the spread and severity of the new variants and, using this info, classify the new variants.

However, AlphaFold is not perfect. While most predictions are quite good, a small percentage of the protein structures generated are clearly  inaccurate, putting hydrophobic amino acids on the outside of the protein. Knowing this, it is still necessary to analyze any prediction made by the computational model before using it for biological analysis.  Nonetheless, AlphaFold is a powerful tool for prediction of protein structure and will revolutionize the field of computational protein structure prediction.

If you want to experiment yourself with AlphaFold, a working notebook can be found here. Any PDB sequence can be queried, and the AlphaFold model will predict the structure to the best of its ability.

 

 

Print Friendly, PDF & Email