A new era of biological research has been ushered in, with an artificial intelligence (AI) predicting the 3D shape of nearly every protein known to science – just a year after its first data release.
Thanks to AlphaFold, an AI tool developed by Google-owned AI company DeepMind, more than 200 million protein structures are now shared online in a free-to-access, searchable database, called AlphaFold DB.
This breakthrough paves the way for untold avenues of scientific discovery in proteins, the building blocks of life. And researchers are excited.
“Determining the 3D structure of a protein used to take months or years, now it takes seconds,” explained cardiologist Eric Topol of the Scripps Research Translational Institute in a statement about the data release.
“With this new expansion of construction enlightening practically the whole protein universe, we can anticipate that more natural secrets should be tackled consistently.”
Together with scientists from the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI), DeepMind unveiled its first batch of AlphaFold predictions in July last year.
Touted as a revolutionary tool that will transform biological research and accelerate drug discovery, AlphaFold predicts the 3D shape of proteins based on their amino acid sequence.
Linked together in chains, these amino acid sequences spool out long proteins that are folded into pleated sheets and spinning ribbons.
By understanding the shape into which any protein folds, scientists can gain a better understanding of how that protein works, explaining what its primary role is inside cells.
AlphaFold was designed to speed up this process, providing more than 200 million predicted structures of proteins found in plants, bacteria, animals and other organisms in this latest data release.
“This hope has become a reality faster than we could have dreamed,” DeepMind Chief Executive Damis Hassabis said in a statement about the latest data release.
Already, researchers have used the first batch of alphafold predictions to improve their understanding of deadly diseases like malaria, open the door to better vaccines, and unravel biological puzzles about the behemoth protein that has held scientists back for decades.
Not to mention identifying never-before-seen enzymes that could help increase plastic pollution. “The alpha fold has sent ripples through the molecular biology community,” said Samir Velankar, a structural biologist who heads EMBL-EBI’s Protein Data Bank.
“In the previous year alone, there have been more than 1,000 logical articles on an extensive variety of examination subjects that utilization alpha-overlap structures; I’ve seen nothing like it. “And that’s just the effect of 1 million predictions,” Welenker added.
“Imagine the impact of making more than 200 million protein structure predictions openly accessible in an alphafold database.”
Albeit the open-source alphafold programming has been accessible to specialists since its delivery last year, having a great many anticipated protein structures readily available in an accessible data set will without a doubt accelerate research.
According to EMBL-EBI, about one-third of the more than 214 million predictions are highly accurate, on par with protein structures derived from routine experimental methods, such as X-ray crystallography and cryo-electron microscopy.
For decades, scientists have painstakingly inferred molecular structure from the blurred images produced by these methods—perhaps the most famous is Rosalind Franklin’s image of helical DNA. However, the quality of alpha-fold predictions varies, and may be less accurate for rare proteins about which scientists know little.
So in some cases, its predicted structure can be used to make sense of experimental data. Despite so many data dumps, there’s still a lot of life that Alpha Fold doesn’t capture, including predictions about how proteins once assembled.
Scientists Say Planet 9 is Running out of Places to Hide
Microbial proteins identified by traces of genetic material in soil and seawater are also not in the database—yet these microorganisms represent an untapped resource of powerful compounds, as scientists only know a tiny fraction of all microbial life on Earth. Part is entered.
Some scientists have also raised concerns about accessing the Alphafold database and its staggering 23 terabytes of content, which some would find difficult given the expensive computer power and cloud-based storage required for sophisticated data analyses.
It is less accessible to research teams. After all, the benefits to human health — which DeepMind says it has carefully weighed against potential biological risks — are too great to imagine.