Machine learning is transforming all areas of biological science and industry, but is typically limited to a few users and scenarios. A team of researchers at the Max Planck Institute for Terrestrial Microbiology led by Tobias Erb has developed METIS, a modular software system for optimizing biological systems. The research team demonstrates its usability and versatility with a variety of biological examples.
Though engineering of biological systems is truly indispensable in biotechnology and synthetic biology, today machine learning has become useful in all fields of biology. However, it is obvious that application and improvement of algorithms, computational procedures made of lists of instructions, is not easily accessible. Not only are they limited by programming skills but often also insufficient experimentally-labeled data. At the intersection of computational and experimental works, there is a need for efficient approaches to bridge the gap between machine learning algorithms and their applications for biological systems.
Now a team at the Max Planck Institute for Terrestrial Microbiology led by Tobias Erb has succeeded in democratizing machine learning. In their recent publication in “Nature Communications”, the team presented together with collaboration partners from the INRAe Institute in Paris, their tool METIS. The application is built in such a versatile and modular architecture that it does not require computational skills and can be applied on different biological systems and with different lab equipment. METIS is short from Machine-learning guided Experimental Trials for Improvement of Systems and also named after the ancient goddess of wisdom and crafts Μῆτις, lit. “wise counsel”.
Less data required
Active learning, also known as optimal experimental design, uses machine learning algorithms to interactively suggest the next set of experiments after being trained on previous results, a valuable approach for wet-lab scientists, especially when working with a limited number of experimentally-labeled data. But one of the main bottlenecks is the experimentally-labeled data generated in the lab that are not always high enough to train machine learning models. “While active learning already reduces the need for experimental data, we went further and examined various machine learning algorithms. Encouragingly, we found a model that is even less dependent on data,” says Amir Pandi, one of the lead authors of the study.
To show the versatility of METIS, the team used it for a variety of applications, including optimization of protein production, genetic constructs, combinatorial engineering of the enzyme activity, and a complex CO2 fixation metabolic cycle named CETCH. For the CETCH cycle, they explored a combinatorial space of 1025 conditions with only 1,000 experimental conditions and reported the most efficient CO2 fixation cascade described to date.
Optimizing biological systems
In application, the study provides novel tools to democratize and advance current efforts in biotechnology, synthetic biology, genetic circuit design, and metabolic engineering. “METIS allows researchers to either optimize their already discovered or synthesized biological systems,” says Christoph Diehl, Co-lead author of the study. “But it is also a combinatorial guide for understanding complex interactions and hypothesis-driven optimization. And what is probably the most exciting benefit: it can be a very helpful system for prototyping new-to-nature systems.”
METIS is a modular tool running as Google Colab Python notebooks and can be used via a personal copy of the notebook on a web browser, without installation, registration, or the need for local computational power. The materials provided in this work can guide users to customize METIS for their applications.