Hey everyone, You might have noticed I’ve been gone for a bit. Fortunately, I’ve had a good reason. Recently, I just pushed another project to the world after a month of research and prototyping.
It’s called Accelerator, and it’s a chemical search engine that lets chemists optimize the production of any molecule using enzymes from nature.
Feel free to play around with it at accelerator-beta.onrender.com. Fair warning though…it takes its sweet time to load.
Or watch the demo here:
I think it’s safe to say our world runs on chemistry.
From the caffeine in your latte to the ammonium nitrate in most fertilizers, it’s sobering to think just how much our everyday lives depend on the existence of these molecules.
The point I’m trying to make here is that we tend to see chemistry as a black box. We just stumble on some important molecule, and out it comes from the factory, ready to use.
What we forget are the countless gears whirring inside that box. The most important molecules in the world aren’t natural. Some of the world’s brightest minds spend their lives figuring out how to make them from simpler molecules we already have.
How Chemistry Works
Today, practically every synthetic molecule is made using an approach called retrosynthesis. From Latin, the name literally translates to “making in reverse”.
To understand this, let’s say you’re a chemist trying to synthesize aspirin (C₉H₈O₄)
Without retrosynthesis, you’d begin by picking some arbitrary starting molecules, then explore the ways they could react to get you closer to your product. With this approach, you quickly run into a problem.
After just four or five reactions, you already have a tree of hundreds of thousands of potential reactions, with no clear path to the molecule you want 1. In 1985, this is what Nobel chemistry laureate E.J. Corey once called “the combinatorial explosion”.
Using retrosynthesis, on the other hand, you start with penicillin. First, you find a reaction that can directly produce penicillin. Then you ask yourself what reactions can set you up for that reaction, and so on and so forth.
You work backward, constantly finding simpler reactions until you end up with starting molecules you can find off the shelf.
The Insight
With my background in biology, researching organic chemistry was a treat. It basically gave me another lens to interpret everything I knew.
But almost as soon as I began, there was an idea I couldn’t shake from my mind.
Anyone who’s studied biology will tell you that nature has the coolest piece of chemical machinery ever. It’s called the enzyme.
Enzymes 101
In molecular biology, we often paint this portrait of the cell as a miniature chemical factory. Even at an average temperature of 37°C (98°F), the molecules inside are moving at breakneck speeds, colliding thousands of times a second 2.
Based on geometry alone, these molecules could react in thousands of different ways, with most of them producing unwanted (or even toxic) products. For cells—which absolutely have to maintain tight control over their composition at all times—that simply won’t do.
This is where enzymes come in. In the bustling interior of a cell, they bind to specific starting molecules (substrates) and lower the energy barrier needed to turn them into a desired product.
TL;DR: Enzymes are catalysts that drive molecules to prefer some reactions over others.
But that’s not all. Enzymes evolved for billions of years to perfect their tasks, which means they’re unbelievably good at what they do.
Consider the enzyme catalase, which accelerates the splitting of hydrogen peroxide into water (H2O) and oxygen (O2).
With no enzyme, this reaction moves at a snail’s pace. And yet, in ideal conditions, one molecule of catalase can break down 4x10⁷ molecules of hydrogen peroxide per second3!
Implication: for practically any organic molecule you want to make, you can find a cheat code hiding somewhere in nature in the form of an enzyme.
As a biologist, it surprised me that enzymes weren’t already mainstream in chemistry. Sure, chemists used inorganic catalysts like nickel and platinum all the time 4, but unlike enzymes, their effects weren’t specific, and were often orders of magnitude weaker.
Even today, if you want to make a certain molecule, there’s no systematic way to find enzymes that can get you there. That didn’t make sense to me.
Why reinvent chemistry when nature already has it figured out?
What I’ve Been Up To
Accelerator is a search engine I’ve been developing for the past month to better connect the worlds of biology and chemistry. It lets chemists streamline the synthesis of any molecule using enzymes from the natural world.
Here’s how it works:
Open up Accelerator
List the SMILES of the molecules in your synthesis reaction.
Click Optimize
That’s it. (And two of those steps weren’t even real steps!)
Behind the scenes, Accelerator compares your molecules against >13,000 enzyme-catalyzed reactions spanning nearly 300 unique organisms.
In your search results, you’ll find the name of your enzyme, the reaction it catalyzes, and even the species of plant, animal, or bacteria it comes from. If you’re interested, just click Learn More to find your enzyme’s page on the Kyoto Encyclopedia of Genes and Genomes (KEGG).
So, what can a chemist do with this information?
After you isolate an enzyme from a living organism, you can embed it directly into your reaction chamber to speed up synthesis. And since enzymes aren’t used up in reactions, they cost virtually nothing to scale and can be recycled indefinitely.
By relying on nature, we can make game-changing molecules just as quickly as we discover them.
The Future
Chemistry has come a long way in a short time. Just a few centuries ago, a chemist resembled more of a wizard, boiling and mixing mysterious liquids until something interesting happened. Everything was qualitative and subjective.
Now, we’re dealing with the inverse. The chemist is now a mathematician and an analyst. The total space of known molecules is now measured in the hundreds of millions, and is steadily growing every year. The most trivial detail about any molecule is hiding in some crevice of the internet.
If anything, we have too much data to handle. And I think the future of chemistry belongs to whoever can wrangle that mountain of information by combining and extrapolating it in useful ways.
That’s especially true when it comes to making molecules. Even if it’s a marginal improvement, think about what that could mean.
It could mean synthesizing cancer drugs that we once thought were once thought impossible. It could mean discovering the next generation of fertilizers to feed our growing world. It could mean creating greener jet fuel to power supersonic flight. It could mean artificial meat that’s healthier (and cheaper) than the real thing. The limit is our ambition.
I don’t need to prove this. Great things happen when we accelerate science.
Let’s step on the gas.
If you’re new here…
This project is part of my latest commitment to build something new every month and share it with the world.
Accelerator was the first project of what are going to be many. Now that I’ve had some time to reflect on what I made, here are my thoughts.
What I’m proud of
>Accelerator runs on a large, custom dataset of metabolic reactions. In other words, it’s not just some code running on a file pulled from Kaggle. It provides unique value.
>The user experience is intuitive enough for anyone to follow (see for yourself)!
>It’s been about a year since I seriously coded in Python, and I barely ever coded anything in JavaScript. I’m glad I could learn the important things quickly.
What I want to fix / next steps
>After a user finds an enzyme, Accelerator should refer them directly to chemical suppliers with discounted prices. (Could be a potential business model).
>Updating the platform to filter results based on the price, speed, and efficiency of enzymes.
>In my last newsletter, I promised this project would be live on June 21st. Instead, I ended up publishing it on the 25th. Never again! Time to hold myself to higher standards!
That’s my monthly invention. On to the next one 🎯
Footnotes
For context, the preclinical cancer drug (-)-thapsigargin usually requires >30 steps to synthesize. Things get very hairy very fast.
Apparently, you can derive the average velocity of these molecules through something called the Langevin equation, which contains all sorts of mathematical witchcraft that I can’t explain. Feel free to give it a look, though!
Again, this number varies a lot from study to study, but that’s the general range. Here’s one of the more reliable papers I found from MIT on the max velocity (vmax) of catalase.
For an example of metal catalysts in a real-world reaction, check out the hydrogenation of alkenes (unsaturated fats) to alkanes (saturated fats)