free science
At first, AIs were created to understand the world. Now, a world is created that AIs can understand better.
The Model Context Protocol (MCP) defines a standardised interface between things and AI. Until MCP, LLMs like ChatGPT or Claude had to figure out where to look for data, how to use an application, or how to navigate a website. This often goes wrong, because apps are all different, and data is often not accessible.
Now, if you’d like an AI to easily access your (app, system, data), you can create an MCP server. An MCP consists of a few fundamental building blocks like tools, resources and prompts for whethever task it is you’d like AIs to do. These building blocks are attached to your app and provide exactly the information that an AI needs to use it. Eventually, if everything from browsers to online shopping to booking flights has MCP servers, AIs will be able to easily do all these things for us, because they’ll know how to use them.
Wouldn’t it be cool if science had MCPs? Say, each paper has its own MCP server that cleanly exposes all important parts, such as methods, conclusions, code and data, independent of the layout of the journal or the structure of the code or data repo? Each paper-MCP would also be registered somewhere, so that AIs can just search for it. Let’s call this protocol the Science Model Context Protocol (SMCP).
Here’s a list of things that a high-bandwidth, high-accuracy AI-science interface through SMCPs would enable:
- automated synthesis: AI agents could reliably synthesise knowledge through systematic-reviews and meta-analyses, effectively enabling anyone to summarise state of the art knowledge on any question. This could dramatically accelerate science and improve/save many lives.
- decentralised knowledge: tacit knowledge and skills are highly centralised within few institutions. I was a Postdoc at the University of Edinburgh, which is a hub for stats and genetics, making it much easier to produce high quality publications even as a fresh PhD student. What if everyone, no matter their University, could have easy access to the models, code, and rational behind them? This is what SCMPs will do.
- de-duplicating efforts: Too much time is spent replicating code for data-processing and analyses. Through an SCMP, AI can recreate them and adapt them to different use cases, freeing up time and capacity for researchers to explore new things.
- live science + digital twins: most research is a one-off. Get the data, run the experiment, analyse, publish. However, what if more data comes along? Especially in the context of a research synthesis? SCMPs will facilitate continuous analyses, added more data and updating the results over time. I imagine that many important papers will have live “digital twins” which incorporate and publish continuous updates.
- streamlined evaluation: AI agents can review bugs in code, flaws in statistical modelling and experimental design. Humans can think evaluate the bigger picture and conclusions. This wouldn’t just save time, but upskill humans and AIs in the process.
Let’s be clear though, there are risks too:
- streamlining scientific information for AIs will speed up AGI timelines. Are we ready for this?
- it also makes it easier for bad actors to access knowledge via AI, e.g. biotech, weapons
Whenever we decentralise information, it comes with benefits and risks. In the age of AI, the trajectory is less clear than ever. Should we free science?