Links and short posts on agent swarms and autonomous/agent-mediated science

The landscape for autonomous science agents is moving fast enough that a link dump with some opinionated annotation is more useful than a polished essay that's outdated by next week. So here's what I've been tracking, grouped by what I think actually matters vs. what's just interesting vs. what needs a warning label.


The core tension: ungrounded agents produce hyperreal science

Start here: Amber Liu's thread on why you should not use Claw Scientist for fully autonomous research. Her point is the important one — unembodied AI agents doing research with no skin in the game tend to descend into the hyperreal. They generate plausible-sounding outputs that aren't anchored to anything. This should be the caveat hanging over everything else in this post.

A better-designed alternative: Curie (arXiv:2502.16069) — actually tries to build rigor and experimental grounding into the agent loop rather than just letting it rip.

Ethan Mollick's cautionary note is also worth reading alongside Amber's thread.

Related biosecurity angle: The Lucretius Problem of Biosecurity — Olivia Scharfman argues a small-scale bioterrorist attack is closer to "inevitable" than most people's priors suggest. Relevant because autonomous agents lower the barrier.


Agent swarm platforms that actually exist now

ScienceClaw × Infinite (from Markus Buehler's lab at MIT) — open-source agent swarm platform for crowdsourcing discovery across institutions. The agents self-coordinate and evolve to use hundreds of scientific tools. The paper behind it: Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange (arXiv:2603.14312). More from Buehler's lab here and a related Science paper. I wrote a post on Infinite too: my infinite-lamm post.

ClawInstitute autoresearch wiki: https://clawinstitute.aiscientist.tools/w/autoresearch

beach.science — another forum where agents post hypotheses. S/N ratio isn't the highest, but some agents are genuinely better than others. Worth monitoring.

Periodic — yet another AI scientist platform.

Orchestra (Amber Liu / Zechen Zhang) — just taking off. Amber may be at the beginning of an important inflection point. Their AI research skills breakdown is worth reading, and AmberLJC's paper list is a good index of the LLM systems space.

projectNANDA / join39.org — Maria Gorskikh's "every robot is an agent" framing. They run hackathons every Thursday and are genuinely good at building and winning them.

New conference specifically for AI scientists: announcement thread

https://www.superintelligent.group/blog/technical-deep-dive


Coordination mechanisms and swarm architecture

Echo Field Dynamics (EFD v1.0) — zero-channel coordination framework where agents synchronize without message passing. No new messages, no predefined topology, no central coordinator. Haven't evaluated in depth but the framing is interesting.

CORAL — multiagent evolution, from the Paul Liang lab in Cambridge.

DGM with Hyperagents (arXiv:2603.19461) — Darwin Gödel Machines. Self-referential, recursively self-improving agent systems. This paper is load-bearing for several of my other posts on the hyperagent problem.

Swarm-adjacent: thread from habermolt, James Zou on agent evaluation


Recursive depth, game theory, and what agents can actually model

Chelsea Zou's poker research — "We Made AI Gamble. Here's What Poker Revealed About Frontier Models." The question is how many layers of recursive K an agent can handle. This connects to recursive language models and to the simulations I've been running.

Pedro Ortega on universal AI as imitation

My own Claude conversations exploring this:

The RL Spiral — RL and neuro connections.


World models and bigger-picture framing

Not Boring: World Models

Apoth3osis projects

Nature paper on foundation models for science

"AI will keep getting better at physics. We will not." — but again, see all the caveats from Amber Liu above.


Cybershamanism appendix

A little more "out there" relative to most progress studies, but worth monitoring if you're interested in where agent-mediated meaning-making gets weird:

  • https://chatgpt.com/share/69b9c37b-3830-800c-8abe-ef851053fe3b
  • https://claude.ai/share/160adb55-6ac0-47bd-8abb-65089c054506
  • https://claude.ai/share/d4d00be2-a89f-4cf8-b261-f363860fde41
  • https://claude.ai/share/4e6905de-dcfb-4c4b-8717-af85b671e9f3
  • https://claude.ai/share/c9d15d8f-b749-4d23-91dc-7342710b0ba8
  • https://claude.ai/share/a9cac104-145b-4bf9-9f44-b6045d1fd732

1

5 comments, sorted by Click to highlight new comments since: Today at 8:04 PM
New Comment

The wider AI-for-science landscape: platforms, tools, and automated labs (collected links + commentary)

Pulling together related threads on AI science platforms, protein reasoning, chemistry automation, and self-driving labs — all of which feed into or compete with what ClawInstitute is trying to do.


AI Scientist Platforms

The space is getting crowded fast. Here's what exists:

Edison Scientific (Sam Rodriques) — automating research across the entire drug development pipeline. Rodriques is one of the smartest and most visionary people in this area, along with Patrick Hsu. His writeup on The Humanity Project is worth reading.

Prism (OpenAI) — free workspace for scientists to write and collaborate on research, powered by GPT-5.2. Available to anyone with a ChatGPT personal account: https://prism.openai.com. starspawn0 tried it and found it good at turning pidgin LaTeX into readable solution sheets, less good at generating novel exam problems, and inconsistent at making LaTeX documents Title II accessibility compliant (which would actually be a huge win if they nailed it — it's a massive pain across university departments right now).

Orchestra Research (Amber Liu / Zechen Zhang) — see also their crawling high-quality AI research paper workflow with 154 messages.

Ai2 Asta — scholarly research assistant combining literature understanding and data-driven discovery. 108M+ abstracts, 12M+ full-text papers. From Allen AI. See also: https://x.com/rbhar90/status/2016239480458657953

Aristotle (Autopoiesis Sciences) — built for scientists and researchers to tackle hypotheses, analyze experimental data, generate research directions.

Axon — AI-assisted transcripts for research.

Analemma — relevant for longevity research automation. Open question: of these platforms, which is most accessible for agents like those on beach.science?

Also: Long-running Claude for scientific computing from Anthropic.

Higher-order knowledge representations for agentic scientific reasoning: https://arxiv.org/pdf/2601.04878

A framing I keep coming back to: AI has fundamentally changed coding and research, at least 5× faster. Publishing papers alone matters less now. The only real standard is: can your work be used to train or improve AI models?

Don't forget latch.bio and DeepOrigin either.


Protein Reasoning: BioReason-Pro

This makes protein interactions WAY more interpretable through reasoning. This is real interpretability — a foundationally more important kind than even Jude Stiel's work.

https://x.com/i/status/2035013002244866547 https://x.com/Radii2323/status/2035012134979961132

Parsa Idehpour launched BioReason-Pro — combining biological foundation models with LLMs to reason across biological modalities. Key claims:

  • Can hypothesize deep molecular functions that have been validated in the lab
  • Can reason in depth on mutations and protein structure
  • Achieved SOTA on Gene Ontology term prediction with more in-depth annotations than what scientists currently use
  • Training: SFT on synthetic reasoning traces (GPT-5, grounded by biological data), then RL to reduce hallucinations and increase accuracy

Team includes Adib Vafa, Arman Isa, with advising from Bo Wang and Patrick Hsu. Paper: BioReason-Pro: Advancing Protein Function Prediction with Multimodal... (also has Arnav Shah from Vector Institute).

Related threads: https://x.com/momo_mattomo/status/2035328956669272335 and https://x.com/duguyuan/status/2035331075527110828

Also see: evedesign from Debora Marks lab — unified protein design for computational researchers and experimentalists.


Generalist Biological AI (GBAI)

"Generalist biological artificial intelligence represents a transformative approach to modeling the 'language of life' — the flow of information from DNA to cellular function."

Nature Biotechnology paper: https://www.nature.com/articles/s41587-026-03064-w Thread: https://x.com/i/status/2034986902789791957


Compbio Tools: Rosalind/LiteFold, Biomni

Rosalind / LiteFold — live and free: https://app.litefold.ai https://x.com/try_litefold/status/2025636684659118088

Biomni: https://x.com/phylo_bio/status/2025971413929320893


Automated Chemistry and Self-Driving Labs

The chemistry automation story goes back a couple years but has accelerated sharply:

LLMs directing automated chemistry labs (Dec 2023, Nature): https://www.nature.com/articles/d41586-023-03790-0 — AI not just controlling robots but planning their tasks from simple human prompts. Eric Topol thread: https://twitter.com/EricTopol/status/1737508532583604552

Automated chemical synthesis (Nature Communications, Mar 2024): https://www.nature.com/articles/s41467-024-45444-3

Lee Cronin's chemputation for drug discovery: Spotlight talk video — programmable chemistry.

"What Biology Can Learn from Physics" (Asimov Press, Apr 2024): https://asimov.press — predictive models as billion-dollar moonshots.

SLAS 2026 takeaway (Feb 2026): Liquid handling is no longer just about throughput — it's about integration + control. Automation teams in Boston aren't asking for "another pipetting robot." They want modules that integrate into robotic systems, execute complex liquid handling with real channel-level control, and report status in real time. The Veon Scientific i.Prep2 was designed for exactly this (open-frame, 8 independent channels, REST API + Swagger UI, real-time telemetry). Contact: info@hrush.net. See also the Hamilton A1 / Trisonic Discovery acoustic dispensing thread on LinkedIn.

LUMI-lab (Bo Wang lab, Cell, Feb 2026): A self-driving lab closing the loop between an AI foundation model and robotics for LNP discovery / mRNA delivery. Pretrained on 28M+ molecular structures, then iteratively improved with closed-loop experimental data. Across ten active-learning cycles, synthesized and evaluated 1,700+ new LNPs. Unexpectedly identified brominated lipid tails as a new design feature — these delivered mRNA into human lung cells more efficiently than approved benchmarks despite being a small fraction of the explored chemical space.

  • Paper: https://authors.elsevier.com/a/1mg4aL7PXy21V
  • Code: https://github.com/bowenli-lab/LUMI-lab
  • Video: https://youtube.com/watch?v=POOgIiKRSiE
  • Bo Wang tweet: https://x.com/BoWang87/status/2026349938746048744
  • Follow-up: https://x.com/i/status/2026981708290035843

All of this is context for why ClawInstitute's approach — structured agent reasoning over biological knowledge graphs with real quality control — matters. The tools are proliferating fast. The question is which ones produce results that survive contact with wet-lab reality.

ClawInstitute: Why this agent science platform might actually work (Marinka Zitnik, Ada Fang, and the Cambridge agent swarm ecosystem)

ClawInstitute is a public exchange for AI scientists and agent swarms, built by Marinka Zitnik's lab (with Ada Fang). It's designed for things like protein engineering and scale-dependent biological context — the kind of problems where you need structured reasoning over messy, multi-scale biology.

https://clawinstitute.aiscientist.tools

https://x.com/AdaFang_/status/2033920328154681700

Harvard/Kempner writeup: Harvard Researchers Create Social Network for "AI Scientists" to Collaborate


The case for why this is different

Some important figures have raised legitimate concerns about autoresearch and agent science. Amber Liu (founder of Orchestra Research, who partnered with Harvard-based Zechen Zhang) wrote a thread essentially begging people not to trust autonomous research agents uncritically: https://x.com/JIACHENLIU8/status/2034398199541317814 — "I Built an Auto Research Claw Too. I'm Begging You Not to Trust It." This is a legitimate worry, especially as the internet may soon contain more agent writing than human writing.

Other platforms exist — beach.science, ScienceClaw × Infinite (from Buehler's lab at MIT: https://x.com/ProfBuehlerMIT/status/2033832967542342021). But the quality control and level of detail on ClawInstitute is notably higher. Beach.science and ScienceClaw × Infinite may have gotten too quickly impressed with some of their early examples.

The reason I think ClawInstitute has unusually high upside risk: Marinka Zitnik is really rigorous in a way many are not. Her lab has had genuinely smart generalist/GNN systems biologists — Michelle Li, Ayush Noori — and a track record in representation learning for biology, not just generic LLM enthusiasm. A lot of their historical research has been on GNN representations of biological networks, which helps with context and applying Michael Bronstein-ish geometric operators to the logic in GNNs (e.g. with Pinnacle).


Why GNNs + agents is a particularly interesting combination for biology

Agent swarms are an improvement to context and nuance over single-shot generation. Scale-sensitive GNNs help too. The hard part is where scales interact — protein to molecule, cell/tissue to protein — which is exactly where translatable results get lost. Or when you need to type-check what's hypothesized/simulated against proper biological measurements and readouts (this is what MBJ keeps trying to point out). ClawInstitute goes further than any past effort on this.

A GNN over a curated graph works best when:

  • the node and edge types are meaningful
  • uncertainty is represented rather than collapsed
  • context dependence is not erased
  • the ontology is flexible enough to handle borderline or mixed biological types

That problem is especially acute in systems biology because "type" is often conditional, fuzzy, state-dependent, or scale-dependent. A cell state can be halfway between canonical categories. A protein's role depends on tissue, binding partner, timing, perturbation, and assay regime. If the graph hardens these into neat bins, the agent gets a very elegant wrong answer, which is humanity's favorite genre of mistake.

The Zitnik lab's recent work points toward multimodal, contextual, and single-cell/spatial modeling — they're not treating biology as a static clean ontology problem.

(Though with GNN representations, you can't guarantee consistent typing of interactions with "messy biology.")

Why this could work unusually well:

  • GNNs and knowledge graphs give agents a structured action space
  • Biomedical tasks reward explicit tool use and retrieval
  • Multi-agent review loops are a better fit for science than single-shot generation
  • Zitnik's group has a track record in representation learning for biology, not just generic LLM enthusiasm

Where it could still fail:

  • Ontologies may discretize away biologically important ambiguity
  • Tool outputs can create false confidence if not tied to experimental design
  • Agent societies can converge on polished mediocrity if review loops are shallow
  • "Autoformalization" can be most seductive exactly where biology is least formalizable

The promise is not that agents magically solve biology. It's that in domains where there already exists a rich ecosystem of graphs, ontologies, databases, assay outputs, and mechanistic priors, agents can become unusually effective navigators and hypothesis-combiners. The key bottleneck shifts from "can the model reason at all?" to "does the representation preserve the weirdness of the biology instead of laundering it into tidy graph objects?"

Agents become more useful when they can reason over partially structured biological worlds, but those same structures can silently erase the cross-scale ambiguity that matters most for translation.

Extended conversation on GNNs/KGs and tensors in biology (Ayush Noori, Marinka Zitnik): https://claude.ai/share/a8549976-7c0a-4492-aee8-87252a4f5a5f


The broader Cambridge agent ecosystem

This is one route that makes Cambridge, MA exciting for frontier science/AI again. Many have raised concerns about Boston losing its frontier talent (e.g. https://x.com/bhalligan/status/2008989938935853300). But the agent swarms work happening here is genuinely strong:

Paul Liang's lab (also Cambridge) is doing relevant multiagent work: https://x.com/pliang279/status/2034410589682839831 — not biology background, but potentially interoperable with ClawInstitute since agents are often interoperable across domains.

Maria Gorskikh / projectNANDA — building the "Internet of Agents" infrastructure, running join39.org hackathons, great at winning hackathons. Again not biology, but she (along with projectNANDA) helps with the agent ecosystem infrastructure including security, which may help make ClawInstitute more credible. See: https://infinite-lamm.vercel.app (Infinite - Scientific Agent Collaboration platform).

All this agent swarms research in Cambridge makes MIT/Harvard more exciting again. They were both late to the transformers revolution and don't get SF's level of investment. But it seems like they've become early adopters with applying agent swarms for science.


 

[+][comment deleted]11d 1
[+][comment deleted]11d 0