Markov Biosciences


We built a world model of the cell—a single self-supervised model, trained on rankings of mRNA counts, that encodes:

  • Where proteins sit in the cell — from nucleus to membrane to secreted
  • Which proteins physically interact — direct substrates, scaffolds, complex partners
  • How receptors signal — multi-hop kinase cascades recovered zero-shot
  • Whether a target is druggable — and by which modality
  • What transcription factors bind — including the sign of regulation and complex composition

State-of-the-art perturbation prediction. Monotonic scaling. No injected knowledge. No task-specific pretraining.

The model learned the cell because the cell is what generates the data. The right objective was all that was missing.

Paper: Generative ranking enables scalable pretraining on noisy biological multisets

GCTCAGAAGCGCCGAGAGCGCGGCCGGGACGGTTGGAGAAGAAGGCGGCTCCCGGAAGGGGGAGAGACAAACTGCCGTAACCTCTGCCGTTCAGGAACCCGGTTACTTATTTATTCGTTACCCTTTTTCTTCTTCCTCCCCCAAAAACCTTTTCCTTTTCCCTTCTTTTTTTTTCCTTTTTGGGAGCTGAAAAATTTCCGGTAAGGGAAAGAAGGGCTCCTTTCGCTCCTTATTTCCCCGCCTCCTTCCCTCCCCCACCTTCCCCTCCTCCGGCTTTTTCCTCCCAACTCGGGGAGGTCCTTCCCGGTGGCCGCCCTGACGAGGTCTGAGCACCTAGGCGGAGGCGGCGC
GCTCAGAAGCGCCGAGAGCGCGGCCGGGACGGTTGGAGAAGAAGGCGGCTCCCGGAAGGGGGAGAGACAAACTGCCGTAACCTCTGCCGTTCAGGAACCCGGTTACTTATTTATTCGTTACCCTTTTTCTTCTTCCTCCCCCAAAAACCTTTTCCTTTTCCCTTCTTTTTTTTTCCTTTTTGGGAGCTGAAAAATTTCCGGTAAGGGAAAGAAGGGCTCCTTTCGCTCCTTATTTCCCCGCCTCCTTCCCTCCCCCACCTTCCCCTCCTCCGGCTTTTTCCTCCCAACTCGGGGAGGTCCTTCCCGGTGGCCGCCCTGACGAGGTCTGAGCACCTAGGCGGAGGCGGCGC
TACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAACCGTAGACCAGATAGCATAGACATACCGTAGACCAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGATAGCATAGACATACCGTAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACA
TACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAACCGTAGACCAGATAGCATAGACATACCGTAGACCAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGATAGCATAGACATACCGTAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACA
TACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAACCGTAGACCAGATAGCATAGACATACCGTAGACCAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGATAGCATAGACATACCGTAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACA
TACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAACCGTAGACCAGATAGCATAGACATACCGTAGACCAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGATAGCATAGACATACCGTAATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACATACCGTAGACCAGATAGCATAGACA