January 22, 2024

The Virchow Foundation Model, Explained: A Q&A with an AI Scientist

On January 18, 2024 Paige released new performance results on Virchow. Some of these results are detailed below but can be read in full on ArXiv.


In the Fall of 2023, Paige announced its collaboration with Microsoft Research to build the world’s largest image-based AI model to fight cancer. True to its commitment, Paige promptly delivered by releasing results for the million-slide digital pathology Foundation Model named Virchow1. In the few weeks after the initial results were made public1, substantial advancements have already been achieved in enhancing the model’s capabilities.

To shed light on Paige’s unique approach to this groundbreaking version of the Foundation Model, we sat down with Siqi Liu, Director of AI Science, to gain a deeper understanding of Paige’s methodology and its implications for the future of AI in cancer diagnosis and treatment.

Q: Can you provide an overview of the updates made to the Foundation Model, Virchow?

A: We are thrilled to introduce Virchow V1, a major upgrade to our Foundation Model, now fully trained on 1.5 million H&E-stained slides for superior task performance. The Model produces detailed tile embeddings from whole slide images (WSIs), offering a rich, data-driven foundation for a broad array of digital pathology applications. These embeddings can be seen as intricate digital fingerprints of tissue tiles, capturing unique histological features that empower pathologists with advanced insights for diagnosis, research, and tailored patient care.

The updated model represents a true breakthrough in computational pathology for several reasons:

First, the novel Virchow embeddings have enabled us to develop a pan-tumor detection system with the proficiency to identify cancer across various organ types, one of the first of its kind for pathology. Our findings demonstrate that the Virchow embeddings surpass baseline models in accuracy, exhibiting exceptional performance especially in the detection of rare cancers.

Additionally, our updated model leads in tile-level benchmarks, surpassing both baselines and its predecessor. This includes a range of public and internally curated pan-tissue benchmarks, reinforcing our model’s robustness and versatility.

Finally, our efforts have also advanced the frontiers of predictive analytics, as the model exhibits extraordinary precision in pinpointing digital biomarkers, a testament to the potential of AI and machine learning in enhancing diagnostic methodologies.

Q: What are some of the early results of the pan-tumor model? What does this mean for the future of AI in cancer detection?

A: Through well-designed experiments leveraging high-quality clinical data, our pan-tumor model—powered by Virchow’s embeddings—has been found to excel in detecting a broad spectrum of cancers. It shows a specimen-level AUC of 0.95 for common cancers and 0.93 for rare cancers occurring in fewer than 40,000 cases annually in the US.

The pan-tumor model’s effectiveness soundly demonstrates the Foundation Model’s strength, bolstering our confidence in a unified AI-driven approach, and setting the stage for continued development of tools that can support pathologists in critical areas like rare tumor detection that have previously been lacking AI support. The Virchow model also enables the simultaneous development of multiple AI applications, encompassing cancer detection, grading, subtyping, measurement, quantification, segmentation, etc. This will significantly enhance Paige’s ability to develop robust product suites across many tissue types, offering the comprehensive support to accompany pathologists throughout their clinical workflow.

Q: Why were other applications not previously able to effectively identify rare cancers? How has the Foundation Model solved this issue?

A: Traditional machine learning models often struggle with rare cancers due to limited data, impeding accurate pattern recognition necessary for diagnosis. Our Foundation Model overcomes these challenges by utilizing a diverse, large-scale dataset that enables the model to learn on a vast array of tissue types, understand cancer morphology, and apply this knowledge to effectively discern the tissue patterns of rare cancers despite data scarcity.

Q: Why is training on a million-slide dataset crucial for advancing digital pathology imaging?

A: Training on a massive, million-slide dataset is vital for digital pathology to create algorithms that are precise and universally applicable. Such extensive data encompasses the diverse and complex scenarios encountered in clinical practice, ensuring that the algorithms are not only empirically robust but also clinically valid.

Real-world clinical case data is ideal for Foundation Models as it includes the subtle variations and intricate patterns necessary for accurate diagnostics, especially for rare conditions often underrepresented in smaller datasets. This equips the model to perform effectively in real-world medical settings, leading to improved patient care. The dataset also includes common artefacts seen on Whole Slide Images, such as cracks, bubbles, dust, pen-marking, variations in slide preparation and staining reagents, etc.

For example, powered by robust clinical data, the new pan-tumor model can recognize cancers that pathologists, especially in smaller hospitals around the globe, may not have seen before, ensuring that these cancers are not overlooked and that all patients have access to the best possible care.

Q: How is the Virchow model tailored to meet the demands of real-world digital pathology applications?

A: The model employs a Vision Transformer (ViT-H), striking a strategic balance between model representation power and computational cost. This makes it both powerful enough to process complex pathology data and cost-effective for widespread use in real clinical environments, thus aligning with the practical needs of digital pathology products.

Q: What makes Virchow V1 the state-of-the-art Foundation Model for computational pathology?

A: When we describe the Virchow model as ‘state-of-the-art,’ we’re referring to:

  • Its exceptional performance across a spectrum of benchmark tasks specifically selected for their relevance to practical digital pathology
  • The use of the most advanced computer vision and AI technologies to-date that were tailored for computational pathology.

This superior level of performance indicates that products developed using our Foundation Model are poised to deliver greater clinical impact. By integrating the latest advancements in AI and machine learning, these products can enhance diagnostic precision, accelerate pathology workflows, and support faster pharmaceutical research. The term ‘state-of-the-art’ reflects not just our current technological excellence but also the potential to transform future practices in digital pathology, helping to improve patient outcomes and unlock more efficient healthcare delivery systems.

Q: What’s next for Virchow?

A: We plan to broaden our benchmark suite, enhancing the testing of the Virchow V1 model across a wider spectrum of computational pathology applications. This expansion will yield deeper insights for ongoing improvements and pinpoint how Virchow’s embeddings can be leveraged to enhance existing and develop new digital pathology AI applications.

In parallel, we’re excited to advance the development of Virchow V2. In collaboration with the exceptional team at Microsoft Research, our focus will be twofold:

Firstly, we aim to significantly enlarge the training dataset and model size while also extending our capabilities to include a broader spectrum of stains beyond H&E.

Secondly, we’re dedicated to refining our training methodologies to establish the most effective strategies for model research and development.

Together, these two goals ultimately will work in tandem to evolve the concept of a Foundation Model in digital pathology and oncology broadly. With cutting-edge algorithm research, Virchow V2 is set to surpass the current capabilities of Virchow V1, paving the way for new innovations in the field that will positively impact patients.

In advancing our roadmap, we aim to complement the collective efforts in the academic field and the industry. Our goal with the Virchow V1 model is to highlight the significant potential of scaling in computational pathology. We are committed to collaborating and sharing our findings with the broader research community to collectively push the boundaries of the field.


To learn more about Paige’s Foundational Model and the published results, view our publication in ArXiv.

1Vorontsov E, Bozkurt A, Casson AI et al. VIRCHOW: A MILLION-SLIDE DIGITAL PATHOLOGY FOUNDATION MODEL. ArXiv. Updated Preprint posted online January 18, 2024. arXiv:2309.07778v5