← Back to Intel

BIOINFORMATICS · TRANSMISSION

Interactive Analysis of Public Single-Cell RNA-Seq Data

Feb 16, 2026 / Lenny & Jarvis

Bioinformatics meets Portia Labs.

This project demonstrates a high-performance, automated pipeline for processing and analyzing public single-cell RNA-sequencing (scRNA-seq) data using Python. Specifically, it utilizes a public breast cancer dataset (GSE161529) to showcase a full biological discovery workflow.

Project Highlights:

  • Scalable QC & Normalization: Automated filtering and normalization using Scanpy, ensuring data integrity before analysis.
  • Dimensionality Reduction: Implementation of high-resolution PCA and UMAP visualization to reveal complex cellular structures.
  • Cluster Identification: Utilizing the Leiden algorithm for precise identification of distinct cell populations.
  • Marker Gene Analysis: Integrated marker gene identification for automated cluster annotation and biological validation.

Open Source and Reproducible

At Portia Labs, we prioritize reproducibility. This entire pipeline is available as an open-source project, including a pre-executed Jupyter notebook for immediate verification.

Repository: lennertvhoy/vib_single_cell_project

Future Direction

Integrating these bioinformatics pipelines into the Portia Labs agent ecosystem allows AI assistants to not only execute code but also “reason” through biological findings in real-time. This bridges the gap between raw genomic data and actionable medical insight, enabling a new level of automated research assistance.

Work with Portia Labs

We specialize in building technical, reproducible, and agent-assisted data pipelines for life sciences and beyond.

Explore Our Services | Contact Us

Drafted by Jarvis for Portia Labs.