Research Radar — 2026-06-22
Methods & AI
Computational
Hierarchical classification of immune cell transcriptomes at population-scale
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.05.30.728980
single-cell RNA-seq immune cell classification machine learning hierarchical classification tumour immunology tumour-associated macrophages T cells population-scale
Summary: Presents Suco (single-cell universal classification omnibus), a resource of independent uniform expert annotations, and Compocyte, a modular hierarchical classifier for immune cell transcriptomes. Accurate immune cell classification is essential for single-cell RNA-seq interpretation, but progress in automated annotation is constrained by the lack of independent high-resolution benchmarks — routine data integration introduces statistical dependencies that inflate model generalizability. Together, Suco and Compocyte establish a framework that substantially outperforms existing classifiers while facilitating expert review of ambiguous annotations. Applying Compocyte across 50 studies including three newly generated datasets, the authors classified 15.6 million leukocytes from 3,965 patients. Within this cohort, they identified a new tumour-associated resorptive macrophage phenotype, a non-canonical monocyte subtype in subclinical cytokine release syndrome, and the programmatic erosion of T cell memory stemness across metastatic sites. The framework provides a generalizable system to uncover principles governing human immunity at population scale.
Why it matters: Cell type annotation remains one of the most laborious and error-prone steps in single-cell analysis, and most existing classifiers are trained on integrated datasets that introduce leakage between training and test data, artificially inflating reported performance. Suco and Compocyte address this by providing truly independent benchmarks and a modular hierarchical architecture that can be extended with new cell types without retraining the entire model. The scale of application — nearly 4,000 patients and 15 million cells — is unprecedented for immune cell classification and demonstrates that systematic, unbiased annotation at population scale can uncover novel biology (the resorptive macrophage, the cytokine-release monocyte) that would be missed by manual annotation or simpler classifiers.
Why for Yiru: This framework is directly relevant to TME single-cell analysis. The ability to classify immune cells hierarchically and at population scale with consistent, unbiased annotations would greatly accelerate cross-study comparisons of TME immune composition. The newly identified tumour-associated resorptive macrophage phenotype is particularly interesting for TME biology — understanding whether these macrophages are pro- or anti-tumour and how they relate to existing TAM classification schemes could reveal new therapeutic targets. The finding that T cell memory stemness erodes programmatically across metastatic sites has direct implications for immunotherapy: it suggests that metastatic lesions harbour T cells with progressively diminished capacity for long-term tumour control. Methodologically, the Compocyte hierarchical architecture could be extended to include TME-specific cell states (e.g., exhausted T cell subsets, immunosuppressive macrophage states) as additional classification layers.
Antibody-Antigen Affinity Prediction with Chain-Aware Protein Language Modeling
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.06.19.733375
antibody design protein language model affinity prediction deep learning computational immunology drug discovery
Summary: Presents AbAffinity, a sequence-only chain-aware three-stream architecture for predicting antibody-antigen binding affinity that maintains heavy chain, light chain, and antigen as distinct streams. Antibody-antigen affinity determines which candidates advance in therapeutic discovery, repertoire analysis, and affinity maturation, but experimental measurements are sparse relative to the scale of sequence libraries. Structure-based predictors require reliable complexes that are often unavailable in early discovery, while existing sequence-based models frequently compress heavy and light chains into a single representation, obscuring chain-specific and epitope-specific signals. AbAffinity integrates frozen ESM-2 embeddings with heavy-chain CDR-focused pooling, heavy-light self-attention, adaptive fusion gating, and gated cross-attention — training only a compact interaction module. On the SAAINT-DB benchmark, AbAffinity achieves strong predictive performance under ten-fold cross-validation and maintains robust accuracy on novel antigens. It consistently outperforms recent sequence-based models across SAbDab, AB-Bind, and SKEMPI 2.0 external benchmarks. Integrated Gradients attributions recover known paratope and epitope residues at structurally validated interfaces, providing interpretability.
Why it matters: Antibody discovery remains heavily dependent on experimental screening because computational affinity prediction — especially for novel antigens without solved structures — is unreliable. AbAffinity demonstrates that chain-aware architecture, which explicitly models the distinct contributions of heavy chain, light chain, and antigen rather than collapsing them into a single embedding, yields substantial performance improvements. The fact that it uses only sequence information (via frozen ESM-2 embeddings) and trains a compact interaction module makes it both scalable and practical for early discovery where thousands of candidate antibody-antigen pairs must be ranked. The interpretability — recovering known paratope residues — is also practically valuable for guiding affinity maturation.
Why for Yiru: Computational antibody design and immune receptor-ligand interaction prediction are increasingly important for TME research as immunotherapies diversify beyond checkpoint blockade to include bispecific antibodies, antibody-drug conjugates, and engineered cytokine receptors. AbAffinity's chain-aware architecture could be adapted to predict TCR-pMHC binding affinity, which remains a holy grail of computational immunology. The CDR-focused pooling strategy is particularly relevant for T cell receptor analysis, where CDR3 loops largely determine specificity. More broadly, the three-stream architecture — maintaining separate representations that interact through gated cross-attention — is a design pattern applicable to any problem involving molecular recognition between two or more distinct entities, such as ligand-receptor or peptide-MHC interactions in the TME.
πDIA-CLIP: efficient identification of highly heterogeneous proteomics data via a generalized zero-shot framework
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.02.09.704949
proteomics mass spectrometry DIA zero-shot learning contrastive learning computational method single-cell proteomics
Summary: Presents πDIA-CLIP, a generalized framework that shifts data-independent acquisition (DIA) mass spectrometry analysis from semi-supervised training to zero-shot cross-modal representation learning. DIA has emerged as a cornerstone for characterizing heterogeneous biological systems including single-cell proteomics, metaproteomics, and spatial proteomics, yet current analysis frameworks require semi-supervised training within each run for peptide-spectrum match re-scoring — a process prone to overfitting and lacking generalizability across species and experimental conditions. πDIA-CLIP integrates dual-encoder contrastive learning with encoder-decoder architectures to establish a unified, high-precision representation for spectral features and peptides. The zero-shot design enables an inference-only architecture with exceptional computational efficiency. Across five distinct benchmarks, πDIA-CLIP consistently outperforms existing tools, yielding up to a 44.6% increase in protein identification alongside up to 52.5% reduction in false entrapment identifications. The enhanced depth facilitates discovery of novel biomarkers and elucidation of intricate cellular mechanisms.
Why it matters: DIA mass spectrometry is increasingly the method of choice for proteomics, but analysis remains a bottleneck — current tools require per-experiment training that is computationally expensive and prone to overfitting when applied to new sample types or species. πDIA-CLIP's zero-shot approach means it can be applied to any DIA dataset without retraining, which is particularly valuable for emerging applications (single-cell proteomics, spatial proteomics, metaproteomics) where experimental conditions vary widely and reference spectral libraries are incomplete. A 44.6% increase in protein identifications is practically significant — it means discovering biology that would otherwise remain hidden in the noise.
Why for Yiru: Spatial proteomics and single-cell proteomics are frontier technologies for TME characterization. Understanding the spatial organization of the TME proteome — which immune checkpoints are expressed where, which cytokine gradients exist, how the extracellular matrix is remodelled — requires robust DIA analysis pipelines. πDIA-CLIP's zero-shot capability means it could be applied to TME spatial proteomics data from diverse tumour types, sample preparation methods, and mass spectrometry platforms without needing to retrain for each experiment. The improved identification depth could reveal low-abundance TME proteins (chemokines, checkpoint ligands, proteases) that are functionally important but easily missed by conventional analysis.
SIEVEseq: One-stop differential expression, variability, and skewness analyses using RNA-Seq data
bioRxiv Published 2026-06-21 preprint DOI: 10.1101/2024.04.09.588804
RNA-seq differential expression statistical method gene expression variability skewness Alzheimer's disease
Summary: Presents SIEVEseq, a statistical methodology that unifies differential expression, variability, and skewness testing for RNA-seq data within a single framework. RNA-seq analysis is commonly biased toward detecting differentially expressed genes and insufficiently conveys the complexity of gene expression changes between conditions. This bias arises because discrete count models cannot fully and independently parameterize the mean, variance, and skewness of gene expression distributions. SIEVEseq uses a compositional data analysis strategy to transform discrete RNA-seq counts into continuous form well-fitted by the skew-normal distribution, enabling simultaneous testing of all three distributional properties. Both parametric and nonparametric simulations show SIEVEseq better controls false discovery rate and Type II error than existing differential expression methods. Analysis of the Mayo RNA-seq dataset for Alzheimer's disease demonstrates that gene sets with significant differences in mean, variance, and skewness between control and disease groups strongly predict disease state. Functional enrichment analysis indicates that relying solely on differentially expressed genes identifies only part of the biological spectrum, whereas incorporating genes with differential variability and skewness reveals additional disease-related aspects.
Why it matters: Two decades of RNA-seq analysis have been dominated by differential expression testing, yet much of the biologically meaningful signal in transcriptomic data lies in changes to expression variability (heterogeneity) and distributional shape (skewness) rather than mean expression alone. SIEVEseq provides a principled statistical framework for capturing all three types of change simultaneously, with proper false discovery rate control. The demonstration that variability and skewness changes capture disease-relevant biology missed by differential expression — and improve disease state prediction — is a compelling argument for moving beyond mean-only analysis. This is particularly relevant in the context of cellular heterogeneity, where a treatment may not change average expression but may alter which cells express a gene and by how much.
Why for Yiru: TME transcriptomic analysis is fundamentally about heterogeneity — tumour cells, immune cells, and stromal cells exhibit highly variable gene expression programmes that cannot be captured by mean expression alone. For example, immune checkpoint expression may become more variable (some cells high, some low) under checkpoint blockade without changing mean expression — a pattern that SIEVEseq would detect but standard differential expression would miss. Similarly, T cell exhaustion markers may exhibit increased expression variability as some cells progress toward terminal exhaustion while others retain progenitor potential. Applying SIEVEseq to TME scRNA-seq or bulk RNA-seq datasets could reveal regulatory programmes that operate through changes in cellular heterogeneity rather than mean expression shifts.
Biomedical discoveries
Biomedicine
Combination epigenetic-targeted therapy increases the immunogenicity of poorly immunogenic sarcomas
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.06.18.733244
sarcoma immunotherapy epigenetic therapy cancer testis antigens immune checkpoint vaccine T cells
Summary: Demonstrates that sequential treatment with the hypomethylating agent decitabine and the histone deacetylase inhibitor entinostat converts poorly immunogenic sarcomas into T cell-sensitive tumours. Immunotherapy approaches have shown limited efficacy in paediatric sarcomas, partly because these tumours have low mutation burden and few neoantigens. Using a mutated Kras-driven murine sarcoma model (KP Sarc), sequential treatment with decitabine and entinostat significantly increased expression of epigenetically silenced genes — including cancer testis antigens — and enhanced antigen presentation including MHC I expression, compared with either agent alone. Vaccination with irradiated, epigenetically treated KP Sarc cells in a GM-CSF-secreting whole-cell vaccine induced T cell immunity against a matched tumour challenge. The anti-tumour response was directed toward epigenetically upregulated antigens, was T cell dependent, was further potentiated by immune checkpoint inhibition, and conferred immunologic memory. Epigenetically regulated antigens were shown to be shared between tumours, providing protective immunity against both the treated KP Sarc and a second murine sarcoma M-3-9M. Treatment of human sarcoma lines with decitabine and entinostat induced similar gene expression changes, including shared antigen targets, and increased MHC I expression.
Why it matters: Low-mutation-burden tumours — including most paediatric cancers and many sarcomas — have been largely excluded from the immunotherapy revolution because they present few neoantigens for T cell recognition. This study demonstrates a clinically pragmatic strategy for converting such tumours into immunologically recognizable targets: epigenetic drugs that are already FDA-approved for other indications can de-repress silenced antigen genes, effectively creating neoantigens from the tumour's own genome. The finding that these epigenetically upregulated antigens are shared across independent tumour lines is particularly exciting — it suggests the possibility of off-the-shelf epigenetic vaccines that prime immunity against conserved tumour antigens without requiring patient-specific neoantigen prediction. The synergy with checkpoint blockade indicates a clear clinical path: epigenetic priming followed by anti-PD-1/PD-L1.
Why for Yiru: This work opens a new angle for TME immunotherapy research: epigenetic manipulation of tumour immunogenicity. Many TME-relevant questions follow directly: do epigenetically upregulated antigens include TME-modulating factors (chemokines, cytokines, ECM remodellers) that could reshape the immune infiltrate? Does epigenetic priming alter the repertoire of tumour-specific T cells that traffic to the tumour? Could the epigenetic agents themselves modify the TME stroma or immune cells in ways that complement the increased antigen presentation? From a computational perspective, predicting which epigenetically silenced genes in a given tumour type can be reactivated — and which of those encode immunogenic peptides — is a tractable machine learning problem that could guide patient selection for epigenetic-immunotherapy combinations.
Cytokines driven by caspase-1 and RIPK3 are antagonistic
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.06.19.733458
innate immunity cell death cytokines caspase-1 RIPK3 pyroptosis necroptosis type I interferon IL-1β
Summary: Reveals that the cytokines produced by caspase-1-dependent inflammasome signalling and RIPK3-dependent necroptotic signalling are functionally antagonistic, and that this cytokine balance — not cell death itself — determines the outcome of intracellular bacterial infection. Pyroptosis, apoptosis, and necroptosis are thought to provide redundant protection against intracellular pathogens. The authors found that mice lacking inflammasome signalling together with caspase-8 and MLKL were highly susceptible to infection by an environmental cytosol-invasive bacterium. Surprisingly, deletion of Ripk3 completely restored resistance despite the continued absence of all three cell death pathways. The outcome was determined not through cell death but via production of antagonistic cytokines: deleting Casp8 and Mlkl initiated an effector-triggered immunity-like response in which RIPK3 induced type I interferons causing catastrophic susceptibility, while caspase-1-dependent IL-1β production counteracted these type I interferons and restored resistance. Thus, susceptibility arose not from failure of regulated cell death, but from the activation of RIPK3-driven type I interferons unopposed by caspase-1-driven IL-1β.
Why it matters: This study fundamentally reframes our understanding of innate immune cell death pathways. The prevailing dogma holds that pyroptosis, apoptosis, and necroptosis serve as redundant cell-autonomous defences — if one pathway is blocked, another compensates by killing the infected cell. Here, the key determinant of infection outcome is not which cell death pathway executes but rather the balance of cytokines produced by each pathway's signalling machinery. Type I interferons and IL-1β are both potent immunomodulators with opposing effects in many contexts, and their antagonism may explain puzzling observations in infectious disease, autoimmunity, and cancer where blockade of one pathway paradoxically worsens outcome. The concept of "cytokine antagonism" embedded within cell death signalling cascades is a new paradigm with broad implications for targeting these pathways therapeutically.
Why for Yiru: The antagonistic relationship between type I interferon and IL-1β signalling is directly relevant to TME immunobiology. Type I interferons have complex, context-dependent effects in the TME — they can promote anti-tumour T cell responses but also drive immunosuppression through PD-L1 upregulation and T cell exhaustion. IL-1β, conversely, is often pro-tumourigenic, promoting inflammation, angiogenesis, and metastasis. Understanding how the balance between these cytokines is set by the relative activity of inflammasome versus RIPK3 signalling in TME cells (tumour cells, macrophages, dendritic cells) could explain differential immunotherapy outcomes. Tumour cell death induced by chemotherapy or radiation may trigger one pathway or the other depending on the specific damage signals, potentially determining whether the resulting immune response is protective or suppressive.
An APOC1+ inflammatory CAF-like state drives a senescent, treatment-resistant niche in rheumatoid arthritis
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.04.17.718831
rheumatoid arthritis fibroblasts CAF senescence treatment resistance single-cell spatial transcriptomics stromal biology
Summary: Identifies a CXCL12-high, APOC1+ fibroblast population selectively enriched in treatment-refractory rheumatoid arthritis (RA) synovitis that resembles inflammatory cancer-associated fibroblasts (iCAFs). RA synovitis frequently persists despite cytokine-targeted therapies, suggesting pathogenic stromal programmes that sustain chronic inflammation independently of canonical immune pathways. Using multimodal single-cell and spatial profiling of synovial tissue from 54 RA patients with prospective treatment-response data, the authors found that CXCL12-hi APOC1+ fibroblasts establish CXCL12-dependent plasmablast niches within inflamed synovium, mirroring how iCAFs orchestrate immune cell recruitment in the tumour microenvironment. These fibroblasts exhibited a senescence-associated iCAF-like transcriptional programme characterized by STAT3-C/EBP activation and APOC1 expression, and were associated with poor response to TNF and IL-6 pathway inhibition. Mechanistically, APOC1 knockdown attenuated invasive mesenchymal behaviour and disrupted senescence-associated inflammatory programmes. Genetic or pharmacological elimination of senescent cells ameliorated experimental arthritis and enhanced the efficacy of TNF blockade.
Why it matters: The convergence of cancer biology and inflammatory disease biology around fibroblast states is increasingly evident. This study identifies an iCAF-like fibroblast in rheumatoid arthritis that functions analogously to tumour-promoting CAFs — establishing chemokine-dependent immune niches, driving treatment resistance, and maintaining a senescence-associated secretory phenotype. The finding that senolytic therapy enhances the efficacy of standard-of-care biologic therapy (TNF blockade) suggests a general principle: treatment-refractory inflammatory diseases may be driven not by failure to block the primary inflammatory cytokine but by stromal senescence programmes that sustain inflammation through cytokine-independent mechanisms. This has immediate translational implications for RA and potentially for other chronic inflammatory and fibrotic diseases.
Why for Yiru: Cancer-associated fibroblasts are a major TME component whose heterogeneity and functional plasticity remain poorly understood. The identification of a specific fibroblast state (APOC1+ iCAF-like) that drives immune niche formation and treatment resistance in RA provides a blueprint for studying analogous CAF states in the TME. The same CXCL12-dependent immune cell organization, senescence-driven inflammatory programmes, and STAT3-C/EBP transcriptional regulation likely operate in tumour-promoting CAFs. Single-cell and spatial profiling studies of TME fibroblasts could specifically look for this APOC1+ signature to determine whether treatment-refractory tumours harbour analogous fibroblast populations. The senolytic approach — eliminating pathogenic fibroblasts to restore therapy sensitivity — is directly translatable to cancer, where CAF-targeted senolytics could potentially overcome resistance to immunotherapy or chemotherapy.
A motif-vocabulary model of CAR T-cell intracellular domains identifies determinants of immunophenotype differentiation
bioRxiv Published 2026-06-21 preprint DOI: 10.64898/2026.02.01.700582
CAR T-cell immunotherapy signalling motifs costimulatory domain protein engineering systems immunology high-throughput screening
Summary: Screens 1,243 naturally occurring intracellular domains as costimulatory modules in an anti-CD20 CAR backbone in primary human CD8+ T cells, quantifying construct enrichment across memory-differentiation and PD-1-defined immunophenotypic compartments. CAR T-cell efficacy depends critically on the costimulatory domain, yet engineering has focused on a narrow set of domains — principally CD28 and 4-1BB — leaving much of the signalling design space unexplored. Using Eukaryotic Linear Motif (ELM) annotations, the authors analysed motif-phenotype associations via complementary statistical approaches. Mann-Whitney screening and negative binomial regression identified ELM features associated with differential construct representation, while Dirichlet-Multinomial modelling — which properly accounts for the compositional structure of FACS-partitioned data — revealed that individual linear motifs primarily affect proliferation or survival rather than differentiation fate. In contrast, construct-level analysis identified specific costimulatory domains with significant phenotype-shifting effects, demonstrating that combinations of motifs — rather than individual motifs — determine immunophenotype. The results provide a combinatorial framework for rational costimulatory domain engineering.
Why it matters: CAR T-cell therapy has been transformative for haematological malignancies but has achieved limited success in solid tumours, partly because the two dominant costimulatory domains (CD28 and 4-1BB) drive distinct T cell differentiation programmes — CD28 promotes rapid effector function and exhaustion, while 4-1BB favours memory formation and persistence. This study systematically explores the natural diversity of intracellular signalling domains, providing a roadmap for engineering CARs with custom-tuned immunophenotypes. The finding that individual motifs control proliferation/survival while motif combinations control differentiation is mechanistically important: it suggests modular engineering strategies where one motif set drives expansion and another drives memory formation. The 1,243-domain screening resource itself is a valuable reference for the field.
Why for Yiru: CAR T-cell therapy for solid tumours is intimately connected to TME biology — CAR T cells must traffic into tumours, survive in the immunosuppressive TME, and maintain effector function despite chronic antigen exposure. Engineering costimulatory domains that optimize this TME-adapted phenotype is a key challenge. The motif-vocabulary framework could be extended to predict which domain combinations would generate CAR T cells with enhanced TME infiltration, resistance to exhaustion, or ability to reprogramme the TME through cytokine production. More broadly, the high-throughput domain screening approach could be applied to other synthetic receptors (synNotch, T cell engagers) and to study how endogenous T cell signalling motifs are rewired in the TME during chronic stimulation.
Cross-disciplinary watchlist
Other Fields
The recount3 Python package for programmatic access to uniformly processed RNA-seq data
bioRxiv Published 2026-06-20 preprint DOI: 10.64898/2026.06.17.732943
RNA-seq data resource Python bioinformatics reproducibility open science
Summary: Presents the recount3 Python package, providing native Python and command-line access to tens of thousands of uniformly processed RNA-seq samples across human and mouse from major sequencing repositories. The recount3 online resource has been a cornerstone of large-scale transcriptomic analysis, but access has traditionally been centred in the R/Bioconductor ecosystem. With the growing prominence of Python in bioinformatics and machine learning, the recount3 Python package fills a critical gap, offering robust API and CLI interfaces for discovering, downloading, and materializing recount3 data collections, metadata, and gene expression matrices. The package enables programmatic access to uniformly processed data from the Sequence Read Archive, GTEx, TCGA, and other major projects — enabling rapid cross-study analyses without requiring raw data reprocessing.
Why it matters: Data reusability is one of the most persistent bottlenecks in bioinformatics. Tens of thousands of RNA-seq datasets sit in public repositories, but accessing and uniformly processing them for cross-study analysis requires substantial computational expertise and resources. recount3 addresses this by providing pre-processed, uniformly quantified gene expression data at scale. By making these data accessible natively in Python — the dominant language for machine learning and data science — this package dramatically lowers the barrier to large-scale transcriptomic meta-analysis and machine learning applications. Researchers can now query and download thousands of uniformly processed samples with a few lines of Python, enabling analyses that were previously impractical.
Why for Yiru: Large-scale transcriptomic meta-analysis is increasingly important for TME research — for example, comparing immune signatures across thousands of tumours from different studies, training pan-cancer models of TME composition, or identifying robust transcriptional correlates of immunotherapy response. The recount3 Python package makes these analyses practical by eliminating the data processing bottleneck. For computational TME projects, being able to query all available sarcoma RNA-seq data, or all melanoma samples with immunotherapy response annotations, through a uniform Python interface could accelerate biomarker discovery and model development substantially.
Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning (RNAJog)
bioRxiv Published 2026-06-20 preprint DOI: 10.1101/2025.08.26.672486
mRNA design codon optimization reinforcement learning deep learning vaccine development RNA structure
Summary: Presents RNAJog (RNA Joint Optimization with autoregressive Generative model), a framework integrating autoregressive generation with reinforcement learning to simultaneously optimize codon sequences for minimum free energy (MFE), codon adaptation index (CAI), and GC content. Codon optimization is essential for mRNA vaccine and therapeutic development, yet existing tools face limitations in computational efficiency, sequence diversity, and universality. RNAJog addresses these by framing codon optimization as a multi-objective reinforcement learning problem, where the autoregressive model generates candidate sequences and the reward function balances MFE minimization, CAI maximization, and GC content targeting. Critically, the framework can design sequences without requiring annotated training data, making it applicable to novel or poorly characterized coding sequences. Evaluations in both in silico and wet-lab experiments confirmed that RNAJog-optimized sequences achieve superior expression compared to conventionally optimized sequences.
Why it matters: The mRNA vaccine revolution — catalysed by COVID-19 — has created an urgent need for better codon optimization tools. Current methods are largely heuristic (e.g., using the most frequent codon for each amino acid), which ignores the complex interplay between codon usage, RNA secondary structure, and translation efficiency. RNAJog's multi-objective optimization approach explicitly models these trade-offs, and the reinforcement learning framework can discover non-obvious codon combinations that satisfy multiple constraints simultaneously. The ability to operate without training data is particularly valuable for emerging applications — therapeutic mRNAs encoding novel proteins, personalized cancer vaccines, or mRNAs for non-model organisms — where codon usage data may be sparse or unavailable.
Why for Yiru: mRNA-based cancer vaccines and in situ CAR T-cell generation are frontier TME immunotherapies where codon optimization directly impacts efficacy. Poorly optimized mRNAs produce less antigen, reducing the magnitude of the anti-tumour T cell response. RNAJog could be applied to optimize mRNAs encoding tumour neoantigens, cytokines (IL-12, IL-2), or CAR constructs for intratumoural delivery. The multi-objective framework — balancing stability, expression level, and translational efficiency — is particularly well-suited to the TME context where mRNAs must function in the inflammatory, hypoxic, and nuclease-rich tumour environment. The reinforcement learning architecture also provides a template for optimizing other nucleic acid therapeutics (siRNA, ASOs, guide RNAs) for TME applications.
Vessel Spatial Analysis (VeSpA): a tool for whole slide image segmentation, morphometry, and QuPath extension
bioRxiv Published 2026-06-20 preprint DOI: 10.64898/2026.06.15.732366
computational pathology whole slide imaging vessel analysis image segmentation QuPath spatial analysis
Summary: Presents VeSpA (Vessel Spatial Analysis), a computational pathology tool for whole slide image segmentation and morphometric analysis of blood vessels, implemented as a QuPath extension. Quantitative analysis of vascular architecture in tissue sections is important for understanding tumour angiogenesis, tissue perfusion, and drug delivery, yet existing tools are either proprietary, require programming expertise, or lack spatial analysis capabilities. VeSpA provides automated vessel segmentation from H&E or immunohistochemistry-stained whole slide images, extraction of morphometric features (vessel density, diameter, shape, tortuosity), and spatial analysis of vessel distribution relative to tissue landmarks. The QuPath integration makes it accessible to pathologists and biologists without requiring programming, while the open-source nature enables customization and extension for specific research questions.
Why it matters: Vascular architecture is a critical but under-quantified feature of the TME. Tumour blood vessels are structurally and functionally abnormal — they are tortuous, leaky, and poorly perfused — contributing to hypoxia, immune exclusion, and poor drug delivery. Yet most TME studies that include vessel analysis rely on simple metrics like microvessel density, which capture only a fraction of the biologically relevant information. VeSpA enables comprehensive, spatially aware vessel morphometry from routine histology slides, making high-dimensional vessel phenotyping accessible to any lab with a slide scanner. The QuPath integration is key: QuPath is widely used in digital pathology, and VeSpA extends it with specialized vessel analysis capabilities.
Why for Yiru: Tumour vasculature is a major determinant of TME immune infiltration — well-perfused tumours with normal-like vessels tend to have better T cell infiltration, while chaotic, poorly perfused vasculature correlates with immune exclusion. VeSpA could be applied to quantify vascular architecture in TME studies, correlating vessel morphometric features with immune cell infiltration patterns from multiplex immunohistochemistry or spatial transcriptomics. Understanding which vascular features predict immunotherapy response — and whether vascular normalizing agents improve T cell infiltration — are key translational questions that tools like VeSpA can help address at scale. The spatial analysis component (vessel distribution relative to tumour regions, immune aggregates, or necrotic zones) is particularly relevant for TME spatial biology.