Research Radar — 2026-06-21
Methods & AI
Computational
Morpho-FM: spatial molecular reconstruction from routine H&E histology using transcriptomic foundation-model priors
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.15.732498
spatial transcriptomics foundation model histology H&E deep learning prostate cancer computational pathology
Summary: Introduces Morpho-FM, a weakly supervised framework that predicts spatial gene expression from routine H&E whole-slide images by conditioning a pretrained single-cell transcriptomic foundation-model prior on local histological neighbourhoods. Routine H&E histology captures tissue architecture at clinical scale but lacks a direct molecular readout of the transcriptional programmes that organise tumour epithelium, stroma, vasculature, and immune compartments. Spatial transcriptomics provides this context, yet cost, workflow complexity, and sparse sampling limit routine use. Most existing histology-to-expression models are trained de novo on small paired cohorts and remain weakly constrained when extrapolating from sparse measurements to dense, tissue-wide molecular maps. Morpho-FM addresses this by using a lightweight morphology-to-transcriptome adapter that maps cached whole-slide histology features into a transcriptomic decoder, enabling prediction at measured locations, dense full-section reconstruction, and re-aggregation to the original measurement support. Across harmonized prostate cancer benchmarks, Morpho-FM achieved the strongest overall performance among five representative methods (mean per-gene Pearson correlations of 0.286-0.298). Controlled ablation analyses identified pretrained transcriptomic initialization as a reproducible source of performance gain. Beyond accuracy benchmarks, Morpho-FM recovered ERBB2-enriched tumour compartments, boundary-associated molecular gradients, and annotation-aligned tissue domains across breast cancer datasets. The framework demonstrates that transcriptomic foundation-model priors serve as an effective constraint for morphology-conditioned molecular decoding.
Why it matters: The ability to infer spatial gene expression from routine H&E slides would dramatically expand the reach of spatial transcriptomics. H&E staining is performed on virtually every tumour biopsy worldwide, yet the molecular information encoded in tissue morphology remains inaccessible without expensive spatial transcriptomics assays. Morpho-FM demonstrates that foundation models pretrained on massive single-cell transcriptomic datasets can serve as powerful priors for this inference task — effectively allowing the model to "hallucinate" plausible molecular profiles that are consistent with both the observed histology and the biological constraints learned from millions of cells. If validated prospectively, this approach could enable retrospective spatial transcriptomic analysis of archival tissue collections and provide molecular context for routine pathology workflows at minimal additional cost.
Why for Yiru: This work is directly aligned with TME spatial omics research. The ability to predict spatial gene expression from H&E images opens the door to large-scale, low-cost spatial analysis of tumour-immune interactions — precisely the kind of tool needed to study how immune infiltration patterns relate to clinical outcomes across thousands of archival samples. The foundation-model approach is particularly relevant: by conditioning on pretrained transcriptomic representations, Morpho-FM can potentially capture immune cell type signatures and cell-cell interaction programmes that are encoded in tissue morphology. The framework could be extended to predict spatial distributions of immune checkpoint expression, cytokine gradients, or T cell receptor repertoire features from H&E alone. Methodologically, the use of weak supervision and transfer learning from foundation models is a design pattern worth adopting for other TME prediction tasks where paired training data are scarce.
smDeepFLUOR: single-molecule deep learning fluorescence classification
Nature Communications Published 2026-06-21 journal_article DOI: 10.1038/s41467-026-74716-3
deep learning single-molecule imaging fluorescence microscopy classification computational imaging biophysics
Summary: Presents smDeepFLUOR, a deep learning framework for classifying single-molecule fluorescence signals with high accuracy and throughput. Single-molecule fluorescence microscopy has become an indispensable tool for studying biomolecular dynamics, interactions, and conformations at the nanoscale. However, analysing the resulting data — distinguishing genuine single-molecule signals from background noise, classifying different molecular states, and quantifying transition kinetics — remains a labour-intensive bottleneck requiring expert manual curation. smDeepFLUOR addresses this by training deep neural networks on large datasets of labelled single-molecule fluorescence traces, enabling automated, high-throughput classification of fluorescence time series into distinct molecular states. The method achieves near-human-level accuracy while processing thousands of traces in seconds — a task that would take days of manual analysis. The framework is validated across multiple fluorophore types and experimental conditions, demonstrating robustness to variations in signal-to-noise ratio and photobleaching kinetics.
Why it matters: Single-molecule biophysics experiments generate vast amounts of data that are currently bottlenecked by manual analysis. Automating this pipeline with deep learning not only accelerates discovery but also reduces analyst bias — different human experts may classify the same trace differently, whereas a trained model provides consistent, reproducible classifications. The ability to process thousands of traces rapidly also enables statistical analyses (e.g., transition rate distributions, state occupancy histograms) that would be impractical with manual curation. As single-molecule techniques scale to multiplexed and high-throughput formats, automated analysis tools like smDeepFLUOR become essential infrastructure.
Why for Yiru: Single-molecule fluorescence techniques are increasingly applied to study immune receptor signalling — for example, tracking individual T cell receptor–pMHC binding events, monitoring conformational changes in checkpoint receptors, or observing cytokine receptor dimerization dynamics. Automated classification of these signals could accelerate TME-relevant biophysical studies by enabling higher-throughput analysis of receptor-ligand interactions. More broadly, the deep learning approach to time-series classification demonstrated here could be adapted to other single-cell trajectory data types — such as classifying cellular response dynamics from live-cell imaging of T cell activation or tumour cell drug responses.
Geometric Deep Learning Reveals Ligandable and Cryptic RNA Binding Small Molecule Pockets (SMARTPocket)
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.18.732920
RNA structure drug discovery geometric deep learning binding pocket prediction RNA therapeutics computational method
Summary: Introduces SMARTPocket, an atomic-level geometric deep learning framework for predicting RNA-small molecule binding pockets directly from three-dimensional RNA structure. RNAs are important therapeutic targets, yet identifying ligandable small-molecule binding pockets remains a major barrier to RNA-targeted drug discovery. SMARTPocket represents RNA as full-atom point clouds and uses transfer learning from more than 110,000 protein binding interface structures to overcome the limited number of experimentally elucidated RNA-ligand complexes. Across four established single-chain benchmarks and three broader curated benchmarks, SMARTPocket consistently outperforms existing RNA pocket predictors and general biomolecular modelling approaches. The model generalizes to apo RNA structures when conformational changes are modest, identifies cryptic ligandable pockets, and recapitulates experimentally validated binding sites in the SARS-CoV-2 frameshifting element and an RNA aptamer evolved to bind small molecules. SMARTPocket-guided docking further improves near-native RNA-ligand pose recovery and computational efficiency compared with blind docking.
Why it matters: RNA-targeted drug discovery is an emerging frontier that has been held back by the difficulty of identifying druggable pockets in RNA structures. Unlike proteins, which have well-characterized binding pockets with defined chemical properties, RNA structures are more dynamic, more charged, and have fewer examples of successful small-molecule targeting. SMARTPocket addresses this data scarcity through transfer learning from the much larger corpus of protein-ligand complexes, effectively teaching the model what a "bindable" pocket looks like in atomic detail and then adapting that knowledge to RNA. The ability to identify cryptic pockets — binding sites that are not visible in static structures but emerge through conformational dynamics — is particularly valuable, as many therapeutically relevant RNA structures (riboswitches, splicing regulatory elements, viral RNA elements) sample multiple conformations.
Why for Yiru: RNA biology is increasingly recognized as important in the TME. Non-coding RNAs, RNA-binding proteins, and RNA modifications regulate immune cell function, tumour cell plasticity, and therapy resistance. SMARTPocket could be applied to identify druggable pockets in TME-relevant RNAs — for example, structured elements in mRNAs encoding immune checkpoints, cytokines, or chemokine receptors. The geometric deep learning approach, which operates directly on 3D atomic coordinates, is also methodologically interesting as a general framework for structure-based prediction tasks that could extend to protein-RNA interactions, RNA modifications, or aptamer design for TME-targeted delivery.
Tox21mer: A transformer foundation model for Tox21 high-throughput concentration-response curves data
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.15.732308
transformer foundation model toxicology drug safety deep learning high-throughput screening
Summary: Presents Tox21mer, a 43.5-million-parameter transformer that encodes each Tox21 concentration-response curve together with assay metadata into a 768-dimensional representation. The U.S. Tox21 collaboration has generated a large reference library of high-throughput concentration-response assays, but extracting transferable knowledge from these data has been challenging. Tox21mer was pretrained on approximately 2.5 million curves from 102 assay protocols and 6,727 compounds using masked-response reconstruction as the primary objective, with low-weight auxiliary supervision on assay outcome and AC50. The learned representation supported a macro-F1 of 0.985 for three-class outcome prediction (agonist, antagonist, inactive) and an R² of 0.87 for log₁₀(AC50). A masked-only pretraining variant retained near-baseline probe performance, indicating that the representation is learned largely from the self-supervised objective rather than from auxiliary labels. The embeddings formed coherent groupings by curve-class category. The model can support extrapolation to untested compounds through integration with chemical features or distillation into chemistry-only student models.
Why it matters: Foundation models are transforming many areas of biology, but their application to toxicology and drug safety has lagged behind fields like protein structure prediction and genomics. Tox21mer demonstrates that self-supervised pretraining on large-scale assay data can produce representations that capture essential pharmacological properties — whether a compound is an agonist or antagonist, and at what concentration it is active. The self-supervised approach is particularly noteworthy: it suggests that the "grammar" of concentration-response curves can be learned without explicit labels, potentially enabling transfer to new assays or compounds where labelled data are scarce. This is directly relevant to drug repurposing and safety screening applications.
Why for Yiru: Drug repurposing and computational pharmacology are increasingly relevant to TME research as immunotherapy combinations and targeted therapies proliferate. Tox21mer's self-supervised representations could be used to screen existing drugs for potential TME-modulating activity — for example, identifying compounds that affect immune cell viability or cytokine production at specific concentrations. The ability to predict concentration-response curves for untested compounds could also guide dose selection for in vitro TME perturbation experiments. Methodologically, the transformer architecture applied to structured assay data (rather than sequences or images) is an interesting paradigm that could be adapted to other high-throughput screening data types relevant to TME research.
A network approach to DNA methylation clocks
bioRxiv Published 2026-06-20 preprint DOI: 10.64898/2026.06.18.733218
DNA methylation aging epigenetics network analysis biomarker computational biology
Summary: Takes a network approach to understand DNA methylation clocks by building a co-methylation network from 12 public blood datasets. Biological age predicts health and lifespan better than chronological age, and DNA methylation-based "clocks" are leading molecular proxies. However, different established clocks share a vanishingly small number of CpG sites, many of which show weak associations with age, and clocks often do not transfer across array platforms. By building a co-methylation network of the sites showing the strongest age correlation and pruning weak links, the authors find a small number of large modules of covarying CpGs surrounded by many small modules and singleton sites. These modules are biologically interpretable, associated with CpG island contexts, and enriched for distinct Gene Ontology functions. Mapping five established clocks (Horvath, Hannum, AltumAge, Skin & Blood, Han) onto this network reveals that they select some CpGs from the same module, suggesting they are more similar than they appear. A simple clock retaining one CpG per module matches the performance of established clocks, while a clock built from module-level principal components outperforms all five established clocks in three validation cohorts and is transferable across array platforms.
Why it matters: DNA methylation clocks are widely used as biomarkers of biological aging, but their biological basis has remained opaque — different clocks use different CpGs, and it has been unclear whether they capture the same underlying biology or different aspects of aging. This network analysis reveals that despite apparent differences, established clocks sample from shared co-methylation modules, suggesting a common underlying structure. The demonstration that a simple module-based clock outperforms existing clocks while being transferable across platforms is practically important: it suggests a more principled approach to clock construction that should generalize better across populations and technologies.
Why for Yiru: Epigenetic aging is relevant to cancer immunology because TME cells — both tumour and immune — can exhibit accelerated or decoupled epigenetic aging patterns that affect function. Tumour- infiltrating T cells, for example, can acquire exhaustion-associated methylation changes that resemble aging. The network-based approach to methylation analysis could be applied to TME methylation data to identify co-methylation modules associated with immune infiltration, checkpoint expression, or therapy response. More broadly, the demonstration that network analysis can reconcile apparently discordant biomarkers into a coherent framework is a methodological lesson applicable to other TME biomarker development efforts.
Biomedical discoveries
Biomedicine
Systemic viral vector vaccination induces brain resident memory T cells to drive anti-glioblastoma immunity
bioRxiv Published 2026-06-20 preprint DOI: 10.64898/2026.06.18.733241
glioblastoma cancer immunotherapy vaccination T cells brain immunity resident memory viral vector
Summary: Demonstrates that systemic vaccination using a heterologous prime-boost regimen with simian adenovirus ChAdOx1 and poxvirus modified vaccinia Ankara (MVA) induces brain-resident memory T cells capable of driving therapeutic anti-glioblastoma immunity. Glioblastoma is a lethal brain tumour unresponsive to current immunotherapeutic approaches including immune checkpoint blockade, suggesting that initial priming of T cells — rather than their expansion and licensing as effectors — is the rate-limiting step. To overcome this limited initiation of CD8⁺ T cell responses, the authors employed strong viral vector vaccination encoding tumour antigens. Vaccination conferred therapeutic efficacy against orthotopic, immune checkpoint-blockade- refractory tumours, with protection dependent on CD8⁺ T cells that established long-term residence in the brain. The induced brain-resident memory T cells displayed canonical tissue residency markers (CD69, CD103) and were capable of rapid effector function upon tumour rechallenge. The study establishes that systemic vaccination can effectively prime T cell responses that seed the central nervous system compartment, overcoming a fundamental barrier to brain tumour immunotherapy.
Why it matters: Brain tumours have been among the most challenging targets for cancer immunotherapy due to the immune-privileged nature of the central nervous system, the blood-brain barrier, and the immunosuppressive glioblastoma microenvironment. This study demonstrates that the priming bottleneck — rather than effector function within the tumour — may be the key limiting factor, and that powerful viral vector vaccines can overcome it. The induction of bona fide tissue-resident memory T cells in the brain is particularly significant: these cells provide long-term surveillance and rapid recall responses without requiring continuous recruitment from the periphery. If translatable to patients, this approach could fundamentally change the immunotherapy landscape for glioblastoma and potentially other brain malignancies.
Why for Yiru: The concept that vaccination route and vector choice can dictate whether T cells establish residency in immune-privileged tissues has implications beyond brain tumours. Similar principles may apply to other TME compartments where immune exclusion is a barrier — for example, pancreatic cancer or certain metastatic niches. The viral vector approach also raises interesting questions about whether different vaccine platforms (mRNA, adenoviral, poxviral) differentially programme T cell trafficking and residency programmes. From a computational perspective, the transcriptional programmes that distinguish vaccination-induced tissue-resident memory T cells from exhausted or circulating memory cells could inform TME T cell state classification in single-cell data.
Heritable single-cell gene expression states shape functional variability in innate immune responses
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.17.732820
innate immunity single-cell biology transcriptional memory TLR signalling systems biology macrophages
Summary: Integrates transcriptomics, high-content imaging, and mathematical modelling to quantify transcriptional heritability within the evolutionarily conserved Toll-like receptor (TLR) system. Activation of innate immunity at the single-cell level is inherently heterogeneous, yet the mechanisms underlying this variability remain incompletely understood. Using RNA-seq-based fluctuation tests in clonal macrophage populations, the authors identified a subset of TLR4-dependent genes — including cytokines and immune effectors — that retain transcriptional heritability for more than 25 cell divisions. High-content microscopy and stochastic modelling revealed that this heritable variation is propagated through cell division and establishes pre-existing cellular states that bias subsequent responses to TLR stimulation. Cells in "high-responder" states produce substantially more cytokine upon stimulation than their "low-responder" counterparts, even though both populations are genetically identical and were exposed to the same stimulus. The findings demonstrate that apparently stochastic cell-to-cell variation in innate immune responses has a deterministic, heritable component that shapes population-level response heterogeneity.
Why it matters: This study challenges the prevailing view that heterogeneity in innate immune responses is primarily stochastic noise. Instead, it reveals that a significant fraction of response variability is epigenetically encoded and transmitted through cell division — effectively creating metastable "memory" states in cells that are not classically considered to have immunological memory. This has profound implications for understanding why some individuals respond strongly to infection or vaccination while others do not, why inflammatory diseases exhibit fluctuating severity, and how trained immunity (innate immune memory) is maintained at the cellular level. The combination of experimental fluctuation tests with mathematical modelling provides a rigorous framework for distinguishing stochastic from heritable sources of variation.
Why for Yiru: The concept of heritable transcriptional states in immune cells is directly relevant to TME biology. Tumour-associated macrophages exhibit remarkable phenotypic heterogeneity — some are pro-inflammatory, others immunosuppressive — and the extent to which this heterogeneity reflects stable, heritable states versus plastic responses to local cues is poorly understood. The fluctuation test framework developed here could be applied to TME macrophages to determine whether immunosuppressive versus pro-inflammatory phenotypes are epigenetically encoded and transmitted through cell division. If so, reprogramming these heritable states — rather than simply blocking polarizing signals — may be necessary for durable TME reprogramming. The mathematical modelling framework for distinguishing heritable from stochastic variation could also be generalized to other single-cell TME data types.
Sequential and coordinated control of human plasma cell differentiation by IRF4 and BLIMP1 utilizing a discriminating ISRE/EICE motif lexicon
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.15.732353
B cells plasma cells transcription factors IRF4 BLIMP1 gene regulation CRISPR
Summary: Investigates how the transcription factors IRF4 and BLIMP1 (PRDM1) partition and coordinate their genomic activities during human plasma cell differentiation. Using naive human B cells and a stepwise in vitro differentiation system, the authors performed CRISPR/Cas9 perturbations of IRF4 or PRDM1 in plasma cell precursors followed by single-cell RNA sequencing. Despite their mutually reinforced expression and shared requirement for plasma cell differentiation, IRF4 and BLIMP1 were found to regulate largely distinct gene sets through different DNA recognition motifs — IRF4 primarily through ISRE (interferon-stimulated response element) motifs and BLIMP1 through EICE (Ets-IRF composite element) motifs. The two factors operate sequentially: IRF4 first activates a programme of secretory pathway expansion and metabolic remodelling, while BLIMP1 subsequently represses B cell identity genes and drives terminal differentiation. The study reveals a discriminating motif lexicon that enables two closely cooperating transcription factors to partition the genome and execute temporally ordered functions.
Why it matters: Plasma cells are the antibody factories of the immune system, and their differentiation is critical for humoral immunity. Understanding how IRF4 and BLIMP1 coordinate this process at the genomic level is fundamental to B cell biology and has practical implications for vaccine design (promoting durable antibody responses) and autoimmune disease (where pathogenic plasma cells drive tissue damage). The finding that these two factors use distinct DNA motifs to regulate separate gene programmes — despite being co-expressed and mutually reinforcing — reveals a general principle of how transcription factor pairs can achieve functional specialization through motif-level discrimination.
Why for Yiru: B cells and plasma cells are increasingly recognized as important players in the TME. Tertiary lymphoid structures containing plasma cells are associated with favourable prognosis in many cancers, and intratumoural antibody production can contribute to anti-tumour immunity. Understanding the transcriptional control of plasma cell differentiation could inform strategies to promote protective antibody responses within tumours. The motif-level discrimination mechanism described here is also relevant to T cell biology, where IRF4 plays critical roles in effector differentiation — similar motif-lexicon mechanisms may operate in T cell fate decisions within the TME.
APOE4 Drives Uniquely Dysfunctional Human Microglial States in Alzheimer's Disease
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.18.733295
Alzheimer's disease microglia APOE neuroimmunology spatial proteomics single-cell multiomics
Summary: Combines spatially resolved proteomic profiling with single-nuclear multiomic analyses to define microglial organization across APOE3/3 and APOE4/4 genotypes in Alzheimer's disease. The APOE ε4 allele is the strongest genetic risk factor for late-onset Alzheimer's disease, yet how it remodels human microglial states remains unresolved. Quantifying condition-associated variation across the cellular manifold reveals a continuous landscape of microglial states. APOE4/4 shifts cells toward terminal states marked by loss of homeostatic identity, metabolic disruption, and incomplete activation programmes — a uniquely dysfunctional state distinct from the classical homeostatic-to-activated continuum observed in APOE3/3 microglia. The spatial proteomic data further reveal that these dysfunctional microglial states are enriched in amyloid plaque-proximal regions, suggesting local microenvironmental drivers of APOE4-dependent dysfunction.
Why it matters: Understanding how the strongest genetic risk factor for Alzheimer's disease alters microglial biology at the single-cell and spatial level is critical for developing targeted therapies. Microglia are the brain's resident immune cells, and their dysfunction is increasingly recognized as a central driver of neurodegeneration. This study provides the most detailed characterization to date of APOE4-dependent microglial states in human tissue, revealing that the ε4 allele does not simply accelerate normal aging processes but drives cells into qualitatively distinct dysfunctional states. The integration of spatial proteomics with single-nucleus multiomics provides a powerful template for studying how genetic risk factors reshape tissue-resident immune cells in their native context.
Why for Yiru: The experimental and analytical framework used here — combining spatial proteomics with single-cell multiomics to study tissue-resident immune cells in their native context — is directly applicable to TME research. Tumour-associated macrophages, like microglia, are tissue-resident myeloid cells whose function is shaped by local microenvironmental cues and whose dysfunction contributes to disease progression. The approach of quantifying how genetic variation (here APOE genotype) shifts the cellular state manifold could be adapted to study how tumour mutations or host genetic variants shape TME immune cell states. The spatial dimension is particularly important: understanding which TME microenvironments drive dysfunctional versus protective immune cell states is a key goal of spatial TME analysis.
Distinct melanoma EV subpopulations reflect immune- and stress-induced tumor states
bioRxiv Published 2026-06-20 preprint DOI: 10.64898/2026.06.16.732566
melanoma extracellular vesicles liquid biopsy tumour immunity biomarkers proteomics
Summary: Investigates how distinct extracellular vesicle (EV) subpopulations reflect dynamic tumour states induced by immune pressure and cellular stress in melanoma. Using proteomic analyses, the authors identified melanoma-associated antigens gp100 (PMEL) and GPNMB in EVs derived from B16F10 melanoma cells and developed sandwich ELISAs that capture total EVs while selectively detecting tumour-derived subpopulations. Different EV subpopulations were found to carry distinct molecular cargo reflecting the tumour cell state at the time of release — immune pressure induced EVs enriched in stress-response and immunomodulatory proteins, while nutrient deprivation produced EVs with altered metabolic signatures. The study demonstrates that EV subpopulation analysis can serve as a liquid biopsy readout of dynamic tumour states that are otherwise inaccessible without invasive tissue sampling.
Why it matters: Liquid biopsy approaches are transforming cancer monitoring, but most current methods focus on circulating tumour DNA or whole-population EV analysis, which average across heterogeneous tumour cell states. This study shows that EV subpopulations carry state-specific molecular information — effectively providing a real-time, non-invasive window into how tumour cells are responding to immune pressure and microenvironmental stress. If validated clinically, this could enable dynamic monitoring of tumour immune evasion, therapy response, and disease progression without repeated biopsies.
Why for Yiru: EVs are emerging as important mediators of intercellular communication in the TME — tumour-derived EVs can suppress T cell function, reprogram macrophages, and prepare metastatic niches. The ability to distinguish EV subpopulations based on their molecular cargo could be applied to TME studies to track how different tumour regions or treatment conditions produce distinct EV signatures. From a computational perspective, the proteomic data generated in this study could be integrated with single-cell TME data to map EV cargo back to specific tumour cell states, enabling deconvolution of bulk EV samples into their cellular origins.
Cross-disciplinary watchlist
Other Fields
Accurate detection of tumor clonality and ongoing expansion mode from genomic data (DECODE)
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.15.732415
cancer genomics tumour evolution clonality mutation calling computational method prognosis
Summary: Presents DECODE (Deciphering Cancer Origin from DNA Evolution), a novel mutation clustering method that incorporates the impact of sample-specific sequencing coverage and mutation calling biases for estimating intra-tumour heterogeneity. On synthetic data, DECODE outperformed existing methods across multiple clonality metrics and accurately detected the neutral tail in the site frequency spectrum (SFS), which encodes the tumour's ongoing expansion mode. In acute myeloid leukaemia, accounting for the neutral tail enabled DECODE to yield more parsimonious clonal decompositions aligning more closely with known subclonal dynamics that drive relapse. Applied to TCGA data, DECODE detected a neutral SFS tail in most samples across tumour types and uncovered a clinically meaningful link between ITH and survival in low-grade glioma. By jointly inferring clonality and expansion mode, DECODE provides two complementary and prognostically relevant readouts of tumour evolution from single tumour genomic samples.
Why it matters: Accurate estimation of intra-tumour heterogeneity is fundamental to understanding tumour evolution, predicting therapy resistance, and stratifying patients. However, existing methods are confounded by technical artefacts — sequencing coverage variation, mutation calling biases — that can produce spurious subclonal structures. DECODE addresses these confounders directly, producing more reliable clonality estimates. The joint inference of clonality and expansion mode is a conceptual advance: knowing not just how many clones exist but whether the tumour is expanding neutrally or under selection provides complementary prognostic information. The finding that a clinically meaningful ITH-survival link emerges after proper technical correction underscores the importance of methodologically rigorous clonality inference.
Why for Yiru: Tumour evolution shapes the TME by generating antigenic diversity, selecting for immune-evasive clones, and producing heterogeneous landscapes of targetable vulnerabilities. Accurate clonality inference is a prerequisite for studying how subclonal architecture influences immune infiltration and immunotherapy response. DECODE could be applied to TME-focused genomic studies to characterize the clonal structure of tumours with different immune phenotypes — for example, determining whether immune-inflamed tumours tend to be more or less clonally heterogeneous. The joint inference of expansion mode could also reveal whether immune pressure imposes selective versus neutral dynamics on tumour subclones.
HTS-Oracle v2: Prospective AI-Guided Discovery and Experimental Validation of Small Molecule Modulators Across Multiple Immune Checkpoint Targets
bioRxiv Published 2026-06-19 preprint DOI: 10.64898/2026.06.15.732399
drug discovery immunotherapy immune checkpoints AI high-throughput screening small molecules
Summary: Introduces HTS-Oracle v2, an AI-guided platform for prospective small molecule discovery against immunotherapy targets. High-throughput screening (HTS) remains the cornerstone of early-phase small molecule discovery yet consistently underperforms against immunotherapy targets, yielding validated hit rates below 0.1%. HTS-Oracle v2 features rigorous cross-validation and was trained across four clinically significant immune checkpoint targets — CD28, ICOS, LAG-3, and TIGIT — achieving ROC-AUC values of 0.968, 0.969, 0.875, and 0.928 respectively. For prospective validation, HTS-Oracle v2 was applied to an 8,960-compound library, selecting only 25 compounds per target for experimental testing using TRIC technology — a 99.7% reduction in screening burden. The platform identified 4-6 validated binders per target from 25 prospectively selected compounds, corresponding to validated hit rates of 16-24%. Notably, 67-80% of all experimentally confirmed hits across the full library were captured within just 25 model-selected compounds per target, representing up to a 28-fold improvement over the previous version.
Why it matters: Small molecule modulators of immune checkpoints could complement or replace antibody-based immunotherapies, offering advantages in oral bioavailability, tissue penetration, and manufacturing cost. However, traditional HTS has struggled to find such molecules, with hit rates below 0.1%. HTS-Oracle v2 demonstrates that AI-guided screening can enrich hit rates by orders of magnitude — finding most of the true hits in the library while testing only 0.3% of compounds. This 99.7% reduction in experimental burden could make small-molecule immunotherapy discovery economically viable at a scale that traditional HTS cannot achieve. The prospective experimental validation across four distinct targets provides strong evidence that the approach generalizes.
Why for Yiru: Directly relevant to TME immunotherapy research. Small molecule modulators of immune checkpoints could be used as tool compounds to study checkpoint biology in the TME — for example, acutely inhibiting LAG-3 or TIGIT in co-culture systems to dissect their contributions to T cell dysfunction. The AI-guided screening approach could also be applied to discover small molecules targeting other TME-relevant proteins — chemokine receptors, metabolic enzymes, epigenetic regulators — that are currently considered "undruggable" by traditional HTS. From a computational perspective, the rigorous cross-validation framework and prospective validation design set a high standard for AI-guided drug discovery studies.
Divergent 3D genome architecture of male germ cells across vertebrates
Nature Communications Published 2026-06-20 journal_article DOI: 10.1038/s41467-026-74695-5
3D genome evolution germ cells chromatin architecture Hi-C comparative genomics
Summary: Reports a comparative analysis of 3D genome architecture in male germ cells across vertebrate species, revealing both conserved features and lineage-specific innovations in chromatin organization during spermatogenesis. Using Hi-C and complementary genomic approaches, the authors profiled three-dimensional chromatin conformation in testes from representative species spanning mammals, birds, and reptiles. While core features of genome organization — such as compartmentalization into active and inactive chromatin domains — were conserved, the study uncovered striking divergence in higher-order chromatin structures including topologically associating domains and chromatin loop configurations. These differences correlate with species-specific features of spermatogenesis, including the presence or absence of meiotic sex chromosome inactivation and variations in post-meiotic chromatin compaction. The findings establish that 3D genome architecture in the germline evolves rapidly between vertebrate lineages while maintaining core functional constraints.
Why it matters: The 3D organization of the genome is increasingly recognized as a critical layer of gene regulation, yet most of our knowledge comes from a handful of model organisms and somatic cell types. This study extends our understanding to germ cells — a uniquely important cell type where chromatin undergoes dramatic reorganization during meiosis and post-meiotic maturation — and to a broad phylogenetic range. The finding that core features are conserved while higher-order structures diverge rapidly provides insight into the evolutionary plasticity of genome organization and suggests that different structural solutions can satisfy the same functional requirements.
Why for Yiru: While not directly TME-focused, this study is relevant to understanding how 3D genome architecture evolves under different selective pressures — a concept that applies to tumour evolution as well. Cancer genomes undergo dramatic structural rearrangements, and understanding which features of 3D organization are constrained (and therefore potentially targetable) versus which are plastic (and therefore sources of heterogeneity) is important for interpreting cancer Hi-C data. The comparative framework — profiling 3D architecture across species with different genomic features — provides a template for comparative analyses of genome organization across tumour types or across normal versus malignant tissues.
Efficient site-specific gene addition using R2 retrotransposons in tobacco and rice
Nature Biotechnology Published 2026-06-19 journal_article DOI: 10.1038/s41587-026-03181-6
genome engineering retrotransposons plant biotechnology CRISPR gene insertion crop improvement
Summary: Demonstrates efficient site-specific gene addition in tobacco and rice using engineered R2 retrotransposon systems. R2 elements are non-LTR retrotransposons that integrate into specific sites within 25S ribosomal DNA through a mechanism involving sequence-specific DNA cleavage and target-primed reverse transcription. The authors optimized R2 element components — including the reverse transcriptase and endonuclease domains — to achieve high-efficiency, targeted DNA insertion at ribosomal DNA loci in both dicot (tobacco) and monocot (rice) plants. The system enables stable integration of multi-kilobase cargo sequences without introducing double-strand breaks or requiring host DNA repair pathways, distinguishing it from CRISPR-Cas9-based knock-in approaches that rely on error-prone homology-directed repair. The work establishes R2 retrotransposons as a practical platform for precision gene insertion in crop plants.
Why it matters: Precision gene insertion remains a major challenge in plant biotechnology. Current CRISPR-based approaches for inserting large DNA sequences rely on homology-directed repair, which is inefficient in plants and competes with non-homologous end joining. R2 retrotransposons offer an alternative mechanism — "copy and paste" integration that does not create double-strand breaks and is naturally site-specific. Demonstrating this system in both a model dicot and a major crop species (rice) establishes its translational potential. Technologies for efficient, precise DNA insertion have broad applications in crop improvement, synthetic biology, and potentially gene therapy.
Why for Yiru: While this is a plant biotechnology paper, the R2 retrotransposon mechanism is conceptually relevant to genome engineering more broadly. The "copy and paste" integration mechanism that avoids double-strand breaks could inspire new approaches for therapeutic gene insertion in mammalian cells, where CRISPR-induced DNA damage can trigger p53 responses and oncogenic transformations. More tangentially, the study demonstrates how deeply understanding natural mobile genetic elements can yield practical engineering tools — a principle that also applies to harnessing transposons for T cell engineering or CAR-T manufacturing improvements.