Research Radar — 2026-05-07

Generated 2026-05-07 09:10 +0800 DeepSeek-V4-Pro Academic articles only

Hot Topic

Foundation Models and Spatial Multi-Omics Converge on Tumor Immunology

This week sees a striking convergence of three themes: transformer-scale foundation models for biological sequences (Waypoint, EvoSyn, UNKAI), graph-based and multi-agent frameworks for spatial multi-omics (scHG, STAT), and mechanistic dissection of tumor-immune microenvironments across multiple cancer types.

Multiple foundation model papers in a single week: microbiome transformers (Waypoint), LLM multi-agent drug design (EvoSyn), ESM-C protein function (UNKAI)
Spatial transcriptomics approaches expanding: STAT multi-agent framework, scHG supercell graph learning, glioblastoma subcellular translatomics
TME immunology focus: MINDS multispecific degraders, DIPG myeloid recruitment, ccRCC retroelement subtypes, FLT3-ITD hypoxia resistance

Methods & AI

Computational

6 selected

Computational #1 READ FULL

scHG: A supercell framework with high-order graph learning enables scalable multi-omics analysis

PLOS Computational Biology Published 2026-05-06 research article DOI: 10.1371/journal.pcbi.1013851

Authors: Huang Y et al.

multi-omics graph neural networks single-cell clustering spatial biology

Summary: Introduces the supercell paradigm for multi-omics clustering, grouping expression-coherent cells into intermediate units using angle-aware similarity and second-order co-occurrence neighbors. scHG, a high-order graph learning framework with omics-weighted optimizer, outperforms state-of-the-art methods across six benchmark datasets (up to 30,672 cells), improving mean ARI by 3.97% and reducing runtime by 26.40%. Notably resolves rare populations including dendritic cells and NK-like B cells hidden by standard pipelines.

Why it matters: The supercell approach bridges the gap between single-cell resolution and computational tractability for large-scale multi-omics integration, with direct relevance to rare cell detection in tumor microenvironments.

Why for Yiru: Multi-omics integration with graph learning is directly applicable to spatial transcriptomics and tumor microenvironment analysis.

Computational #2 READ FULL

Learning the Language of the Microbiome with Transformers

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.02.722381

Authors: Treloar NJ et al.

foundation models transformers microbiome self-supervised learning benchmarking

Summary: Presents Atlas, a pretraining dataset of over 539,000 microbiome datapoints, and the Waypoint family of GPT-2 style causal language models (6M–170M parameters) for microbiome analysis. Introduces Compass, a curated benchmark of eight predictive tasks including biome classification, drug-microbiome interactions, and infant gut development. Pretrained transformers begin to reliably outperform classical methods once training data exceeds ~10,000 examples.

Why it matters: Establishes the first comprehensive foundation model framework for microbiome data, demonstrating that self-supervised pretraining at scale yields significant improvements across diverse downstream tasks.

Why for Yiru: Foundation model approaches applied to biological sequence data are directly relevant to building analogous models for single-cell and spatial omics.

Computational #3 READ FULL

Bridging LLM Reasoning and Chemical Knowledge via an Evolutionary Multi-Agent Framework for Molecular Synthesis

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.02.722342

Authors: Chen Y et al.

large language models drug discovery multi-agent systems reinforcement learning molecular design

Summary: Proposes EvoSyn, an evolutionary multi-agent framework that synergizes LLM reasoning with domain experts for molecular synthesis. Uses a dual-process evolutionary paradigm: co-evolving linguistic capabilities with multi-objective constraints and self-evolving through a Markov Game formulation. Domain feedback penalizes invalid proposals and grounds generation in feasible reaction pathways. Significantly outperforms state-of-the-art baselines on comprehensive benchmarks.

Why it matters: Demonstrates how LLM-guided evolution with rigorous domain validation can overcome hallucination problems in generative molecular design, producing molecules that are both bioactive and synthetically actionable.

Why for Yiru: Multi-agent LLM frameworks and evolutionary optimization strategies are applicable to biological sequence design, including peptide and antibody engineering.

Computational #4 READ FULL

UNKAI: A protein functional identity prediction model based on ESM-C latent representations and the attention mechanism

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.02.722384

Authors: Ukai K et al.

protein language models ESM attention mechanism enzyme function deep learning

Summary: Develops a deep learning method to predict whether two proteins catalyze the same enzymatic reaction, using ESM Cambrian (ESM C) latent representations processed through an attention-based neural network. Outperforms existing methods including sequence similarity and AlphaFold-based approaches. Attention weight analysis reveals autonomous highlighting of catalytic and binding residues, eliminating the need for manual feature engineering.

Why it matters: Shows that protein language model embeddings combined with attention mechanisms can achieve interpretable enzyme function prediction without structural information, democratizing functional annotation.

Why for Yiru: Attention-based interpretation of protein language models provides a blueprint for interpretable deep learning in biological sequence analysis.

Computational #5 READ FULL

Tumor cell specific total mRNA expression informed neural networks predicts cancer progression

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.01.722212

Authors: Paul A et al.

deep learning cancer genomics multi-omics prognosis transcriptomics

Summary: Presents TmSNet, a deep learning framework that predicts tumor cell-specific total mRNA expression (TmS) from mRNA, DNA methylation, miRNA, and immune cell proportions. Integrates structured feature selection (gradient boosting, LASSO, elastic net) with specialized neural architectures. Achieves cross-validated CCC up to 0.93 across 12 TCGA cancer types and generalizes to external cohorts. Predicted TmS effectively stratifies patients by risk.

Why it matters: Provides a scalable, alignment-free method for inferring tumor transcriptional activity without matched DNA sequencing, enabling analysis of large heterogeneous cohorts.

Why for Yiru: Multi-omic feature integration with neural networks for cancer prognosis is directly relevant to building clinically applicable models from spatial and single-cell data.

Computational #6 READ FULL

STAT: A multi-agent framework for integrated and interactive spatial transcriptomics analysis

bioRxiv Published 2026-05-05 preprint DOI: 10.64898/2026.05.01.722244

Authors: Chen Y et al.

spatial transcriptomics multi-agent systems large language models interactive analysis benchmarking

Summary: Introduces STAT, a multi-agent framework making spatial transcriptomics analysis conversational and interactive. Features a persistent session, shared tissue viewer, and staged skill-aware pipeline. Outperforms baseline LLMs and existing autonomous agents across 11 analytical task categories on three spatial platforms. Successfully reproduces published Visium HD colorectal cancer findings from natural language prompts alone.

Why it matters: Represents a practical integration of LLM agents with spatial biology workflows, maintaining transparency and user control while dramatically reducing analysis overhead.

Why for Yiru: Multi-agent LLM frameworks for spatial transcriptomics analysis are directly relevant to building intelligent analysis pipelines for spatial omics data.

Biomedical discoveries

Biomedicine

6 selected

Biomedicine #1 READ FULL

Multispecific nanobody degraders co-deplete membrane receptors and enable targeted delivery of diverse payloads

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.02.722401

Authors: Kabir M et al.

targeted protein degradation nanobody ADC PROTAC cancer therapy EGFR cMET

Summary: Develops MINDS (Multivalent Interchangeable Nanobody Degradation System), a modular nanobody-Fc chassis co-engaging EGFR, cMET, and TfR1 for lysosomal co-depletion and intracellular payload delivery. Tritazumab achieves picomolar degradation potency with near-maximal depletion within ~1.5 hours. BRD4 molecular glue conjugate improved selectivity window >100-fold; EZH2 PROTAC conjugate achieved ~1,000-fold increase in intracellular degradation potency versus free PROTAC.

Why it matters: A platform technology integrating multispecific receptor degradation with diverse payload delivery (cytotoxic, molecular glue, PROTAC) that could transform targeted cancer therapy by addressing receptor heterogeneity and compensatory signaling.

Why for Yiru: Multispecific targeted degradation and payload delivery is highly relevant to immunotherapy and tumor microenvironment engineering.

Biomedicine #2 READ FULL

Local translation drives glioblastoma heterogeneity and tumor invasion

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.02.722387

Authors: Layer N et al.

glioblastoma tumor invasion spatial transcriptomics subcellular translation tumor microenvironment

Summary: Establishes local protein translation as a fundamental driver of tumor microtube (TM) dynamics and invasive cell states in glioblastoma. Using subcellular transcriptomics integrating organelle organization with spatially resolved transcriptomics, reveals that TM gene expression drives cell state identity. Targeted disruption of TM-localized translation via photoswitchable puromycin and knockdown of TM-enriched proteins GPM6A and GAP43 impairs invasion and reduces tumor growth.

Why it matters: Identifies a previously unappreciated subcellular mechanism driving glioblastoma invasion, with direct therapeutic implications for targeting local translation to block brain colonization.

Why for Yiru: Subcellular spatial transcriptomics approach and the link between local translation and tumor cell state plasticity are highly innovative and relevant to spatial multi-omics methods development.

Biomedicine #3 READ FULL

Glutamine-dependent downregulation of FLT3-ITD is a mechanism of FLT3 inhibitor resistance in FLT3-ITD AML in hypoxia

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.02.722336

Authors: Silvestri G et al.

AML FLT3-ITD drug resistance tumor microenvironment hypoxia glutamine metabolism

Summary: Reveals that hypoxia, characteristic of the bone marrow niche, causes 3–5-fold increase in FLT3 inhibitor IC50 through glutamine-dependent upregulation of the ubiquitin ligase c-CBL, accelerating FLT3-ITD proteasomal degradation (half-life 1.0 vs 2.5 hours). Glutaminase inhibitor telaglenastat abrogates c-CBL upregulation, preserves FLT3-ITD expression, and synergizes with FLT3 inhibitors in hypoxia.

Why it matters: Explains the clinical observation that FLT3 inhibitors clear blood blasts but not bone marrow blasts, and identifies a metabolic intervention (glutaminase inhibition) that restores sensitivity.

Why for Yiru: Metabolic microenvironment-driven drug resistance is a critical theme in tumor immunology and directly relevant to understanding spatial heterogeneity in treatment response.

Biomedicine #4 READ FULL

Retroelement Hypomethylation Links Hypoxia Signaling, Immune Phenotypes, and Survival in Clear Cell Renal Cell Carcinoma

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.01.722263

Authors: Nnam CF et al.

ccRCC epigenetics retrotransposons tumor microenvironment immune infiltration cGAS-STING

Summary: Identifies three reproducible RE methylation subtypes (Repressed, Transient, Active) in ccRCC through genome-wide prediction across Alu, LINE-1, and LTR elements. Active subtype shows significantly worse survival, reduced EPAS1/HIF2A expression, increased immune infiltration, elevated PD-1, and heightened cGAS-STING/interferon signaling — an immune-inflamed yet immunosuppressed state. Findings validated in CPTAC and independently replicated in an institutional cohort.

Why it matters: Establishes retroelement methylation as a novel molecular classifier in ccRCC linking epigenetic dysregulation to immune phenotypes, with potential for improving risk stratification.

Why for Yiru: Integration of epigenomic features with tumor immune microenvironment phenotypes using multi-omic computational approaches is a core interest.

Biomedicine #5 READ FULL

Integrated Multi-Omics Identifies Lineage-Dependent Myeloid Cells Recruitment and the APP-CD74 Axis as an Immunoregulatory Target in Pediatric High-Grade Glioma

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.01.722277

Authors: Wang Z et al.

DIPG pediatric glioma tumor-associated macrophages tumor microenvironment immunotherapy spatial biology

Summary: Uses bulk RNA-seq (26 DIPG autopsy specimens), scRNA-seq (8 patients), and CellChat analysis to reveal that DIPG tumors actively recruit monocytes through chemokine-mediated mechanisms driven by mesenchymal-like lineage state. Identifies APP-CD74 signaling as a prominent tumor-TAM interaction pathway. APP suppression in tumors attenuates proinflammatory TAM activity. Protein docking identifies the APP-CD74 binding interface for therapeutic targeting.

Why it matters: Identifies a druggable tumor-myeloid communication axis in a devastating pediatric brain cancer with no effective treatments, providing a structural basis for therapeutic development.

Why for Yiru: Multi-omics deconvolution of tumor-immune communication with structural follow-up is an exemplary workflow for translational computational immunology.

Biomedicine #6 READ FULL

Spatial transcriptomics identifies a translayer architecture of pyroptosis-related transcription in systemic sclerosis skin

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.03.722547

Authors: Oryoji D et al.

spatial transcriptomics pyroptosis systemic sclerosis autoimmunity skin inflammasome

Summary: Reanalyzes public Visium skin sections (4 healthy, 9 systemic sclerosis) revealing a conserved translayer pyroptosis architecture: epidermal NLRP1/PYCARD/CASP4 bias vs. dermal GSDMD bias. This spatial separation is detectable in healthy skin and enhanced in SSc. Spatial deconvolution shows dermal GSDMD associated with endothelial abundance. Findings replicated in independent cohort (10 SSc sections).

Why it matters: Demonstrates that inflammatory programs in autoimmune disease have a reproducible spatial architecture that may require compartment-specific therapeutic targeting rather than systemic inhibition.

Why for Yiru: Spatial transcriptomics deconvolution of immune programs in inflammatory disease is directly relevant to building spatial analysis pipelines for tumor and autoimmune microenvironments.

Cross-disciplinary watchlist

Other Fields

3 selected

Field #1 READ FULL

ArchaicSeeker 3.0: A deep-learning framework for scalable, haplotype-resolved inference of archaic introgression

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.05.722798

Authors: Wang B et al.

deep learning population genetics human evolution local ancestry inference genomics

Summary: Introduces ArchaicSeeker 3.0, a deep-learning framework for haplotype-resolved detection of archaic introgression. Integrates tract-scale sequence modeling with overlap-aware reassembly and boundary refinement. Simulation-trained model avoids inference-time recalibration, outperforming existing methods in precision, recall, and F1 score. Applied to 3,453 genomes from 209 populations, identifies novel introgressed regions with locus-level phylogenetic support.

Why it matters: Advances deep learning in population genetics by providing a scalable, assumption-free framework for detecting archaic ancestry that generalizes across diverse demographic scenarios.

Why for Yiru: Deep learning frameworks for sequence-based inference that achieve robustness without demographic assumptions are methodologically relevant to building generalizable models for biological sequence analysis.

Field #2 READ FULL

Simple baselines rival protein language models in mutation-dense design tasks

bioRxiv Published 2026-05-06 preprint DOI: 10.64898/2026.05.01.722313

Authors: Talpir I et al.

protein language models benchmarking protein design machine learning methodology

Summary: Benchmarks widely used protein language models against conventional baselines in dense, experimentally validated multi-mutant landscapes. Finds that regardless of architecture and parameter count, pLMs are statistically similar to one another and none consistently outperforms conventional methods. Zero-shot functional variant discrimination is comparable to homology-based methods. Suggests pLMs may need biophysical/structural priors for protein function design.

Why it matters: A sobering reality check for the protein language model field, demonstrating that simpler methods match or exceed pLMs on the hardest design tasks — important for directing future ML research efforts.

Why for Yiru: Critical methodological benchmarking of foundation models against simple baselines is an essential practice that should be applied to spatial and single-cell foundation models as well.

Field #3 READ FULL

Uncertainty-aware localization microscopy by variational diffusion

bioRxiv Published 2026-05-05 preprint DOI: 10.64898/2026.05.01.722206

Authors: Seitz C et al.

diffusion models variational inference super-resolution microscopy uncertainty quantification computer vision

Summary: Proposes a conditional variational diffusion model (CVDM) for kernel density estimation in single-molecule localization microscopy. Models a probability distribution over high-resolution solutions to the ill-posed inverse problem of localizing fluorescent molecules in dense images. Enables uncertainty quantification of reconstructed images, a capability absent from existing deep models for localization microscopy.

Why it matters: Introduces uncertainty quantification to deep learning-based super-resolution microscopy, enabling researchers to assess confidence in reconstructed molecular localizations — critical for biological interpretation.

Why for Yiru: Uncertainty-aware generative models for imaging are methodologically relevant to spatial transcriptomics analysis, where confidence in spatial feature detection is essential.