Meta-D: Metadata-Driven Attention via Transformer Maximizer (Tₘₐₓ) Routing

Meta-D Architecture

MICCAI 2026 Paper System Code

1. The Genesis: Curing Image-Only Blind Spots with Metadata

The Hypothesis: Injecting clinical metadata into the pipeline should fundamentally improve tumor detection accuracy over standard image-only models.

The Visual Proof: After achieving higher F-1 scores, Grad-CAM visually confirmed why: metadata physically forces the network to stop guessing and shifts its attention directly onto the true tumor core.

The Evolution: Proving that metadata outright cures 2D misclassifications became the foundation for scaling Meta-D to mathematically bypass entirely missing modalities in 3D space.

Figure 1: Grad-CAM visualization. The Image-Only baseline (left) guesses blindly based on artifacts. Meta-D modulation (right) physically pulls the network's attention back to the tumor core, curing the model's drawbacks.

2. Initial Validation: Brain Tumor Classification (Meta-D)

To quantify the visual evidence observed in the Grad-CAM analysis, the architecture was tested on a rigorous 2D classification task. By incrementally adding metadata context, the model effectively eliminates ambiguity in corrupted or edge-case slices.

Dataset	bias- corr.	2D Image only	Meta-D (Ours)
Dataset	bias- corr.	2D Image only	Img + Seq	Img + Plane	Img + Seq + Plane
BraTS 2020	✓	0.9037 ± 0.0033	0.9129 ± 0.0015	0.9108 ± 0.0255	0.9138 ± 0.0089
BraTS 2020	✓	0.9055 ± 0.0209	0.9030 ± 0.0130	0.9070 ± 0.0028	0.9099 ± 0.0134
BRISC	✓	0.6890 ± 0.0226	0.6860 ± 0.0622	0.6955 ± 0.0118	0.6989 ± 0.0199
BRISC	✓	0.6896 ± 0.0670	0.6676 ± 0.0260	0.6321 ± 0.0874	0.7016 ± 0.0483
BRISC (skull-stripped)	✓	0.6986 ± 0.0393	0.7091 ± 0.0988	0.7116 ± 0.0479	0.7248 ± 0.0163
BRISC (skull-stripped)	✓	0.7222 ± 0.0671	0.6976 ± 0.0109	0.7005 ± 0.0560	0.7233 ± 0.0608

3. The Core Innovation: Severing Computational Waste in 3D

Armed with the proof that metadata strictly governs attention in 2D space, we applied this exact routing logic to solve the much harder problem of entirely missing modalities in 3D environments.

Conventional Transformers

Valid Modality

Missing Modality

Dense Self-Attention
Forces interpolation of all inputs

O(N²) Complexity
Hallucinates Fake Boundaries

Standard networks blindly allocate attention to all inputs, wasting compute by forcing connections to missing data and propagating overconfident false positives into the final output.

Meta-D (Ours)

Valid Modality

Missing Modality

T_max Routing Module
Calculates Entropy & Evaluates Metadata

Blocked

O(N) Complexity
Extracts True Features Only

Meta-D actively calculates signal integrity via T_max. It directly severs attention to corrupted data, dynamically routing computational power exclusively to valid features.

4. Final Proof: 3D Segmentation Performance & Robustness

By treating corrupted 3D data as a deterministic routing problem rather than relying on deep feature interpolation, Meta-D completely breaks the standard scaling laws of medical vision models. It achieves state-of-the-art accuracy against heavy transformer baselines while drastically reducing the system footprint.

Table 1: Architectural Comparison (Missing Modality Scenario)

Model Architecture	Computational Complexity	Parameter Count (M)	Overall Dice Score (%)
UNETR (Heavyweight Baseline)^[1]	O(N²)	102.1	84.52
MMFormer (Missing Modality Baseline)^[2]	O(N²)	45.8	86.72
Meta-D (Ours)	O(N)	34.7 (-24.1%)	91.84 (+5.12%)

Table 2: Comprehensive Missing Modality Robustness (BraTS 2018)

Beyond raw efficiency, Meta-D demonstrates absolute robustness. We rigorously evaluated the architecture against all 15 possible combinations of missing MRI sequences. In every single scenario, T_max metadata routing outperformed standard image-only interpolation by successfully ignoring the missing modalities.

Modalities				3D Image only	Meta-D (Ours)
FLAIR	T1c	T1	T2	3D Image only	Img + Seq
●	○	○	○	82.80	83.38
○	●	○	○	71.85	73.98
○	○	●	○	68.95	74.07
○	○	○	●	81.34	83.78
●	●	○	○	85.74	86.27
●	○	●	○	85.38	85.76
●	○	○	●	86.56	87.00
○	●	●	○	76.00	77.89
○	●	○	●	85.04	85.74
○	○	●	●	84.52	85.66
●	●	●	○	86.35	86.93
●	●	○	●	87.46	87.88
●	○	●	●	87.26	87.72
○	●	●	●	85.82	86.28
●	●	●	●	87.72	88.24
Average				82.85	84.04