What Does the Internet Do to the Brain?
Mapping Cortical Activation Fingerprints Across Digital Content Modalities Using a Deep fMRI Encoder
Activation Cartography maps 3,008 natural language stimuli across 13 internet content categories against predictions from TRIBE v2 — a 177M-parameter deep neural encoder trained on real fMRI recordings — revealing statistically significant, category-level differences in predicted cortical recruitment.
Concept Overview
Abstract
Background
Despite extensive single-stimulus neuroscience on emotional, narrative, and threatening media, no large-scale comparative study exists of how distinct categories of internet content differentially engage the cortex at scale.
Methods
Activation Cartography maps 3,008 natural language stimuli across 13 internet content categories against predictions from TRIBE v2 — a 177M-parameter deep neural encoder trained on real fMRI recordings that predicts whole-cortex haemodynamic responses. Each stimulus yielded a predicted activation profile across 20,484 cortical surface points, summarised into six anatomical regions.
Results
A one-way ANOVA revealed a significant main effect of content type (F(12, 2995) = 13.51, p < 10²&sup6;, η² = 0.051). ThreatSafety content ranked highest and Narrative lowest (Cohen’s d = −0.82). A dominant cortical gradient (PC1 = 96.9% variance) contrasts sensory-language against executive-motor cortex across all categories.
Implications
Different internet content categories engage distinct brain circuits with statistically significant differences in predicted intensity. GWT’s prediction that threat-laden content drives broad cortical activation received the strongest support. The analysis pipeline and registered hypotheses are released with the project.
Introduction
Digital media consumption has become a defining feature of contemporary cognitive life. Recent estimates indicate the average adult consumes six to eight hours of digital content per day — a duration that exceeds sleep for many subpopulations. A fundamental empirical question follows: whether distinct categories of internet content engage the cerebral cortex equivalently, or whether systematic, category-level differences in predicted neural recruitment can be identified at scale.
Activation Cartography
Rather than running a single-category fMRI study, Activation Cartography uses a validated deep neural encoder (TRIBE v2) to predict whole-brain responses at scale — enabling a 13-category comparative study with 3,008 stimuli that would be logistically impossible with real scanner time.
The neuroscience of media consumption has historically been constrained by the throughput of fMRI acquisition: each participant yields a few hundred trials per session, making large comparative studies prohibitively expensive. TRIBE v2 breaks this barrier by predicting whole-cortex haemodynamic responses from text inputs, enabling population-scale analysis of content-type effects on predicted brain activation.
Four neuroscientific frameworks are evaluated against the activation patterns: Global Workspace Theory (GWT), Free Energy Principle (FEP), Default-mode Circuit Theory (DCT), and Integrated Information Theory (IIT). Each makes distinct predictions about which content types should drive the broadest or most intense cortical recruitment.
Methods
Stimulus Construction
3,008 stimuli drawn from established NLP benchmarks and live internet sources, distributed across 13 content categories: ThreatSafety, News, Social, Scientific, Narrative, Emotional, AudioText, ImageVisual, Educational, Persuasive, Humour, Instructional, and Commerce. Each stimulus is a short natural language passage (1–3 sentences).
TRIBE v2 Encoder
TRIBE v2 is a 177-million-parameter deep neural encoder trained on real functional MRI recordings. Given a text input, it predicts a whole-cortex haemodynamic response across 20,484 cortical surface points. Two encoding modes are used: hash-mode (fast, token-level) and semantic-mode (LLaMA-3.2-3B embeddings, N = 390 replication sample).
Statistical Analysis
Each stimulus’s 20,484-point activation profile is summarised into six anatomical regions. A one-way ANOVA tests the main effect of content type on mean global activation. Effect sizes reported as η² and Cohen’s d. Principal component analysis across category mean profiles identifies dominant cortical gradients.
Results
A one-way ANOVA on predicted global activation revealed a statistically significant main effect of content type across all 13 categories.
ANOVA: F(12, 2995) = 13.51, p < 10²&sup6;
Under hash-mode encoding, ThreatSafety content ranked highest and Narrative lowest (Cohen’s d = −0.82). The semantic replication (N = 390, LLaMA-3.2-3B) produced a 4× wider activation spread, with AudioText, ImageVisual, and Emotional leading — a ranking essentially uncorrelated with hash-mode ordering (r = 0.09).
Dominant cortical gradient (PC1 = 96.9% variance) contrasts sensory-language cortex (high loading: auditory, visual, language network) against executive-motor cortex (low loading: prefrontal, motor) across all 13 categories. This gradient is consistent across encoding modes despite the different category rankings.
Regional breakdown shows that ThreatSafety content activates the language network and prefrontal cortex most strongly under hash-mode, while AudioText and ImageVisual content drives the largest visual and auditory cortex responses under semantic encoding — suggesting hash-mode captures surface lexical features while semantic-mode captures deeper representational content.
Theory Evaluation
Four neuroscientific frameworks were assessed against predicted activation patterns, each making distinct testable predictions about which content types should drive the broadest cortical recruitment.
Global Workspace Theory — Strongest Support
GWT predicts that threat-laden content should trigger a global broadcast, driving widespread cortical ignition. ThreatSafety content ranking highest under hash-mode (d = −0.82) directly supports this. The finding that a single dominant gradient accounts for 96.9% of between-category variance is also consistent with GWT’s single-workspace model.
Free Energy Principle predicts prediction-error-rich content (novel, surprising, uncertain stimuli) should drive higher activation. The moderate support observed is consistent: ThreatSafety and News content (high surprise value) rank highly, but the correlation with uncertainty proxies is weak (r ≈ 0.3).
Default-mode Circuit Theory predicts narrative and self-referential content should activate default mode network most strongly. This receives mixed evidence: Narrative ranked lowest under hash-mode, but higher under semantic encoding — suggesting encoding mode mediates the narrative-DMN link.
Integrated Information Theory predicts content with higher integrated information (Φ) should drive more cortical activation. IIT receives mixed evidence: there is no reliable proxy for Φ in natural language stimuli, making this prediction untestable at current resolution.
Conclusion
Activation Cartography demonstrates that different internet content categories engage distinct brain circuits with statistically significant differences in predicted intensity (F(12, 2995) = 13.51, p < 10²&sup6;, η² = 0.051). The dominant cortical gradient — sensory-language vs executive-motor — is stable across encoding modes and accounts for 96.9% of between-category variance.
The encoding-mode dependence of category rankings (r = 0.09 between hash-mode and semantic-mode) is the study’s most important methodological finding: surface lexical features (hash-mode) and deep semantic representations (semantic-mode) produce systematically different activation predictions, suggesting that fMRI encoding models are sensitive to which level of linguistic representation is used as input.
Future directions include higher-powered semantic replication (N ≥ 150 per category) to resolve the hash/semantic discrepancy, extension to multimodal stimuli (images, audio) using TRIBE v2’s full multimodal encoder, and pre-registration of the category-ranking hypothesis for confirmatory testing.