Research Paper

What Does the Internet Do to the Brain?

Mapping Cortical Activation Fingerprints Across Digital Content Modalities Using a Deep fMRI Encoder

Does scrolling news light up your brain like watching a video, or reading a story? We used an AI brain model to find out which parts of the cortex wake up for 13 kinds of online content.

Activation Cartography maps 3,008 natural language stimuli across 13 internet content categories against predictions from TRIBE v2 — a 177M-parameter deep neural encoder trained on real fMRI recordings — revealing statistically significant, category-level differences in predicted cortical recruitment.

3,008 stimuli evaluated
13 content categories
20,484 cortical surface points

Concept Overview

Background

Despite extensive single-stimulus neuroscience on emotional, narrative, and threatening media, no large-scale comparative study exists of how distinct categories of internet content differentially engage the cortex at scale.

Past brain studies have looked at one thing at a time: a scary clip here, a sad story there, a single emotional jolt. Nobody has lined up the many kinds of stuff we actually scroll through online and compared them side by side. That is the gap this project fills.

Methods

Activation Cartography maps 3,008 natural language stimuli across 13 internet content categories against predictions from TRIBE v2 — a 177M-parameter deep neural encoder trained on real fMRI recordings that predicts whole-cortex haemodynamic responses. Each stimulus yielded a predicted activation profile across 20,484 cortical surface points, summarised into six anatomical regions.

We collected 3,008 short text snippets — a mix of headlines, posts, stories, and more — sorted into 13 content categories. We then fed each snippet through TRIBE v2, an AI model with 177M tunable knobs that was trained on real fMRI brain scans. The model predicts how blood flow (the signal fMRI tracks) would shift across 20,484 points on the brain’s surface, which we group into six big regions.

Results

A one-way ANOVA revealed a significant main effect of content type (F(12, 2995) = 13.51, p < 10²&sup6;, η² = 0.051). ThreatSafety content ranked highest and Narrative lowest (Cohen’s d = −0.82). A dominant cortical gradient (PC1 = 96.9% variance) contrasts sensory-language against executive-motor cortex across all categories.

A standard statistical test (a one-way ANOVA) confirms the content type really matters (F(12, 2995) = 13.51, p < 10²&sup6;, η² = 0.051). Scary or threatening posts light up the brain the most. Stories light it up the least (Cohen’s d = −0.82, a big gap). One pattern dominates everything: a tug-of-war between the brain’s sense-and-language side and its planning-and-movement side, explaining 96.9% of the variation. That tug-of-war shows up no matter which category you look at.

Implications

Different internet content categories engage distinct brain circuits with statistically significant differences in predicted intensity. GWT’s prediction that threat-laden content drives broad cortical activation received the strongest support. The analysis pipeline and registered hypotheses are released with the project.

Different kinds of online content really do switch on different brain circuits, and the differences in intensity are not just noise. The theory that fits best is GWT — the idea that threatening content commands a brain-wide broadcast. The full code, data, and the predictions we made before running anything are public.

Digital media consumption has become a defining feature of contemporary cognitive life. Recent estimates indicate the average adult consumes six to eight hours of digital content per day — a duration that exceeds sleep for many subpopulations. A fundamental empirical question follows: whether distinct categories of internet content engage the cerebral cortex equivalently, or whether systematic, category-level differences in predicted neural recruitment can be identified at scale.

Most of our waking mental life now runs through a screen. Estimates put the average adult at six to eight hours of digital content a day — more than many of us spend asleep. So a fair question is: does a news headline, a meme, and a short story all hit the brain the same way? Or do they reliably pull on different circuits in ways we can actually measure?

Activation Cartography

Rather than running a single-category fMRI study, Activation Cartography uses a validated deep neural encoder (TRIBE v2) to predict whole-brain responses at scale — enabling a 13-category comparative study with 3,008 stimuli that would be logistically impossible with real scanner time.

Instead of putting a few people in an fMRI machine and showing them one type of content, Activation Cartography uses TRIBE v2 — a tested AI model that predicts whole-brain responses — to do the work at scale. That lets us compare 13 categories across 3,008 snippets, which would never be possible with real scanner time and human volunteers.

The neuroscience of media consumption has historically been constrained by the throughput of fMRI acquisition: each participant yields a few hundred trials per session, making large comparative studies prohibitively expensive. TRIBE v2 breaks this barrier by predicting whole-cortex haemodynamic responses from text inputs, enabling population-scale analysis of content-type effects on predicted brain activation.

Brain science about media has always hit the same wall: fMRI is slow and expensive. A single person in a scanner gets through only a few hundred items per session, so big comparison studies are out of reach. TRIBE v2 jumps that wall. Feed it text, and it predicts how the whole brain would respond — making it possible to study content effects at a scale real scanners cannot reach.

Four neuroscientific frameworks are evaluated against the activation patterns: Global Workspace Theory (GWT), Free Energy Principle (FEP), Default-mode Circuit Theory (DCT), and Integrated Information Theory (IIT). Each makes distinct predictions about which content types should drive the broadest or most intense cortical recruitment.

We compare the results against four big ideas in brain science: Global Workspace Theory (GWT), the Free Energy Principle (FEP), Default-mode Circuit Theory (DCT), and Integrated Information Theory (IIT). Each one makes a different bet about which kind of content should set off the widest or strongest brain response.

01

Stimulus Construction

3,008 stimuli drawn from established NLP benchmarks and live internet sources, distributed across 13 content categories: ThreatSafety, News, Social, Scientific, Narrative, Emotional, AudioText, ImageVisual, Educational, Persuasive, Humour, Instructional, and Commerce. Each stimulus is a short natural language passage (1–3 sentences).

We pulled 3,008 short text snippets from well-known language research datasets and from the live web, then sorted them into 13 content categories: ThreatSafety, News, Social, Scientific, Narrative, Emotional, AudioText, ImageVisual, Educational, Persuasive, Humour, Instructional, and Commerce. Each snippet is just 1–3 sentences long — roughly the length of a tweet or a caption.

3,008 Stimuli 13 Categories NLP Benchmarks
02

TRIBE v2 Encoder

TRIBE v2 is a 177-million-parameter deep neural encoder trained on real functional MRI recordings. Given a text input, it predicts a whole-cortex haemodynamic response across 20,484 cortical surface points. Two encoding modes are used: hash-mode (fast, token-level) and semantic-mode (LLaMA-3.2-3B embeddings, N = 390 replication sample).

TRIBE v2 is an AI model with 177 million tunable knobs, trained on real brain scans (fMRI) so it can guess how the brain would react to text. Hand it a sentence and it predicts a brain-wide response across 20,484 points on the cortex. We run it in two flavors: hash-mode, which is fast and looks at the words themselves, and semantic-mode, which uses a small language model (LLaMA-3.2-3B) to grasp the meaning — tested on 390 of the snippets as a double-check.

177M Parameters 20,484 Cortical Points LLaMA-3.2-3B
03

Statistical Analysis

Each stimulus’s 20,484-point activation profile is summarised into six anatomical regions. A one-way ANOVA tests the main effect of content type on mean global activation. Effect sizes reported as η² and Cohen’s d. Principal component analysis across category mean profiles identifies dominant cortical gradients.

For each snippet, we shrink the 20,484-point brain map into six big anatomical regions to keep things manageable. A one-way ANOVA — a basic test for “does the group really make a difference?” — checks whether content type changes the overall brain response. We report effect sizes (η² and Cohen’s d) so it’s clear how large the differences are, not just whether they exist. Principal component analysis — a way of finding the strongest underlying pattern — pulls out the dominant cortical gradient.

One-way ANOVA Cohen’s d PCA

A one-way ANOVA on predicted global activation revealed a statistically significant main effect of content type across all 13 categories.

The headline finding: content type really does make a difference. The standard statistical test (a one-way ANOVA) showed a clear, reliable effect across all 13 categories.

ANOVA: F(12, 2995) = 13.51, p < 10²&sup6;

Under hash-mode encoding, ThreatSafety content ranked highest and Narrative lowest (Cohen’s d = −0.82). The semantic replication (N = 390, LLaMA-3.2-3B) produced a 4× wider activation spread, with AudioText, ImageVisual, and Emotional leading — a ranking essentially uncorrelated with hash-mode ordering (r = 0.09).

In hash-mode (the fast, word-level version), ThreatSafety content tops the chart and Narrative sits at the bottom (Cohen’s d = −0.82, a hefty gap). The meaning-aware semantic run (N = 390, using LLaMA-3.2-3B) spreads things out 4× wider, and a different cast leads: AudioText, ImageVisual, and Emotional. The two rankings barely match (r = 0.09), so the two modes are picking up on very different things in the same text.

Dominant cortical gradient (PC1 = 96.9% variance) contrasts sensory-language cortex (high loading: auditory, visual, language network) against executive-motor cortex (low loading: prefrontal, motor) across all 13 categories. This gradient is consistent across encoding modes despite the different category rankings.

One pattern dominates everything (PC1 = 96.9% variance): a tug-of-war between the brain’s sensing-and-language side (hearing, vision, language areas) and its planning-and-doing side (the prefrontal cortex, which weighs decisions, and the motor cortex, which moves the body). That tug-of-war shows up in all 13 categories. Even though the two encoding modes rank categories differently, this underlying pattern stays the same.

Regional breakdown shows that ThreatSafety content activates the language network and prefrontal cortex most strongly under hash-mode, while AudioText and ImageVisual content drives the largest visual and auditory cortex responses under semantic encoding — suggesting hash-mode captures surface lexical features while semantic-mode captures deeper representational content.

Zooming in by brain region: in hash-mode, ThreatSafety content lights up the language areas and the prefrontal cortex (the brain’s decision-maker) the most. In semantic-mode, AudioText and ImageVisual content drive the biggest responses in the visual and auditory areas. The likely reason: hash-mode reacts to the words themselves, while semantic-mode reacts to what those words actually describe.

Four neuroscientific frameworks were assessed against predicted activation patterns, each making distinct testable predictions about which content types should drive the broadest cortical recruitment.

We checked the results against four big theories of how the brain works. Each one makes a specific bet about which kind of content should set off the widest brain response — predictions we can actually test against our data.

Global Workspace Theory — Strongest Support

GWT predicts that threat-laden content should trigger a global broadcast, driving widespread cortical ignition. ThreatSafety content ranking highest under hash-mode (d = −0.82) directly supports this. The finding that a single dominant gradient accounts for 96.9% of between-category variance is also consistent with GWT’s single-workspace model.

GWT pictures the brain as a stage. When something matters — like a threat — it gets broadcast to the whole stage, lighting up many regions at once. Our results fit: ThreatSafety came out on top in hash-mode by a big margin (d = −0.82). And the fact that a single pattern explains 96.9% of the differences between categories also fits GWT’s idea of one central stage. Of the four theories, GWT comes out looking the strongest.

Free Energy Principle predicts prediction-error-rich content (novel, surprising, uncertain stimuli) should drive higher activation. The moderate support observed is consistent: ThreatSafety and News content (high surprise value) rank highly, but the correlation with uncertainty proxies is weak (r ≈ 0.3).

Free Energy Principle: this theory says the brain is constantly guessing what comes next, and content that surprises it — the unexpected or uncertain — should fire harder. Partial fit: ThreatSafety and News are both high on surprise, but the link between “surprising” and “brain lights up” is only modest (r ≈ 0.3).

Default-mode Circuit Theory predicts narrative and self-referential content should activate default mode network most strongly. This receives mixed evidence: Narrative ranked lowest under hash-mode, but higher under semantic encoding — suggesting encoding mode mediates the narrative-DMN link.

Default-mode Circuit Theory: the default mode network is the set of brain regions that hums along when you’re daydreaming or thinking about yourself. This theory says stories and self-focused content should turn it on the most. Our results are split: Narrative ranked dead last in hash-mode but climbed when we used semantic-mode — so the link only shows up if the model is reading for meaning, not just words.

Integrated Information Theory predicts content with higher integrated information (Φ) should drive more cortical activation. IIT receives mixed evidence: there is no reliable proxy for Φ in natural language stimuli, making this prediction untestable at current resolution.

Integrated Information Theory: IIT measures “richness of experience” with a number called Φ, and predicts that richer content should activate more of the brain. The honest verdict is: we cannot really tell yet. There is no good way to measure Φ in a short piece of text, so this prediction is essentially untestable with the tools we have today.

Activation Cartography demonstrates that different internet content categories engage distinct brain circuits with statistically significant differences in predicted intensity (F(12, 2995) = 13.51, p < 10²&sup6;, η² = 0.051). The dominant cortical gradient — sensory-language vs executive-motor — is stable across encoding modes and accounts for 96.9% of between-category variance.

Activation Cartography shows that different kinds of online content really do switch on different brain circuits, and the differences in intensity are big enough to take seriously (F(12, 2995) = 13.51, p < 10²&sup6;, η² = 0.051). The strongest pattern — a tug-of-war between the brain’s sense-and-language side and its planning-and-doing side — holds up no matter how we run the model, and accounts for 96.9% of the differences between categories.

The encoding-mode dependence of category rankings (r = 0.09 between hash-mode and semantic-mode) is the study’s most important methodological finding: surface lexical features (hash-mode) and deep semantic representations (semantic-mode) produce systematically different activation predictions, suggesting that fMRI encoding models are sensitive to which level of linguistic representation is used as input.

The most important method lesson is that the two modes barely agree on which categories rank highest (r = 0.09). Reading words on the surface (hash-mode) and reading them for meaning (semantic-mode) give very different brain predictions. That is a warning to anyone using these AI brain models: the answer you get depends a lot on how you describe the input.

Future directions include higher-powered semantic replication (N ≥ 150 per category) to resolve the hash/semantic discrepancy, extension to multimodal stimuli (images, audio) using TRIBE v2’s full multimodal encoder, and pre-registration of the category-ranking hypothesis for confirmatory testing.

Next steps: run a bigger semantic-mode pass (at least N ≥ 150 per category) to settle which mode is closer to the truth, push beyond text by feeding TRIBE v2 real images and audio, and lock in the category rankings as a prediction up front — so a future study can confirm or refute them cleanly.

Related Research