Understanding the Visual Cortex | Notebooks

The Spark

I read Phantoms in the Brain by V.S. Ramachandran and was blown away. Phantom limbs, visual agnosias, the fact that perception is essentially a controlled hallucination. His TED talk made me want to dig deeper into the visual cortex specifically, since that's the part of the brain closest to what I work on in computer vision.

The question that started it all: how does the brain process what the eye sees?

The Rules

1 hour/day, 1 topic, 1 question
Each day's question comes from the previous day's gaps
No syllabus, follow whatever is most interesting

Experimental Setup

Ask the question Start with whatever I was most curious about, then ask AI models to get a high-level map of the key concepts. Claude was significantly more useful than ChatGPT here, better at distilling complex neuroscience into clear mental models

Go to the source Dive deep into the textbook, Principles of Neural Science, chapters 23 & 24 on intermediate and high-level visual processing. The textbook language was dense, so I used AI to distil difficult passages into something I could actually absorb in an hour

Write by hand Handwritten notes in an A5 notebook with a fountain pen (black ink, fine nib). No AI, no typing. ~13 pages of rather small handwriting by the end. Forces real understanding, not just pattern matching

Publish Upload the handwritten notes to AI to structure them for the website. The writing and understanding is mine. This entire website section was designed and built by agentic models (entirely vibe coded)

7 / 7 days

Day 1 Sat, Mar 21

How does the brain begin processing vision?

Discovered that the brain splits visual processing into two parallel streams right from the retina: the dorsal stream (fast, spatial, action-oriented) using M-cells, and the ventral stream (slow, detailed, recognition-oriented) using P-cells. The dorsal stream operates in real-time with no memory, while the ventral stream builds meaning hierarchically.

Day 2 Sun, Mar 22

What mechanisms power each visual stream?

Mapped out the specific mechanisms at each stage: motion energy and optical flow in the dorsal stream, hierarchical assembly and population coding in the ventral stream. Drew architectural diagrams for both pathways. Key insight: the ventral stream has more feedback connections running downward than feedforward ones running upward. The brain predicts what it should see.

Day 3 Mon, Mar 23

How do neurons detect and assemble edges?

Dove into V1's simple and complex cells, the brain's edge detectors that work like CNN kernels. Discovered that edge detection moved closer to the cortex through evolution: mice detect edges before the cortex, cats use simple cells, but primates skip straight to complex cells. This gave us more flexibility because the cortex is plastic and connected to everything.

Day 4 Tue, Mar 24

How do neurons use context from their neighbours?

Learned that neurons don't work alone. They communicate with neighbours via horizontal connections. Contextual modulation helps distinguish real edges from noise, and inhibitory surround (end-stopping) separates object edges from background. Also discovered that without tiny involuntary eye movements (saccades), vision would literally fade to nothing.

Day 5 Thu, Mar 26

How does the brain perceive depth?

Deep dive into depth perception: three types of depth neurons, amodal completion (the brain fills in what's hidden), Kanizsa triangles (seeing edges that don't exist), disparity capture, and Da Vinci stereopsis where the absence of information itself becomes a depth signal. Also covered border ownership, brightness perception, pop-out vs serial search, and attention as the final gate.

Day 6 Fri, Mar 27

How does seeing become knowing?

Entered high-level visual processing. V4 computes colour constancy and begins invariance, posterior IT builds complete object representations using overlapping columns and population coding (like vector embeddings), anterior IT connects perception to meaning. The agnosias revealed the architecture: damage different regions and different abilities break, whether perception, naming, face identity, or entire categories.

Day 7 Sat, Mar 28

The complete architecture — how vision flows through the brain

Consolidated everything into two comprehensive architectural diagrams: the full dorsal stream (retina to motor action) and the full ventral stream (retina to contextualised memory). Synthesised all mechanisms, connected the dots between streams, and reflected on what the brain teaches us about computer vision.

So, how does the brain process what the eye sees?

It splits the work into two parallel streams, right from the retina. The dorsal stream is fast and memoryless: it tracks motion, builds spatial maps, and guides your hands and eyes in real-time. It never "sees" objects. The ventral stream is slow and hierarchical: it detects edges, assembles contours, extracts shapes, builds object representations, and connects them to meaning and memory.

At every stage, the brain uses relative perception (never absolute), fills in gaps it can't see, predicts what should come next, and resolves ambiguity through context. Perception is not passive recording. It is active construction. The brain doesn't see the world as it is. It builds a model of what the world should be, and checks it against incoming data. When the model is good enough, that's what you perceive.

The whole journey, from photon hitting the retina to a fully contextualised, emotionally-tagged memory, takes roughly 150 milliseconds.