DNA: The Double Helix — Structure, Base Pairing, and Replication

Inside every cell sits two metres of DNA coiled into a space six micrometres across. How the double helix works, why complementary base pairing matters, and how one molecule copies itself with extraordinary fidelity — with Mukherjee on the moment the mechanism of heredity became visible.

Last updated May 26, 2026

TL;DR

Four bases (A, T, G, C), Chargaff rules, Watson-Crick double helix, semi-conservative replication via helicase and DNA polymerase. Error rate ~1 in 10^9. Mukherjee perspective on the 1953 discovery.

Inside every one of your 37 trillion cells sits a molecule two metres long, coiled so tightly it fits inside a space six micrometres across. That molecule — deoxyribonucleic acid — is the master archive of everything the cell needs to build, run, and replicate itself. Understanding DNA begins with understanding its physical shape, because the shape is the mechanism.

The four-letter alphabet

DNA is built from four chemical units called nucleotides, each consisting of a sugar (deoxyribose), a phosphate group, and one of four nitrogen-containing bases: adenine (A), thymine (T), guanine (G), and cytosine (C). The bases are the letters of the genetic code. Their sequence along the DNA strand is the information.

Complementary base pairing: A-T (2 hydrogen bonds) and G-C (3 hydrogen bonds). Erwin Chargaff noticed the ratio symmetry in 1950; Watson and Crick used it to deduce the double-helix structure in 1953.

The double helix

James Watson and Francis Crick's 1953 paper in Nature — building on X-ray crystallography data produced by Rosalind Franklin and Maurice Wilkins — described DNA's three-dimensional structure: two strands wound around each other in a right-handed spiral, with the sugar-phosphate backbones on the outside and the bases pointing inward, held together by the hydrogen bonds between complementary pairs.

"Watson and Crick's double helix wasn't just some pretty shape. It was a mechanism. The moment you saw it, you understood how information could be copied."
— Siddhartha Mukherjee, from lectures on The Gene

The genius of the structure is its self-explaining elegance: each strand is the template for rebuilding the other. When a cell divides, the helix unzips, and each single strand serves as the pattern from which a new complementary strand is synthesised. The result: two identical double helices from one.

The double helix: two antiparallel strands wound around a central axis. During replication the strands separate and each serves as a template — producing two identical copies, each with one original and one new strand (semi-conservative replication).

From molecule to information

The sequence of bases along a DNA strand is analogous to a text written in a four-letter alphabet. The order of letters encodes instructions: which amino acids to assemble into proteins, when to activate or silence a gene, and how to regulate the cell's entire metabolic programme. Three consecutive bases (a codon) specify one amino acid; the full set of three-letter words translates into the proteins that build and operate every living cell.

The human genome contains approximately 3.2 billion base pairs — enough text, if printed, to fill several thousand books. Of that sequence, only about 1.5% encodes proteins directly. The rest was once dismissed as "junk DNA" but is now understood to include regulatory regions, structural elements, and sequences whose function is still being mapped. As Mukherjee observed, the discovery of the gene's physical structure was only the beginning of understanding what genes actually do.

Why the structure matters

Watson and Crick's double helix was not merely a molecular discovery — it was the answer to the most fundamental biological question: how is hereditary information stored and copied with enough fidelity to transmit across generations, yet with enough variation to allow evolution? The answer is encoded in the molecule's shape. The complementary base-pairing rule ensures accurate copying; the four-base alphabet allows virtually unlimited informational complexity; and the antiparallel strand orientation provides the directionality that the replication machinery requires.

"Here was Watson, here's this is where Watson stood up and said let's do this. It was a visual tour, as it were, of history — the moment the mechanism of heredity became visible."
— Siddhartha Mukherjee

DNA: The Double Helix — Structure, Base Pairing, and Replication

The four-letter alphabet

The double helix

From molecule to information

Why the structure matters

RNA, Transcription, and Translation: How DNA Instructions Become Proteins

Chromosomes: How Two Metres of DNA Fits Inside a Cell — and Why the Packaging Matters

Alleles and Inheritance: Mendel, Dominant and Recessive, and Why Traits Skip Generations

Sex Chromosomes: Why Sons Get Y from Father and X from Mother — and What That Means

The Science of Living Longer: mTOR, NAD+, Senescence, and What Actually Works

The Laws of Human Nature: Robert Greene on Why You Never See Yourself Clearly

Transformers and the Coming of AGI: What Geoffrey Hinton Says