How Humans Speak—and Why Chimps Don’t

We humans think we’re pretty special as far as animals go, and one thing that makes us unique is our ability to communicate super complex ideas through language and, ultimately, speech itself. Human speech is an important part of everyday life but not one that should be taken for granted. 

Let’s look at the anatomy and evolution of speech and why humans can talk while other primates like chimpanzees can’t. 

But before we dive in, let’s make sure we’ve got our definitions straight: speech is the production of sounds to communicate, while language is the system of words we use to communicate with each other. Speech is spoken language, but language can exist without speech (like the written language you’re reading right now). 

Got it? Great! 

Speech anatomy 

First, let’s do a quick overview of the anatomical structures that make speech happen. 

During exhalation, air travels from the lungs into the trachea, also known as the windpipe. At the top of the trachea is the larynx, and the larynx contains bands of tissue known as vocal cords or vocal folds. 

trachea and larynx

Image from Visible Body Suite

Contracting the muscles in the larynx allows us to manipulate the vocal cords. By altering the tension of the vocal cords and the amount of space between them (aka the glottis), we can control the pitch, volume, and tonal quality of the voice. 

There are nine muscles in the larynx that affect the vocal cords and glottis: 

  • The vocalis increases the thickness of the vocal cords
  • The thyroarytenoid shortens and relaxes the vocal cords
  • The thyroepiglottic depresses the epiglottis
  • The cricothyroid lengthens and stretches the vocal cords
  • The lateral cricoarytenoid closes the glottis 
  • The oblique arytenoid narrows the laryngeal inlet
  • The posterior crioarytenoid separates the vocal folds
  • The transverse arytenoid closes the posterior glottis
  • Last but not least, the aryepiglottic muscle depresses the epiglottis and closes off the larynx during swallowing

The larynx isn’t the only structure that impacts speech—it’s time to talk about the tongue. By coming into contact with the oropharyngeal wall, soft palate, and hard palate, the tongue can impede airflow. 


The tongue, hard palate, and soft palate. GIF from Visible Body Suite

Vowel sounds are made by changing the shape and size of the space the air passes through—tongue height, position, and roundness of the lips all contribute to the vowel sound we make. 

Consonants, on the other hand, involve the stopping and releasing of air; airflow can be impeded at the lips, teeth, alveolar ridge, hard palate, soft palate, uvula, oropharyngeal wall, epiglottis, and glottis. 

At birth, humans actually have a vocal tract similar to nonhumans. As the infant develops, the roof of their mouth flexes, the tongue moves lower into the pharynx, and the larynx descends. 

However, speech doesn’t start in the lungs or the larynx; it starts in the brain. The brain has to remember the sequence of speech sounds before a word can be spoken, so it generates a mental representation of those sounds. That mental picture is turned into motor commands that the brain sends to the muscles to alter airflow and make the correct sounds. 

broca and wernickes

Broca's area (black arrow) and Wernicke's area (red arrow). Image from Visible Body Suite

There are three main parts of the brain concerned with speech: 

  • Wernicke’s area contains motor neurons and is associated with the processing of words
  • Broca’s area plans out the sentences you’re about to say
  • The motor cortex is responsible for muscle movements; it controls all those muscles you need to form words 

What makes humans different

Humans have the structural and neurological infrastructure to make and make sense of words, but what was the evolutionary process for speech? The answer is more than a little complicated. In fact, experts can’t even agree on when humans began speaking—estimates range from two million to 50,000 years ago. In 2019, a hotly debated paper went as far to posit that 25 million years ago, human ancestors had the physical capability of (some) speech, even if they didn’t have the cognitive capacity to use language. 

We do know that about 100,000 years ago, our mouths began to protrude less, and our tongues developed more flexibility, but it’s a huge challenge to understand the evolution of human speech—after all, speech and language aren’t something we can observe in the fossil record, and it’s difficult to tell how the vocal tract has changed over time. The larynx is made of soft tissues that do not fossilize, and brain tissues are long gone by the time we discover fossils. In fact, there is only one bone in the vocal tract, the hyoid bone. 

Since humans are the only species to communicate with this much complexity, many researchers focus on what makes us different from our closest relatives: our fellow primates. There are many differences between humans and other primates in regards to speech—for example, humans have far better breath control, and many primate species have vocal membranes that make sounds harder to control.


Photo by Alex Surd via Pexels.

The most popular theory of speech evolution is that changes in throat anatomy first allowed modern humans to speak. Compared to other primates and our early ancestors, humans’ larynxes are located much lower, and it’s thought that this anatomical difference allows us to make more complicated vowel sounds. 

On the other hand, recent research says that modern primates have “speech-ready” vocal tract anatomy but lack the brainpower to make speech happen. It’s been found that primates who make more sounds—like chimpanzees and bonobos, who communicate with about 40 different sounds—have larger cortical association areas, the areas of the brain that allow for voluntary control of behavior. This indicates that the brain has more of an influence on speech than the lower larynx theory claims. 

Since anatomy capable of speech was present in the homo sapiens' ancestors and is present in other primates, many researchers are examining genes like FOXP2, the so-called “language gene.” 

Let’s zoom in to the DNA level. Most animals have the FOXP2 gene; it’s important for song and mimicry in birds and echolocation in bats. At amino acid positions 303 and 325, humans have different amino acids than other primates, making our FOXP2 allele different from that of our closest relatives.

We know that Neanderthals and Denisovans both had the same special FOXP2 sequence, but modern humans may be unique in that they have another gene mutation that impacts the regulation of FOXP2 gene expression. 

The FOXP2 gene controls at least 116 other genes, regulates some muscle movement, and it’s involved in brain development—including areas of the brain that are involved in vocalization. FOXP2 related speech and language disorder occurs when the FOXP2 gene is disrupted. People with this disorder experience apraxia of speech, which means they have difficulty with the motor planning and production of speech. 

When the gene was first discovered, many thought that it was the key to understanding the evolution of human language, but despite its “language gene” moniker, FOXP2 isn’t a one stop shop for all your verbal communication needs. Further research has uncovered other gene mutations that also contribute to speech and early human brain development. 

Speech requires an immense amount of coordination between multiple muscles and multiple parts of the brain, so it’s unlikely there is just one reason why humans are capable of speech. It’s a complicated puzzle, and it’s one that will yield exciting research for years to come.


Be sure to subscribe to the Visible Body Blog for more awesomeness! 

Are you an instructor? We have award-winning 3D products and resources for your anatomy and physiology or biology course! Learn more here.