The Rules We Speak By

In many contexts in life there are clear rules that govern speaking. In a classroom, you raise your hand if you have something to say. On the witness stand, you answer questions clearly and concisely. Behind a podium, you deliver a complete address with a beginning, middle, and end. Someone with no understanding of these rules would be unable to participate adequately in these aspects of social life; they would jam up the gears of the speech-production machines that are the school, the courtroom, and the auditorium.

But casual conversation is also a sort of machine, and by the time we reach adolescence we are so adept at the rules governing its operation that we don’t even realize we’re following rules. These rules are why we aren’t constantly talking over one another, for example, and why our conversations don’t just fall apart after a misunderstanding.

The discipline of sociolinguistics concerned with these rules is called Conversation Analysis (CA), and it traces its roots back to sociologists Harvey Sacks, Emanuel Schegloff, and Gail Jefferson. At times CA can look like a painstakingly minute breakdown of obvious phenomena, and when I first learned about it I dismissed it as boring academic nonsense. After that, though, I found myself looking at routine social interactions through new eyes. As is often the case, the obvious phenomena that CA studies are, upon closer inspection, structures made up of unexamined and surprising parts. In this post, I’ll look at two interesting objects of study within CA that help provide a glimpse into the complexity underlying the seeming simplicity of casual conversation.


The following has most likely happened to you: a friend or acquaintance is talking, and you have something to say in response, but you have to wait a few minutes to get a word in edgewise. Finally you spot an appropriate moment and interject whatever it is you have to say. But that “appropriate moment” didn’t necessarily come because your interlocutor had finished speaking; if you hadn’t acted, he or she probably would have continued. And before that moment came, it wasn’t literally true that you couldn’t have gotten a word in edgewise; if you’d really wanted to you could have interrupted, of course, though at the cost of seeming rude.

This interaction is only possible because you and your interlocutor are following a nearly identical set of rules, which Sacks, Schegloff, and Jefferson set out in a 1974 paper. They referred to the “appropriate moment” in the example above as a “transition relevance place,” or TRP. One of three things can happen when a TRP is reached:

  1. A participant designated by the current speaker can begin speaking, thus claiming the next turn in the conversation. This happens if, for example, the current speaker asks someone else a question; if the rules are followed that person will respond, and under normal circumstances it would be rude for anyone else to interject before the designated person.
  2. If the current speaker does not designate a successor, anyone may self-select and begin speaking, claiming the next turn for themself.
  3. If the current speaker does not designate a successor and no one else self-selects, the current speaker can self-select and continue talking.

As you can see, there is a sort of hierarchy here; 1 takes precedence over 2, which takes precedence over 3. Taken together they produce a normal, unexceptional change in speakers, much like a peaceful political transfer of power. But laying these rules out explicitly allows us to talk about familiar phenomena in new ways. A rhetorical question, for example, can be defined as a question that does not create a TRP. An interruption is an abrupt transition in turns that isn’t triggered by a TRP.

So how do we know when a TRP is occurring? In other words, how are turns defined, and when do we know they’ve ended? It’s certainly not just a function of the length of the turn; a turn can be a sentence or phrase, an entire lecture, or just a grunt. A turn is also not, in general, something that can end suddenly without warning; we can tell when someone’s about to give us a chance to reply, and in fact a 1986 paper by Thomas Wilson and Don Zimmerman found that people can tell that a pause is about to be reached before the completion of a turn. We can do this because we process language one word at a time, rather than waiting for a phrase or sentence to be completed before we try to comprehend it. (This, incidentally, also explains the phenomenon of garden path sentences.)

So if it’s not anything as simple as fixed-length turns or the silence after a turn ends, how do we know when we’re reaching a TRP? A 1983 study by Bengt Oreström looked at five linguistic features to see how they correlated with turn-taking (i.e., a TRP in which the speaker changes) in a large sample of two-person conversations(1). That paper found a high level of correlation (about 95%) with each of three features: completion of a prosodic sequence (i.e., the end of an intonational unit, like the final upswing in pitch that accompanies a question), completion of a syntactic sequence (like the end of a sentence), and completion of a semantic sequence (like the end of an anecdote). The other two linguistic features Oreström looked at, a decrease in volume and a silent pause, correlated with turn-taking only about 40% of the time. So, to sum up, we can tell that a TRP is about to happen because of intonational, grammatical, and logical clues in the speaker’s speech, which usually coincide.

The regulation of turn-taking in conversations conducted over the phone is inherently harder because we have fewer cues available about the intentions and reactions of our interlocutors; we can’t see them opening their mouths, for example, or indicating through facial expressions that they have something they’d like to say. All we have is a voice, and we may have to pay closer attention to the timing of our own speech in order to avoid derailing the conversation. For example, if we want to make the general noises of assent and support that are often appropriate in conversation—”right,” “yeah,” “uh huh”—we have to fit them carefully into the natural sub-TRP pauses so they’ll be audible but not intrusive. If we mistime them, it might sound like we’re interrupting, and our interlocutor might cede the floor to us when we’re not trying to claim it, or they might simply go unnoticed or drown out an important word we need to hear.


Speaking of which, another fruitful field of study within CA consists of what’s called conversational repair strategies, which speakers employ when something has gone awry. Perhaps the listener has missed or misheard something, or perhaps the speaker has misspoken or realizes that the listener has misunderstood. The most common repair strategy is simply self-correction: the speaker realizes that he or she has said something wrong or omitted something and, without giving up the floor, corrects the mistake.

But sometimes the listener is the one who detects the breakdown in communication and initiates the repair. Most simply this can take the form of a question: “What?” or “Where did you say he went?”, for example. The listener can also provide a possible correction: for example, “Fifth Street? Do you mean Fifth Avenue?” Sometimes we can be even more explicit, as for example in this strip from the brilliant webcomic Achewood, wherein one character, Ray, who has obtained the high-tech helicopter from the ’80s TV show Airwolf, calls another character, Téodor:

Ray: Hey Téodor! You got any problems or anything?

Téodor: Ray? What are you doing up this early?

Ray: I have Airwolf. I’m just seein’ if there’s anything I can do to help my guys.

Téodor: I’m not sure what you mean. I don’t know if what you’re saying means anything.

Ray: I have Airwolf. This is not code language. I am flying Airwolf because I own Airwolf. Nothing else I could say would make more sense given what I own and what I am doing at this moment.

In this exchange, Téodor initially assumes that “I have Airwolf” is meant in a non-literal sense and communicates to Ray that a misunderstanding has taken place. Ray understands the source of Téodor’s confusion and clarifies that the sentence in question was meant literally.

Normally in face-to-face conversations our repair strategies are not so obvious; the breakdown in communication is usually dealt with almost instantaneously and the matter is immediately forgotten. Repair can be more difficult in conversations over the phone, and even more difficult in conversations via text message or instant message. I can personally think of a number of times when a misunderstanding in an instant message conversation has left me so unsure of what went wrong that I’m not even sure what repair strategy to employ. This can lead to awkward situations, especially if the interlocutor isn’t someone I know well, and it shows how adroitly we usually make use of repair strategies to smooth the path of conversation.


There are a category of computer science problems that deal with different agents (programs, servers, computers, etc.) communicating with each other to determine the allocation of resources. Some of the more colorful of these problems include the dining philosophers problem, the sleeping barber problem, and the cigarette smokers problem. When looked at outside the context of computer science, some of these problems make no sense; why can’t the dining philosophers just talk to each other and figure out how to share their forks? The answer, of course, is that in computer science there are strict rules governing their interactions, and the best solution must be found within the constraints of those rules.

Rules also govern human interactions, although they may not be as strict as those governing computer interactions. We may not realize it, but in our daily lives we solve hundreds of problems comparable to the ones about sleeping barbers and cigarette smokers: how do we get away from this person who won’t stop talking? How do we communicate our confusion to the person we’ve misunderstood? Fortunately most of us are experts at problems like these, and we never think about the complexity we’re navigating unless we happen to be studying CA.

(1) A more recent study found similar results for a wider range of conversations.

Cross-posted at Empire Avenue.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: