Steve Blank What is missing in Zoom reminds us of what it means to be human


In the past month, billions of people have refused to participate in the largest involuntary social experience ever – testing how video conferencing has replaced face-to-face communication.

While we have discovered that in many cases this is possible, more importantly, we have discovered that, regardless of bandwidth and video resolution, these applications do not have the signals that humans use when they communicate. Although we may be spending as much time in meetings, we find that we are less productive, social interactions are less satisfying and distance learning is less effective. And we are frustrated not to know why.

Here is why videoconferencing apps don't capture the complexity of human interaction.


All of us who are staying at home, have used video conferencing apps for virtual business meetings, virtual cafes with friends, family reunions, online courses, etc. And while technology allows us to do business, see friends and transfer information one-on-one and one to many of our homes, something is missing. It's just not the same as logging on to the conference room table, the classroom, or the local cafe. And it seems more exhausting. Why?

What is missing?
It turns out that today's video conferencing technology does not emulate the way people interact with others in person. Each of these video apps has overlooked half a century of research into how people communicate.

Meeting place
In the physical world, space and context give you clues and reinforcement. Meet the 47e conference room on the ground with a magnificent view? Are you surrounded by other lively conversations in a cafe or sitting with other classmates in a conference room? With people working from home, you can't say where the meeting is or how important the location or setting is. In a videoconference, all the contextual clues are homogenized. You look the same whether you play poker or make a sales call, in a suit or without pants. (And with videoconferencing, people see your private space. Now you need to check if there's something annoying lying around. Or your kids are yelling and interrupting meetings. It's tiring to wear. ; try to separate business and family life.)

In the real world, you just don't teleport to a meeting. Video conferencing misses transitions when you enter a building, find the room and sit down. The same transitions are missing when you leave a video conference. There is no entry or exit. The conference has just ended.

Physical contact
Second, most business and social gatherings start with physical contact – a handshake or a hug. There is something about this first physical interaction that communicates trust and connection through touch. In business meetings, there is also the official business card exchange ritual. These are all preambles to link to the next meeting.

Meeting space background
In person, we visually take a lot more information than just looking at someone's face. If we are in a business meeting, we will scan the room, quickly changing the look. We can see what's on the desks or hanging on the walls, what's on the shelves or in the cubicles. If we are in a conference or classroom, we will see next to who we are sitting, notice what they are wearing, wearing, reading, etc. We can see the relationships between people and notice deference, hierarchy, lateral looks and other subtle clues. And we use all of this to build context and make assumptions – often unconsciously – about personalities, positions, social status and hierarchy.

Looking in a mirror while having a meeting
Before you meet in person, you can quickly check your appearance, but you certainly don't hold a mirror in the middle of a meeting by constantly seeing what you look like. Yet, by focusing on us as much as on the participants, most video applications seem designed to make us aware and distract from watching who is talking.

Non-verbal cues
More importantly, researchers have known for at least fifty years that at least half of the way we communicate is done by nonverbal signals. In conversation, we look at each other's hands, follow their gestures, focus on their facial expressions and their tone of voice. We make eye contact and find out if they do. And we constantly follow their body language (posture, body orientation, how they stand or sit, etc.)

In a group meeting, it's not just following the speaker's cues, but it's often the side looks, the rolling eyes and the shrugs between our peers and the other participants who provide direction and nuance to the content of a meeting. On a computer screen, all this interaction between people is lost.

The sum of these nonverbal signals is the background (still often unconscious) of each conversation.

But video conferencing apps just offer a fixed look from a single camera. Everyone is relegated to a one-dimensional square on the screen. It is the equivalent of having your head in a vice, of having been brought to a meeting with blinders while being attached to a chair.

Are olfactory clues Another missing piece?
There is another set of communication clues that we might miss on the video. Scientists have discovered that in animals, including mammals and primates, communication involves not only words, gestures, body language and facial expressions, but also odors through the exchange of products chemicals and hormones called pheromones. These are not smells that are consciously registered, but which are nevertheless picked up by the olfactory bulb in our nose. Pheromones send signals to the brain about sexual status, danger and social organization. It is assumed that odors and pheromones control some of our social behaviors and regulate hormone levels. Could these olfactory cues be an additional element of what we miss when we try to communicate by video? If so, emulating these clues digitally will be a real challenge.

Why zooming and video conferencing is exhausting
If you spent a long time using video for a social or professional meeting during the pandemic, you have probably found it exhausting. Or if you use video for learning, you may realize that it affects your learning by reducing your ability to process and store information.

We are exhausted because of the additional cognitive processing (fancy word for having to think consciously) to fill the missing 50% of the conversation that we normally receive from nonverbal and olfactory cues. It is the accumulation of all these missing signals that causes mental fatigue.

Turning winners into losers
And there's one more thing that makes video apps awesome. Although they save a lot of time for initial meetings and the selection of prospects, sellers find it difficult to close complex deals via video. Even taking into account the economy, the reason is that in person, big sellers can "read" a meeting. For example, they can say when someone who nodded their head to negotiate meant "no way." Or they can pick up the "tell me more" when someone leans forward. In Zoom, all these landmarks have disappeared. As a result, agreements that should be easy to make will take longer, and those that are difficult will not happen. You invest the same or more time to get the meetings, but you're frustrated that little to no progress is happening. It is a productivity factor for sales.

In social situations, a feeling of body language can help us feel that a friend who smiles and says that everything is fine is actually having trouble in his personal life. Without these physical signals – and loss of physical contact – can lead to greater distance between our family and friends. The video can bridge the distance but lacks empathy communicated by a hug.

An opportunity for innovators to take videoconferencing to the next level
This scientific experience of a billion people replacing face-to-face communication with digital convinced me of several things:

  1. The current generation of videoconferencing apps ignores the way humans communicate
    • They do not help us pick up non-verbal communication signals – touch, gestures, postures, looks, smells, etc.
    • They haven't done their homework to understand the importance of each of these clues and how they interact with each other. (What is the ranking order of the importance of each benchmark?)
    • They also do not know which of these clues is important in different contexts. For example, what are the good clues to signal empathy in social circles, sincerity, reliability and relationships in business circles or attention and understanding in business education?
  2. There is a real opportunity for a new generation of videoconferencing applications to fill these gaps. These new products will begin to solve problems such as: How will you shake hands? Exchange business cards? Become aware of the environment around the enclosure? Notice the non-verbal cues?
  3. There are already startups offering emotion detection and analysis software that measures speech patterns and facial cues to infer feelings and levels of attention. None of these tools are currently integrated into widely used video conferencing applications. And none of them is yet sensitive to the context of particular meeting types. Perhaps an augmented reality overlay with non-verbal cues for professional users could be a first step as powerful additions.

Lessons learned

  • Today's video conferencing applications are a unique technical solution to the complexity of human interaction
    • Without missing nonverbal cues, business is less productive, social interactions are less satisfying and distance learning is less effective
  • It is possible for someone to create the next generation of videoconferencing applications capable of recognizing key clues in the appropriate context.
    • This time, psychologists and cognitive researchers lead the team

Filed under: Family / Career / Culture |