![]() |
|
Know How Visually Literate Your
Viewers Are
by Robert Gershon
When the viewer begins to watch a program, he makes an unconscious decision
to believe that, to some extent, it is real. That doesn't mean that he will
think the two-dimensional screen is three-dimensional space, or that cartoon
characters are real people, or even that actors are the characters they portray.
It means that, for the time he dedicates to the program, he will choose to live
partially within the world the television sets up for him, and within those
limitations, to accept the items, facts, and situations of that world as somehow
relevent to his own life. Just as we have developed certain methods of understanding
the world from living in it (e.g., distant items appear smaller than close ones,
particular sounds come from particular objects, etc.), we have developed methods
of understand ing the video world from watching TV (e.g., a dissolve means time
passes, etc.).
These methods of understanding are the conventions of the video medium; they
organize our perceptions the same way grammar and syntax organize our perceptions
of the words we read. And just as we are unconscious of this ordering process
when we read, so are we unconscious of the many organiza tional principles involved
in video when we view. If we are to produce video that takes advantage of these
conventions and allows viewers to enter the world of the program most easily,
we must be conscious of the various organizing principles and their meaning--of
the grammar and syntax of video.
The world of video, like our real world, is one of time and space. Particular
principles apply to the organization of space--what we see within the frame,
and of time--how the program progress es from frame to frame. The conventions
of video space are, not surprisingly, those of the camera: its shot, angle,
and framing. Given shots elicit particular responses in an audience. Close-ups
indicate intensity--the closer, the more intense. As shots widen out, the level
of intensity falls. This comes as no surprise, yet as directors, we must remember
to use appropriate shots at appropriate times. A wide angle shot during an emotional
moment leaves our viewers with a sense of missing something; an ECU during a
passive period leaves the audience puzzled. Angle shots denote power relation
ships. A low angle makes the subject appear powerful; a high angle makes it
vulnerable. This again, is elementary video, but it is somehow hard to keep
in mind when we place our talent in an easy chair and the camera operator sets
the camera at a comfortable eye level. The resulting high angle will be subtle,
but it will be there. Less obvious viewer expectations are encountered in framing.
The "rule of thirds" suggests that the major features within a shot
be placed one-third of the way into the frame. A camera operator's first impulse
is to center the basic subject of a shot. While this is often the best method
of rendering an object, equally often, subtle framing adjustment to place important
features of the subject around the third-of-the-screen areas creates a more
interesting and less static mood. The most important use of this "rule"
occurs in shots of people. The area of greatest viewer attention is the subject's
eyes. So, while the subject may be to the right, left, or in the center, depending
on other circumstances, a good rule of thumb is to place the subject's eyes
about a third of the way down the screen. This rule applies for almost all shots,
from long shots to ECUs. Using this rule of thumb usually helps to avoid another
pitfall--cutting the head off at the neck, making the face look like an orange
on a table. Any time we cut off a subject at a natural point of division-- the
neck, the exact top of the head, the waist, the bottom of the feet, etc.--we
tend to isolate that image and allow the viewer to think it can exist in itself.
When we add a bit of the shoulder or hips, the viewer is likely to fill in the
rest of the body. When we add a bit of space above the head, the viewer is likely
to fill in background. When we clip the top of the head, the viewer mentally
fills it in so that the shot will appear more natural and full.
The placement of figures, particularly head shots, angularly and laterally on
the screen also creates particular relation ships between subject and viewer.
The closer the camera is to a full face shot with the subject looking straight
out of the screen, the more direct his or her relationship with the viewer will
seem. This apparently simple observation is based on a complex relationship
among program, talent, and viewer. When the subject faces the audience directly,
the viewer no longer merely observes the "world" of the program. That"world"
now speaks to the viewer directly. This has advantages and disadvantages. Among
the former is the efficiency of transmitting abstract verbal information. The
"talking head" is perceived differently from other on-camera people.
The
speaker's language is more precise and flawless than that of the people we meet
every day, or those whom we meet in in terviews. The speaker is the voice of
the program itself and expresses what is real and true within its world. The
disadvantage of this type of presentation is the lack of sensory information.
The picture and vocal quality of the "talking head" are secondary
to the words spoken. The viewer gains little from either the visuals or the
vocal patterns. The opposite situation is the presentation of a conversation.
Here, the subjects rarely look at the camera and therefore rarely create a direct
relationship with the viewer, who becomes basically an observer. The subjects,
however, do appear as people who have distinct existences outside of the program.
The viewer, therefore, is informed by their conversation on the one hand, and
on the other, is also informed about them by the way they look and speak. These
two situations represent opposite ends of a continuum. In the middle lie intermediate
types of relationships between subject and viewer. Most common among these is
the subject who looks in the general direction of the camera, but not directly
out of the frame at the viewer. The subject speaks to someone just off-camera.
We do not see this person, however, so he does not interpose himself between
the subject and the audience. As a result, the subject communicates more directly
with the audience than would be the case if someone else were also in the frame.
The tone of this technique is more conversational, however than it is when subject
addresses the audience directly. In this format, the talent presents information
as a participant in the material of the show, not as its official voice. The
direction in which a subject faces also has a bearing on the viewer's perception.
People in this culture, where we read from left to right, tend to perceive that
direction as the "correct" one and the reverse somehow vaguely awry.
A person who speaks toward the right tends to elicit a bit more confidence than
one who is facing left. While it is almost always imperative to provide stable,
continuous shots, the TV literate viewer is mindful of the reasons for violating
this dictum. The degree of steadiness of the camera shot affects the viewer
beyond the obvious annoyance of a jumpy picture. Most viewers have seen hand-held
camera work in exciting, perilous, and (most importantly) real circumstances.
The effect of such footage now is to suggest such a situation.
The conventions of video time are those of action and editing--of move ment
within the shot, and the assembly of shots. Linguist Richard Ohmann writes,
"To state something is first to create im balance, curiosity, where previously
there was nothing, and then to bring about a new balance." The same goes
for camera shots. Knowing he is immersed in a linear, continuous process, the
viewer constantly (if not totally consciously) wonders how the shot will end
and what shot will follow--in other words, what more is there to see? Understanding
this propensity, a director can open a shot out of focus and focus in to show
the subject. A shot can begin so tight that it indicates only an interesting
form, and then pull out to reveal its context. The camera can open on a wide
cover and zoom in to pick out the important element. While too many of any type
of shot can easily be irritating, the shot that opens on a question is more
than a cute trick. It is a method of involving viewers in the program by making
them invest energy and interest in wondering why an image is there and what
will follow. Camera moves can have a symbolic significance as well. While most
are used to keep a moving subject in the picture, the choice of how and when
all but the most automatic movements are made conveys meaning to the viewer.
The most obvious are the dramatic moves like swish pans and quick zooms. The
former indicates an instant question--where are we going? The latter is something
like an exclamation point--it emphasizes an image. Both could be replaced by
cuts to the item at the end of the shot. But the impor tance here is not so
much that final image as the manner of getting there--the momentary confusion
you cause the viewer that makes him take part in the discovery of the final
image. Movement as simple as panning to keep us abreast of a walking subject
has elements of 'video grammar.' If the pan is slower than the subject so that
the talent moves forward in the frame, the viewer feels there is progress. If
the pan is faster, and the subject loses ground in the frame, the viewer can
feel the walker's frustration. Similarly, movement toward and away from the
camera can be sped up or slowed down by use of short or long lenses. A runner
shot in telephoto will seem to struggle in vain in an attempt to reach the camera;
a normal walking gait will appear as a lunge forward if shot in wide angle.
Zooms and dollies are frequently used for the same purpose--to gain a closer
view of the subject. The two have very different subsidiary effects in attaining
that goal, however. The zoom works primarily in two dimensions. As more and
more of the target subject fills the frame, images on the periphery merely disappear
off the edge. The zoom tends to isolate the subject because it enlarges
background objects as much as it does the subject, thereby eliminating most
of the environment. The telephoto image compounds this isolation by throwing
the background out of focus. The dolly is more three-dimensional. The camera
passes peripheral objects and, as they are lost from the frame, they appear
to remain behind the camera, leaving us with the mystery-film feeling that they
are still part of the world of the program. The dolly increases the size of
the subject much more than it does that of the background, leaving more of the
relationship of subject and environment in the frame.
Another set of conventions concerning video time is even more challenging. Film
and television programs follow what McLuhan calls a "logic of lineality"--whenever
the shot changes, the new shot appears to fit into the program merely because
it follows the preceding one. As we mentioned earlier, viewers who have chosen
to live partly within the world of the program will attempt to make that world
follow a logical pattern. In doing so, they assume that whatever follows must
belong there, if not chronologically then because of some other logical pattern
in the form of the program. In a famous experiment, Lev Kuleshov, the Russian
filmmaker of the '20s, took a single shot of an impassive actor and juxtaposed
it with three images: a plate of soup, a knife, and a dead child. Viewers of
the different versions had no trouble identifying the actor's emotions and praised
him for his realistic portrayal of hunger, fear, and sorrow. Editing, then,
depends on an extraordinarily strong acceptance of its conventions by the audience.
Thus unless the viewer is for some reason to be sensitized to this process of
juxtaposition (as in a classical montage sequence), it is crucial that edits
do not call attention to themselves and thereby disturb the viewer's acceptance
of the sequence. From this necessity have come such editing principles as cutting
on action, the "30 rule," and spacing camera angles over an area less
than 180 (i.e., the eyeline). The principle of cutting on action uses subject
movement to strengthen the connection between the two shots. Not only is the
viewer less likely to notice changes in the exact position of an object if it
is moving, but the movement itself becomes an element which is carried on from
shot to shot.
The "30 rule" stipulates that the cam era angle be changed at least
30 in succeeding shots of the same subject if the sequence is to be easy to
perceive as continuous. The viewer is conditioned to expect a change in visual
information with a change in shots so that successive shots may build up to
a full perception of the subject. If the subject remains the same, a change
in angle is necessary to deliver additional visual information. A concurrent
change in the size of the subject within the frame adds further new information.
In non-fiction video, interview footage often must be shot from a single camera
position allowing only changes in framing. Furthermore, it is not uncommon for
the most logical edit in terms of verbal in formation to require juxtaposing
images of similar size and angle. The easiest way around such a dilemma, assuming
the footage is already shot, is to insert a piece of cutaway footage--a three-quarter
rear angle shot of the subject, possibly with an interviewer in the background
listening, or wild footage of some aspect of the topic being discussed. The
more easily accomplished audio edit behind the cutaway in no way alerts the
viewer to the change. Interestingly enough, the necessary use of same-angle
edits in non-fiction television has caused the development of a new convention.
Viewers seem to associate this type of interview so strongly with news and non-fiction
that commercials using the consumer interview format frequently edit in this
manner to stimulate in the viewer association with real subjects developed through
years of watching non-fiction. The spacing of camera angles on one side of a
180 axis, or eyeline, is crucial to a viewer's sense of continuity. Use of this
method assures the viewer that any unmoving subject will continue to be oriented
in the same direction. If a cover shot shows two people facing each other, every
camera angle must keep the same person on the same side of the screen. Crossing
the eyeline with the camera between shots will result in an image where the
subjects are reversed. Similarly, a series of shots of a moving vehicle must
be shot with the vehicle always moving the same way. If it is necessary to shoot
from the other side across the eyeline, a neutral shot--from directly in front
or back of the vehicle--can ease the transi tion for the viewer.
In print, punctuation and paragraphing indicate the pacing of the word structure.
In video, the type of transition used between shots indicates temporal and spatial
relationships between one shot and the next. Cuts are obviously the most instantaneous.
Because they usually do least to call attention to a change, they indicate the
least change in the world of the program. Dissolves tend to make the audience
feel there has been a change in time or lo cation. The relatively slow replacement
of one scene with the next one acts like a change in paragraph, indicating to
the viewer the end of one series of thoughts and the beginning of another. Wipes
and effect wipes work similarly, although they give an added strength to the
feeling that the new scene is replacing, rather than simply following, the previous
one. Because of this strength, they tend to call a great deal of attention to
the transition, making their continued use somewhat overpowering. Fade-outs
and fade-ins indicate changes in thought of even greater magnitude than those
suggested by the dissolve or wipe, since one scene completely disappears before
the next appears. They are akin to changes in chapter or section in written
material. These transitions all depend on convention for their effect. Some
editing techniques however, depend on these conventions to suggest one relationship
while the shots involved show another. The result is a jolt to the audience
because the convention is violated. While cuts are generally used to minimize
attention to the transition, the jump cut makes blatant changes in time and
condenses the material in such a way as to shock the viewer with its rapidity.
Similarly, montage--a quick succession of cuts--uses the transition between
a number of shots to make the speed and variety of successive images as much
the subject of the sequence as is the content of each shot. Montage overwhelms
the viewer with the complexity of a situation within the world of the program,
showing the need for attention to an enormous number of items in an obviously
inadequate period of time. Matched dissolves, causing a section of one shot
to blend onto a similar section of the next, show a change in time or place,
but also indicate a strong relationship between the matched elements in each
shot. Thus while the general pattern of thought is changed, as we have seen,
specific subjects bridge that change and are seen as common to both.
Video conventions are continually changing to create new methods of dealing
with viewer expectations. For years only non-fiction used same-angle cuts, but
their use in commercials will change viewer belief that they express reality.
While viewers used to expect a shock at the end of a swish pan after a detective
enters a room, now they are jarred by the revelation of an empty corner instead.
No "video grammar" can anticipate a complete set of viewer expectations
for very long. Rather, as video professionals who expect to use video to its
fullest ex tent, we must be aware of the current range of video literacy, and
must monitor changing uses of visual convention if we are to communicate--and
be able to continue communicating--effectively in the medium.
Home Programs Courses Facilities Faculty Alumni Student Work News Internships Admissions
e-mail CSC Communication Department
Last updated 8-4-02