The Impact of Videos Presenting Speakers ’ Gestures and Facial Clues on Iranian EFL Learners ’ Listening Comprehension

The current research sought to explore the effectiveness of using videos presenting speakers’ gestures and facial clues on Iranian EFL learners’ listening comprehension proficiency. It was carried out at Ayandeh English Institute among 60 advanced female learners with the age range of 17-30 through a quasi-experimental research design. The researcher administered a TOEFL test to determine the homogeneity of the participants regarding both their general English language proficiency level and listening comprehension ability. Participants were randomly assigned into two groups. After coming up with the conclusion that the two groups were homogeneous, during 10 sessions of treatment, they received two different listening comprehension techniques, i.e. audio-visual group watching the video was equipped with the speaker’s gestures and facial clues, while the audio-only group could just listen to speaker’s voice and no additional clue was presented. Meanwhile, the participants were supposed to answer the questions related to each video. At the end of the treatment, both groups participated in the listening comprehension test of the Longman TOEFL test as the post-test. A t-test was used to compare the mean scores of the two groups, the result of which showed that the learners’ mean score in the audio-visual group was significantly higher than the learners’ mean score in the audio-only group. In conclusion, the result of this study suggests that foreign language pedagogy, especially for adult English learners, would benefit from applying videos presenting speakers’ gestures and facial clues.

believe gestures and facial cues can facilitate and improve face-to-face interactions involving L2 learners. They state that access to visual cues such as gestures and lip movements facilitate ESL students' listening comprehension. Their findings represent that learners' preference for visual cues (video-recorded materials) is more than audio-only cues due to their better comprehension of the lecture.
In spite of the importance of listening comprehension in communication and learning a foreign language( in this case English), little attention( at least in our country, Iran) has been paid to testing and applying different strategies and techniques in improving this skill. One of those useful strategies that may be less adopted in Iranian English classes is providing students with videos in which learners are provided with the speakers' gestures and facial clues as additional sources of clue to help them improve their listening comprehension proficiency. But due to teachers' or institute administrators' inadequate information and knowledge or insufficient facilities, learners cannot enjoy the benefits of this technique, and still most of the students have problem comprehending native speakers' words. Also, most of the current strategies focus on production and little insight has been lent to the process. Teachers just expect their students to be able to understand native speakers' utterances, but the process of achieving this goal is neglected.
The goal of this study is to overcome the difficulty with listening comprehension among EFL advanced Iranian students, and to discover the effectiveness of applying video-taped materials presenting the speakers' non-verbal behaviors (gestures and facial cues specifically) in improving listening comprehension proficiency.

Research question
This study seeks to answer the following question: Q: Does applying videos presenting speaker's gestures and facial clues have any significant effect on EFL learners' listening comprehension?

Statement of the hypothesis
H 0 : Applying videos presenting speaker's gestures and facial clues has no significant effect on EFL learners' listening comprehension.

Listening comprehension
Listening plays a crucial role in learning a foreign language. It is actually a problem-solving skill. Mendelsohn (1994) defines listening comprehension as "The ability to understand the spoken language" (p. 19). And Oxford (1993) describes this process as "perception of sounds, comprehension of meaning-bearing words, phrases, clauses, sentences and connected discourses" (p. 206).

Listening Strategies
According to Bloom (1970) there are some strategies that fluent L2 and even L1 listeners use to process speech. He states in order to become successful L2 listeners, teachers can teach the following strategies to their learners: · Getting the background information they need to know something about what they will hear, · Making predictions about what they will hear, · Ignoring information in the speech they don't need, · Noticing if they are not comprehending what they hear, · Checking their comprehension often as they listen, and, if they are speaking with another person, · Making an appropriate response to keep the conversation going. (1872) is the first study of non-verbal communication in which it is argued that all mammals show emotion reliably in their faces.

Charles Darwin's book named The Expression of the Emotions in Man and Animals
He believes in similarity in the facial expressions of emotions in all human beings in spite of their varying cultural backgrounds. Hogan (2008, pp.192-194) states that, "cultural differences in body language and non-verbal behaviors typically show up in the areas of eye contact, touch, gesture and territorial space".
He refers to body language as "Body position, gestures, eye contact and movements of the body" and defines non-verbal communication as follows: Nonverbal communication includes those things of body language [such as facial expressions or hands and arms movement] but also includes how people dress, social norms on dress and behavior, the jewelry people wear, the tattoos people reveal, the distance people stand from each other, the way people use time, the way people use space…even the tone and pitch of people's voices. (p. 2) 2.2.1 Non-verbal behavior's categories 2.2.1.1 Proxemics: it refers to the study of how people use and perceive the physical space around them, and influences the way the message is interpreted. Hogan (2008, p.49) defines proxemics as "the study of personal space and how humans use distance in general". 2.2.1.2 Chronemics: according to Burgoon, Buller, and Woodall (1994) and Hickson (1985), this category refers to the study of the use of time in nonverbal communication. The way we perceive time, structure our time and react to time is a powerful communication tool, and helps set the stage for communication. Time perceptions include punctuality and willingness to wait, the speed of speech and how long people are willing to listen. The timing and frequency of an action as well as the tempo and rhythm of communications within an interaction contributes to the interpretation of nonverbal messages. 2.2.1.3 Kinesics: kinesics which refers to movement and body position is another category consists of gesture and posture. Posture can be used to determine a participant's degree of attention or involvement, the difference in status between communicators, and the level of fondness a person has for the other communicator. Knapp and Hall (2007, p.9) investigated the effect of posture on interpersonal relationships and suggest that "mirror-image congruent postures, where one person's left side is parallel to the other person's right side, leads to favorable perception of communicators and positive speech; a person who displays a forward lean or a decrease in a backwards lean also signify positive sentiment during communication". 2.2.1.4 Gestures: a gesture is a non-vocal bodily movement intended to express meaning. Gestures may be articulated with the hands, arms or body, and also include movements of the head, face and eyes, such as winking, nodding, or rolling ones' eyes, which actually in this study face movements are considered as facial expressions and researcher treat such movements as a different category from gestures. Different gesture categories have been recognized by different researchers (Allen, 2000;Gullberg, 2006;McNeill 1992;Kusanagi, 2005). The most familiar ones are the so-called emblems or quotable gestures. These are conventional, culture-specific gestures that can be used as replacement for words, such as the hand-wave used in the US for "hello" and "goodbye". 2.2.1.5 Haptics: this is the last category which refers to the study of touching as nonverbal communication.
Touches that can be defined as communication include handshakes, holding hands, kissing (cheek, lips and hand), back slapping, high fives, a pat on the shoulder, and brushing an arm. Touching of oneself may include licking, picking, holding, and scratching. Knapp and Hall (2007, p. 9) refer to these behaviors as "adapter" or "tells" and believe that they may send messages that reveal the intentions or feelings of a communicator.

Facial Clues
Human expressions are good indicators of true individual emotions and internal feelings. James (2009, p.117) considers that human face consists of many muscles and "these muscles combine to create a complex range of emotional messages". Hogan (2008, pp.4, 165) considers facial expressions as "human's universal language all over the world". Rachel, Blais, Scheepers, Schyns, and Caldara (2009) define facial expression as: ...One or more motions or positions of the muscles in the skin. These movements convey the emotional state of the individual to observers. Facial expressions are a form of nonverbal communication. They are a primary means of conveying social information among aliens, but also occur in most other mammals and some other animal species. Facial expressions and their significance in the perceiver can, to some extent, vary between cultures. (pp. 543-548) Facial expressions are also a form of kinesics used to nonverbally transmit messages. According to Knapp and Hall (2007, p. 260): The face is rich in communicative potential. It is the primary site for communication of emotional states, it reflects interpersonal attitudes; it provides nonverbal feedback on the comments of others; and some scholars say it is the primary source of information next to human speech. For these reasons, and because of the face's visibility, we pay a great deal of attention to the messages we receive from the faces of others. Facial expressions consisting gazing, smiling, wrinkling our nose when disgusted, baring our teeth and narrowing our eyes when enraged, and staring wide eyed when frightened serve different functions in communication and can be attributable to a better understanding for the listeners. Darwin (1872) argues that: People make use of facial expressions because they are signs of serviceable associated habitsbehaviors that earlier in our evolutionary history had specific and direct functions. For a species that attacked by biting, baring the teeth was a necessary prelude to an attack; wrinkling the nose reduced the inhalation of foul odors; and so forth. (p.74) He also believes humans do these things because "over the course of their evolutionary history such behaviors have acquired communicative value: they provide others with external evidence of an individual's internal state" (p.80). Krauss et al, (1996) state that making use of such information generated evolutionary pressure to select sign behaviors, thereby schematizing them and, in Tinbergen's phrase, "emancipating them" from their original biological function.

Video-taped materials
Having known about the importance of listening comprehension and the effects of non-verbal behaviors in general and gestures in particular on listening, it is worthy to know other researchers' views about using videos presenting speakers' gestures as important instruments in achieving our goal which is the improvement at listening comprehension. Celce-Murcia (2002, p.52) states the rationales and logics of applying media in language classrooms as follow: · 1. Media plays an important role in learners' daily lives outside the classrooms, so its application in the classroom environment can motivate learners and cause the learning process happen more quickly. · 2. Audio-visual or video-taped materials create a contextualized situation in which meaning, content and instruction or guidance are all provided and can be exploited. · 3. Media can expose learners to authentic language, and relate the somehow artificial atmosphere of the classroom to the real outside world. · 4. Using media in classrooms can respond to different learning styles and address the needs of visual as well as auditory learners. · 5. Media can serve as additional and firsthand sources of input that decrease the danger of learners' dependency on their teachers' dialect or idialect. · 6. According to schema theory that new information can be accessed by scanning our memory for related knowledge, media can help learners enhance their use of prior background knowledge in learning process. · 7. Media present the lesson in a time-efficient and concise way, allowing students to process the information more readily. Moreover, Harmer (2001) believes in suitability and effectiveness of using audio-visual materials in English classes. The learners' exposure to see the language in use is one of the reasons which he offered. He also believes that "videos provide students with paralanguage which help them to interpret the text more deeply". In addition, familiarity with the target language culture, the way people in those countries dress, kinds of food they eat or the typical body language they use for instance to invite some one out , are other advantages offered by videos. (p. 17) Regarding the effectiveness of using new technology and video-taped materials for improving listening comprehension among EFL learners, Nunan (2005) believes that in many aspects technology has become as effective as humans in delivering content for L2 listening classrooms. Wagner (2006, p.74) believes that: Depending on the purpose of the test, the inclusion of nonverbal components of spoken communication through the use of video texts in L2 listening test tasks might be advantageous, since not only would the tasks more closely simulate the characteristics of authentic spoken language, but the inclusion of the visual channel in presenting the spoken input might lead to construct more relevant variance in the assessments, allowing for more valid inferences to be made from the results of those assessments. A total of 60 EFL advanced learners who were all female adults, ranging from 17 to 30 years old, took part in this research. They formed 4 classes of 15 students on average, which made out for 60 learners in total. As, this is a small scale research, 60 participants seemed rather logical and controllable.

Instrumentation
In order to pursue this study, the researcher used the following instruments: Longman TOEFL test (2007) as a homogeneity test, a DVD containing ten short video clips showing on-the-street-interviews, chosen from Summit TV, copies of questions related to each interview taken from Summit video book practicing different kinds of questions such as, gap-fill, true or false and matching exercises, multiple questions and completing sentences in each of the ten sessions of treatment, and the listening comprehension section of the Longman TOEFL test as the posttest.

Procedure
To achieve the purpose of the present study, the following steps were taken during the research process.
Due to the regulations of the Ayandeh Institute in which the researcher ran her study, the researcher did not have the luxury of randomly selecting and assigning the subjects to two groups. Therefore, she employed convenient sampling for choosing the participants of her study. The students selected for the study were from four intact advanced classes of the institute.
Among these four classes two classes with 16 and 14 learners which were held in the morning (one since 8:30 a.m. to 10:15 a.m. and the other since 10:30 a.m. to 12:15 p.m.) and the other two classes with 17 and 13 learners held in the afternoon (one since 3:45 p.m. to 5:30 p.m. and the other since 5:45 p.m. to 7:30 p.m.) were randomly assigned as the control and experimental groups, respectively. So each group contained 30 students. The researcher herself was the teacher of the experimental group(two classes with 17 and 13 learners) and one of the teachers of the language school with almost 5 years of experience was selected as the teacher of the control group(two classes with 16 and 14 learners). These classes were held three days a week on even days for seventeen sessions.
To make sure that the participants of both groups were homogenous regarding their general English language proficiency in one hand and listening comprehension in another hand, the Longman TOEFL test (2007) was administered to the participants of these classes.
After administering the test, based on the results, it could be concluded that there was not any significant difference between the mean scores of the two groups on the Longman TOEFL test. Thus, it was concluded that the two groups were homogenous in terms of their general English language proficiency prior to the administration of the treatments.
Further, it was proved that there was not any significant difference between the mean scores of the two groups on the listening comprehension section too. Thus, it was concluded that the two groups were also homogenous in terms of their listening comprehension ability prior to the administration of any treatment.
After coming up with the conclusion that the two groups were homogeneous, they received two different listening comprehension techniques, i.e. audio-visual group watching the video was equipped with the speaker's gestures and facial clues, while the audio-only group could just listen to speaker's voice and no additional clue was presented.
At the beginning of each session, the participants of both groups received warm-up on the theme of the interviews on which they were going to work that session. Then, they had pre-teaching of the vocabulary if any new or problematic one was present in the interviews. Next, the copies of questions related to each interview taken from Summit video book were handed out to the whole participants. Different kinds of questions were assigned for each interview. In some sessions the students were supposed to answer gap-fill, true or false and matching exercises while in some sessions the exercises designed for that interview were multiple questions or completing sentences.
The participants of both groups watched or listened to the interviews as much as they wished to fulfill the task. On average, they needed to watch or listen three to five times. After the completion of the task by the participants of both groups, they received feedback on their answers from the teacher. In both groups, the teacher paused the interview after each question and reviewed the correct answer with the participants. Each time the overall practice took twenty to thirty minutes. The only dividing point of two groups was the exposure to the interviewee's gestures and facial clues. The audio-visual group had access to the speaker's nonverbal behaviors in addition to verbal behaviors, while the audio-only group could just hear the interviewee's voice and no additional clues were presented to utilize. However, it should be mentioned since the interviews presented for two groups and also the related questions were exactly the same in two groups, they were designed in a way that no nonverbal behaviors (gestures and facial clues) were necessary to answer the questions. The questions practiced in class in each session of treatment and also the TOEFL test administered as the homogeneity test are included in appendices 1 and 2.
It's worth mentioning that the interviews utilized in both groups were exactly the same, and the subjects were as follows: Personality Type, Saving Money, Urban Life vs. Suburban Life, Advertising, and News Sources.
Finally, at the end of the instructional period (in the eleventh session) both groups took part in the listening comprehension test of the Longman TOEFL test as the posttest. The obtained mean scores of the two groups were compared to see if there was any significant difference in the effect of using video-taped materials presenting speaker's gestures and facial clues on EFL learners' listening comprehension.

Data Analysis
In order to test the null hypothesis of this study, the following statistical analyses were carried out: At the beginning of the study the mean scores of two groups were calculated to prove the homogeneity of participants in experimental and control groups both regarding their general English language proficiency and their listening comprehension ability prior to the administration of any treatment.
Also, an Independent t-test was run between the obtained means of two groups on the listening comprehension section of the TOEFL test as the post-test to determine whether applying videos presenting speaker's gestures and facial clues has any significant effect on EFL learners' listening comprehension. The mean scores for the audio-visual and audio-only groups were 64.4000 and 63.8667, respectively. Based on these results, it could be concluded that there was not any significant difference between the mean scores of the two groups on the Longman TOEFL test. Thus, it was concluded that the two groups were homogenous in terms of their general English language proficiency prior to the administration of the treatments. The mean scores for the audio-visual and audio-only groups were 65.7667 and 65.2000, respectively. So, there was not any significant difference between the mean scores of the two groups on the listening comprehension section. Thus, it was concluded that the two groups were homogenous in terms of their listening comprehension ability prior to the administration of any treatment.

Data Analysis for the Listening Comprehension Section of the Longman TOEFL test as the Post-test
All participants in both groups were tested at the end of the study by means of the listening comprehension section of the same Longman TOEFL test that they had taken at the outset of the study to see the impact of applying videos presenting speaker's gestures and facial clues on EFL learners' listening comprehension. An independent t-test was run to compare the means scores of the audio-visual and audio-only groups on the post-test of Listening Comprehension test. The descriptive statistics for the two groups are displayed in following tables. In order to run a t-test, the researcher had to meet the two assumptions of normal distribution of scores and homogeneity of variances. As shown in table 4.11 above, the two groups were normally distributed because the ratios of skewness statistic over standard error was within the range of plus and minus 1.96. So, the first assumption was met.
It should be noted that the two groups were also homogenous in terms of their variances. As displayed in Table  4.12, the Levene F of 3.38 had a probability of .071. Since the probability associated with the Levene F was higher than the significance level of .05, it could be concluded that the two groups enjoyed homogenous variances on the post-test of listening comprehension.

Results and Discussion
The outcome of the post-test data analysis revealed that there was a significant difference between the mean scores of two groups on the post-test of listening comprehension section of TOEFL test. Since the audio-visual group outperformed on the post-test of listening comprehension test, it could be concluded that applying videos presenting speaker's gestures and facial clues has a significant impact on the EFL learners' listening comprehension.
The researcher speculated that the achieved result could be due to the fact that participants of audio-visual group were more actively involved in the process of listening comprehension during the sessions. They were equipped with extra sources of information.

Conclusion
To ensure the homogeneity of the participants of the two groups; i.e. audio-visual group and audio-only group, in terms of their general English language proficiency in one hand and listening comprehension in another hand, a Longman TOEFL test (2007) was administered among the participants of both groups. Based on the results, it was concluded that the two groups were homogenous in terms of both their general English language proficiency and listening comprehension ability prior to the administration of any treatment.
After coming up with this conclusion that the two groups were homogeneous regarding their general English language proficiency and listening comprehension, the two groups received two different treatments. Audio-visual group watching the video was equipped with the speaker's gestures and facial clues, while the audio-only group could just listen to speaker's voice and no additional clue was presented. Next, the participants in two groups were supposed to answer the same set of questions related to each interview (on average 12 questions for each session of treatment), and then reviewed and corrected in class by teacher's assistance.
Finally, at the end of the instructional period, both groups took part in the listening comprehension test of the Longman TOEFL test as the post-test. An independent t-test was run to compare the mean scores of the audio-visual and audio-only groups on the post-test of listening comprehension test. Based on the results, it could be concluded that there was a significant difference between the mean scores of the two groups on the post-test of the listening comprehension test. The audio-visual group performed better on the post-test of the listening comprehension test. Thus, it could be concluded that the null-hypothesis which was 'Applying videos presenting speaker's gestures and facial clues has no significant effect on EFL learners' listening comprehension' was rejected.
To sum up, the finding of this study can be summarized as follows: The present study indicated that there is a significant difference in the effect of using video-taped materials presenting speaker's gestures and facial clues on the EFL learners' listening comprehension. In fact, audio-visual materials are more effective than audio-only ones.

Implications of the Study
The result of this study has some hints for English instructors to pay attention to while teaching listening comprehension. They can benefit from using video-taped materials to improve their students' listening comprehension.
The result of this study could also have significant implication for syllabus designers, material developers, and those preparing listening aids. They can achieve a better result by careful inclusion of appropriate audio-visual materials in designing syllabuses, developing materials, and preparing listening aids.