The Effect of Automatic Speech Recognition EyeSpeak Software on Iraqi Students ’ English Pronunciation : A Pilot Study

The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the foreign language classrooms where it is most needed. One promising emerging technology that supports language learning is automatic speech recognition (ASR). Integrating such technology, especially in the instruction of pronunciation in the classroom, is important in helping students to achieve correct pronunciation. In Iraq, English is a foreign language, and it is not surprising that learners commit many pronunciation mistakes. One factor contributing to these mistakes is the difference between the Arabic and English phonetic systems. Thus, the sound transformation from the mother tongue (Arabic) to the target language (English) is one barrier for Arab learners. The purpose of this study is to investigate the effectiveness of using automatic speech recognition ASR EyeSpeak software in improving the pronunciation of Iraqi learners of English. An experimental research project with a pretest-posttest design is conducted over a one-month period in the Department of English at Al-Turath University College in Baghdad, Iraq. The ten participants are randomly selected first-year college students enrolled in a pronunciation class that uses traditional teaching methods and ASR EyeSpeak software. The findings show that using EyeSpeak software leads to a significant improvement in the students’ English pronunciation, evident from the test scores they achieve after using EyeSpeak software.


Introduction
The increasing demand for technology use in everyday life serves many purposes, one of which being in education as a way to facilitate teaching and learning.It is necessary to use technology, especially in the foreign language classroom where there is limited time for exposure to and practice of the target language.Therefore, language learners need to listen to and practice using the target language more often in a stress-free environment.The important factor in learning any language is being able to speak that language in an intelligible way.In helping students achieve this, using classroom-based technology such as automatic speech recognition (ASR) to teach English pronunciation has a positive impact on an individual's outcomes and performance (Chapella, 2001).It provides authentic material, such as native speakers' pronunciation of the target language, and at the same time allows the students to listen to and practice their pronunciation in an enjoyable setting; it also gives each individual learner immediate corrections and feedback, which is difficult to achieve in class with a large number of students.
The increased use of automatic speech recognition (ASR) is now an important element in teaching pronunciation.Many studies recommend its use as essential to the process due to the advantages it offers learners (Chapelle, 2001;Butler-Pascoe & Wiburg, 2003;Neri, Cucchiarini, & Strik, 2001;Harless, Zier, & Duncan, 1999;Kim, 2006;Pennington, 1996;McCrocklin, 2014).For instance, ASR technology gives the instructor the opportunity to discover each individual learner's problems with pronunciation.Furthermore, using automatic speech recognition (ASR) provides each student the chance to practice pronunciation, identify mistakes and receive feedback from a native speaker.It provides a stressfree environment that encourages the students to speak the target language and motivates them to participate (Morley, 1991).All of these advantages assist the students in the process of learning English pronunciation.Ultimately, it helps them improve their pronunciation and their overall oral skills.Furthermore, clear and accurate pronunciation will lead to better understanding and make communication easier, whereas poor pronunciation can mislead the listener and make comprehension difficult (Eskenazi, 1999).Learners' poor pronunciation is a barrier to speaking the target language effectively; therefore, the learners focus much attention on attempts to master pronunciation (Fraser, 1999).There are several reasons for learners' poor pronunciation, which include mother tongue interference and phonetic system differences (Flege, 1995).This study concentrates on the Flourishing Creativity & Literacy differences between the Arabic and English phonetic systems, especially those English sounds that do not exist in the Arabic sound system.While the Arabic phonetic system includes 32 consonants sounds and 8 vowel sounds, the English equivalent has 24 consonant sounds and 22 vowel sounds (Abdou et al., 2014, p. 371).In addition, such differences between the mother tongue and the target language are considered problems for learners, especially in learning pronunciation (Bell, 1995).The interference of the mother tongue (Arabic) on the foreign language (English) is considered one of the factors that cause problems for Arab learners in general, as is the case for Iraqi learners, which lead them to have difficulty in mastering and producing accurate English pronunciation.This issue hinders communication in the target language and discourages them from practicing and speaking English.Therefore, this study investigates whether using pronunciation training software may help learners to improve their English pronunciation despite differences between the Arabic and English phonetic systems.The main issue relates to the English sounds that do not exist in the Arabic phonetic system.Therefore, teachers should make the students aware of those differences and help them overcome their pronunciation errors.A successful learner requires proper training in those differences.According to Kenworthy (1987, p. 4), six factors affect pronunciation accuracy.First are the phonetic system differences between the mother tongue and the target language.The second is the learner's age, as younger learners learn faster that adults.Third is the amount of exposure to the target language.Fourth is the learner's phonetic ability, which allows him to discriminate between sounds.The fifth factor is the learner's attitude and identity, and sixth is the learner's motivation and desire to produce good pronunciation.Three of these factors can be addressed using automatic speech recognition (ASR) software, which can provide the help students need in successful pronunciation learning: (a) it provides students with several training exercises and drills that make them aware of the sound differences between the mother tongue and the target language; (b) it offers learners exposure to the target language; and (c) it provides practice activities, correction and feedback that will enable the students to discriminate between the sound differences.Thus, incorporating computer-assisted language learning (CALL) software in the classroom can improve students' pronunciation, as using automatic speech recognition (ASR) software to teach pronunciation will provide the students with authentic materials and activities by listening to native speakers, identifying students' pronunciation problems and providing correction and feedback.Therefore, this will assist the students' learning process, lead them to produce accurate English pronunciation and help them become independent learners.Students' practising and completing drills by themselves also saves the teacher time (Kenworthy, 1987).

The Purpose of the Study
The purpose of the study is to investigate the effectiveness of computer-assisted ASR software in teaching English pronunciation to Iraqi students in the Department of English Language at Al-Turath University College, Baghdad, Iraq.Integrating automatic speech recognition (ASR) EyeSpeak software may help students improve their English pronunciation.Taking advantage of the software's features, such as drills, correction and feedback, may help students reduce pronunciation errors related to the transfer of Arabic sounds to their English speech production.In addition, implementing automatic speech recognition (ASR) software in the teaching and learning environment will provide learners with examples of authentic pronunciation by native English speakers.In this study, EyeSpeak software is used for one month; improvements in students' pronunciation are measured by administering pretests and posttests to evaluate their pronunciation proficiency levels before and after using the software.

Literature Review
Many studies in the field of language teaching and technology attempt to find the most effective aids for improving students' pronunciation, and the aim of any teacher is to help the students pronounce accurate, intelligible and nativelike pronunciation.The traditional way of teaching pronunciation is usually to concentrate on the comparative evaluation of how the students' speech production compares to native speaker pronunciation (Molholt, 1988).However, correcting all of the students' mistakes in the traditional class is difficult to achieve, as it is time consuming.Therefore, many teachers try to implement technology aids to assist learning, as giving corrections and feedback to students is essential to ensure their pronunciation accuracy.When the learner's pronunciation is accurate, it allows for a spontaneous conversation and makes the flow of communication easier (Pennington, 1996).

Teaching Pronunciation and Theories in Language Learning
Several studies demonstrate the effectiveness of pronunciation training on students' pronunciation performance.Morley (1994) recommends paying more attention to pronunciation instruction as a new trend in teaching pronunciation, while Derwing, Munro and Wiebe (1998) state that instruction in segmental accuracy and general oral habits leads to enhanced pronunciation.In addition, a focus on new instructional plans should involve consideration of not only language forms and functions, but also issues of learner involvement and learner training techniques.Student involvement in learning leads them to become active learners in the sense that they can develop and modify their speech production.However, teachers must be aware of the possibilities that technology offers learners, as this can increase their understanding of language learning.
Having students practice pronunciation is important in helping them improve their speaking production and increase their self-confidence, and can make them less hesitant to speak the target language.Students' self-esteem plays a significant role in improving their English pronunciation because learning pronunciation is not only a matter of exposure to native speakers, but also relates to practising the target language themselves (Kenworthy, 1987).
Many studies address the variables that lead to successful pronunciation learning.Vitanova and Miller (2002) indicate that teachers may provide learning strategies that raise awareness of the differences in the phonetic systems between the mother tongue and the target language, which will help students master the target language pronunciation (Vitanova & Miller, 2002).An Oxford study (1986b) highlights the important role of the teacher in providing learning strategies to enhance students' pronunciation performance.

The Importance of Using Pronunciation Training in the Classroom
In teaching English as a Foreign Language (EFL), the most important aspect teachers should focus on is helping the students to master the pronunciation.Students' poor pronunciation leads to communication failure and learners suffering from low self-esteem and stress (Morley, 1998).
Using automatic speech recognition (ASR) can help the students to identify the differences between the sounds of their mother tongue and the target language (Richards, 2015).The students will be able to listen to the native speaker pronunciation model, compare it with their own pronunciation, practice and receive immediate private feedback, which will encourage them to engage in repetitive practices (Richards, 2015).One of the factors that cause Arab learners to face difficulty in learning English pronunciation is the influence of their mother tongue on the target language.The Arabic language interfering with the English language affects the Arab learners' pronunciation of English.Similarly, various researchers (Celce-Murcia, Brinton, & Goodwin, 1996;Pennington, 1994) note that the mother tongue can greatly influence the target language, which affects students' pronunciation in terms of intonation and the production of vowel or consonant sounds.
Therefore, it is important that teachers of pronunciation integrate technology aids to help students improve.The teacher can provide individual students with the opportunity to practice their pronunciation by using automatic speech recognition (ASR) to help them overcome their mistakes as it offers individual practice, correction and feedback, which is difficult to accomplish in traditional classes because it is time consuming.Moreover, integrating automatic speech recognition (ASR) is particularly necessary in a foreign language classroom as it allows individual students to gain awareness of their pronunciation issues.With the teacher's guidance and the support of ASR software, each student can overcome pronunciation issues related to the influence of their mother tongue on the target foreign language.It gives them the opportunity to listen to native speakers and practice the drills needed to help them improve their target language pronunciation.

The Usefulness of Pronunciation Training Feedback
In traditional classes, the teacher corrects the students' pronunciation mistakes immediately, a technique based on the audio-lingual teaching method.However, in the early 1970s, according to the communicative approach, it was felt that teachers should not correct students' errors immediately.Krashen (1985) explains that immediate correction may make the students feel uncomfortable, lose their self-confidence and refuse to participate in further activities.Carroll and Swain (1993) review various types of correction and feedback used to improve students' English performance, and show that second language learners who receive feedback perform better than students who do not have their errors corrected.As there is no recognised effective measure concerning the best way to correct students' pronunciation, feedback is considered the most successful, even though the students first need to understand the feedback in order to make the changes required to improve their pronunciation.Al-Qudah (2012) mentions that the focus in teaching pronunciation in the foreign language classroom should be on having students produce native-like pronunciation, while, in teaching pronunciation, the focus should be on sound production and knowing the place of sound articulation (p.202).
In traditional pronunciation classes, it is usually difficult for the teacher to address all the students' English pronunciation performances and problems.Thus, technology applications are widely used in language teaching because of the benefits for individual learners.However, using automatic speech recognition (ASR) in the pronunciation classroom can provide correction and feedback for all individual students in a private, stress-free environment, as each learner works independently with the software.Therefore, many teachers in foreign language classrooms implement the use of technology in teaching English pronunciation (Neri, Strik, & Cucchiarini, 2006).Automatic speech recognition, which addresses individual learners and can identify their pronunciation errors, is commonly used as an aid in teaching pronunciation (Truong, Neri, Dewet, Cucchiarini, & Strik, 2005), and ASR software with internet features is considered most effective in improving the teaching and learning of pronunciation.There are different types of ASR software, some CD-based and others internet-based; we use EyeSpeak in this study, which is an internet-based program.According to Witt (2012), EyeSpeak has a useful feature in terms of providing the students with feedback in an interesting way, as it provides details of tongue position as compared with the native-speaker model.It gives the students feedback in the form of an animated sound wave, phonetic transcription and animated sound production, and provides them with an overall score for their pronunciation performance.Moreover, it provides students with details about segmental features, including consonant and vowel phonemes, and suprasegmental features, including the pronunciation aspects of timing, pitch and loudness.In addition to these features, which increase teachers' interest in using the software in pronunciation classes, the software introduces subject materials in a colourful and interesting environment that attracts students' attention (Hişmanoğlu, 2010).

Methodology
In recent decades, research has addressed the implementation of automatic speech recognition (ASR) EyeSpeak software in the foreign language classroom because it can help in achieving native-like pronunciation.The EyeSpeak software can detect and diagnose students' errors and provide automatic feedback, which raises awareness of their pronunciation errors.Therefore, the aim of the study is to investigate the effect of using the software on the pronunciation performance of Iraqi EFL students, and to determine whether teaching pronunciation using ASR is more efficient than using traditional methods.To test this, we conduct experimental research in this pilot study to answer the following research question: Is there any effect of using automatic speech recognition (ASR) EyeSpeak software on Iraqi students' English pronunciation?
In an experiment with one group, we utilise a pretest-posttest method to investigate students' pronunciation performance by considering differences in test scores before and after the use of EyeSpeak in teaching English pronunciation.We conduct the pilot study at a private university in Baghdad, Iraq; the participants are ten first-year college students, randomly selected from the Department of English, aged between sixteen and twenty-one.

Instrument
The pronunciation teaching material is the textbook Better Pronunciation by J. D. O'Connor (2003), as assigned by Iraq's Ministry of Higher Education.This book introduces the pronunciation of English to students at the intermediate and advanced levels.It explains how the speech organs work, and also considers separate sounds before blending them into words, rhythm patterns and intonation.
The pilot study adopts EyeSpeak software as a multimedia pronunciation-teaching tool that includes speech recognition, online-based pronunciation features and sound-distinction training.EyeSpeak includes drills, practice, speech recording with playback capabilities, sound and graphic articulatory displays and animated and visual pronunciation feedback.These features enable evaluation and provide visual feedback for English as a foreign language (EFL) learners of English language sounds (vowels and consonants).This software can be a valuable pronunciation tool for beginner to intermediate English as a foreign language (EFL) learners to help them distinguish the differences between English sounds and accurately produced English phonemes.Several features of EyeSpeak have the potential to assist EFL learners in resolving their segmental pronunciation problems, and may help them improve their pronunciation.These features include showing animated speech organs (sound production), viewing the sound production waveform, listening to native speakers to compare with the students' pronunciation and listening to minimal pairs containing target sounds with transcription.
The software consists of two user profiles, one for the teacher and one for the students.The purpose of the teacher profile is to monitor students' progress; it enables the teacher to observe the individual learner's skill level and score.The student profile consists of five sections: home, lesson, speech, dictionary and fun.
The focus of the pilot study is on learning pronunciation, and the aforementioned speech section assists students in learning and practising English pronunciation.To measure any improvement in English pronunciation due to using EyeSpeak software, we use an achievement test.We administer a pretest and posttest to measure whether there is a significant difference between students' scores before the exposure (pretest) and after exposure (posttest) to EyeSpeak.The main goal is to measure students' improvement in pronouncing the English consonant sounds not found in the Arabic phonetic system (/p/, /v/, /tʃ/, /ʒ/, /ŋ/).The test consists of a written part and an oral element.The test questions are adapted from those in English Pronunciation in Use by Marks (2007), while the test also utilises content from Better Pronunciation by O'Connor (2003).On the written test, the students have to answer all 11 transcription and multiplechoice questions intended to measure specific sounds.In the 48-word oral test, each student must read four pairs of words that contain two consonant sounds.The content of the test is verified by two senior lecturers.The analysis technique used in this study is a paired-samples t-test to measure differences between the pretest and posttest scores.

Data Collection Procedure
This four-week pilot study began on January 10 and ended on February 10, 2016.There were three 45-minute classes per week for the pronunciation subject.Each week, the students attended one class in the sound lab using EyeSpeak software.The students took a 90-minute pretest on January 10 at 10 a.m. to measure their pronunciation proficiency level before using the EyeSpeak software, with the test directed by their pronunciation class teacher.In the next day's lab-based pronunciation class, the lecturer explained, in detail, how to use the EyeSpeak software's sections and categories.The lecturer verified that all students understood how to use the software before the actual class session started.In the following days, after the pilot study began, students attended lab-based pronunciation classes.Each student had a computer, an EyeSpeak account and his or her own headset.Students had to log in to their accounts each time they came to the lab; the lecturer then introduced them to the sounds they had to work on that day using particular EyeSpeak activities.At 10 a.m. on February 10, at the end of the study, the students took a 90-minute posttest.

Data Analysis
A paired-samples t-test determined whether the difference between the pretest and posttest was significantly different from zero, and a Shapiro-Wilk test determined whether the difference could have been produced by a normal distribution (Razali & Wah, 2011).The results of the Shapiro-Wilk test are not significant (W = 0.88, p = .123),which suggests that the deviations from normality are explainable by random chance; thus, normality can be assumed.Levene's test was applied to assess if the homogeneity of variance assumption was met (Levene, 1960), which requires that the variance of the dependent variable be approximately equal in each student.The result of Levene's test is not significant (F (1, 20) = 1.93, p = .180),indicating that the assumption of homogeneity of variance is met.
The result of the paired samples t-test is significant (t (9) = -11.22,p < .001),suggesting that the true difference in the pretest and posttest means is significantly different from zero.The mean of the pretest (M = 33.10) is significantly lower than that of the posttest (M = 44.21).Table 1 presents the results of the paired samples t-test; Figure 1 presents the pretest and posttest means.

The Findings
The findings of this study indicate a significant improvement in students' pronunciation after using EyeSpeak software for a one-month period.The study focuses on the English sounds absent from the phonetic system of their mother tongue (Arabic), and the students' test scores reveal an improvement in these sounds.The findings show a great improvement in the sounds /p/, /v/, /∫/, a slight improvement in /ʒ/, /t∫/, /dʒ/ and less improvement in /ŋ/.A similar study by Mohsin (2012) reveals significant improvement in students' pronunciation (individual sounds) after using CALLbased ASR software.Moreover, the conclusion of the pilot study confirms that using ASR EyeSpeak software did improve Iraqi students' pronunciation of English sounds absent from their mother tongue (Arabic).Therefore, we recommend implementing CALL ASR in teaching pronunciation, given that it leads to student improvement.Furthermore, a study by Eskenazi, Tomokiyo and Wang (2000) reveals that using CALL pronunciation training software is valuable in improving the students' pronunciation of difficult English sounds.On the contrary, a study by Witt (2012) reveals that using EyeSpeak software may not help the students to understand the feedback as it does not give a score for their phoneme level, only their word level.Other studies (Kim, 2006;Neri, Cucchiarini, & Strik, 2008) affirm that ASR is effective in teaching pronunciation, especially for non-native speaker students, as it significantly improves the students' pronunciation.

Conclusion
The objective of this study is to help Iraqi college students improve their English pronunciation by focusing on the English consonant sounds not found in the Arabic phonetic system.Differences in the phonetics systems affect the students' pronunciation of the English language due to them transferring their Arabic pronunciation to their English pronunciation.Analysis of the study's results reveals that there is a significant improvement in students' English pronunciation in the posttest compared with their pretest scores.This difference indicates that the use of EyeSpeak software in the pronunciation class helps students in their learning process and leads them to produce more accurate English pronunciation.
Thus, the use of EyeSpeak software in a pronunciation class can improve students' English pronunciation and help them to learn more quickly and realise their errors.Therefore, we recommend the use of automatic speech recognition (ASR) EyeSpeak software as a tool to support teaching English, especially for EFL learners.Integrating EyeSpeak software can augment the students' ability to learn and increase their level of understanding.

Figure 1 .
Figure 1.The means of the pronunciation pretest (A) and posttest (B)

Table 1 .
Paired-samples t-test for the difference between pretest and posttest Note.Degrees of freedom for the t-statistic = 9. d represents Cohen's d.