A Corpus-based Study of EFL Learners ’ Errors in IELTS Essay Writing

The present study analyzed different types of errors in the EFL learners’ IELTS essays. In order to determine the major types of errors, a corpus of 70 IELTS examinees’ writings were collected, and their errors were extracted and categorized qualitatively. Errors were categorized based on a researcher-developed error-coding scheme into 13 aspects. Based on the descriptive statistical analyses, the frequency of each error type was calculated and the commonest errors committed by the EFL learners in IELTS essays were identified. The results indicated that the two most frequent errors that IELTS candidates committed were related to word choice and verb forms. Based on the research results, pedagogical implications highlight analyzing EFL learners’ writing errors as a useful basis for instructional purposes including creating pedagogical teaching materials that are in line with learners’ linguistic strengths and weaknesses.


Introduction
English occupies the status of a foreign language for the majority of the Iranian learners.However, learning English as a foreign language (EFL) is not an easy task.According to Brown (2000), in order to become proficient in English language, learners have to be adequately exposed to all four basic skills, namely listening, speaking, reading and writing.Among the four language skills, writing is the most difficult one since it needs a higher level of productive language control (Celce-Murcia & Olshtain, 2000).Writing is a complex task and writing in a foreign language makes the task more complex as it needs adequate command over the language to fulfill all the essential formalities for a written text to be comprehensible.Regardless of its challenging aspects, L2 writing is widely considered in international EFL/ESL testing systems such as TOEFL and IELTS (Askarzadeh Torghabeh & Yazdanmehr, 2010).In fact, mastery of writing skills is seen as one of the most contributing factors that can lead to success in an IELTS test.
For around fifteen years, learner corpus-based research has relied on a methodological framework adopted from previous methodological approaches to second language acquisition (SLA) and from corpus linguistics.From SLA, it has incorporated the methodologies of Contrastive Analysis (CA) and Error Analysis (EA) which improved the advantages of computerized corpus research, and have given rise to a powerful device for the qualitative and quantitative study of foreign language learning.Learner corpus research has adopted the quantified approach to information and mechanisms of analysis from findings of corpus linguistics (DÍaz-Negrillo, 2006).
Understanding learners' errors is important for language teachers, researchers, and learners (Corder, 1967).Learner corpora have been employed mostly to provide information on learners' common errors, although it can be used for pedagogical purposes.Learner corpora constitute a new resource for SLA and foreign language teaching (FLT) specialists.Learner corpora are more useful when all errors in the corpus have been annotated with the help of a standardized system of error tags (Granger, 2003).
Regarding the importance of L2 writing in IELTS essays, it is useful to examine the major errors that this group of learners typically makes This study, in particular, aimed at showing that a learner corpus can give assistance to developing pedagogical materials that are appropriate for specific learners.Therefore, the goal of this study is to determine the major types of errors in IELTS essay writing by EFL learners in order to find practical implications for improving IELTS essay writing instruction.

Error analysis (EA)
Error analysis (EA) is a method used to compile the errors that appear in learner language, determine whether those errors are systematic, and explain what caused them.Corder (1967), the 'Father' of Error Analysis, defined EA as a procedure used by both researchers and teachers which involves collecting samples of learner language, identifying the errors in the sample, describing these errors, classifying them according to their nature and causes, and evaluating their seriousness.The purpose of Error Analysis is, in fact, to find "what the learner knows and does not know" and to "ultimately enable the teacher to supply him not just with the information that his hypothesis is wrong, but also, importantly, with the right sort of information or data for him to form a more adequate concept of a rule in the target language" (Corder, 1974, p. 170).Mitchell and Myles (2004) subscribe to the view that language errors are normal and inevitable features of learning.They claim that the study of errors can reveal a developing system of the L2 learners' language.This system, in their view, is supposed to be dynamic and open to changes and resetting of parameters.The same view is supported by Stark (2001) who further explained that the language teachers should view students' errors positively and should not regard the errors as the learners' failure to grasp the rules and structures.L2 learners' errors should be viewed as normal, inevitable, and a natural part of the learning process (Stark, 2001).Corder (1974) devised a five-step model for Error Analysis which was employed in the present study.The steps in this model are as follows: Step 1: Collection or selection of a sample of learner language (a corpus of language which can be written or oral) Step 2: Identification of errors Step 3: Description of errors which includes a grammatical analysis of each error and the sources.
Step 4: Explanation of different types of errors that is the ultimate object of error analysis.
Step 5: Evaluation of the errors that are collected.
The study of errors, therefore, is positive for both learners and teachers (Richards, Platt, & Platt, 1996).Tapping into the problematic areas, EA can offer clear and reliable pictures of students' knowledge of the target language.This may inspire the language teachers to consider learners' errors as effective indicators of their learning problems.

Corpus-based Error Analysis (CEA)
Since the mid-1980s, corpus linguistics has been increasingly recognized as a powerful methodology in language teaching and learning (Conrad, 2000;Granger, Dagneaux, Meunier, & Paquot, 2002;Ro¨mer, 2011;Sinclair, 2004)."Acorpus is a collection of texts which is used for linguistic analysis.These texts are generally assumed to be representative of a given language, dialect, or other subset of a language" (Girgin, 2011, p. 1).A large and carefully gathered corpus can be a very useful resource to know how different languages are learned.It can also help to improve the learning process (Pravec, 2002).
The 1960-70s was the heyday of language error analysis.Based on Kotsyuk (2015), some non-electronic collections of texts were investigated by the language teachers and researchers.This traditional error analysis had numerous drawbacks.Only certain types and a limited number of errors could be fixed and extracted, and the rest of the material was not taken into account.Modern learner corpora or computer-based corpora, in contrast, have a whole set of essential advantages.Using computers and the internet, a much larger amount of data can be analyzed and the analysis is more accurate and time efficient.The cutting-edge technology provides the possibility "not only to fix language errors, but to make conclusions as to learners' speech in general and look critically at existing teaching methods, syllabuses, and teaching materials" (Kotsyuk, 2015, p. 391).Dagneaux, Denness, and Granger (1998) propose that Computer-aided Error Analysis (CEA) is a new approach to the analysis of language learner errors which is hoped to give new impetus to Error Analysis research, and re-establish it as an important area of study.DÍaz-Negrillo (2006) links the advent of CEA to EA suggesting that "CEA finds its origins in the methodology of EA, which enjoyed both popularity and severe attack during the 70s" (p.84).The emergence of computer learner corpora in the early 1990s enabled the researchers and educators to carry out computer-aided error analysis (CEA).CEA was developed to overcome most of the weaknesses of traditional EA, and in Abe's (2007) terms, it has an advantage in the storing and processing of enormous amounts of information about various aspects of learner language.
According to DÍaz-Negrillo (2006), learner corpora are used to investigate computerized learner language so as to gain insights into foreign/second language education.One of the most important approaches that can be applied to this type of research is CEA, which, in general terms, consists of the study of learner errors as contained in a learner corpus (DÍaz-Negrillo, 2006).Jichun (2015) stated that, "Corpus-based Error Analysis makes it possible for research workers not only to analyze what is wrong but also to describe what is right.Linguists can observe the language produced by EFL learners in contrast to that uttered by native speakers" (p.255).
By studying learners' errors, the difficulties involved in learning a foreign language can be predicted.Therefore, teachers can be made informed of the most difficult areas for the students to pay special attention to and emphasis on them (Kotsyuk, 2015).It is possible to distinguish the strategies which foreign language learners trigger in their process of learning by analyzing errors of learners' production.By identifying and classifying errors in corpora, it is possible to design instructional materials that are more locally oriented for learners of a specific mother tongue in a specific context (Hou, 2016).The present study aimed to investigate different types of errors in EFL learners' writing.To achieve this aim, the following research question was explored: • What are the most frequent types of errors committed by EFL learners in IELTS essay writing?

The learner corpus
The learner corpus chosen for the study was IELTS essay writing samples on the accredited IELTS website (IELTSblog.com).This website was originally created by Simone Braverman in 2005; over time IELTS-Blog.combecame a large resource.It is filled with information about the recent IELTS exams, points and advice provided by several writers and contributors from all over the world.It also has abundant links to other helpful IELTS resources.This website contains about 70 IELTS essay writing samples from band 5 to band 8. On the homepage of this website, the user can choose from these samples accompanied by corrections.

The coding scheme
This study adopted the coding scheme from Dagneaux et al. (1998), Chuang and Nesi (2006) and Hou (2016).Dagneaux et al. (1998) and Chuang and Nesi (2006), in their studies, used an error tagging system which was hierarchical: error tags consisted of one major category code and a series of sub-codes.There were seven major category codes: formal, grammatical, lexical-grammatical, lexical, register, word redundant/word missing/word order and style.Major category codes in Hou's (2016) study were article error, preposition error, noun error, verb form, word form, spelling, punctuation, word misuse, insertion and deletion.
Based on the learner corpus, the coding system in this study was developed based on the integration of these systems and contained 13 major category codes: Insertion, Preposition error, Deletion, Sentence structure error, Word choice, Spelling errors, Confusing sentences and Unclear expressions, Punctuation, Article error, Noun Error, Verb Form, Word Form, (connective word, definitive statement and linking words errors).The following table shows the classification of the errors which was used in this study as the coding scheme.

Procedure
In order to analyze the corpus, a three-stage process was followed.First, the texts were turned into a list of the word forms in the order of their first occurrence, noting the frequency of each; second, they were sorted in an alphabetical order and third, they were sorted based on their frequency order.Errors included major writing variables, including lexical knowledge, grammar, discourse (cohesion and coherence), mechanics (punctuation and spelling), and content richness.Errors were categorized and classified based on the error coding scheme.The learner corpus was developed by investigating about 70 IELTS essays accompanied by corrections and raters' comments through the accredited websites.
Table 2 provides a sample of errors, their corrected forms, and their coding.These are examples of errors that have been identified from the IELTS essays written by EFL learners.

Design
The method used in this study was a qualitative one.This study involves a descriptive design in terms of computing the frequency of various errors committed by IELTS text-takers.According to Macmillan and Schumacher (1993, p. 35), a study using a descriptive design "simply describes an existing phenomenon by using numbers to characterize individual or group."They further argued that in descriptive research, there is no manipulation of subjects.In fact, a researcher measures things as they are.

Results
Errors were categorized and classified based on the error coding scheme as represented in table 1. Afterwards, the frequency of each error type was calculated to identify the commonest errors committed by the learners in IELTS essays.The frequency of the occurrence of the errors in the IELTS examinees' performance together with the percentage and cumulative percentage of the errors are reported in table 3.This table presents the observed errors made by the IELTS examinees and the frequency with which they were committed by IELTS examinees.The errors in each category are arranged from the most frequent to the least frequent one.On the whole, 589 errors in the writing performance of the IELTS examinees were observed in the data.As table 3 presents, the error category which was most frequently attended to was the word choice error (n = 144) that was identified in about 24.4% of the errors.The category of verbs occurred 104 times with 17.7% of the whole errors.Spelling errors (n = 73) and noun errors (n = 53) were the next two error categories which were identified in students' performance.These four error types were the ones that were found the most in the data.All in all, most of the errors occurred in the category of wrong choice of the words and the second category in which the most errors were recognized was verb form.Furthermore, punctuation (n = 14) and linking words (n = 7) were the least committed errors by the IELTS examinees.The rest of the errors are listed in the table.
Here are some examples of incorrect word choice errors (the first most frequent errors in the learner corpus) that were extracted from the learner corpus: (1) Students can get better understanding of the destination* (host) countries.
(2) In order to defense* (sustain) our life, the governments every countries should tackle this issue.
(3) Every person learns something new according to* (depending on) their age, experience, knowledge and education.
(4) The positive things that this globalization process have brought must sensible* (compensate) us for the negative sides.
From Table 3, it is apparent that word choice errors are the most frequent errors in the learner corpora.Therefore, it can be concluded that the EFL learners have not sufficient mastery of vocabulary.As Flowerdew (2003) mentioned, learners' problems are not so much incorrect English or bad English, but rather insufficient English.They made many mistakes in word choice.
For verb form errors (the second most frequent errors in the learner corpus), the following are examples that were extracted from the learner corpus.
(1) Moreover it should be considerate* (considered) that the social effects of talking about money and finance in a socially diverse school class can be harmful for some students.
(2) However, the government can solved* (solve) these problems in many ways.
(4) If the nations wants* (want) to be progressive it is very important that the people are more educated and progressive.
For spelling errors (the third most frequent errors in the learner corpus), the following are examples that were extracted from the learner corpus.
(1) Not only this, but also by giving importance to education, the nations can get rid of problems like illiteracy* (illiteracy), poverty, unemployment and population growth that delay the progress of a nation.
(2) The Internet is a convenient* (convenient) way of getting information, as long as your mobile phone is connected or you possess a laptop.

Discussion
The current study aimed at identifying the common errors in EFL learners' IELTS writing tests and categorizing them according to the types of the errors.According to Corder (1981), errors are important in three ways: Firstly, the teachers can be informed of the progress of the learners and the areas needing more practice.Secondly, they offer the researchers with the strategies and processes of language learning.Finally, they are beneficial to the learners when they use errors as devices for further learning.
According to Ellis (2008), classification of errors helps us in recognizing and analyzing learners' language problems at any stage of their development.Since conclusions from single experiments cannot be generalized, "it is important to set up longitudinal researches that would help practitioners gain insights into longstanding effects" (Cotos, 2014, p. 219).In fact, more empirical studies are needed to determine whether or not using learner corpus is effective in improving second and foreign language writing.Based on the analysis of students' errors in writing, the present researchers identified 13 aspects and calculated the frequency of each error type to identify the commonest errors committed by learners in IELTS essays.
The results of this study are in line with Nesselhauf's (2004) claim which indicated that with learner corpora, many aspects can be investigated at the same time, and more general questions such as the relative frequency of different types of errors can be addressed.
The findings are similar to those of Hou (2016) and Kobayashi (2014) who found practical implications for improving students' writing by exploring lexical and grammatical error patterns produced by EFL learners.Based on the identified error patterns, it can be highly conceived that EFL teachers and learners need to notice the sources where the problems arise and work to eliminate the errors through understanding the possible sources of errors.
These results agree with Jichun's study (2015) which suggested that linguistic errors were the most distinctive errors among the Chinese EFL learners.Participants in Nzama's (2015) study committed the following errors more frequently: Use of auxiliaries, tenses, concords, articles, prepositions, pronouns, plurals, mother tongue interference, infinitives and auxiliary with past tense.In addition, Agustina and Juning's (2015) study showed that three types of errors occurred most frequently: a) the misformation in the use of tense form, b) the errors in omission of noun/verb inflection, and finally, c) many clauses that contained unnecessary phrases.Another study by Dagneaux et al. (1996) showed that the three most frequent grammatical errors committed by learners were verb forms, article errors, and preposition errors.
The researchers concluded that those errors were committed by the students because of their limited vocabulary, as they just wrote down the words they knew without having a concern for inappropriate words or meaning.These results may seem slightly different from those of the present study as they use different types of categorizations and taxonomies in analyzing the errors.
In a similar study, Sarfraz (2011) examined the errors in a corpus of 50 English essays written by 50 EFL/ESL learners.
The researcher followed Ellis's (1994) procedural analysis of errors: Collection of sample of learner language, identification of errors, description of errors, explanation of errors, and evaluation of errors in analyzing 50 English essays.The results indicated that the percentage of the occurrences of interlanguage errors is higher than those of errors resulting from the interference of the learners' mother tongue.
In an Iranian-based setting, Omidipour (2014) investigated writing skills of 40 adult Persian-speaking learners.The results of the study showed that most errors included in participants' compositions resulted from inadequate lexical knowledge, misuse of prepositions and pronouns, seriously misspelled lexical items, and faulty lexical choice.These results closely support the results of the present study.The researcher here found that error categories including wrong choice, verb form, noun errors, and spelling errors, respectively were more frequently made by EFL learners.Furthermore, the errors of punctuation and linking words were the least committed errors by the IELTS examinees.
One possible explanation for the above mentioned results can be that, in current educational system in Iran, most teachers and administrators follow traditional approaches to teaching L2 grammar and writing and they are unfamiliar with or unwilling to employ recent corpus-based approaches to teaching L2 writing.Despite all the teaching and learning innovations many educators still pursue the principles of deductive approaches for teaching grammatical rules and do not pay sufficient attention to pragmatic approaches and language use rather than its usage.
Another reason can be associated with the L1 interference and learners' unfamiliarity with the pragmatic aspects of the target language.L1 interference can lead the learners to inappropriate choice and use of nouns and verbs which make their essays unintelligible and unacceptable.Moreover, learners usually memorize the words without considering the appropriate context in which they are pragmatically used.The frequency of spelling errors can also be explained in terms of learners' heedless attention to the words' spelling and a meticulous attention to the meaning of the words in order to memorize them.

Conclusion
Based on the findings, it can be concluded that there is a substantial difference in the frequencies of different types of errors.As it was reported, most of the total errors EFL students committed in their IELTS essay writing were related to finding appropriate words (word choice) and verb form.This result presents an important insight that students have difficulty in choosing natural word selection and verb form while writing in English.They just put the words they knew without considering the appropriateness of words or meaning.Iranian EFL teachers should consider this in their pursuit of L2 writing instruction.The learners should also be wary of the vocabulary items they choose for their writing compositions and be careful when using words with which they are unfamiliar.
Noun errors and spelling errors were among the most committed errors after word choice and verb form errors as presented in the results section.It has been reported that some L2 learners commit glaring errors frequently whether it be in the use of nouns or simply in the spelling of the words.It can be concluded here that Iranian EFL teachers need to address these areas more efficiently and the students have to consider them in their writing process more earnestly.Preposition errors, sentence structure errors, and word formation errors are next in line in the list of most frequently committed errors and need to be taken care of in L2 writing classes.
It is also worth mentioning that punctuation and linking words are the least committed errors by the IELTS examinees.
Considering the importance of punctuation and linking words in essay writing, it is suggested that these areas not be ignored in language teaching classes.Other less frequent errors concerned deletion errors, confusing sentences and unclear expressions, article errors, and insertion errors.
Examining learner errors could be an initial step to familiarize teachers with the knowledge of learner's language, but it is only a beginning stage to find the different nuances that learning an L2 contains.The L2 writing instructors' knowledge of these errors and the frequency with which they occur can help them prioritize the problematic areas and attend to them more efficiently.Through prioritizing, learners' linguistic needs can be more easily spotted and their writing problems can be immediately dealt with.The findings also revealed that incorporating learners' common errors and difficulties which are uncovered from learner corpus can be of great help for EFL learners to notice the potential problematic features and overcome their learning difficulties.Future researchers are recommended to deal with a more comprehensive analysis of the sources of errors, their context interpretation and their dependence on learners' proficiency level and native language in Iranian EFL setting.

Table 1 .
The Coding Scheme of the Present Study

Table 2 .
Sample Error Correction and Coding

Table 3 .
Frequency and Percentage of Errors Committed in IELTS Essays