Linguistic Features of Humor in Academic Writing

A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall Humor component score for each essay in the corpus. In addition, the essays were also scored for overall writing quality by human raters, which correlated (r = .195) with the humor component score. Correlations between the humor component scores and linguistic features were examined. To investigate the potential for linguistic features to predict the Humor component scores, regression analysis identified four linguistic indices that accounted for approximately 17.5% of the variance in humor scores. These indices were related to text descriptiveness (i.e., more adjective and adverb use), lower cohesion (i.e., less paragraph-to-paragraph similarity), and lexical sophistication (lower word frequency). The findings suggest that humor can be partially predicted by linguistic features in the text. Furthermore, there was a small but significant correlation between the humor and essay quality scores, suggesting a positive relation between humor and writing quality.


Introduction
Academic writing and humor would seem an unlikely pairing.Especially in contexts of higher education, where students are often ranked and sorted into classes based on diagnostic essays and SAT scores and where academic writing can have serious consequences for students' futures.Traditional advice for academic writing in the United States exhorts writers to compose with clarity and cohesion (e.g., American Psychological Association, 2010) and to respond to the social needs of the audience and surrounding contexts (Palmquist, 2010).Humor, on the other hand, relies on semantic incongruity, linguistic ambiguity, and the violation of pragmatic maxims (Attardo & Raskin, 1991).Thus, traditional advice may compel college writers to avoid humor, because being funny would demonstrate a purposeful lack of clarity and cohesion and disrespect the desires of the audience (i.e., teachers and professors) who tend to expect adherence to academic writing norms.In contrast to academic writing, everyday language is replete with examples of play and humor.Creativity in language is an important method of communication employed not just by the literary and lyrical, but also by everyday people in everyday speech (Cook, 2000).Indeed, humor has many psychological and social benefits that can work to aid communication between interlocutors (Martin, 2007).Although humor may not serve the immediate rhetorical goals of academic writing, evidence of humor in academic writing would be reflective of this general tendency to be creative and playful when communicating.However, because no studies have investigated the potential role that humor might play in academic writing, the forms and functions of humor in academic writing remain relatively unknown.As an initial investigation into this topic, our study investigates a corpus of college student academic writing that has been rated for writing quality, creativity, and humor.We take a computational approach to investigate these relations.Specifically, we use correlational and regression analyses to examine relations between linguistic features and humor ratings and the relation between humor and essay quality.Our study addresses the following research questions: 1.
Are humor ratings related to ratings of essay quality? 2.
Do linguistic features of academic writing (e.g., lexical, rhetorical, cohesive) correlate with ratings of humor in academic writing? 3.
What amount of variance in essay humor ratings is accounted for by these linguistic features?
Flourishing Creativity & Literacy commonly used when assessing SAT essays.Collection and background of this corpus is further described in Crossley and McNamara (2011).

Human ratings
Two separate pairs of trained raters scored each essay using a rubric designed to assess either essay quality (holistically) or essay creativity (analytically).The holistic quality rubric was designed using a standardized rubric associated with the essay portion of the SAT test.The analytic creativity rubric contained seven subscales related to idea generation and style.Four subscales were related to idea generation (fluency, flexibility, originality, and elaboration) whereas three subscales were related to style (humor, metaphor and simile, and word play.Each subscale was rated on a scale of 1 -6, with raters informed that the distance between each value on the scale was equal.The two rubrics are included in Appendix C. Raters possessed either Masters or Doctoral degrees in English, and all had at least two years of experience teaching writing at the university level.Each pair of raters first trained with a rubric using a practice set of 20 essays (not included in the MSU corpus) until they reached an inter-rater reliability of at least r = .60for the analytic scores and r = .70for the holistic scores (holistic scores generally reach a higher consensus and thus have a higher threshold).The raters then scored the remainder of the 313 essays independently.After the scoring was completed, differences between the raters' scores were calculated.If the difference was greater than two points for any sub-scale, the two raters adjudicated their scores, and average score between the two raters was computed for each subscale.For the creativity rubric, this process brought most adjudicated scores down to a difference of two or less, but some scores remained at a difference of two or more.Correlations and Kappas for the raters' scores after adjudication are reported below.

Linguistic variables
The indices we extracted from TAALES, TAACO, and WAT were pre-selected based on perceived and known links between humor and linguistic features.TAALES is a text analysis tool designed to measure the overall lexical sophistication of a text and includes over 150 different lexical measurements related to lexical frequency, lexical range, psycholinguistic word information, and academic language.TAACO measures the cohesion properties of a text by incorporating over 150 indices related to word overlap, type-token rations, and use of connectives, as well as local (sentence-to-sentence) and global (paragraph-to-paragraph) measures of cohesion within a text.WAT is a text analysis tool designed to assess overall writing quality and includes a variety of writing-specific lexical, rhetorical, and cohesion indices.Specifically, WAT reports the incidence of certain lexical categories indicative of rhetorical style.These include exemplification, hedges, amplifiers, downtowners, copular verbs, and private and public verbs.WAT also uses latent semantic analysis (LSA; Landauer et al., 2007) to measure cohesion by calculating the semantic overlap (i.e., conceptually related words and phrases across a text) between sentences and paragraphs.In addition, WAT reports on a variety of indices related to lexical sophistication, key word use, and n-grams.The indices selected from these three tools are discussed below.

Basic text properties
We selected basic properties of the text, such as number of words per text, number of total lemmas per text, number of total word types per text, and average sentences per text because the length of the essay may be related to a greater probability for humor to be expressed.Basic text descriptive indices were calculated using WAT.

Grammatical and semantic word properties
We included the WAT word part of speech (POS) type indices related to incidences of pronoun types, verb types, adverbs, adjectives, and nouns because previous humor studies have shown that humor exhibits unique semantic features, such as human-centric language (Mihalcea et al., 2010) and descriptiveness (Reyes et al., 2012).Additionally, we included word indices from WAT designed to measure the overall incidences of negative or positive words in each text based on a number of investigations that have identified negative semantic meanings or polarity as indicative of humor (e.g., Campbell & Katz, 2012;Reyes et al., 2012).

Textual cohesion
We used indices related to semantic overlap, lexical diversity, and givenness reported by TAACO and WAT to capture textual cohesion in student essays based on previous results showing greater semantic distance of shared topics and themes in humorous texts (Mihalcea & Strapparava, 2006;Mihalcea et al., 2010).Because incongruity is widely recognized as an element of humor (Martin, 2007), we hypothesize that greater semantic distance between words, higher lexical diversity, and relatively less givenness within a text may be more predictive of humor.

Rhetorical devices
While all of these texts were written under the purview of an academic genre, we presume that student essays containing humor will contain fewer overt markers of academic writing.One way to measure this is through the frequency of rhetorical devices commonly associated with academic writing.Thus, we included indices that calculate the use of classic rhetorical phrases used to conclude an essay (e.g., "In closing") or to state a concluding opinion, such as "I think…" or "I believe…".These indices were calculated using WAT.

Word frequency
Measurements of word frequency indicate how often a particular word is used in a given corpus.Word frequency is typically provided for single words.In addition, frequency can also be calculated for n-grams (i.e., two or more words that frequently pattern together).While few studies have explicitly used word frequency measures in automated assessments of humor (cf.Reyes & Rosso, 2012, who included n-gram frequency in a computational model to predict ironic texts), success in the computational generation of humor has relied on the exploitation of simple, unambiguous lexical items in order to generate riddles, puns, and one-liner jokes (Ritchie, 2004).Therefore, we predict that humor in student essays will involve relatively frequent words.Word frequency measures were obtained with TAALES and WAT.

Psycholinguistic properties of words
Several indices indicative of the psycholinguistic properties of words were included.These included word familiarity, imagability, concreteness, meaningfulness, and age of acquisition.To our knowledge, only one previous study of humor has considered this range of word properties: Skalicky and Crossley (2015) found that satire included more concrete words than did non-satire.Further evidence of the potential importance of these indices comes from studies in ironic and figurative language processing, which demonstrate that word salience (i.e., concreteness, familiarity) is crucial for ironic interpretations (Cronk & Schweigert, 1992).Because humor and irony are closely related (Simpson, 2003), we examine whether humorous texts include more familiar, imageable, concrete, and meaningful words.All measures of essays' psycholinguistic properties were calculated using TAALES and WAT.

Statistical Analysis
An exploratory factor analysis was conducted to examine relations between the seven analytic creativity subscales obtained by human raters and to develop weighted component scores based on co-occurrence factors in the ratings found in the creativity rubric (see Results section below).Results from that factor analysis revealed two factors: a Creativity component score and a Humor component score.Because the current study is primarily concerned with how humor is manifested linguistically in academic writing, only the Humor component score was analyzed further in this study.This Humor score was used as a dependent variable in a regression analysis to examine the potential for linguistic variables to explain humor in academic writing.For the selected variables described above, we first removed non-normally distributed indices.We then conducted correlations between the Humor component score and the remaining indices to assess which indices reported a meaningful and significant relation (p < .05,indicating at least a small effect size; r ≥ .10)with the Humor component.
Correlations amongst the indices that demonstrated a meaningful and significant relation were then checked for instances of multicollinearity.If any two indices were highly collinear (r > .90),only the index with the strongest relation to the Humor component score was retained.Finally, we discarded any of the remaining indices that we were unable to justify theoretically for inclusion.The remaining indices (n = 24) were entered as predictor variables into a stepwise multiple regression in order to explain the variance in the Humor component scores.Before carrying out the regression analysis, we divided the student essays into training and test sets using a 67/33 split (67% training, 33% test; Witten et al. 2011), which allowed for cross-validation of the regression model.If a model derived from a training set predicts the outcome variable in the test set at a similar accuracy rate as the training set, the regression model can be considered stable.We first obtained a model from the essays comprising the training set.We then applied that model to the test set to assess its predictive power and overall generalizability.

Scoring subscales
An exploratory factor analysis was conducted using the human scores on the creativity rubric to investigate potential subscales for the ratings.A Bartlett's test of sphericity was statistically significant (p <.001), and the Kaiser-Meyer-Olkin measure of sampling adequacy reported .693,indicating underlying structures.The scree plot suggested the extraction of two factors, which was also supported by the percent of variance explained by the initial Eigenvalues between the second and third factors.The principal axis factoring using a varimax rotation also identified two factors.
The items that loaded onto the first factor, which we labeled Creativity, were fluency, flexibility, elaboration, originality, and metaphor.The items that loaded onto the second factor, which we labeled Humor, were humor and word play.All items loaded onto their respective factors with eigenvalues > .500(see Table 2).The Creativity and Humor subscales were both calculated by weighting the items based on their Eigen weights in the factors and averaging these weighted scores across the items for each factor.For this study, we only focus on the Humor subscale, which was used in a subsequent regression analysis, along with the previously discussed linguistics variables, in order to examine the potential for language features to predict the presence of humor in the essays.

Humor component scores and essay quality
The average humor component score for the essays was M = 1.96 (SD = 0.59).The average essay quality was M = 3.29 (SD = 0.98).The correlation between the Humor component scores and the holistic essay quality scores was r(313) = .195,p < .001,indicating a small (yet significant) relation (Cohen, 1992).

Correlations between humor component scores and linguistic indices
As an initial step to identify indices that best predict essays' Humor scores, we discarded indices that were nonnormally distributed, were not theoretically related to humor, or did not demonstrate a significant correlation with the humor component score (r ≥ .10,p > .05).The 34 remaining indices were then checked for multicollinearity.If any two indices were highly collinear (r > .90), the index with the weakest relation to the Humor component score was removed.This resulted in the removal of 10 additional indices, and a total of 24 linguistic indices.Correlations between these 24 indices and the Humor component score are displayed in Table 3.
The correlations between the Humor component score and the linguistic indices are generally weak.Collectively, however, they tell a coherent story.They indicate that the essays scored as more humorous are longer, more descriptive (i.e., more adverbs, more adjectives, more infinitives, greater negativity, more verbs, fewer nouns, greater concreteness), use more distinctive, sophisticated language (i.e., more unique bigrams, less frequent content words), and less cohesive (i.e., lower semantic similarity, greater lexical diversity, less overlap, fewer connectives, lower givenness, fewer conclusion words).Hence, on their own, the correlations provide some insight into the linguistic nature of the humor scores.4).When the model was applied to the test set, the model yielded, r = .419,R 2 = .175,indicating that the four predictor variables explained 17.5% of the variance in the Humor component score for the 115 essays in the test set, and that the model can therefore be considered stable.

Regression analysis to predict humor component scores
Of the four significant predictor variables, three were reported by WAT (Incidence of adverbs, Incidence of adjective predicates, Semantic similarity: paragraph-to-paragraph) and one was reported by TAALES (Word frequency content words: Kucera-Francis).The first two of these variables were positive predictors of the Humor component score, meaning that as they increased, so did the Humor component score.The final two were negative predictors, meaning that as their scores decreased, the Humor component score increased.In other words, more adverbs and adjective predicates resulted in higher Humor component scores, whereas lower semantic similarity between paragraphs and lower word frequency resulted in higher Humor component scores.As such, the regression tells a similar story as the correlations, the essays with more humor were more descriptive (i.e., more adverbs, more adjectives), use more distinctive, sophisticated language (i.e., less frequent content words), and less cohesive (i.e., lower semantic similarity).

Discussion
This study analyzed a corpus of undergraduate essays in order to better understand the linguistic forms and features of humor in student academic writing.In addition, we also examined the relations between judgments of humor and essay quality.Because humor and creativity serve important roles in communication (Cook, 2000;Martin, 2007), it is important to understand the manner in which humor functions in academic writing, and whether or not humor and essay quality are linked.In general, our results indicate that four linguistic features are predictive of humor in academic writing.We also found a small but positive link between humor and essay quality.Our final model selected four linguistic indices which successfully accounted for 17.5% of the variance in Humor scores, suggesting that higher incidences of adverbs and adjective predicates and lower paragraph-to-paragraph semantic similarity and word frequency account for approximately one fifth of the variance in the Humor score component.In the remainder of this section, we will discuss these indices in detail and provide examples from essays that loaded the highest into the Humor component score.The index that contributed the most to the regression model was incidence of adverbs (8.7%), which loaded positively into the model, meaning that essays with higher Humor scores tended to contain higher numbers of adverbs.The following excerpt comes from an essay that received a Humor component score of 5.45 and essay quality score of 2.5 (both scales ranged from 1-6).The author was responding to a prompt on the nature of heroes and celebrities.Adverbs have been italicized for ease of identification: "Anyway, heroes are cool because they don't even care what you think.They will just wake up and silently think to themselves, "Yep, it's time to be awesome today."They don't even exclaim that in their heads because that would be so unnecessary and foolish.Alternatively, celebrities wake up all scared and unsure of themselves hoping that the world will approve of them because, "I'm not sure I'll be awesome today...hope everything goes smoothly today and I don't crash my car into a fire hydrant while sneaking away at 3 a.m. to cheat on my wife!...Cause man that would stink." This example demonstrates the author's frequent use of adverbs to modify verbs ("goes smoothly"), adjectives ("all scared"), and entire clauses ("Alternatively, …").Semantically, adverbs are typically employed to express degree, convey attitudes, or modify actions (Biber et al., 2002).In this particular excerpt (and in other essays in the corpus), the adverbs function to qualify elements of the narrative characterization of heroes and celebrities in a manner that intensified the actions described.The effect of such purposefully exaggerated narration is both comical and vernacular in tone.In this regard, the narration in the above excerpt is more descriptive, and mirrors spoken language, rather than academic registers.The second-strongest index in our model was paragraph-to-paragraph semantic similarity, which added 4.4% more to our model's R 2 value.This index is a measure of cohesion that uses latent semantic analysis to calculate the semantic similarity among paragraphs within an essay.This index loaded negatively into our model, meaning essays with higher Humor scores tended to have lower paragraph-to-paragraph semantic similarity.In other words, funnier essays were more likely to contain paragraphs whose topics were semantically inconsistent relative to surrounding paragraphs' topics.Of the four indices in the model, semantic similarity has a direct relation to incongruity models of humor (Martin, 2007), as disruptions in the semantic cohesion of an essay may signal to a reader that a section of the essay should not be interpreted as academic writing but instead as a humorous aside.
As an example, the same essay quoted above demonstrated a lack of paragraph-to-paragraph similarity in paragraphs three and four of the essay (see Appendix A for the full essay)."Heroes set out to decide whether or not they approve of the world.If not, they change it by any means necessary without resorting to celebrity-like tactics because that would so totally defeat the purpose of their heroic deeds.If everyone looked up to heroes then the world would have many fewer celebrities in the future.
Everyone would become all modest, smart, strong, self-reliant and wise.That's a nice idea but if you think for a minute, fewer celebrities means fewer fools to laugh at which means fewer examples of what not to do.Normal people learn from their mistakes while wise people learn from the mistakes of fools.
Heroes may not always be popular with the law.Batman, for example, was constantly hunted by the police for being a vigilante and for littering.It is a little known fact that batman does not pick up his soda cans.This just goes to show that even though heroes have the best interest of the world in mind, they may not always be perfect themselves." The discussion of Batman (a fictional comic book superhero) aligns wells with the thesis and topic of this essay, but the author's decision to include this example of Batman's fictional misdemeanors to support the topic sentence of the paragraph is in stark contrast to the previous paragraph, which argued that heroes are distinct from celebrities using very straightforward and academic vocabulary.As a result, the above excerpt contains relatively anomalous lexical choices (e.g., "soda cans" and "littering") compared to the previous paragraph.In general, this essay is marked by the author's shifts between content topics and writing styles from paragraph to paragraph.Our model suggests that the humor in this essay may have thus been signaled in part by a lack of semantic cohesion between paragraphs.Word frequency of content words was the third-strongest contributor in our model, explaining 3.2% of the total variance in Humor component scores.This index loaded negatively, suggesting that essays with higher Humor scores tended to have lower content word frequency.Content word is used here to refer to a noun, lexical verb, adjective, or adverbs (as opposed to a function word, which typically expresses a grammatical relation, e.g., prepositions).Content words with relatively low word frequency in the essay quoted above were tactics, royal, transgression, hydrant, and vigilante, among others.Recall that the frequency of words is a measure of their relative use in language, meaning that lessfrequent words are less-commonly encountered, and also more distinctive.Of course, infrequently encountered words are not inherently humorous.Rather, we would argue that authors who tend to use humor are using more distinctive language, and as such, are more likely to exhibit rich vocabularies, or lexical sophistication, for which the use of lowfrequency language is a strong indicator.Adjective predicates were the fourth and final significant contributor to the Humor component score in our model, explaining 1.8% of the variance in the overall model.This index loaded positively, suggesting that essays with higher Humor component scores contained a higher number of adjective predicates.Adjective predicates are single-or multiple-word adjective phrases that modify the subject of a sentence.As opposed to attributive adjectives (which almost always precede a noun phrase in English), adjective predicates are part of the main verb phrase in a clause and are typically preceded by a copular verb (e.g., be, seems, appears).Thus, in the sentence "The dog is brown," brown is an adjective predicate.The following sentence illustrates the use of adjective predicates from the essay quoted above: "Everyone would become all modest, smart, strong, self-reliant and wise."Here we see a string of adjective predicates (italicized above) following the copular verb become.Adjective predicates are further unique from attributive adjectives in that their occurrence after the main verb makes them more likely to express new information about the subject of a sentence than previously given information (Chafe, 1976).In this regard, adjective predicates are syntactically poised to redefine a topic, rather than to merely modify it.One interpretation of the ability of adjective predicates to predict humor is that humorous academic texts are more likely to redefine their topic matter in a manner that is comical, surprising, or deprecating.Evidence of this tendency can be found in the example essay and throughout humorous essays in our corpus as a whole.These findings have several implications.First, the linguistic features that emerged as significant in this study differ from those seen in other computational studies of humor (e.g., Carvalho et al., 2009;Mihalcea & Strapparava, 2005, 2006;Reyes et al. 2012;Skalicky & Crossley, 2015).This is not surprising, given that the humor analyzed here was markedly different from previously studied humor, such as one-liners, humorous quotes, or ironic tweets, and agrees with observations made claiming feature sets from one descriptive study of humor may not match others (Reyes et al., 2010).Secondly, it may be that authors who employ humor in academic writing do so cautiously, aware of the exhortations to write concisely, directly, and to remain on point (e.g., American Psychological Association, 2010; Palmquist, 2010).As a result, linguistic features typical of academic writing remain dominant, even in more humorous essays.For example, the essay quoted above, despite receiving the highest Humor component score, still contains both the typical rhetorical organization of an academic essay, including opening, body, and concluding paragraphs, and a paragraph structure that includes both topic and concluding sentences.Furthermore, the primary function of humor in this essay was to provide humorous examples that served to support the author's overall thesis.Therefore, the essay demonstrates that it is possible to use humor to support the larger rhetorical demands of academic writing, although the low essay quality score for our exemplar essay demonstrates that it will not always be successful.Moreover, despite wordplay's connection to the manipulation of linguistic forms and semantic meanings of words (Cook, 2000), measurements that might have captured linguistic features such as repetition, alliteration, and ambiguity did not account for a large percentage of the Humor component score.This suggests that for both wordplay and humor, raters may attend to other features of the texts not measureable by the text tools employed in this study.These may be larger, rhetorical or pragma-linguistic devices, such as genre conventions or the voice of the author (Devitt, Reiff, & Bawarshi, 2004).Furthermore, actual incidences of explicit humor were relatively rare.No essays attempted humor through canned jokes or puns.Instead, humor was typically signaled through sarcasm, derisive comments about the subject matter, or fantastical descriptions of fictional characters.In other words, having a high Humor component score did not necessarily mean that the essay included jokes or attempted to be explicitly funny, but rather, that the raters perceived some elements of wordplay or humor in the essay that created a tone more accurately described as playful, whimsical, or wry.Importantly, though, our results found a small positive correlation between essay quality and humor ratings (r = .195).This suggests that humor may be a contributing factor to holistic ratings of essay quality.In order to illustrate the positive correlation between humor and academic writing, we briefly discuss another essay from the corpus, which had an essay quality score of 6 and a humor score of 4 (see Appendix B for full essay).The prompt for this essay asked students to discuss the inherent tension between a desire to be unique and the reality that it is difficult to make truly unique contributions to the world.In this essay, the student employed irony, wordplay, and negative sarcastic evaluation.The student began the essay by stating that unoriginality is inevitable, and pointed out the irony inherent in constant recycling of styles in the fashion industry: "However, no matter how much effort the designers for Versace put into a gown, it is almost guarunteed that Chanel produced nearly the same dress twenty years ago." However, when the student turned to focus on the context of a local university and town, a number of negative evaluations through sarcasm (which may result in humor depending on the reader) were apparent: "More immediate examples of this principle can be seen on campus at [name of university].One cannot turn a corner without seeing girls in Nike running shorts.These particular shorts were designed for exercising, not for sitting in class.It is a trend that was sponned by a sorority, probably as a joke, and unfortunately caught on to the point where it is the norm for girls here in [name of town] to walk around in gym shorts all day long.It would be understandable if they intended to work out after class, but from the looks of most of them they do not do much in the way of exercise.The fraternity trend is Ralph Lauren Polo shirts.Fraternity boys have a polo shirt in every color: long-sleeved, short-sleeved, no-sleeved.These shirts cost well over eighty dollars, so their parents are probably not happy that these shirts are the only acceptable form of clothing for fraternities." In this example, the student opens with a jab targeted at other students who wear exercise clothes for purposes other than exercising, before implying that these same people are in need of exercise.The author then turns their ire towards fraternity styles, using a parallel play on the hyphenated adjectival "-sleeved" to joke that some polo shirts have no sleeves.The author ends the paragraph with the observation that parents must be upset over the high cost of this style.
What is interesting about this paragraph is that it serves two functions: to add support for the overall argument using examples, while at the same time mocking members of the author's local community.Both of the exemplar essays use humor as a means to support their claims.The difference between this essay and the previous essay is primarily in the humorous example that is used.In the first essay, the fictional superhero Batman is discussed, whereas in this essay, the author targets real members of the local community.It may be that the function of humor in this second essay worked to build rapport between the essay rater and author (a recognized function of humor; Martin, 2007), especially if the essay rater shared similar feelings towards members of fraternities or those who wear exercise clothing outside of a gymnasium.However, the humor in this essay was also more congruent with the rest of the writing, unlike the first example, and the author was better able to cloak the humor behind the typical diction of academic writing.It may be, then, that humor does have a place in academic writing, but only if students employ it carefully and subtly.

Conclusion
In this study, we have demonstrated the ability to predict a portion of the variance in raters' perceptions of humor and wordplay in academic writing.This task is challenging because academic writing is not a genre in which humor would be expected to occur.Nonetheless, we have offered initial evidence suggesting that humor or wordplay in academic writing may be signaled via descriptive language, such as adverbs and adjective predicates, along with a lack of semantic cohesion between paragraphs and the use of more sophisticated words.We have also demonstrated a small yet significant relation between the use of humor in academic essays and human perceptions of essay quality, one that warrants further investigation.
To our knowledge, no student is expressly instructed to be funny in academic writing.Yet, as this analysis demonstrates, student attempts at humor in academic writing do occur.While we have identified some of the linguistic forms and functions of humor in student essays, further research is needed to investigate the attested relation between essay quality and humor.The features identified here can also be used in future studies examining a wider range of contexts and writing proficiency in order to contribute to a better understanding of how humor functions in academic writing.
Correlations do not address the question regarding which of those features in the Humor component scores influence judgments made by human raters.To address this question, a step-wise regression was conducted to assess which of the 24 indices collectively explained the variance in the Humor component score.The regression model, F(4, 193) = 10.650,p < .001,r = .425,R 2 = .181,demonstrated that four predictor variables explained 18% of the variance for the 198 essays in the training set (see Table

Table 1 .
Inter-rater reliability for essay scores.

Table 2 .
Factor analysis: Eigen loadings for components

Table 3 .
Correlations between humor component score and computational indices

Table 4 .
Stepwise regression analysis and significance values for linguistic indices predicting humor component scores B = unstandardized β; B = standardized; S.E.= standard error.Estimated constant term is 2.720; all t significant at < .05