Towards an Automated Approach to Text Summarization

Until the last quarter of the 20th century, language teaching and learning was influenced by two tendencies: comparative linguistics studies of the 19th century and theories and methods of descriptive linguistics (Crystal 1994). Now there is a new perspective that stems from recent advancement in information technology which revolutionized language pedagogy such that most areas of language studies have been profoundly affected. In phonetics for instance, a new generation of instrumentation is gaining currency in auditory, acoustic and articulatory research (Crystal1994).

In graphology, image scanners enable large quantities of text to be processed quickly and image enhancing techniques magnify obscure graphic patterns in old manuscripts. In grammar, huge corpora of spoken and written language make possible studies of structures in unprecedented details and in unprecedented varieties (McEnery&Wilson, 1994). Similarly, discourse analyses are both motivating and beneficial from research in human and computer interaction(). Other areas also affected include ESL research, sociolinguistics, child language acquisition studies and corpus linguistics (Meskil&Rangelova1997, Beauvois1998).

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

In this paper, we highlight the shortcomings of the traditional approach to summary writing which is rooted in the descriptive tradition of epistemology and illustrate a new language technology tool called TOPICALIZER as a complimentary Language Teaching Strategy to language teachers. Shortcomings of the Traditional Approach to Text Summarization The traditional approach to summary writing in a Second Language class, after students might have read a text, involves a number of tasks. These tasks include reading the text more than once, making notes on the main points and expanding on the notes.

Chambers and Brigham (1989) argue that pedagogic defects exist in the traditional approach to the teaching of precis or summary and they advocate a shift in paradigm from the traditional praxis to a new approach called the Deletion Technique Approach (DTA). DTA is an approach to text summarization that involves “taking the original text and simply discarding any nonessential sentences, clauses, phrases or words” (Chambers and Brigham 1989). The procedure outlined for using the approach involves five (5) major steps. )Reading the passage and deleting all elaborations of the topic sentences. b)Deleting all unnecessary clauses and phrases. c)Deleting all extraneous lexis. d)Replacing the remaining words with the students own expression. e)Writing a fair copy. The pedagogic import of the deletion approach is predicated on its ability to be divided into sub skills that can be graded and practiced independently by learners. Further rationale for the approach is that it is a systematic and regularized approach that provides teachers and students the opportunity to apply the principles to any passage.

With the deletion techniques approach, students are active participants against their passivity when the traditional approach is adopted (Chambers & Brigham, 1989;Lopez, ). From the foregoing it is clear that DTA is an elaboration of the traditional approach where students follow a set of instructions: a) read the passage; b) make notes on the main points c) expand the notes. A similar insight on categorization of intra-sentential relationships is used in the development of the deletion technique methodology (Hoey, 1983).

In testing the deletion technique approach, to verify its efficacy and subsequent application, it was observed that students performed below expectation; because of lack of shared cultural content and interest factor of the text ( Kamai & Batrobass, 2012). An Experiment on Teaching Summary Writing Method The subject, instrument and procedure are described below. Subject The subjects consist of two hundred and five (205) advanced level non-native speakers of English language in their second year of schooling. Students were assigned to one treatment condition. Instrument

Materials for the test involved two reading texts adapted from the ordinary level examination question paper (NECO, 2000). The test calls for an understanding of lexical items and intra-sentential relationship to be followed by the reproduction of the text in students’ own words. The texts are judged to have fair level of syntactic and semantic difficulties and shared cultural contents. Treatment effects were examined based on simple percentage. Procedure: students were taught precis using the deletion technique approach (DTA) for a semester. The five steps outlined in the approach were carefully followed.

Diverse examples were used relative to the demands of the technique. Results Table 1 shows the result of a summary writing test given to two hundred and five (205) advanced level students after being taught using the deletion technique approach. Table 1. Internal frequency distribution of scores of students in precis S/noClass intervalFrequency 1. 0-515 2. 6-1010 3. 11-1513 4. 16-2015 5. 21-2505 6. 26-3005 7. 31-3505 8. 36-40135 9. 41-5002 10. 51-6000 As Table 1 indicates Two (2) students got C, one hundred and thirty five (135) got E and Seventy four (74) got F.

The average mean is 25 (x = 25), the median is 38 (x = 38) and the mode is 40 (x = 40). Because of the few students who scored (45 – 50) as the highest scores, the distribution is skewed. Thus, the median is probably the most accurate indicator of achievement in precis besides the mode. What this implies is that, like the traditional approach, underachievement is prevalent with deletion technique approach among ESL students. Furthermore, the deletion technique is not without its shortcomings. In the course of applying this technique in the lassroom, it was observed that: a) some students do not know what to delete and what not to delete, b) most students are ignorant of the understanding of lexical items and intra-sentential relationship, and c) some students have problem comprehending the text. In view of this, underachievement in precis among ESL students in Nigeria is not due to the shortcoming (s) of a particular praxis but rather other factors like the intellectual ability of learners and the learning strategies adopted; these factors are possible variables that can result to underachievement.

Relative to this, the strength underpinning DTA is not only contradicted but also, Krashen’s success-achievement generating principles is re-echoed (see Krashen, 1982). Thus, this article speculates that in Nigerian schools, underachievement in precis is not due to the adoption of a particular praxis as evinced by the proponents of the DTA. The findings also imply that, achievement in text summarization is not based on the particular praxis adopted; but rather factors like the intellectual abilities of learners, their knowledge of the world and the adopted learning strategy are possible factors responsible for underachievement.

Language Technology as a Complementary Teaching Strategy THE TOPICALIZER: Topicalizer is a language technology tool for topic extraction, textual analysis and abstract generation (Wilsmansanni, 2003). This technology is important to the language learner as a complimentary learning strategy that can augment the role of the teacher in the language learning process. In what follows, we (will) illustrate how the TOPICALIZER can be used to teach summary writing. Text A is an abstract of a research article. The sample text has a readability index of 10. based on Gunning-Fog rating while the Automated Readability Index is 7. 25. Readability according to Coleman-Liau Index is 20. 07 and an Average Readability of 12. 74. The abstract will be used as a reference text to teach students summary writing. TEXT A THE PURPOSE OF THE STUDY WAS TO VERIFY THE EFFICACY OF COOPERATIVE LEARNING STRATEGY IN MAXIMIZING LANGUAGE LEARNING. THE STUDY, SPECIFICALLY, EXAMINED THE RELATIONSHIP BETWEEN COOPERATIVE LEARNING AND STUDENTS’ PERFORMANCE. BASED ON THE RESULT OF ONE HUNDRED AND EIGHTY (180) STUDENTS FROM TWO SCHOOLS.


The utilization of Topicalizer requires the use of a computer and the internet as teaching aids. In situations where these are not available, the teacher can relay to students a previous online session using a projector as can be seen below. The teacher distributes a TEXT which has been modified into running text and has been entered as plain text into box (A) to all students. This is followed by selecting a Language for the text analysis in Box (B). The next step is to click on the command prompt (BOX C) which automatically initiates the textual analysis.

The following analysis results from the analysis of the abstract. Architecture of Topicalizer BOX A BOX B Language: BOX C DATA GENERATED BY THE TOPICALIZER BASED ON TEXT A Analysis for text Language English, character set: utf-8 ________________________________________ Lexical analysis Number of words (tokens): 11 Number of distinct words (types): 8 Average number of words per sentence: 0 Average number of words per paragraph: 0 Lexical density: 0. 73 Average number of characters per word: 6. 09

Average number of syllables per word: 2. 45 Longest word: ‘cooperative’ (11 characters) Shortest word: ‘text’ (4 characters) Ten most frequent words: learning2 efficacy1 language1 plain1 maximising1 text1 cooperative1 enter1 Most frequent words: learning2 ________________________________________ Phrasal analysis Ten most frequent two-word phrases: plain text1 maximising language1 enter plain1 text efficacy1 language learning1 cooperative learning1 Ten most frequent three-word phrases: plain text efficacy1 enter plain text1 maximising language learning1

Ten most frequent two-word phrases, including stop words: text efficacy1 plain text1 of cooperative1 maximising language1 enter plain1 in maximising1 learning in1 language learning1 cooperative learning1 efficacy of1 Ten most frequent three-word phrases, including stop words: learning in maximizing1 text efficacy of1 maximising language learning1 enter plain text1 cooperative learning in1 of cooperative learning1 efficacy of cooperative1 plain text efficacy1 in maximising language1 Most frequent two-word phrases: Most frequent three-word phrases: Most frequent four-word phrases:

Most frequent five-word phrases: Most frequent two-word phrases, including stop words: Most frequent three-word phrases, including stop words: Most frequent four-word phrases, including stop words: Most frequent five-word phrases, including stop words: ________________________________________ Textual analysis Number of paragraphs: 1 Number of sentences: 0 Average number of sentences per paragraph: 0. 0 Longest sentence: ” (0 words) Shortest sentence: ” (1000 words) Readability according to Gunning-Fog Index (the higher, the harder to read): 10. 1 Readability according to Automated Readability Index (the higher, the harder to read): 7. 25 Readability according to Coleman-Liau Index (the higher, the harder to read): 20. 07 Average readability (the higher, the harder to read): 12. 74 Suggested keywords: learning in maximizing Learning Text maximising language learning cooperative learning in Cooperative maximising language efficacy of cooperative enter plain in maximising language plain text Efficacy text efficacy of language learning of cooperative learning plain text efficacy Language Plain ooperative learning text efficacy Maximizing enter plain text Enter ________________________________________ APPLICATION TOPICALIZER IN LANGUAGE CLASS Based on the computerized data generated, the following aspects of summary writing can be taught by the language teacher namely, the identification of keywords and the formation of titles. The data generated from the lexical and phrasal analysis can be used to identify the keyword(s) and to form a title for Text (A). For instance, students will be asked to identify the most frequently used words and their frequency.

The words coloured in green are: Efficacy, Cooperative, Language, Learning, maximizing. Lexical analysis Ten most frequent words: learning2 efficacy1 language1 maximising1 cooperative1 Most frequent words: learning2 The second step is to ask students to identify from the Phrasal analysis generated by the computer the most frequently used two word phrases, three word phrases, four word phrases etc. Examples generated by the computer include the phrases in Yellow. Ten most frequent three-word phrases, including stop words: cooperative learning in….. of cooperative learning…. fficacy of cooperative…. in maximising language…. The last step is to ask students to form titles from the keywords for text (A) based on the lexical and phrasal analysis generated by the computer. Finally, the teacher compares the students’ versions of the title with the one generated by the computer and the one printed in the journal article. Conclusion What has been illustrated is how a language teacher can use a language technology tool called Topicalizer to teach summary writing. The discussion was premised on the role electronic media can play in language pedagogy.

Although the role of the mass media is paramount in developed parts of the world, language teachers in this part of the world can experiment with it in their language class