Int. J. of Knowledge and Web Intelligence   »   2010 Vol.1, No.3/4

 

 

Title: Construction of linefeed insertion rules for lecture transcript and their evaluation

 

Author: Masaki Murata, Tomohiro Ohno, Shigeki Matsubara

 

Addresses:
Department of Systems and Social Informatics, Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku 464-8601, Japan.
Graduate School of International Development, Nagoya University, Furo-cho, Chikusa-ku 464-8601, Japan.
Information Technology Center, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan

 

Abstract: The development of a captioning system that supports the real-time understanding of monologue speech such as lectures and commentaries is required. In monologues, since a sentence tends to be long, each sentence is often displayed in multilines on the screen. In the case, it is necessary to insert linefeeds into a text so that the text becomes easy to read. This paper proposes a rule-based technique for inserting linefeeds into a Japanese spoken monologue sentence as an elemental technique to generate the readable captions. Our method inserts linefeeds into a sentence by applying the rules based on morphemes, dependencies and clause boundaries. We established the rules by circumstantially investigating the corpus annotated with linefeeds. An experiment using Japanese monologue corpus has shown the effectiveness of our rules.

 

Keywords: spoken language; sentence analysis; real-time captioning; clause boundaries; speech corpus; linefeed insertion rules; monologue speech; lectures; commentaries; linefeeds; Japanese language; spoken monologues; readable captions; morphemes; dependencies.

 

DOI: 10.1504/IJKWI.2010.034189

 

Int. J. of Knowledge and Web Intelligence, 2010 Vol.1, No.3/4, pp.227 - 242

 

Available online: 17 Jul 2010

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article