Research on born-digital image text extraction based on conditional random field
by Zhang Jian; Cheng RenHong; Wang Kai; Zhao Hong
International Journal of High Performance Systems Architecture (IJHPSA), Vol. 5, No. 1, 2014

Abstract: With the number of digital videos and digital images increasing tremendously in e-mails and web pages, text extraction from images becomes important more than ever. Born-digital images are generated directly with the computer and the text in the images is important to help the semantic understanding of the images. Although there are many methods proposed over the past years for text extraction from natural scene images, the text detection and extraction from born-digital images remains a challenge. This paper proposes a novel method to segment the text connected components (CCs) from a born-digital image. Firstly, binarisation is conducted on the given image to get all candidate text CCs based on wavelet theory. Secondly, classification is conducted on the extracted CCs to label text CCs based on conditional random field (CRF) - a probabilistic graph model that has been widely used in natural language processing. Experimental results show that the proposed method can effectively extract text from the born-digital images.

Online publication date: Sat, 12-Jul-2014

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of High Performance Systems Architecture (IJHPSA):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com