Title: Document stream classification based on transfer learning using latent topics

Authors: Masato Shirai; Jianquan Liu; Takao Miura; Yi-Cheng Chen

Addresses: Research Center for Micro-Nano Technology, Hosei University, Tokyo, Japan ' System Platform Research Laboratories, NEC Corporation, Japan; Graduate School of Science and Engineering, Hosei University, Japan ' Dept. of Advanced Science, Hosei University, Japan ' Dept. of Computer Science and Information Engineering, Tamkang University, Taiwan

Abstract: In this investigation, we propose a classification framework based on transfer learning using latent intermediate domain for document stream classification. In document stream, word frequency changes dramatically because of transition of themes. To classify document stream, we capture new features and modify the classification criteria during the stream. Transfer learning utilises extracted knowledge from source domain to analyse the target domain. We extract latent topics based on topic model from unlabeled documents. Our approach connects each domain using latent topics to classify documents. And we capture change of features by update of intermediate domain in document stream.

Keywords: transfer learning; topic model; document stream classification.

DOI: 10.1504/IJBDI.2018.088290

International Journal of Big Data Intelligence, 2018 Vol.5 No.1/2, pp.105 - 113

Received: 22 Apr 2016
Accepted: 01 Dec 2016

Published online: 01 Dec 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article