Title: Big data analytics: an improved method for large-scale fabrics detection based on feature importance analysis from cascaded representation

Authors: Ming-Hu Wu; Song Cai; Chun-Yan Zeng; Zhi-Feng Wang; Nan Zhao; Li Zhu; Juan Wang

Addresses: Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China

Abstract: Aiming at the dimensional disaster and data imbalance in large-scale fabrics data, this paper proposes a classification method of fabrics images based on feature fusion and feature selection. The model of representation learning using transfer learning idea was firstly established to extract semantic features from fabrics images. Then, the features generated from the different models were cascaded on the purpose of features complement. Furthermore, the extremely randomised trees (Extra-Trees) were used to analyse the importance of the cascaded representation and reduce the computation time of the classification model with big data and high-dimensional representation. Finally, the multilayer perceptron completed the classification of selected features. Experimental results demonstrate that the method can detect fabrics with high accuracy. Moreover, feature importance analysis effectively accelerates the detection speed when the model processes big data.

Keywords: big data; representation learning; feature fusion; feature selection.

DOI: 10.1504/IJGUC.2021.112483

International Journal of Grid and Utility Computing, 2021 Vol.12 No.1, pp.81 - 93

Received: 14 Jan 2020
Accepted: 01 May 2020

Published online: 19 Jan 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article