Title: DataTalk-V: time series visualisation for internet of things based on clustering and dimension reduction for an IoT platform
Authors: Jiun-Yi Lin; Yun-Wei Lin; Yi-Bing Lin
Addresses: Digital Technology Transformation Office, China Medical University Hospital, No. 2, Yude Rd., North Dist., Taichung City 404327, Taiwan ' College of Artificial Intelligence, National Yang Ming Chiao Tung University, No. 301, Sec. 2, Gaofa 3rd Rd., Guiren Dist., Tainan City 711, Taiwan ' Department of Computer Science, National Yang Ming Chiao Tung University, 1001 University Road, Hsinchu, 300, Taiwan
Abstract: Understanding the complexities of the growing time-series data collection poses a challenge. To extract valuable insights and knowledge from this data, data mining approaches have been developed to process and analyse it effectively. Dimension reduction (DR) is a commonly employed method for this purpose. Selecting appropriate hyperparameter values and measuring visualisation quality for DR are critical for ensuring the usefulness of the visualisation. To enhance DR further, we propose integrating it with pseudo labels generated by clustering techniques. This paper designs DataTalk Visualisation (DataTalk-V), an algorithm for visualising time series data. DataTalk-V automatically performs clustering and selects hyperparameters for the dimension reduction (DR) method on high-dimensional data, resulting in two-dimensional data. DataTalk-V is built on IoTtalk, an IoT application development platform. DataTalk-V leverages a cost function in Bayesian optimisation to effectively optimise the hyperparameters for DR. We demonstrate that the two-dimensional data reduced by DataTalk-V not only facilitates data visualisation but also enhances the prediction accuracy of the k-nearest neighbours (k-NN) algorithm. We demonstrate that the DR model generated by DataTalk-V is applied to analyse the sensitivity of the features from soil samples and successfully predicts the correlation of these features with their respective machine learning models.
Keywords: Bayesian optimisation; clustering; dimension reduction; DR; hyperparameter tuning; time series; visualisation; k-nearest neighbours; k-NN.
DOI: 10.1504/IJSNET.2024.136688
International Journal of Sensor Networks, 2024 Vol.44 No.2, pp.63 - 73
Received: 17 Aug 2023
Accepted: 22 Aug 2023
Published online: 16 Feb 2024 *