Article: Multi-level spatial attention network for image data segmentation Journal: International Journal of Embedded Systems (IJES) 2021 Vol.14 No.3 pp.289 - 299 Abstract: Deep learning models for semantic image segmentation are limited in their hierarchical architectures to extract features, which results in losing contextual and spatial information. In this paper, a new attention-based network, MSANet, which applies an encoder-decoder structure, is proposed for image data segmentation to aggregate contextual features from different levels and reconstruct spatial characteristics efficiently. To model long-range spatial dependencies among features, the multi-level spatial attention module (MSAM) is presented to process multi-level features in the encoder network and capture global contextual information. In this way, our model learns multi-level spatial dependencies between features by the MSAM and hierarchical representations of the input image by the stacked convolutional layers, which means the model is more capable of producing accurate segmentation results. The proposed network is evaluated on the PASCAL VOC 2012 and Cityscapes datasets. Results show that our model achieves excellent performance compared with U-net, FCNs, and DeepLabv3. Inderscience Publishers - linking academia, business and industry through research

Title: Multi-level spatial attention network for image data segmentation

Authors: Jun Guo; Zhixiong Jiang; Dingchao Jiang

Addresses: School of Software, Quanzhou University of Information Engineering, Quanzhou, Fujian, China ' School of Business, Shanghai Dianji University, Shanghai, China ' School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China

Abstract: Deep learning models for semantic image segmentation are limited in their hierarchical architectures to extract features, which results in losing contextual and spatial information. In this paper, a new attention-based network, MSANet, which applies an encoder-decoder structure, is proposed for image data segmentation to aggregate contextual features from different levels and reconstruct spatial characteristics efficiently. To model long-range spatial dependencies among features, the multi-level spatial attention module (MSAM) is presented to process multi-level features in the encoder network and capture global contextual information. In this way, our model learns multi-level spatial dependencies between features by the MSAM and hierarchical representations of the input image by the stacked convolutional layers, which means the model is more capable of producing accurate segmentation results. The proposed network is evaluated on the PASCAL VOC 2012 and Cityscapes datasets. Results show that our model achieves excellent performance compared with U-net, FCNs, and DeepLabv3.

Keywords: deep learning; semantic segmentation; big data.

DOI: 10.1504/IJES.2021.116134

International Journal of Embedded Systems, 2021 Vol.14 No.3, pp.289 - 299

Received: 05 Dec 2020
Accepted: 11 Jan 2021
Published online: 12 Jul 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Multi-level spatial attention network for image data segmentation

Keep up-to-date