Title: Semantic search for Japanese sentences based on sentence embedding

Authors: Yoshihiro Adachi; Minoru Uehara

Addresses: Research Institute of Industrial Technology, Toyo University, Japan ' Graduate School of Information Sciences and Arts, Toyo University, Japan

Abstract: Research is progressing in semantic search, in which the meaning of a sentence is represented as a numerical vector called an embedding and sentences are searched for based on their similarity. The ability to process queries combined using logical operators is essential for semantic search. We developed an appropriate technique for processing queries containing logical operators with reference to fuzzy set operations. In this technique, AND and OR between queries take the minimum and maximum similarity of the search results, respectively. A NOT operation on a query subtracts the similarity to the query from 1 and scales the result to obtain the desired search results. We devised an example-based semantic search method that obtains search results that match the user's intention as closely as possible based on the positive and negative example sentences that should and should not be included in the search results, respectively, as specified by the user.

Keywords: semantic search; digital document; sentence embedding; query processing; logical operators; example-based search; dimensionality reduction.

DOI: 10.1504/IJWGS.2025.147144

International Journal of Web and Grid Services, 2025 Vol.21 No.2, pp.199 - 220

Received: 30 Sep 2024
Accepted: 18 Nov 2024

Published online: 10 Jul 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article