Authors: Satyen Abrol; Latifur Khan; Fahad T. Bin Muhaya
Addresses: Department of Computer Science, Erik Jonsson School of Engineering and Computer Science, University of Texas at Dallas, P.O. Box 830688, EC 31, 75083-0688, Richardson, TX, USA ' Department of Computer Science, Erik Jonsson School of Engineering and Computer Science, University of Texas at Dallas, P.O. Box 830688, EC 31, 75083-0688, Richardson, TX, USA ' Management Information System Department, College of Business Administration, King Saud University, P.O Box 2459 Riyadh, Saudi Arabia
Abstract: In the present world scenario, everybody is on the lookout for suitable housing options, each having different needs (e.g., the elderly are looking for safe, quiet neighbourhood, while students are looking for affordable apartments close to the university/school). For e.g., Craigslist currently does not have a map version, making the process of apartment searching a very long and laborious process. This creates a need for software that is significantly superior to current web search tools. We demonstrate the development of a tool which takes the Craigslist apartment listings on Google Maps. MapIt then integrates this functionality with the information collected from location based extraction of various web sources such as the city police blotter which makes apartment searching simpler and faster, helping the user to make a better decision. The paper also discusses the challenges that are faced in the development process, the raw and unstructured nature of the documents, the existence of geo/non-geo and geo-geo disambiguities and our approach in identifying the location of the apartment from informal text (geo-parsing and geo-tagging of content) to ensure maximum coverage of the listings.
Keywords: information retrieval; text mining; natural language processing; gazetteers; geo-parsing; disambiguation; geo-tagging; housing location; knowledge discovery; apartment searching; apartments; house hunting.
International Journal of Data Mining, Modelling and Management, 2013 Vol.5 No.1, pp.57 - 75
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 07 Feb 2013 *