2006 |
Siddiqui, Tanveer J; Tiwary, Uma Shanker A hybrid model to improve relevance in document retrieval Journal Article Journal of Digital Information Management, 4 , pp. 73 - 81, 2006, (Cited By (since 1996):2Export Date: 10 July 2014). Abstract | Links | BibTeX | Tags: CG-based retrieval model, High precision information retrieval model, Intelligent retrieval, Two-stage retrieval model @article{36, title = {A hybrid model to improve relevance in document retrieval}, author = {Tanveer J. Siddiqui and Uma Shanker Tiwary}, url = {http://www.scopus.com/inward/record.url?eid=2-s2.0-33748252460&partnerID=40&md5=28fb109a6982827d4057e4947164916d}, year = {2006}, date = {2006-00-01}, journal = {Journal of Digital Information Management}, volume = {4}, pages = {73 - 81}, abstract = {In information retrieval community a lot of work is focused on increasing efficiency by capturing statistical features. The other dominant approach is to improve the relevance by capturing the semantic and contextual information which is invariably inefficient. Generally the two approaches are assumed to be diametrically opposite. In this paper we have tried to combine the two approaches by proposing a hybrid information retrieval model. The model works in two stages. The first stage is a statistical model and the second stage is based on semantics. We have first downsized the document collection for a given query using vector model and then used a conceptual graph (CG) based representation to rank the documents. Our main objective is to investigate the use of conceptual graphs as a precision tool in the second stage. The use of CGs brings semantic in the ranking process resulting in improved relevance. Three experiments have been conducted to demonstrate the feasibility and usefulness of our model. A test run is made on CACM-3204 collection. We observed 34.8% increase in precision for a subset of CACM queries. The second experiment is performed on a test collection specifically designed to test the strength of our model in situation where the same terms are being used in different context. Improved relevance has been observed in this case also. The application of this approach on results retrieved from LYCOS shown significant improvement. The proposed model is both efficient, scalable and domain independent.}, note = {Cited By (since 1996):2Export Date: 10 July 2014}, keywords = {CG-based retrieval model, High precision information retrieval model, Intelligent retrieval, Two-stage retrieval model}, pubstate = {published}, tppubtype = {article} } In information retrieval community a lot of work is focused on increasing efficiency by capturing statistical features. The other dominant approach is to improve the relevance by capturing the semantic and contextual information which is invariably inefficient. Generally the two approaches are assumed to be diametrically opposite. In this paper we have tried to combine the two approaches by proposing a hybrid information retrieval model. The model works in two stages. The first stage is a statistical model and the second stage is based on semantics. We have first downsized the document collection for a given query using vector model and then used a conceptual graph (CG) based representation to rank the documents. Our main objective is to investigate the use of conceptual graphs as a precision tool in the second stage. The use of CGs brings semantic in the ranking process resulting in improved relevance. Three experiments have been conducted to demonstrate the feasibility and usefulness of our model. A test run is made on CACM-3204 collection. We observed 34.8% increase in precision for a subset of CACM queries. The second experiment is performed on a test collection specifically designed to test the strength of our model in situation where the same terms are being used in different context. Improved relevance has been observed in this case also. The application of this approach on results retrieved from LYCOS shown significant improvement. The proposed model is both efficient, scalable and domain independent. |
Publications
2006 |
A hybrid model to improve relevance in document retrieval Journal Article Journal of Digital Information Management, 4 , pp. 73 - 81, 2006, (Cited By (since 1996):2Export Date: 10 July 2014). |