2010 RelationalRetrievalUsingaCombin

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Random Walk with Restart, Entity Relation Graph, Link Prediction, Learned Similarity Measure.

Notes

Cited By

Quotes

Author Keywords

Abstract

Scientific literature with rich metadata can be represented as a labeled directed graph. This graph representation enables a number of scientific tasks such as ad hoc retrieval or named entity recognition (NER) to be formulated as typed proximity queries in the graph. One popular proximity measure is called Random Walk with Restart (RWR), and much work has been done on the supervised learning of RWR measures by associating each edge label with a parameter. In this paper, we describe a novel learnable proximity measure which instead uses one weight per edge label sequence: proximity is defined by a weighted combination of simple "path experts", each corresponding to following a particular sequence of labeled edges. Experiments on eight tasks in two subdomains of biology show that the new learning method significantly outperforms the RWR model (both trained and untrained). We also extend the method to support two additional types of experts to model intrinsic properties of entities: query-independent experts, which generalize the PageRank measure, and popular entity experts which allow rankings to be adjusted for particular entities that are especially important.

5 Conclusion and future work

We proposed a novel method for learning a weighted combination of path-constrained random walkers, which is able to discover and leverage complex path features of relational retrieval data. We also evaluate the impact of using query-independent path features, and popular entity features which can model per entity characteristics. Our experiment on several recommendation and retrieval tasks involving scientific publications shows that the proposed method can significantly outperforms traditional models based on random walk with restarts.

We are very interested in the generalization from simple relations to hyper-relations which are mappings from possibly more than one source types. For example, there is much incentive to express the AND relation (Balmin et al. 2004): e.g. consider the task of finding papers that are both written by certain author and recent. However, model complexity will be a major concern. Efficient structure selection algorithm is very important to make a system practical.

Furthermore, we are interested in algorithms that introduces new entities and edges to the graph. This can potentially be useful to improving retrieval quality or efficiency. For example, new entities can represent subtopics of research interests, and new links can represent memberships from words, authors or papers to these subtopics. In this way, a model might be able to replace some long paths which we have shown in the experiment with relatively shorter and more effective paths associated with the introduced structures.

References

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2010 RelationalRetrievalUsingaCombinWilliam W. Cohen
Ni Lao
Relational Retrieval Using a Combination of Path-constrained Random Walks10.1007/s10994-010-5205-8