2014 MultimodalDistributionalSemanti

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Semantic Relatedness; Marco-Elia-Nam (MEN) Semantic Relatedness Benchmark; MEN Word Relatedness Score; MEN Word Relatedness Dataset.

Notes

Cited By

Quotes

Abstract

Distributional semantic models derive computational representations of word meaning from the patterns of co-occurrence of words in text. Such models have been a success story of computational linguistics, being able to provide reliable estimates of semantic relatedness for the many semantic tasks requiring them. However, distributional models extract meaning information exclusively from text, which is an extremely impoverished basis compared to the rich perceptual sources that ground human semantic knowledge. We address the lack of perceptual grounding of distributional models by exploiting computer vision techniques that automatically identify discrete “visual words” in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. We propose a flexible architecture to integrate text - and image-based distributional information, and we show in a set of empirical tests that our integrated model is superior to the purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.

1. Introduction

2. Background and Related Work

3. A Framework for Multimodal Distributional Semantics

4. Implementation Details

5. Experiments

6. Conclusion

Acknowledgements

We thank Jasper Uijlings for his valuable suggestions about the image analysis pipeline. A lot of code and many ideas came from Giang Binh Tran, and we owe Gemma Boleda many further ideas and useful comments. Peter Turney kindly shared the abstractness score list we used in Section 5.2.3 and Yair Neuman generously helped with a preliminary analysis of the impact of abstractness on our multimodal models. Mirella Lapata kindly made the WordSim353 set used in the experiments of Feng and Lapata (2010) available to us. We thank the JAIR associated editor and reviewers for helpful suggestions and constructive We thank Jasper Uijlings for his valuable suggestions about the image analysis pipeline. A lot of code and many ideas came from Giang Binh Tran, and we owe Gemma Boleda many further ideas and useful comments. Peter Turney kindly shared the abstractness score list we used in Section 5.2.3 and Yair Neuman generously helped with a preliminary analysis of the impact of abstractness on our multimodal models. Mirella Lapata kindly made the WordSim353 set used in the experiments of Feng and Lapata (2010) available to us. We thank the JAIR associated editor and reviewers for helpful suggestions and constructive.

References

BibTeX

@article{2014_MultimodalDistributionalSemanti,
  author    = {Elia Bruni and
               Nam-Khanh Tran and
               Marco Baroni},
  title     = {Multimodal Distributional Semantics},
  journal   = {Journal of Artificial Intelligence Research},
  volume    = {49},
  pages     = {1--47},
  year      = {2014},
  url       = {https://doi.org/10.1613/jair.4135},
  doi       = {10.1613/jair.4135},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2014 MultimodalDistributionalSemantiMarco Baroni
Elia Bruni
Nam-Khanh Tran
Multimodal Distributional Semantics2014