Closing the Gap between Language and Vision

Страна: N/A

Город: N/A

Тезисы до: 31.01.2017

Даты: 31.01.17 — 31.01.17

Область наук: Филологические;

Е-мейл Оргкомитета:

Организаторы: Journal of Natural Language Engineering


Research involving both language and vision computing spans a large variety of disciplines and applications, and goes back at least two decades. In a recent scene shift, the big data era has thrown up a multitude of tasks in which vision and language are inherently linked. The explosive growth of visual and textual data, both online and in private repositories owned by diverse institutions and companies, has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data effectively depend on the connection between visual and textual content being made interpretable, hence on the semantic gap between vision and language being bridged.

One perspective has been integrated modelling of language and vision, with approaches located at different points between the structured, cognitive modelling end of the spectrum, and the unsupervised machine learning end, with state-of-the-art results in many areas currently being produced at the latter end, in particular by deep learning approaches.

Another perspective is exploring how knowledge about language can help with predominantly visual tasks, and vice versa. Visual interpretation can be aided by text associated with images/videos and knowledge about the world learned from language. On the NLP side, images can help ground language in the physical world, allowing us to develop models for semantics. Words and pictures are often naturally linked online and in the real world, and each modality can provide reinforcing information to aid the other.

Visual recognition methods are now reaching a level of maturity where commercial deployment is becoming feasible for an increasingly wide range of applications. At the same time recent years have witnessed a marked increase in research focusing on the language and vision area, intensifying in particular in the past five years, and it can be argued that the language computing and vision computing fields for the first time overlap to form the beginnings of a genuinely interdisciplinary research field. This is the perfect moment for a special issue with an emphasis on the applications that language and vision research is now producing high-quality solutions for. A carefully chosen, representative selection of in-depth reports on the best current research in language and vision will provide the perfect snapshot of the state of the art in a field that has just experienced five extraordinarily active and productive years.


While there has been research involving both language and vision for some time, it was not until about five years ago that it began to gel into an interdisciplinary research field. Early indicators were (i) the funding of the UK EPSRC Network on Vision and Language in 2010 and of the European COST Action on Integrating Vision and Language in 2013; and (ii) the organisation of the first workshops dedicated to the topic of vision and language, including the 1st Workshop on Vision and Language in 2011 and the NIPS 2011 Workshop on Integrating Language and Vision.

Since then the subject area has grown rapidly; there has been a proliferation of language and vision workshops (there were at least seven in 2015 and 2016); and in 2015 all the major NLP conferences, ACL’15, EMNLP’15 and NAACL’15, introduced the subject area of Language and Vision for the first time. This expansion has stimulated a lot of exciting new research on a wide variety of language and vision topics, a good proportion of which is now reaching a level of maturity where journal articles are the most appropriate form of publication.

At this point in time, there is a lot of new Language and Vision research that has been mainly, if not exclusively, published in conference and workshop proceedings. The time is right for a journal special issue to provide an overview of cutting edge research in this new generation of language and vision research, through a carefully selected, representative collection of in-depth reports of the best mature research in the new interdisciplinary Language and Vision research field.

We invite the submission of contributions reporting completed research on any topics related to the above, including but not limited to the following:

    Image and video labelling and annotation
    Image and video description
    Computational modelling of human vision and language
    Image and video retrieval
    Multimodal human-computer communication
    Text-to-image generation
    Language-driven animation
    Facial animation for speech
    Assistive methodologies


Веб-сайт конференции:

Конференции по теме - с близкими дедлайнами: