Real language is underspecified, vague, and ambiguous. Indeed, past work (Zipf, 1949; Piantadosi, 2012) has suggested that ambiguity may be an inextricable feature of natural language, resulting from competing communicative pressures. Resolving the meaning of language is a never-ending process of making inferences based on implicit knowledge. For example, we know that ``the girl saw the man with the telescope'' is ambiguous and could refer to two situations, while ``the girl saw the man with the hamburger'' is not, or that ``near'' in ``the house near the airport'' and ``the ant near the crumb'' does not refer to the same distance. Being able to capture this kind of knowledge is central to building systems with a human-like understanding of language, as well as to providing a full account of natural language itself.
While underspecified, ambiguous, and implicit language rarely poses a problem for language speakers, it can challenge even the best models. For example, despite recent major successes in NLP coming from large language models (LLMs), it is not clear that models capture ambiguous language in a human-like fashion (Liu, 2023; Stengel-Eskin, 2023). The same has been argued for multimodal NLP. (Pezzelle, 2023), for example, showed that CLIPScore is sensitive to underspecified captions. Tackling these kinds of linguistic phenomena represents a new frontier in NLP research, enabled by major progress on more clear-cut tasks.
Past work in underspecified language has tackled several directions. Some semantic representations have sought to explicitly represent underspecification (Copestake, 2005; Bos, 2004).
Other work has begun to recognize that perfect annotator agreement is often unrealistic, especially when using categorical labels for tasks like natural language inference (Chen, 2020; Nie, 2020; Pavlick, 2019). This workshop hopes to attract work embracing disagreement between annotators as a source of signal about underspecification and ambiguity.In order to resolve the meaning of underspecified and ambiguous language, we often employ additional modalities and information acquired through embodied experience (Bisk, 2020). For example ``the girl saw the man with the telescope'' becomes unambiguous if paired with an image of a man holding a telescope. In contrast, NLP typically considers language in isolation, removed from the context in which it is typically found. This workshop will highlight multimodal inputs, especially visual ones, as sources of information for resolving underspecification. These inputs can themselves pose additional challenges, e.g. through ambiguous images or videos (Bhattacharya, 2019; Sanders, 2022).
The goal of the third edition of the workshop is to continue eliciting future progress on processing implicit, underspecified and ambiguous language with a strong focus on annotation ambiguity, multimodality and pragmatics. Similar to the first two editions, we would accept theoretical and practical contributions (long, short and non-archival) on all aspects related to the workshop topic.
We welcome submissions related to, but not limited to, the following topics:
Please submit your papers at https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/UnImplicit
Please use the EACL style templates.
If you have any questions please email us at unimplicitworkshop -AT- gmail.com
We welcome submissions related to, but not limited to, the following topics: