"There is no cure for curiosity" — R.W. Emerson

About me

Who am I?

Yannick Versley

I am a Senior Tech Lead Manager at eBay CoreAI, leading a team working on LLM pretraining and LLM customization. Previously, I worked as a Senior Applied Scientist at Amazon, first at Alexa AI and subsequently at the AGI org (Amazon Nova models). Before that, I've developed improvement processes and machine learning modeling for chatbots at OttoGroup data.works as a Senior Data Scientist, and did consulting, architecture, and implementation work around natural language processing as a Senior Cognitive Expert at IBM Services.

I have been a visiting professor ("Professurvertretung") at the Institute for Computational Linguistics of the Ruprecht-Karls-Universität Heidelberg from Winter 2013/2014 to Winter 2015/2016.

I co-edited a handbook on Anaphora Resolution published by Springer.

Research Interests

What, Why and How?

My research focuses on adapting and specializing foundation models for domain-specific applications, particularly in e-commerce and enterprise settings. I am interested in how domain knowledge and task-specific requirements shape the way we customize large language models and vision-language models — including vocabulary adaptation, efficient fine-tuning techniques, and architecture choices that balance performance with practical deployment constraints.

A central theme in my work is making advanced language understanding capabilities accessible and efficient: developing techniques for model customization that don't require the computational resources available only to a handful of large LLM-focused labs. This includes work on vocabulary optimization for domain-specific deployment, making use of the subspace structure of the modeling space, and efficient instruction tuning approaches that can adapt general-purpose foundation models to specialized domains while maintaining interpretability and performance.

I continue to contribute to the research community through reviewing for journals including Computational Linguistics and Natural Language Engineering, and have served on program committees for major conferences including ACL (2008-2016, 2020-present), EMNLP (2011-2019, 2021-present), and EACL (2012-2023). I've also acted as an external reviewer for funding agencies including DFG (Germany), NWO (Netherlands), and ERC (European Research Council).

Selected Publications

I selected most of them

For a full list, see Google Scholar or SemanticScholar.

Preprints

C Herold, M Kozielski, N Santavas, Y Versley, and S Khadivi (2025) Vocabulary Customization for Efficient Domain-Specific LLM Deployment. [arXiv:2509.26124]

Conference & Workshop Papers

C Herold, M Kozielski, T Bazazo, P Petrushkov, Y Versley, SH Hashemi, and S Khadivi (2025) Domain Adaptation of Foundation LLMs for E-Commerce. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track).
N Shivagunde, M Kulkarni, G Karamanolakis, JGM FitzGerald, Y Versley, S Soltan, V Cevher, J Lu, and A Rumshisky (2024) Approximations May Be All You Need: Towards Pre-Training LLMs with Low-Rank Decomposition and Optimizers. NeurIPS 2024 Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV) .
A Rosenbaum, S Soltan, W Hamza, Y Versley, and M Boese (2022) LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging. Proceedings of the 29th International Conference on Computational Linguistics.
A Abujabal, CD Bovi, S Ryu, T Gojayev, F Triefenbach, and Y Versley (2021) Continuous Model Improvement for Language Understanding with Machine Translation. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics.
S Loáiciga, S Stymne, P Nakov, C Hardmeier, J Tiedemann, M Cettolo, and Y Versley (2017) Findings of the 2017 DiscoMT Shared Task on Cross-Lingual Pronoun Prediction. Proceedings of the Third Workshop on Discourse in Machine Translation, 1–16.
J Sikos, Y Versley, and A Frank (2016) Implicit Semantic Roles in a Multilingual Setting. Proceedings of the Fifth Joint Conference on Lexical and Computational Linguistics.
L Guillou, C Hardmeier, P Nakov, S Stymne, J Tiedemann, and Y Versley (2016) Findings of the 2016 WMT Shared Task on Cross-Lingual Pronoun Prediction. Proceedings of the First Conference on Machine Translation: Volume 2.
L Brandt, D Grimm, M Zhou, and Y Versley (2016) ICL-HD at SemEval-2016 Task 8: Meaning Representation Parsing – Augmenting AMR Parsing with a Preposition Semantic Role Labeling Neural Network. Proceedings of the 10th International Workshop on Semantic Evaluation.
A Kirilin, F Krauss, and Y Versley (2016) ICL-HD at SemEval-2016 Task 10: Improving the Detection of Minimal Semantic Units and Their Meanings with an Ontology and Word Embeddings. Proceedings of the 10th International Workshop on Semantic Evaluation.
Y Versley (2016) Discontinuity (Re)²-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing. Proceedings of the Workshop on Discontinuous Structures in Natural Language Processing (DiscoNLP), NAACL-HLT 2016.
Y Versley and J Steen (2016) Detecting Annotation Scheme Variation in Out-of-Domain Treebanks. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016).
M Haas and Y Versley (2015) Subsentential Sentiment on a Shoestring: A Crosslingual Analysis of Compositional Classification. Proceedings of NAACL-HLT 2015.
Y Versley (2014) Experiments with Easy-First Nonprojective Constituent Parsing. Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages (SPMRL-SANCL 2014).
Y Versley (2013) SFS-TUE: Compound Paraphrasing with a Language Model and Discriminative Reranking. Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013).
Y Versley (2013) Graph-Based Classification of Explicit and Implicit Discourse Relations. International Conference on Computational Semantics (IWCS 2013).
Y Versley (2012) Supervised Learning of German Qualia Relations. ACL 2012 Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages.
Y Versley and Y Panchenko (2012) Not Just Bigger: Towards Better-Quality Web Corpora. Seventh Web-as-Corpus Workshop at WWW2012 (WAC7).
Y Versley (2011) Multilabel Tagging of Discourse Relations in Ambiguous Temporal Connectives. Recent Advances in Natural Language Processing (RANLP 2011).
Y Versley (2011) Towards Finer-Grained Tagging of Discourse Connectives. DGfS Workshop Beyond Semantics.
Y Versley (2010) Discovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection. Workshop on the Annotation and Exploitation of Parallel Corpora (AEPC).
Y Versley, K Beck, E Hinrichs, and H Telljohann (2010) A Syntax-First Approach to High-Quality Morphological Analysis and Lemma Disambiguation for the TüBa-D/Z Treebank. 9th Conference on Treebanks and Linguistic Theories (TLT9).
Y Versley and I Rehbein (2009) Scalable Discriminative Parsing for German. International Conference on Parsing Technology (IWPT'09).
Y Versley (2008) Decorrelation and Shallow Semantic Patterns for Distributional Clustering of Nouns and Verbs. ESSLLI'08 Workshop on Distributional Lexical Semantics.
Y Versley, A Moschitti, M Poesio, and X Yang (2008) Coreference Systems Based on Kernel Methods. Proceedings of Coling 2008.
Y Versley, SP Ponzetto, M Poesio, V Eidelman, A Jern, J Smith, X Yang, and A Moschitti (2008) BART: A Modular Toolkit for Coreference Resolution. Proceedings of LREC 2008.
Y Versley, SP Ponzetto, M Poesio, V Eidelman, A Jern, J Smith, X Yang, and A Moschitti (2008) BART: A Modular Toolkit for Coreference Resolution. ACL 2008 System Demonstrations.
Y Versley (2007) Antecedent Selection Techniques for High-Recall Coreference Resolution. Proceedings of EMNLP-CoNLL 2007.
Y Versley (2007) Using the Web to Resolve Coreferent Bridging in German Newspaper Text. GLDV-Frühjahrstagung 2007.
Y Versley and H Zinsmeister (2006) From Surface Dependencies towards Deeper Semantic Representations. Fifth Workshop on Treebanks and Linguistic Theories (TLT 2006).
Y Versley (2006) A Constraint-Based Approach to Noun Phrase Coreference Resolution in German Newspaper Text. Konferenz zur Verarbeitung Natürlicher Sprache (KONVENS 2006).
Y Versley (2006) Disagreement Dissected: Vagueness as a Source of Ambiguity in Nominal (Co-)Reference. ESSLLI 2006 Workshop on Ambiguity in Anaphora.
Y Versley (2005) Parser Evaluation across Text Types. Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005).
F Schilder, Y Versley, and Ch Habel (2004) Extracting Spatial Information: Grounding, Classifying and Linking Spatial Expressions. Workshop on Geographic Information Retrieval, 27th Annual International ACM SIGIR Conference.
F Schilder, Ch Habel, and Y Versley (2003) Temporal Information Extraction and Question Answering: Deriving Answers for When-Questions. Questions and Answers: Theoretical and Applied Perspectives (2nd CologNet-ElsNet Symposium).

Journal Articles

Y Versley (2013) A Graph-Based Approach for Implicit Discourse Relations. CLIN Journal 3, 148–173.
H Telljohann, Y Versley, K Beck, E Hinrichs, and T Zastrow (2013) STTS als Part-of-Speech-Tagset in Tübinger Baumbanken. Journal for Language Technology and Computational Linguistics 28(1), 1–16.
Y Versley and A Gastel (2013) Linguistic Tests for Discourse Relations in the TüBa-D/Z Treebank of German. Dialogue & Discourse 4(2), 142–173.
Y Versley (2008) Vagueness and Referential Ambiguity in a Large-Scale Annotated Corpus. Journal on Research in Language and Computation 6(3--4), 333–353.

Books & Edited Volumes

M Poesio, R Stuckardt, and Y Versley (2016) Anaphora Resolution: Algorithms, Resources, and Applications Springer.
S Featherston and Y Versley (2016) Quantitative Approaches to Grammar and Grammatical Change: Perspectives from Germanic Walter de Gruyter GmbH & Co KG.

Book Chapters

Y Versley, M Poesio, and S Ponzetto (2016) Using Lexical and Encyclopedic Knowledge. In: Anaphora Resolution: Algorithms, Resources, and Applications, 393–429.
Y Versley and A Björkelund (2016) Off-the-Shelf Tools. In: Anaphora Resolution: Algorithms, Resources, and Applications, 237–266.
M Poesio, R Stuckardt, and Y Versley (2016) Challenges and Directions of Further Research. In: Anaphora Resolution: Algorithms, Resources, and Applications, 487–500.
M Poesio, R Stuckardt, Y Versley, and R Vieira (2016) Early Approaches to Anaphora Resolution: Theoretically Inspired and Heuristic-Based. In: Anaphora Resolution: Algorithms, Resources, and Applications, 55–94.
M Poesio, S Pradhan, M Recasens, K Rodriguez, and Y Versley (2016) Annotated Corpora and Annotation Tools. In: Anaphora Resolution: Algorithms, Resources, and Applications, 97–140.

Theses

Y Versley (2010) Resolving Coreferent Bridging in German Newspaper Text. PhD Thesis, Seminar für Sprachwissenschaft, Universität Tübingen.
Y Versley (2004) Tagging kausaler Relationen. Diplomarbeit, Fachbereich Informatik, Universität Hamburg. Also published as: Tagging kausaler Relationen: Grundlagen kausaler Ereignisrelationen und aktuelle Probleme; VDM Verlag Dr. Müller. ISBN 978-3-8364-3259-7