About Me

I am a Staff Research Engineer at Google DeepMind working in a group led by Yasemin Altun and Srini Narayanan and contributing to Gemini. I focus on the multimodal understanding and generation capabilities of LLMs in the domain of non-natural images. In the past, I worked on the internationalization of NLU models, and query/dialog parsing with structured information.

I received my Ph.D. from the University of Trento (Italy) in April 2018, under the supervision of prof. Alessandro Moschitti. My research interests included kernel and deep learning methods for Question Answering (QA).

After my M.Sc. in computer science at the University of Trento, I worked on tree kernel methods for passage reranking, textual similarity, and crossword solving. From an engineering perspective, I focused on the Apache UIMA framework (also used in IBM Watson) for the development of NLP pipelines.

In 2013/2014, I interned at the Qatar Computing Research Institute, where I started the development of the Iyas QA system, worked on machine translation, and participated to the SemEval 2015 competition in the Community Question Answering track, for English and Arabic. The same year, I started the doctoral course and consolidated the work on automatic crossword solving.

In 2015, I won a Google European Doctoral Fellowship, which funded my studies (44 fellowships assigned worldwide, 15 in EMEA, only 1 assigned to an Italian university). During summer, I interned at Google Zurich, where I worked with Katja Filippova and Enrique Alfonseca on sentence compression using neural networks. During summer 2016, I was at Google London working with Bernd Bohnet, Ryan McDonald and Michael Collins on an easy-first neural approach to dependency parsing. During summer 2017, I was back in Google Zurich, working on adversarial methods for text generation with Aliaksei Severyn.

Publications

These are my publications. See also my Google Scholar profile.

Gemini Team. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities. Tech Report 2025. Core Contributor.

Sarthak Jauhari, Massimo Nicosia, Aditya Ranjan, Ankush Chatterjee and Rahul Goel. Translation and Transliteration Based Data Augmentation for Multilingual Semantic Parsing. ECAI 2024.

Gemini Team. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. Tech Report 2024. Contributor.

Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, Sebastian Ruder. mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations. EMNLP 2023 (Findings). Paper on arXiv.

Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev, Partha Talukdar. XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages. EMNLP 2023 (Findings). Paper and Github repo.

Massimo Nicosia and Francesco Piccinno. Evaluating Byte and Wordpiece Level Models for Massively Multilingual Semantic Parsing. MMNLU-22 colocated with EMNLP 2022. Paper on arXiv.

Massimo Nicosia, Zhongdi Qu, Yasemin Altun. Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data. EMNLP 2021. Paper on arXiv.

Thomas Mueller, Francesco Piccinno, Massimo Nicosia, Peter Shaw, Yasemin Altun. Answering Conversational Questions on Structured Data without Logical Forms. EMNLP 2019. Paper on arXiv.

Massimo Nicosia, Alessandro Moschitti. Semantic Linking in Convolutional Neural Networks for Answer Sentence Selection. EMNLP 2018.

Massimo Nicosia, Alessandro Moschitti. Accurate Sentence Matching with Hybrid Siamese Networks. CIKM 2017.

Massimo Nicosia, Alessandro Moschitti. Lazada Product Title Quality Challenge: Constructing Features for a Diversified Ensemble of Classifiers (Honorable Mention Winner). CIKM AnalytiCup 2017.

Massimo Nicosia, Alessandro Moschitti. Learning Contextual Embeddings for Structural Semantic Similarity using Categorical Information. CoNLL 2017.

Kateryna Tymoshenko, Alessandro Moschitti, Massimo Nicosia and Aliaksei Severyn. RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations. ACL 2017.

Alessandro Moschitti, Kateryna Tymoshenko, Panos Alexopoulos, Andrew Walker, Massimo Nicosia, Guido Vetere, Alessandro Faraotti, Marco Monti, Jeff Z. Pan, Honghan Wu and Yuting Zhao. Question Answering and Knowledge Graphs, in Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer International Publishing 2017.

Massimo Nicosia, Alessandro Moschitti. Crossword Puzzle Resolution in Italian using Distributional Models for Clue Similarity. IIR 2016.

Aliaskei Severyn, Massimo Nicosia, Gianni Barlacchi, Alessandro Moschitti. Distributional Neural Networks for Automatic Resolution of Crossword Puzzles. ACL 2015.

Gianni Barlacchi, Massimo Nicosia, Alessandro Moschitti. SACRY: Syntax-based Automatic Crossword puzzle Resolution sYstem. ACL 2015.

Massimo Nicosia, Gianni Barlacchi, Alessandro Moschitti. Learning to Rank Aggregated Answers for Crossword Puzzles. ECIR 2015.

Massimo Nicosia, Simone Filice, Alberto Barrón-Cedeño, Iman Saleh, Hamdy Mubarak, Wei Gao, Preslav Nakov, Alessandro Moschitti, Giovanni Da San Martino, Lluı́s Màrquez, Shafiq Joty, Walid Magdy. QCRI: Answer Selection for Community Question Answering – Experiments for Arabic and English. SemEval-2015 at NAACL 2015.

Gianni Barlacchi, Massimo Nicosia, Alessandro Moschitti. A Retrieval Model for Automatic Resolution of Crossword Puzzles in Italian Language. CLiC-it 2014 (Distinguished Young Paper).

Francisco Guzmán, Shafiq Joty, Lluı́s Màrquez, Alessandro Moschitti, Preslav Nakov and Massimo Nicosia. Learning to Differentiate Better from Worse Translations. EMNLP 2014.

Gianni Barlacchi, Massimo Nicosia, Alessandro Moschitti. Learning to Rank Answer Candidates for Automatic Resolution of Crossword Puzzles. CoNLL 2014.

Aliaksei Severyn, Massimo Nicosia, Alessandro Moschitti. Building Structures from Classifiers for Passage Reranking. CIKM 2013.

Aliaksei Severyn, Massimo Nicosia, Alessandro Moschitti. Learning Adaptable Patterns for Passage Reranking. CoNLL 2013.

Aliaksei Severyn, Massimo Nicosia, Alessandro Moschitti. Learning Semantic Textual Similarity with Structural Representations. ACL 2013.

Aliaksei Severyn, Massimo Nicosia, Alessandro Moschitti. iKernels-Core: Tree Kernel Learning for Textual Similarity. *SEM 2013.

Honors and Awards

Assistant Tech Impact Award
H1 2023 Award, for impactful contribution to the Google Assistant.

Google European Doctoral Fellowship 2015
Fellowship in Statistical Natural Language Processing.

MMNLU 2022
1st Place at the Zero-Shot track of the Massive MMNLU-22 Multilingual Semantic Parsing Competition organized by Amazon.

CIKM AnalytiCup 2017
Semi-finalist (top 10) at the CIKM 2017 Lazada Product Title Quality Challenge. Honourable mention and travel award for the quality of the code and documentation.

ICT Days 2018 Best Poster Award
Prize awarded by Seac S.p.A. at the PhD Poster Competition organized by the University of Trento.

Distinguished Young Paper CLiC-it 2014
A Retrieval Model for Automatic Resolution of Crossword Puzzles in Italian Language.

Merit Scholarship from the University of Trento
Prize for finishing the Master Course on time and with excellent results.

Academic Activities

Reviewer
ACL Rolling Reviews (2022, 2021), EMNLP (2023, 2022, 2021, 2020, 2017), ACL (2023, 2021, 2020, 2019, 2018, 2017, 2015), EACL (2023, 2022), IJCNLP-AACL (2023), EACL SRW (2021), CoNLL (2023, 2022, 2021, 2020, 2019, 2018), AAAI (2023, 2020, 2019, 2018, 2015), Coling (2022), ECIR (2020), NAACL HLT (2020, 2018, 2016, 2015), SIGIR (2018, 2017, 2016, 2014), CLiC-it (2018, 2017), IJCoL(2018), WWW (2017), IJCAI (2016), IEEE BigData Congress (2016), BioASQ workshop in CLEF(2015), IEEE SMC (2015).

Teaching
Advanced Natural Language Processing and Information Retrieval, University of Trento, Italy.

Summer schools
Machine Learning Summer School (MLSS) 2016 in Cadiz, Spain.