Michael Wray

Assistant Professor of Computer Vision

I am a lecturer/Assistant Professor of Computer Vision at the School of Computer Science at the University of Bristol. My research interests are in multi-modal video understanding, particularly for egocentric videos — focusing on how both vision and language can be tied together towards tasks such as cross-modal retrieval, grounding and captioning. I am part of MaVi and ViLab.

Email: michael (dot) wray (at) bristol (dot) ac (dot) uk

News

June 2025 - New CVPRW Paper Our paper: "Video, How do your Tokens Merge?" will be presented at the eLVM workshop at CVPR2025. More info: Website, ArXiv.
June 2025 - New Survey Paper on ArXiv Our paper: "Leveraging Auxiliary Information in Text-to-Video Retrieval: A Review" is now on ArXiv here.
April 2025 - New Paper on ArXiv Our paper: "Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval" is now on ArXiv here.
February 2025 - Two Papers Accepted at CVPR 2025 Our papers titled HD-EPIC and ShowHowTo were accepted at CVPR2025 and will be presented in Nashville in June!
February 2025 - Area Chair at NeurIPS 2025 Honoured to be an Area Chair for NeurIPS 2025.
February 2025 - New Paper on ArXiv Our paper: "Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval" is now on ArXiv, webpage.
February 2025 - New Paper on ArXiv Our paper: "HD-EPIC: A Highly-Detailed Egocentric Video Dataset" is now on ArXiv, webpage.
December 2024 - New Paper on ArXiv Our paper: "ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions" is now on ArXiv, webpage.

For a full list of News, click here.

Research

Short list of recent Research Projects, click here for a full list.

Video, How Do Your Tokens Merge?
Sam Pollard, Michael Wray
CVPRW, 2025
[Website] [arXiv] [Code]

Leveraging Auxiliary Information in Text-to-Video Retrieval: A Review
Adriano Fragomeni, Dima Damen, Michael Wray
arXiv, 2025
[arXiv]

Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval
Adriano Fragomeni, Dima Damen, Michael Wray
arXiv, 2025
[arXiv]

HD-EPIC: A Highly Detailed Egocentric Dataset
Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara, Sam Pollard, Kranti Parida, Kaiting Liu, Prajwal Gatti, Siddhant Bansal, Kevin Flanagan, Jacob Chalk, Zhifan Zhu, Rhodri Guerrier, Fahd Abdelazim, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, Dima Damen
CVPR, 2025
[Webpage] [arXiv] [Annotations] [Videos]

Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval
Kevin Flanagan, Dima Damen, Michael Wray
WACV, 2025
[Webpage] [arXiv] [Code]

ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
Tomáš Souček, Prajwal Gatti, Michael Wray, Ivan Laptev, Dima Damen, Josef Sivic
CVPR, 2025
[Webpage] [arXiv] [Code]

For a full list of Research projects, click here.

Short Bio

Michael is a lecturer in Computer Vision at the School of Computer Science at the University of Bristol. He finished his PhD titled "Verbs and Me: an Investigation into Verbs as Labels for Action Recognition in Video Understanding" in 2019 under the supervision of Professor Dima Damen. After, he stayed in the same lab as a Post-Doc working on Vision and Language and the collection of the Ego4D Dataset. Michael has led the organisation EPIC workshop series from 2021 onwards, is an organiser of the Ego4D workshop series, and is an ELLIS member.

Teaching

Applied Deep Learning 22/23, 23/24, 24/25. Webpage.
Computer Systems A 22/23. 23/24, 24/25. Webpage.
Individual Projects 22/23. 23/24, 24/25. Webpage.

People

Current

Adriano Fragomeni: PhD, 2020–Current (w/ Dima Damen)
Kevin Flanagan: PhD, 2021–Current (w/ Dima Damen)
Shijia Feng: PhD, 2022–Current (w/ Walterio Mayol Cuevas)
Sam Pollard: MEng, PhD, 2023–Current
Beth Pearson: PhD, 2023–Current (w/ Martha Lewis)
Fahd Abdelazim: PhD, 2024–Current
Alyssa Boisse: MEng, 2025
Amr Khaled Mohamed El-Sawy: BSc, 2025
Jacob Seaborn: MEng, 2025
Aleksandra Walusiak: MEng, 2025

Richa Banthia: MEng, 2024
Alex Elwood: MEng, 2024
Moise Guran: MEng, 2024
Rahat Mittal: BSc, 2024
Bence Szarka: MEng, 2024
Lee Tancock: MEng, 2023
Zac Woodford: MEng, 2023
Benjamin Gutierrez Serafin: MSc, 2020
Pei Huang: MSc, 2016

Misc.

Presentations

BMVA Summer School Egocentric Vision Lecture 2022, 2023, 2024.
VIViD Research Seminar, Durham University Fine Grained Video Understanding from a Personal Perspective. 2024.
Advancements in Time Series Analysis for Computer Vision: Techniques, Applications, and Challenges Unlocking the Temporal Dimension from the Egocentric Perspective 2024.
Video Understanding Symposium 2022 Do we still need Classification for Video Understanding? 2022.
BMVA Symposium: Robotics Meets Semantics Towards an Unequivocal Representation of Actions. 2018.
EPIC@ECCV2016 SEMBED: Semantic Embedding of Egocentric Action Videos. 2016.

Workshop Organiser

WINVU: CVPR2024
EPIC@: ICCV2021, CVPR2021, ECCV2020
Joint Ego4D+EPIC@: CVPR2023, CVPR2022
Ego4D@: ECCV2022

Area Chair

CVPR: 2025, 2024, 2023
ECCV: 2024
NeurIPS: 2025, 2024

Associate Editor

IET Computer Vision 2024–Current
ToMM Special Issue on Text-Multimedia Retrieval 2024

Outstanding Reviewer

BMVC2024
ECCV2022
ICCV2021
CVPR2021
BMVC2020

Reviewing Duties

Conferences

CVPR: 2022, 2021, 2020, 2019
NeurIPS: 2023
NeurIPS D&B Track: 2024, 2023
ICCV: 2025, 2023, 2021
ECCV: 2022
ACCV: 2024, 2022, 2020
BMVC: 2024, 2023, 2022, 2021, 2020, 2019
WACV: 2024, 2023, 2022, 2021

Journals

TPAMI
IJCV
TCSVT
Pattern Recognition
TOMM

	Video, How Do Your Tokens Merge? Sam Pollard, Michael Wray CVPRW, 2025 [Website] [arXiv] [Code]
	Leveraging Auxiliary Information in Text-to-Video Retrieval: A Review Adriano Fragomeni, Dima Damen, Michael Wray arXiv, 2025 [arXiv]
	Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval Adriano Fragomeni, Dima Damen, Michael Wray arXiv, 2025 [arXiv]
	HD-EPIC: A Highly Detailed Egocentric Dataset Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara, Sam Pollard, Kranti Parida, Kaiting Liu, Prajwal Gatti, Siddhant Bansal, Kevin Flanagan, Jacob Chalk, Zhifan Zhu, Rhodri Guerrier, Fahd Abdelazim, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, Dima Damen CVPR, 2025 [Webpage] [arXiv] [Annotations] [Videos]
	Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval Kevin Flanagan, Dima Damen, Michael Wray WACV, 2025 [Webpage] [arXiv] [Code]
	ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions Tomáš Souček, Prajwal Gatti, Michael Wray, Ivan Laptev, Dima Damen, Josef Sivic CVPR, 2025 [Webpage] [arXiv] [Code]