Overview

SLoMO brings together researchers working on the story-level understanding of long-form, edited videos—particularly movies and TV episodes. Building on the success of the first edition at ICCV 2025, this 2nd edition continues our invited talks + competition format with in-person attendance, and broadens the discussion to long-form edited media understanding and how deeper movie comprehension can benefit generative models.

Compared to conventional video tasks, this setting demands: (i) modeling long-range narrative dependencies; (ii) reasoning over complex character relationships; and (iii) understanding editing patterns and cinematography.

  • Invited Talks showcase recent advances in Audio Description (AD), movie understanding, and accessibility for visually impaired audiences.
    We address the following key open questions:
    • How can we ensure fair evaluation of large vision-language models, particularly with respect to knowledge leakage from movie data?
    • How can movie understanding advance accessibility in edited media?
    • How can modeling movie structure and cinematography benefit both story-level understanding and movie generation?
  • SLoMO Competition evaluates narrative-level reasoning through two complementary tracks: Movie Question Answering (MovieQA) over full story arcs, and Audio Description (AD) Generation producing coherent, story-aware narrations to enhance accessibility for visually impaired audiences.

Schedule

The workshop is a half-day session held at ECCV 2026 in Malmö, Sweden, on the afternoon of September 8, 2026. Each invited talk is 25 min plus 5 min discussion. The schedule below is tentative and subject to change.

  1. 01:00 – 01:10 pm 10 min
    Opening Opening Remarks
  2. 01:10 – 01:40 pm 30 min
    Invited Talk 1 Prof. Bernard Ghanem KAUST
  3. 01:40 – 02:10 pm 30 min
    Invited Talk 2 Prof. Anna Rohrbach TU Darmstadt · hessian.AI
  4. 02:10 – 03:10 pm 60 min
    Competition SLoMO Competition Result announcements & winners' presentations
  5. 03:10 – 03:30 pm 20 min
    Break Coffee Break
  6. 03:30 – 04:00 pm 30 min
    Invited Talk 3 Dr. Fabian Caba Heilbron Adobe Research
  7. 04:00 – 04:30 pm 30 min
    Invited Talk 4 Dr. Piotr Mirowski Google DeepMind
  8. 04:30 – 05:00 pm 30 min
    Closing Discussion & Closing Remarks

Invited Speakers

SLoMO Competition

The SLoMO Competition advances story-level video understanding through two complementary tasks: Movie Question Answering (MovieQA) and Audio Description (AD) Generation. The competition is designed to evaluate long-form, narrative-level reasoning in video-language models, going beyond clip-level perception toward coherent story understanding.

Datasets

  • Short-Films 20K (SF20K) — a large-scale, publicly available collection of 20,143 self-contained short films totalling 3,684 h. Each film lasts 5–40 min (~11 min on average) across diverse genres, with both automatically generated and manually curated QA pairs.
  • Condensed Movie Dataset (CMD-AD) — short clips from over 1,432 online movies with professionally annotated Audio Descriptions from AudioVault, temporally aligned with the clips.

Tracks

  • SLoMO QA Track: Movie Question Answering (MovieQA)
    Dataset: SF20K (19,071 train / 50 public test / 45 private test movies).
    Metric: LLM-QA-Evalgpt-4.1-nano compares ground-truth and predicted answers and assigns a binary correctness label; the final score is the percentage of correct answers.
  • SLoMO AD Track: Audio Description (AD) Generation
    Datasets: CMD-AD (1,332 train / 98 public test / 100 private test movies) + SF20K (zero-shot, 17 public test / 45 private test movies).
    Metric: a single AD score -- a weighted combination of
    ADQA (a question-answering measure of how well the ADs convey the story's visual content, leveraging gemini-3.0-flash)
    CIDEr (n-gram agreement with reference ADs). The exact weights are not disclosed
    An AD Duration Check (conformance to the expected length of audio descriptions) is performed for each AD as a qualifying check, and does not contribute to the overall score.

Important Dates

  • 29 Jun, 2026: Competition server launches with the public test set
  • 1 Aug, 2026: Competition server launches with the private test set
  • 25 Aug, 2026: Submission deadline for leaderboard ranking
  • 1 Sep, 2026: Final rankings announced
  • 8 Sep, 2026: Workshop at ECCV 2026 with winners' presentations

Organizers