ENACT Logo ENACT
Foundation Models Meet Embodied Agents @ CVPR 2026

ENACT Challenge

Evaluating Embodied Cognition with Egocentric Interaction World Modeling

🏛️

We are organizing the ENACT Challenge, a benchmark designed to evaluate embodied cognition through egocentric interaction world modeling. Participants will be ranked by Pairwise Accuracy and Task Accuracy on a held-out test set.

Challenge Overview

Goal

Given egocentric observations of embodied interactions, predict the correct outcomes for world modeling tasks. Models will be evaluated on their ability to understand and reason about embodied cognition in interactive environments.

What You Do

  1. Train / fine-tune on the ENACT training set.
  2. Develop and validate on the ENACT validation set.
  3. Run inference on the held-out test set (to be released) and submit predictions via EvalAI.

Data Splits

📚

Train

ENACT_train.jsonl

Public
🔬

Validation

ENACT_val.jsonl

Public
🏆

Test (Held-out)

Final evaluation set

Coming Soon

Dataset: Data can be found at huggingface.co/datasets/MLL-Lab/ENACT

Format & loading: Please refer to the official instructions in the ENACT repository.

Evaluation

  • Primary Metrics:
    • Pairwise Accuracy: Measures the model's ability to correctly compare and rank pairs of interactions.
    • Task Accuracy: Measures the model's ability to correctly predict task outcomes.
  • Ranking: Teams are ranked by a weighted combination of Pairwise Accuracy and Task Accuracy.
  • (Optional) We may additionally report accuracy by task category and interaction type for detailed analysis.
  • Tie-break: Higher Task Accuracy, then earlier submission time.

Challenge Leaderboard

Performance of submitted methods on the held-out test set.

Click on column headers to sort the results

Baseline Participants
Rank Team / Method Overall Pairwise Acc. Task Acc.
- Random Baseline - - -
Challenge submissions coming soon...
Leaderboard will be updated after the test set is released and submissions are evaluated.

Submission

Submission File Format (JSONL)

Submit a single .jsonl file with one JSON object per line, containing:

  • sample_id (string): Unique identifier for the test sample
  • prediction (string or integer): Your model's prediction
Example submission format
{"sample_id":"enact_000001","prediction":"A"}
{"sample_id":"enact_000002","prediction":"B"}

Requirements

  • Provide exactly one prediction for each sample_id in the test set.
  • Duplicate IDs: Will result in invalid submission
  • Missing IDs: Count as incorrect / invalid submission
  • (Recommended) You may gzip the file for size: predictions.jsonl.gz

How to Submit

1

Download the held-out test set (coming soon)

2

Generate your predictions.jsonl following the required format.

3

Name the file as: TeamName_MethodName.jsonl (or .jsonl.gz)

4

Submit via EvalAI platform (link coming soon)

Submission Limit: Up to 5 submissions per team; best submission counts
Deadline: TBD
Results Announcement: TBD

Rules

  • External data / models / APIs: Allowed with disclosure
    Teams may use external datasets, pre-trained models, or APIs, but must clearly disclose all external resources used in their submission.
  • Human-in-the-loop labeling on test: Disallowed
  • Participants must not attempt to obtain test labels or manipulate evaluation.
  • Verification: Top teams may be asked to provide a brief method description and reproducibility details.
  • Team size: No limit on team size, but each team may only submit under one team name.

Baselines & Starter Kit

Baselines, data loaders, and evaluation scripts are available in the official ENACT repository:

github.com/mll-lab-nu/ENACT

Getting Started: Check out our baseline implementations and starter code to quickly get up and running with the ENACT dataset.

Contact

For questions, please reach out via: