ENACT Challenge
Evaluating Embodied Cognition with Egocentric Interaction World Modeling
We are organizing the ENACT Challenge, a benchmark designed to evaluate embodied cognition through egocentric interaction world modeling. Participants will be ranked by Pairwise Accuracy and Task Accuracy on a held-out test set.
Quick Links
- Dataset & Code: https://github.com/mll-lab-nu/ENACT
- Challenge Contact: qinengw@u.northwestern.edu
- Submission Portal: EvalAI (Coming Soon)
Challenge Overview
Goal
Given egocentric observations of embodied interactions, predict the correct outcomes for world modeling tasks. Models will be evaluated on their ability to understand and reason about embodied cognition in interactive environments.
What You Do
- Train / fine-tune on the ENACT training set.
- Develop and validate on the ENACT validation set.
- Run inference on the held-out test set (to be released) and submit predictions via EvalAI.
Data Splits
Train
ENACT_train.jsonl
Validation
ENACT_val.jsonl
Test (Held-out)
Final evaluation set
Coming SoonDataset: Data can be found at huggingface.co/datasets/MLL-Lab/ENACT
Format & loading: Please refer to the official instructions in the ENACT repository.
Evaluation
- Primary Metrics:
- Pairwise Accuracy: Measures the model's ability to correctly compare and rank pairs of interactions.
- Task Accuracy: Measures the model's ability to correctly predict task outcomes.
- Ranking: Teams are ranked by a weighted combination of Pairwise Accuracy and Task Accuracy.
- (Optional) We may additionally report accuracy by task category and interaction type for detailed analysis.
- Tie-break: Higher Task Accuracy, then earlier submission time.
Challenge Leaderboard
Performance of submitted methods on the held-out test set.
Click on column headers to sort the results
| Rank ↕ | Team / Method ↕ | Overall ↕ | Pairwise Acc. ↕ | Task Acc. ↕ |
|---|---|---|---|---|
| - | Random Baseline | - | - | - |
| Challenge submissions coming soon... | ||||
Submission
Submission File Format (JSONL)
Submit a single .jsonl file with one JSON object per line, containing:
sample_id(string): Unique identifier for the test sampleprediction(string or integer): Your model's prediction
{"sample_id":"enact_000001","prediction":"A"}
{"sample_id":"enact_000002","prediction":"B"} Requirements
- Provide exactly one prediction for each
sample_idin the test set. - Duplicate IDs: Will result in invalid submission
- Missing IDs: Count as incorrect / invalid submission
- (Recommended) You may gzip the file for size:
predictions.jsonl.gz
How to Submit
Download the held-out test set (coming soon)
Generate your predictions.jsonl following the required format.
Name the file as: TeamName_MethodName.jsonl (or .jsonl.gz)
Submit via EvalAI platform (link coming soon)
Rules
- External data / models / APIs: Allowed with disclosure Teams may use external datasets, pre-trained models, or APIs, but must clearly disclose all external resources used in their submission.
- Human-in-the-loop labeling on test: Disallowed
- Participants must not attempt to obtain test labels or manipulate evaluation.
- Verification: Top teams may be asked to provide a brief method description and reproducibility details.
- Team size: No limit on team size, but each team may only submit under one team name.
Baselines & Starter Kit
Baselines, data loaders, and evaluation scripts are available in the official ENACT repository:
github.com/mll-lab-nu/ENACTGetting Started: Check out our baseline implementations and starter code to quickly get up and running with the ENACT dataset.
Contact
For questions, please reach out via: