FlagEval-Embodied Verse

欢迎使用FlagEval-Embodied Verse! FlagEval-Embodied Verse 旨在通过FlagEval具身工具链跟踪、排名和评估具身大模型(Embodied model),其中FlagEvalMM提供了多模态评估架构,Embodied Verse构建了一种基于具身智能高质量评测数据集的能力体系,Leaderboard则通过榜单实时跟踪并呈现不同具身大模型综合能力。

Welcome to the FlagEval-Embodied Verse! FlagEval-Embodied Verse aims to track, rank, and evaluate embodied models through the FlagEval embodied toolchain. FlagEvalMM provides a multimodal evaluation framework, while Embodied Verse builds a capability system based on high-quality evaluation datasets for embodied intelligence. The Leaderboard tracks and presents the comprehensive capabilities of different embodied large models in real time through a leaderboard.

Select columns to show
?
61.18
42.38
68.21
84.59
59.87
78.74
51.78
47.81
79.33
42.85
56.25
Mistral-Small-3.1-24B-Instruct-2503