Back to DIVE
Finance Agent Benchmark
L3
OOD — Specialized Tools
TaskPoolSet
Financial specialist tasks using EDGAR filing retrieval, web parsing, and domain-specific financial APIs. Requires understanding of financial documents and specialized tool invocation patterns not seen during training.
Tool Pool
EDGAR / Web / Parse / Retrieve
Toolset
Uniform
Protocol
OpenAI Function Calling
Environment
Stateless
Performance (Success Rate %)
Base
Best 8B Baseline
DIVE (SFT)
DIVE (RL)