Loading model details...
Back to Leaderboard
AllenAI
Olmo 3 32B Think (free)
Rank
#55
Overview
Analytics
Score Overview
42.0%
Overall Score
40.0%
Accuracy
100.0%
Syntax Valid
Score Dimensions
47
Correctness
41
Best Practices
42
Performance
42
Clarity
Performance by Complexity
Basic
100.0%
Intermediate
29.3%
Advanced
34.4%
Tasks Correct
12 / 30
Avg Response Time
204852ms
Task Breakdown
Summary
Dimensions
Task
Category
Score
Time
Total Sales Amount
task-001
aggregation
100%
17652ms
Count of Customers
task-002
aggregation
100%
127534ms
Average Unit Price
task-003
aggregation
100%
29960ms
Distinct Product Count
task-004
aggregation
100%
37634ms
Total Order Quantity
task-005
aggregation
100%
22561ms
Year-to-Date Sales
Query Failed
task-006
time intelligence
10%
224379ms
Previous Year Sales
Query Failed
task-007
time intelligence
10%
67792ms
Sales by Category Filter
task-008
filtering
100%
42834ms
Year-over-Year Growth Percentage
Query Failed
task-009
calculation
10%
247633ms
Running Total with CALCULATE and FILTER
task-010
iterator
30%
349649ms
Sales Summary by Category
task-011
table manipulation
100%
74643ms
Product List with Renamed Columns
Query Failed
task-012
table manipulation
10%
62003ms
Union of High-Value Transactions
Query Failed
task-013
table manipulation
10%
81464ms
Year-Category Analysis Matrix
Query Failed
task-014
table manipulation
10%
67918ms
Product Percentage of Category Total
Query Failed
task-015
context transition
10%
196240ms
Virtual Relationship with TREATAS
task-016
context transition
100%
238198ms
Granularity-Aware Measure with VALUES
Query Failed
task-017
context transition
10%
120736ms
Running Count with EARLIER
Query Failed
task-018
context transition
10%
301156ms
Multiple Filter Conditions
task-019
filtering
100%
207800ms
Percentage of Total with ALLEXCEPT
task-020
filtering
100%
174931ms
Filter Intersection with KEEPFILTERS
task-021
filtering
100%
424610ms
Product Ranking with RANKX
Query Failed
task-022
iterator
10%
237478ms
Top 5 Products with TOPN
Query Failed
task-023
table manipulation
10%
220023ms
90th Percentile Order Value
Query Failed
task-024
iterator
10%
131163ms
Handle Missing Data with BLANK
task-025
calculation
100%
36339ms
Safe Ratio with Cascading Fallbacks
Query Failed
task-026
calculation
10%
537975ms
Safe Year-over-Year with Missing Data
Query Failed
task-027
time intelligence
10%
441310ms
3-Month Rolling Average
Query Failed
task-028
time intelligence
10%
260266ms
Same Month Previous Year Comparison
Query Failed
task-029
time intelligence
10%
657519ms
Fiscal Year-to-Date (July Start)
Query Failed
task-030
time intelligence
10%
506152ms