Loading model details...
Back to Leaderboard
Anthropic
Claude 3 Haiku
Rank
#56
Overview
Analytics
Score Overview
36.7%
Overall Score
33.3%
Accuracy
100.0%
Syntax Valid
Score Dimensions
41
Correctness
37
Best Practices
39
Performance
40
Clarity
Performance by Complexity
Basic
84.7%
Intermediate
30.5%
Advanced
26.3%
Tasks Correct
10 / 30
Avg Response Time
1345ms
Task Breakdown
Summary
Dimensions
Task
Category
Score
Time
Total Sales Amount
task-001
aggregation
100%
894ms
Count of Customers
task-002
aggregation
100%
892ms
Average Unit Price
Query Failed
task-003
aggregation
10%
1016ms
Distinct Product Count
task-004
aggregation
100%
1000ms
Total Order Quantity
task-005
aggregation
100%
1227ms
Year-to-Date Sales
Query Failed
task-006
time intelligence
10%
1017ms
Previous Year Sales
Query Failed
task-007
time intelligence
10%
1151ms
Sales by Category Filter
task-008
filtering
100%
1627ms
Year-over-Year Growth Percentage
Query Failed
task-009
calculation
10%
2278ms
Running Total with CALCULATE and FILTER
Query Failed
task-010
iterator
10%
1163ms
Sales Summary by Category
Query Failed
task-011
table manipulation
10%
1239ms
Product List with Renamed Columns
task-012
table manipulation
100%
1095ms
Union of High-Value Transactions
Query Failed
task-013
table manipulation
10%
936ms
Year-Category Analysis Matrix
Query Failed
task-014
table manipulation
10%
833ms
Product Percentage of Category Total
Query Failed
task-015
context transition
10%
1585ms
Virtual Relationship with TREATAS
Query Failed
task-016
context transition
10%
1390ms
Granularity-Aware Measure with VALUES
Query Failed
task-017
context transition
10%
1657ms
Running Count with EARLIER
task-018
context transition
30%
2527ms
Multiple Filter Conditions
Query Failed
task-019
filtering
10%
1305ms
Percentage of Total with ALLEXCEPT
Query Failed
task-020
filtering
10%
1123ms
Filter Intersection with KEEPFILTERS
task-021
filtering
100%
1099ms
Product Ranking with RANKX
task-022
iterator
100%
1084ms
Top 5 Products with TOPN
task-023
table manipulation
30%
1561ms
90th Percentile Order Value
task-024
iterator
100%
929ms
Handle Missing Data with BLANK
task-025
calculation
100%
1065ms
Safe Ratio with Cascading Fallbacks
Query Failed
task-026
calculation
10%
2054ms
Safe Year-over-Year with Missing Data
Query Failed
task-027
time intelligence
10%
1981ms
3-Month Rolling Average
Query Failed
task-028
time intelligence
10%
1764ms
Same Month Previous Year Comparison
Query Failed
task-029
time intelligence
10%
1522ms
Fiscal Year-to-Date (July Start)
Query Failed
task-030
time intelligence
10%
1341ms