Current Model
Loading model information...
Test Configuration
Select the transformer model to use for testing
Tip: The default country changes randomly on each page load
Higher values provide more comprehensive error statistics but take longer
Advanced Configuration
Configure exp(x) range and parameters for different methods
Configure the exp(x) range for different method types. Single-table methods are limited by int16 precision.
Recommendation: Use 6-10 for balanced coverage. Single tables struggle with wide ranges—use DIGmax for xmax>10.
Recommendation: Use 20-40 for robust handling of diverse attention patterns. DIGmax excels at wide ranges—go higher for safety without major accuracy loss.
Tradeoff: Higher orders (6-7) improve accuracy for large x but require more multiply-accumulate operations and risk numerical overflow.
Lower orders (2-3) are faster but only accurate for small x values.
Recommendation: Order 5 balances accuracy and computational cost. Use Order 3-4 for ultra-low-power, Order 6-7 for research comparisons.
Recommendation: Order 5 balances accuracy and computational cost. Use Order 3-4 for ultra-low-power, Order 6-7 for research comparisons.
Tradeoff: More tables provide finer-grained range adaptation, improving accuracy across diverse attention patterns but increasing memory footprint.
Fewer tables reduce memory but force precision compromises.
Recommendation: 6 tables (~3KB) is optimal for embedded systems. Use 8-12 for best accuracy, 3-4 for extreme memory constraints.
Recommendation: 6 tables (~3KB) is optimal for embedded systems. Use 8-12 for best accuracy, 3-4 for extreme memory constraints.
Tradeoff: Linear distribution lacks exponential adaptation, requiring many more tables (256+) to match accuracy of 6 logarithmic tables.
Fewer tables severely degrade precision. More tables approach logarithmic accuracy but waste memory.
Note: Exponential/log distribution is generally superior. Use linear only for controlled benchmarking or specific hardware constraints.
Note: Exponential/log distribution is generally superior. Use linear only for controlled benchmarking or specific hardware constraints.
Select Implementations to Test
Click cards to select/deselect implementations. Each will run separately and results will be compared.