- Disappointing on the NPU. I have found it's a point where industry wide improvement is necessary. People talk tokens/sec, model sizes, what formats are supported... But I rarely see an objective accuracy comparison. I repeatedly see that AI models are resilient to errors and reduced precision which is what allows the 1 bit quantization and whatnot.
But at a certain point I guess it just breaks? And they need an objective "I gave these tokens, I got out those tokens". But I guess that would need an objective gold standard ground truth that's maybe hard to come by.