CoachEval – Evaluating AI model performances on agentic coaching in endurance sports.

As part of developing my app for AI-powered training planning and coaching in endurance sports, I’ve been spending a lot of time lately looking into LLM benchmarks and setting up my own evaluations. What data do I feed into the LLM? How does it respond in specific situations? What tools do I provide to the LLM? Are the training plans the LLM generates good and effective? This is a very exciting task. This week, Strava launched its own MCP server. So many athletes will use this to evaluate their training as well as to create training plans. I think benchmarks are becoming increasingly important for improving the quality of LLMs in order to improve the coaching skills.

The idea: CoachEval

Stay tuned!

Interested in what I’m doing? Contact me at mail@united-in-pace.com

en_GB