Hello 👋, I’m Lena Shakurova, Conversational AI Advisor & CEO and Founder of parslabs.org and chatbotly.co
Linkedin: https://www.linkedin.com/in/lena-shakurova/
🌎 Amsterdam
📌 If you want another tool to be added send me an email to [email protected]
<aside> đź•‘
Last updated on 16.06.2025
I will try to keep this resource updated as I get to know about new eval tools. New tools will be marked “WIP” until I get time to test them
</aside>
<aside> 👌
If you need extra help with LLM evals, I offer audits, consultations, and full setup support to help your team build a proper LLM evaluation framework, so you can release to production with more confidence. Just reply to this email and I’ll send you more details.
Send me a DM on LinkedIn or book a free intro call and I’ll send you more details our LLM eval setup service :)
</aside>
Build with evidence, monitor what matters**
To know if you're improving, you must measure.
During development, evaluation shows whether prompt changes or model tweaks help or harm. Guessing slows you down.
But LLM evals are hard. Change one word in a prompt, does it help? Add a new instruction, do past use cases still pass?
LLMs are non-deterministic: the same input might produce five different outputs. How do you decide what's “good” or “bad”? Do you rely on human judgment, unit tests, or automatic scoring? And how do you catch silent regressions when nothing breaks, but quality slips?
In production, monitoring becomes critical. You need alerts when something fails, like the bot refusing basic tasks or drifting off-topic. Test sets help prevent this. Cover edge cases, simulate unexpected inputs: incomplete data, foreign languages, or hostile users.
This page lists tools to evaluate, test, and monitor LLMs, through every stage of development and deployment.
<aside> đź’ˇ
Check “Gallery” view for screenshots and “Open Source” to see all open source tools.
</aside>
No-code tools for LLM evaluation
<aside> 👌
If you need extra help with LLM evals, I offer:
All focused on helping your team release to production with more confidence.
📧 Send me a DM on LinkedIn or book a free intro call and I’ll send you more details our LLM eval setup service :)
</aside>
<aside> 👌
If you need extra help with LLM evals, I offer:
All focused on helping your team release to production with more confidence.
📧 Send me a DM on LinkedIn or book a free intro call and I’ll send you more details our LLM eval setup service :)
</aside>