注册并分享邀请链接,可获得视频播放与邀请奖励。

Sumanth (@Sumanth_077) “Pytest for AI Agents! (100% open-source and runs locally) Building agents with L” — TopicDigg

Sumanth 的个人资料封面
Sumanth 的头像
Sumanth
@Sumanth_077
Simplifying LLMs, RAG, Machine Learning & AI Agents for you! • ML Developer Advocate • Shipping Open Source AI apps
加入 July 2021
870 正在关注    76.6K 粉丝
Pytest for AI Agents! (100% open-source and runs locally) Building agents with LangChain means chaining LLMs, tools, and retrieval steps together. Each component can fail differently. The output changes with every run. Traditional unit tests don't work here because there's no deterministic value to assert against. DeepEval's LangChain integration brings Pytest to this problem. You write test files the same way you write any Pytest test. Loop through your evaluation dataset, run your agent, assert against LLM metrics. Same workflow you already know. The tracing works through a CallbackHandler you pass directly to your LangChain agent. It captures the full execution trace - inputs, outputs, tool calls, LLM spans - and maps them to test cases automatically. Testing works at two levels. End-to-end testing evaluates the whole agent on task completion. Component-level testing attaches metrics to individual LLMs and tools within your chain, so you know exactly which component failed when a test breaks. Plugs into CI/CD with a single command. Add it to your GitHub Actions workflow and every push triggers your agent test suite before anything ships. Key capabilities: • Native Pytest integration with parametrize and assert_test • LangChain CallbackHandler for automatic trace capture • End-to-end and component-level evaluation • Metrics: TaskCompletion, AnswerRelevancy, Hallucination, and more • Parallel test execution across multiple processes • CI/CD integration via GitHub Actions • Results dashboard on Confident AI 100% open source. Runs entirely on your machine. I've shared the link in the replies!
显示更多