Show HN: High-performance GenAI engine now open source

19 points by fryz 8 hours ago | 12 comments

serguei
We've been ramping up our gen ai usage for the last ~month at Upsolve and it's becoming a huge pain. There are already a million solutions for observability out there, but I like that this one is open source and can detect hallucinations
Thanks for open sourcing and sharing, excited to try this out!!
- fryz
  Yeah thanks for the feedback.
  We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.
  A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.
  Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.
madeleinelane
Love this. More transparency + better tooling is exactly what AI needs right now. Excited to give it a try.
Gabriel_h
Interesting, AI needs much better guardrails and monitoring!
kacperek0
Cool, I'm running few GenAI automations, but they're rather unsupervisored. So I'm gonna try it and check how they're doing.
iabouhashish
Very excited to be trying this out! The examples look very useful and excited to tie it up with other open source solutions
Lupita___
Thanks for sharing! This looks perfect for teams getting started with monitoring for all model types -- excited to try it out!
pierniki
Yoo! Hopefully no more "oops our AI just leaked the system prompt" moments thanks to these guardrails!
jdbtech
Looks great! How does the system detect hallucinations?
- fryz
  Yeah great question
  We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)
  We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.
- iabouhashish
  [dead]
vparekh1995
Excited to get hands on with this. I've had too many sleepless nights trying to figure out how to track when my agents were hallucinating.
cipherchain111
Very cool!
saintjcob
[dead]