• Interesting, a lot of this is consistent with our experience!

    We've been adding linters, including semantic analysis, in various ai talk2db connectors in loiue.ai that help steer better than normal thin MCPs enable. We've been having to hand roll for making it easy for folks to use splunk, kusto, etc better, so cool to see something like this for sql - we were hoping for precisely that! Our semantic analysis is driven by what's in the DB (schema, ..) x policy controls configurable by team admins, which this does out of the box for SQL.

    • That's exactly the use case I built agent runtime mode for: AI agents generating SQL need a policy layer between intention and execution. The rules engine is designed to be extensible for precisely that reason and can be enhanced with DB metadata via the catalog integration.

      Would love to compare notes on how you're handling the non-SQL side. Feel free to reach out!

      hello@lexega.com.

  • It's cool workaround for the problem that preserves the problem. An alternative is to write the query in a more reasonable language like https://prql-lang.org/ which has a representation closer to the semantic meaning and mostly avoid the big diff in the first place.
  • Interesting note about left join in the CTE being converted into an inner join. Didn't know that
    • Yeah, it's one of those things that is hard to catch unless you've been bit by it before and know to look for it. Analytics teams at scale are at a much higher risk of this sneaking in, which is where automatic blocking with Lexega is helpful. No one wants to have to explain to their leadership why their dashboards were wrong from such a subtle SQL bug months down the road.
  • Using the term "signals" was a bit confusing to me. But this looks like a SQL linting tool?

    Seems like it's doing something similar to sqlfluff lint, even supporting the same dialects.

    Also the GitHub link in the docs section leads to a 404.

    • Great question! sqlfluff catches real things like "= NULL" bugs, implicit cross joins, unused CTEs, and SELECT *. It's a genuinely useful code quality tool. The dialect coverage of sqlfluff is also extensive. Lexega's dialect implementations focus on depth over breadth.

      Lexega is asking a different question though. sqlfluff asks "is this SQL well-written?", while Lexega asks "is this SQL dangerous?". It parses into a full AST and emits signals (categorized AST components) that can be matched against YAML rules to block PRs or execution in agent runtime mode. DELETE without WHERE, GRANT to PUBLIC, PII exposure without masking, DDL that drops tables in production pipelines. The output isn't "fix your style", it's "this query violates an organizational risk policy and shouldn't be allowed to hit production".

      Think code quality vs. risk analysis. Both useful, different jobs.

      Good call on the GitHub link - I need to fix that.