• > If you are a junior developer, “learn SQL properly” is the most valuable 40 hours you can spend. Not a tutorial. Not an ORM. Actual SQL: joins, subqueries, window functions, query plans. That investment pays you back at every job, in every stack, for decades

    This is the power of low-level reasoning.

    Today, even for a junior developers, even if they have AI that solves syntax problems, SQL teaches you to reason and approach problems logically. Without any wrapper masking low-level logic.

    It's something like the letters of the alphabet that form concepts: why should they change?

    • The breakthrough for me was thinking in terms of sets and not in terms of "how would I do this imperatively."

      If you see SQL where someone wrote a SELECT and is then using a cursor to loop through those results and do other queries, you've found the person who is still thinking imperatively.

      • SQL teaches Set based thinking.

        Prolog teaches logical constraint based thinking.

        ML/Lisp/etc teach functional thinking.

        There is a lot of use in learning these other things beyond the standard imperative thinking from C/Python/Java/etc. Since some problems reduce their complexity significantly in one form or another.

      • > If you see SQL where someone wrote a SELECT and is then using a cursor to loop through those results and do other queries, you've found the person who is still thinking imperatively.

        To a first approximation, yes. But 'client-side joins' can be a valuable tool when the database engine won't cooperate. For some queries and some engines, you can do a select with a join to get everything you need in one query, but a select to get a list of ids followed by a union of selects (not an IN query) to get details for each id will have the results at the client sooner, with less load on the database, at some potential loss of consistency. Your client needs to be within a reasonable round trip of the server or the two queries approach won't get the answer faster.

        • I'm not sure I follow. Fetching the data in one query should almost always be faster and more efficient - the same work is being done regardless, except now you have the additional overhead involved with a second query (network and connection overhead, parsing etc.)
          • Depends on how the database engine sets up the join. I've seen queries where the join ends up spilling into a temp table and it takes a lot longer to do it that way. IIRC, this can happen especially if you've asked the database engine to sort things.

            Same with UNION vs IN. If you union 10 queries for one row each by id, it'll hit the index every time; but if you do it with IN, maybe it decides to do an index scan which takes longer.

            You could say well the database engine is broken if client-side join works better than database-side join, and sure it probably is, but given the choice of fix the database engine or do a client-side join, I know which one is feasible in the short term.

            • In that case, I'd suggest using EXPLAIN and CREATE INDEX on whatever requires full table scans.

              Worst case, you could CREATE MATERIALIZED VIEW whatever would've happened in that temp table.

              At the end of the day, your single query will be faster.

            • > database engine sets up the join

              Every major database uses nested loop, hash, or sort-merge joins. If the data is small enough to fetch in a second query, I don't see any scenario where it would make a query spill to disk when it otherwise wouldn't have (except those pesky OR's).

              Most JOIN issues can be resolved by composing using sub-queries/CTE's to defer a join to a smaller intermediary result, and the only time this generally can't be done is if you need a predicate on the joined table.

              > you've asked the database engine to sort things

              I've only found this to be true when there's an OR condition where the single query ends up doing a bitmap against every row, in which case a UNION will be faster since it's just appending two already-index-ordered streams.

              > Same with UNION vs IN

              A union is almost always going to be worse - most query planners cannot optimize across queries in a UNION, so each query is going to have separate ops vs. a single op with the IN. The only case I've seen this be true is for multi-column predicates with an OR clause against a composite index. I just tested the former in both Postgres and MSSQL and the UNION query cost was 2x the IN clause.

              > maybe it decides to do an index scan which takes longer

              This has not been my experience unless the table statistics are bad.

      • How is a cursor different from a jdbc call that fetches all records after executing a query? What advantages doez the cursor have that a recordset does not?
      • A thousand times this. I've resolved performance issues in so many stored procedures written by programmers who don't grasp set theory and reach for the CURSOR early and often.

        The quote that comes to mind: "His pattern indicates two-dimensional thinking."

        • Also from that film comes the quote:

          > I've done far worse than kill you. I've hurt you. And I wish to go on hurting you.

          Which I believe is a paraphrase of the Oracle Master Agreement.

        • I guess I think in 2.5D because I’ll often sketch an imperative, dumb query and then refine it once I know I’ve got the data I need. But I have a hard time starting with elegance. I use a baseball bat to shape the garbage can into the desired form
          • You'd have a future in modern sculpture. ;-) Okay, sorry sculptors! I like the brute force of the bat/garbage symbolism that seems so fitting in this day and age.
        • Fascinating.
    • >, SQL teaches you [...] Without any wrapper masking low-level logic.

      I understand the point you're trying to make, and yes, it does seem like SQL is "low-level" from the perspective a wrapper like ORMs or a GUI db browser tool with menus for filtering data.

      But it's also worth remembering that SQL itself is a high-level wrapper that hides the lower-level C/C++ code of the db engine that has the loops that iterate through b-trees, 8k data pages, memory blocks of the buffer cache, etc.

      And C/C++ itself is a high-level wrapper that hides the logic in lower-level Linux o/s system calls that manages RAM and disk i/o.

      And Linux itself is a high-level wrapper that hides low-level device drivers like SATA/SSD memory-mapped IO ... and so on and so on.

      Depending on the type of app, you can ignore all the lower levels and just work at the abstraction level of higher-level wrappers.

    • I find SQL a very thick "wrapper masking low-level logic". Think of the query planning, the index-maintaining, the upholding of guarantees, the writing-to-disk and caching that you are all not doing by using a RDBMS!

      I'd say SQL is a very high level language.

      "SQL teaches you to reason and approach problems logically" -- I kind of agree here. It teaches relational data mgmt. I think it is better to attack most software design challenges at a higher level, and --once settled at that level-- consider how to "serialize" those solutions to an RDBMS (if that's the tech that you've chosen for persistence; still a very solid choice after 50+ years!).

      • Yes, i think the right wording is something like "the power of understanding the concepts" or "having the right mental model" rather than "low level".
        • And this I think is best not done in SQL/ the relational-data paradigm. It's better to understand the problem in terms that do not tie you in to a specific technology. And once you have a clear picture of what need to be built, then choose persistence tech; if that happens to be SQL, you can then translate your solution to SQL.

          In my experience, SQL sorely misses sum-types. So I need to find a way to serialize the sum-types of my domain model to SQL.

      • “Fundamental” rather than “low-level”. Which also matches the article picture.
        • I was reacting to the parent post. And F and LL are very different. I'd say F is a more subjective metric.
          • I was reacting to the parent post as well, suggesting that they should have used “fundamental” rather than “low-level”, and that “fundamental” would also match the article picture.
            • Dunno. I think the pic is useless. SQL is not in the foundation of all those langs.

              SQL is still very useful after all these years: that's the point that anyone will agree on.

              Not low level. Not "fundamental" (by most definitions I can think of).

    • > Not a tutorial. Not an ORM. Actual SQL

      ah, this is an Ai article

      • Darn. I write exactly like this. You need to consider that people write in different ways, and the LLM is choosing from among the different styles.
        • I knew a lady who was named Isis at birth.

          She stopped using that name.

        • I heard it mostly writes in a style associated with low-class people from Kenya.
          • I read the same piece you did (if it was on HN, anyway), and it described highly-educated-in-Kenya people. Nothing low-class implied. I suspect (though I may be wrong about this) that lower-class Kenyans aren't likely to be literate in English.
      • So AI also thinks people somehow get away only with ORM without understanding SQL?

        Funny thing is, that is always argument of „anti ORM” people.

        I yet have to see someone actually argue that you don’t need to understand SQL and ORM will suffice in the wild. Then also find devs who can’t do a simple join as joins and index usage is not some black magic and is still required to use ORM properly.

        • > I yet have to see someone actually argue that you don’t need to understand SQL and ORM will suffice

          Well that's because decades of bitter experience has told us all that object graphs rarely map cleanly to sets of relationships.

          However, I do think that must have been the original idea as tools such as Hibernate tried so hard to obscure the underlying SQL and database. As a result all Hibernate objects have their own particular identity requirements which only made sense to a developer that knows what's going on under the hood.

          • I would still like some kind of proof.

            Like an early article having headline "ORM will replace SQL knowledge".

            I am professional dev for 15 years and hobbyist for 20 years and I might have missed something. But only thing I do remember was "anti ORM" people nagging how "one should really know SQL" - where I never heard anyone saying "don't learn SQL" maybe only NoSQL hype... but no one else.

      • Can we stop calling specific literary devices as automatically AI?

        Yes, LLMs overuse that pattern. But it's a valid rhetorical device used for many , many years by human authors. Quite often too, especially in philosophical writing, and fantasy novels.

        I'll give you that it wasn't often used in blogs or tech articles, but LLMs have been around long enough to have influenced human writing in other domains without the entirety of the content itself being LLM generated.

        But its called out so often I swear people online will go read some classics and accuse them of being AI generated.

        • The amount of em dashes in Nietzche; the amount of semicolons in Hegel
        • I just assume anyone posting it, at this point, doesn't read, doesn't write, or simply isn't clever enough to say anything that's actually worth listening to. Pure noise that won't go away because it makes the teenagers feel validated in how mad they are about AI.
        • [dead]
      • It's always AI until proven otherwise... even then I'm skeptical.
        • Ellipses where a comma would suffice? Definitely AI. Not even a good model.
        • John23832, you are still an AI to me too.
      • Did you not see the banner image?
    • I'm actually learning SQL now and finding it HIGHLY enjoyable.

      I'm by no means a senior dev, but I don't know if I fit in the box of a junior either.

      Regardless, SQL is proving enjoyable. But I really like logic, so it fits.

    • A couple of sites worth checking out to level up, both by Markus Winand:

      https://modern-sql.com/

      https://use-the-index-luke.com/

    • That’s not low-level reasoning, but rather low-resolution thinking. SQL is a high level language. It’s so high level that you barely need to express how the computer should do things, but only declare what you want to have.
    • The whys will vary, but letters of alphabets do change indeed.
  • >The Only Programming Language Built on Mathematics, Not Fashion

    As a modern array language D4M is the natural successor for SQL [1].

    D4M is based on mathematics like SQL, specifically associative array algebra but not relational unlike SQL. It's more generic since can it caters to most modern data abstractions including spreadsheets, database tables, matrices, and graphs [2].

    You can achieve 100M database inserts per second with D4M and Accumulo more than a decade ago back in 2014 [3].

    [1] D4M: Dynamic Distributed Dimensional Data Model:

    https://d4m.mit.edu/

    [2] Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs:

    https://direct.mit.edu/books/monograph/5691/Mathematics-of-B...

    [3] Achieving 100M database inserts per second using Apache Accumulo and D4M (2017 - 46 comments):

    https://news.ycombinator.com/item?id=13465141

    • molf
      There is no SQL successor: SQL is here to stay.

      Applying the Lindy effect [1]: after half a century of SQL we can expect it to survive for at least as long.

      Disruption/displacement of SQL is like attempting to replace email. It's not going to happen. At best an alternative technology can carve out a small niche (and there's nothing wrong with that).

      [1]: https://en.wikipedia.org/wiki/Lindy_effect

      • That wikipedia article was super interesting, I'd never heard of the Lindy Effect before. A bit difficult to wrap my noggin around but really fascinating to think about.
        • Read the books of Nassim Taleb. They are full of this kind of interesting stuff. Sadly, he blocked me on twitter back when I asked him why he had a paid subscription for a self-described communist hardcore-Putinist hardcore-Antisemite :/
      • Never heard of the Lindy effect either, learn something new from this site every day haha
        • It was made famous by Nassim Nicholas Taleb in Incerto.
    • The power of SQL is not because it is "based on mathematics" - it's because anyone (really, anyone, even with the most basic English skills) could understand it quickly enough to start using it productively with not much technical knowledge. Business analytics, managers of all sorts, manual QA people could grasp the basics in a minute and more complex queries within a few hours. It is very user-friendly and such tools win over anything else. Each time I see an overengineerd/overcomplicated solution that is hard to read/understand - I know it's only "good luck" to the creators.
    • The only one? As opposed to ... Haskell, LISP/Scheme in the original SICP version, and proof assistant languages like Lean.
    • First impressions assuming the goal is to replace the incumbent SQL. Haven't seen the language yet.

        * D4M rolls off the tongue
        * Make me buy a book to see the language.
    • i'm probably pretty dumb but i don't really get what's so mathematical about sql.
    • Sounds interesting, but how can I use it to talk with an Oracle/MySQL/PostgreSQL database?
    • This seems like just another NoSQL db, but with fancier words.
    • I feel you missed the point of the article :)
      • I feel like the point of the article was "hey chatgpt write me an article about SQL"
  • I’d say the most impactful thing is not to learn SQL, but set theory.

    Well-written SQL is about thinking in sets. I cannot tell you how many poorly written procedural stored procedures I’ve replaced with a single performant SQL query over the years.

    This is because the most impressive part of the SQL ecosystem is the DBMS engine’s query plan. Though, yes, you have to know how to influence it.

    I find ORMs also tend to keep devs thinking procedurally.

    Yes learn SQL! But don’t just learn the syntax. Learn the underlying mathematical models and ways of thinking that SQL supports implementing.

    • I think I had a pretty good understanding of set theory through programming, and although I tried to get into the more mathematical side of it, I found most things to be 1. trivial (because I was used to thinking in lists, sets, hashmaps, etc.), or 2. irrelevant to me. The latter was a shame, because I've always liked the abstraction of math, but I couldn't help but feel I wasn't learning anything actionable when learning more about sets (though maybe I'm wrong).

      What had a big impact on me was the relational model, specifically after reading Richard Fabian's Data-oriented Design book [1]. I had watched Mike Acton's famous Data-oriented Design talk [2], then Andrew Kelley's talk [3] where he explains speedups in the Zig compiler using DoD principles (largely using methods from Acton's talk), but Fabian's book tied these concepts to database normalization and the relational model.

      Most DoD advice is very "exercise left to the reader", because it's about matching a specific problem, but using the relational model and considering your data's primary and foreign key relations can be really powerful. I just wish more of that power was exposed through regular programming language interfaces, rather than having to pull in and marshal data through a DB. I might have to try C# and Linq.

      1. https://www.dataorienteddesign.com/dodbook/

      2. https://www.youtube.com/watch?v=rX0ItVEVjHc

      3. https://www.youtube.com/watch?v=IroPQ150F6c

  • Learn SQL (because it's basically the only option) but much more importantly, learn databases. Know why atomicity, consistency, idempotency, and durability matter. Understand the wire protocol and the client-server model. Do relational data modeling; think beyond databases as a dumb store. Join. Know when to normalize. Internalize indexing strategies. Think deeply about what work belongs on the database server (work that can leverage relational set theory) and what work stays in the application. Once you figure the true capabilities of databases, SQL as the language interface is a side note - about as important as the leather on your steering wheel.
    • I agree with everything except this:

      > Understand the wire protocol and the client-server model.

      I'm a DBRE, and have locally compiled MySQL with debug symbols to step through something with gdb. I have yet to examine the wire protocol for MySQL or Postgres beyond a brief read from docs. I'm not saying it's not useful in some circumstances, but I can't think of a reason why a developer would ever need to know it.

      • I take "understand the wire protocol" to mean, perhaps, understand nature and shape of the communication between the client and the server.

        Here's an "understand the wire protocol" story:

        Years ago I dealt with a Customer's thick-client Win32 ERP application that back-ended into Oracle. They wanted to use it across a VPN. They naively thought bandwidth would be the major potential showstoppper and did measurement of bandwidth usage on their own. The app didn't throw around many bits so the Customer declared the test a success and spent the money on the VPN solution.

        After the VPN was implemented they were unhappy w/ the performance of the app and asked me to take a look.

        The application was built with individual SQL queries "bound" to many of the UI controls. Depending on what the user was doing, displaying a dialog might require 20+ round-trips to the Oracle server. The developers just assumed LAN latency and gave no thought to minimizing round trips.

        Web devs are used to dealing with RTT latency today, but this was another time. The devs had no sense of how the wire protocol worked and ended-up making users like my Customer hamstrung into "solutions" like Remote Desktop / VDI.

  • But you cannot learn it “once”. You need to continuously use it (over those 30 years, for example) to be proficient at it. It’s not that you can study sql for let’s say, 30 days, and never touch/study it again.

    I did “learn” sql at uni… but had to study it again at every company i worked for (different problems triggered different solutions). Im still learning it.

    • That hasn't matched my experience. I read a fundamentals book on it, fully; a practical one (think it was T-SQL fundamentals, which was 90% ANSI sql). I did the problems, some were hard. Then its just kind of stuck around in my head, now nearly 15 years later. I use it often and am continuously shocked to understand it better than some of my colleagues, still, since I rarely use it. It also seems to have infected how I think, such that I'm often thinking in terms of SQL (or I guess, set theory really) when I"m reasoning about data and processing it. That's likely why it sticks, its just not that far removed from the operations that are happening (at the basic level), and then also not that complex. You aren't making new abstractions or layers with it, its a pretty limited set of features ultimately, and generally speaking it changes little if at all over time. Its great in that way, especially in this everything-changing-constantly industry.
  • - I recently read that most programmers SQL knowledge is outdated by 20 years and it’s true for me. There are quite a lot of features in most DBs that feel very "new" to me.

    - Comparing SQL to React weakens the argument. SQL is the language, React is a piece of software. You certainly can run 30 year old JS today in modern browsers.

  • I think the pretext of this articles is ridiculous.

    Yes, SQL is based around relational algebra, but all programming languages are built on a theoretical foundation.

    And SQL is very much a "fad" language - it just somehow managed to stick around. The goal was not some sort of mathematical purity, but rather to built a natural language data interface (sounds like something currently very hyped?) and it failed spectacularly at that goal.

    It is so far from natural language that English speakers with statistical understanding won't be able to read it, but it is also inconsistent enough in its grammar design that it is unreasonably difficult to learn and needs large refactoring every time you want to query into the result of a query.

    To continue my rant: Sometimes '=' is an identity test, sometimes it is `==`. Sometimes groups are called groups, sometimes they are partitions.

    When creating a CTE, you put the name before "AS", but when creating a column, you put the name after "AS".

    SQL is great because it is everywhere and it is definitely good enough, but it is not something great, that transcends other programming languages.

    • I think this is a tad too contrarian. I think SQL does (did?) a great job of giving common-language terms for fundamental data table operations, which in turn creates a shared mental model that everyone can use to describe what can and should be done to tables.

      I agree with you in terms of syntax, though, it leaves something to be desired. But learning SQL was a pretty fundamental step in my journey to becoming a data scientist. It helped form the basis for how I reason about tabular data.

    • >Yes, SQL is based around relational algebra, but all programming languages are built on a theoretical foundation.

      While true on some level, I don't think this is a very useful statement. The importance of mathematical foundations lies in the extent to which they constrain the features of a programming language.

      That extent is not the same for all languages. Many programming languages do not appear to be constrained by anything other than some pragmatic hunch of their designers plus the theoretical limits of computability.

      SQL is a mess. The author acknowledged that. But the relational model and relational algebra are more serious attempts at creating a small but expressive theory than many of our mainstream programming languages.

    • > And SQL is very much a "fad" language - it just somehow managed to stick around.

      Probably because - despite it not being perfect - the only people who have been able to do it better are very slight variations of it like LINQ and Logica and GoogleSQL etc.

      • KQL is a great language for quickly analyzing data, too. I've grown to love it.
      • Edgar F. Codd did it better when he designed Alpha. It is his work that we still talk and dream about to this day. SQL won because it was the first implementation to be network-connected, which was far more important to industry than language.
    • I found PRQL[1] to be good fix for nearly everything I dont like about SQL.

      But then it's only a query lang (DDL you still do in SQL then I guess).

      Bottom line for me now is that I dont write much of my SQL by hand. AI does a much better job at it. I just read it back and point out mistakes and/or inefficiencies.

      1: https://prql-lang.org/

      • This looks somewhat similar to LINQ in how it order clauses.
    • SQL may not be perfect, but I think the biggest pain points come from databases themselves. Any changes to the table structure (like a simple column rename) breaks all your queries. Data versioning not supported. Documentation not supported out of the box...
    • > it just somehow managed to stick around.

      We're kind of stuck with it, unless someone does for SQL what Kotlin tried to do for Java. I wonder what it would even look like, or if the real answer is to take the WASM spec, and make one for SQL itself, so you can write queries in any language, compile them to "WASM-DB" or whatever, then those get converted over to standard SQL, until databases support "WASM-DB" or whatever language.

      Would love to see what something like this could look like and if it would be worthwhile? For me WASM opens us up to not having to write front-end JS and being able to do front-end and back-end both in your native programming language (like Blazor does for C#).

    • Well put. SQL gets put in this category of foundational technology that has passed the test of time when in reality it’s more like an example of path dependence.
    • > Sometimes '=' is an identity test, sometimes it is `==`.

      Eh... Where did you find `==` used in SQL?

  • I've been slowly transitioning from using an ORM to just plain SQL. It's so much simpler. Less magic, more explicitness, and more control. Also, much better performance. I think the thing is to construct your model around the different queries you need to perform. In many cases, especially a CRUD-type situation, you'll end up with 10-20 different SQL queries, and that's it.
    • Once you break free of ORM’s I find the code so much simpler to maintain.

      Here’s the query(typically multiple different subqueries and return types), here’s the params, give me all the data back and something like Dapper in .net is an absolute godsend to convert it.

      • The code is simple to maintain until the database changes. Then you will experience the pain of SQL
        • I’ve experienced what you’ve mentioned before, ORM or not you have issues if you’re shuffling the schema around.

          Cool.

          • It would be great if there was a compiler that would check your SQL queries against the schema, and you could even refactor a column name to update both the schema and the queries.
        • Data schema changes are difficult, almost regardless of technology - it's been an issue for me from relational dbs to OpenAPIs. gRPC is easier as long as you obey the migration rules, but those impose tight restrictions on what you can change
        • By the time you really needs to change your database, updating your queries will be the easiest part, compared to reviewing the semantic changes and the data migration.
  • That's true. SQL knowledge is one of the few skills that didn't age.

    1. C language.

    2. *nix tools (shell and friends).

    3. SQL.

    4. Basic IPv4 networking.

    These things I learned around 20 years ago, they didn't change much and they are useful for me to this day.

    • I was a fan of Seven Languages in Seven Weeks [1] because it exposed you to different paradigms which you could then try to apply where they made sense on whatever tools you were using or building: prototype based, fault tolerante, funcional, logical. Very fun book when used right.

      The point being that sometimes the tools themselves don't need to survive because you take the lessons from one thing to another (e.g. move semantics and rust/modern c++)

      [1] - https://pragprog.com/titles/btlang/seven-languages-in-seven-...

    • Well, IPv4 is obsolete.
      • I'll believe that when I see tech companies bring up IPv6 integration before I do when setting up networking things.
  • The comparison with JavaScript as an exemplary imperative language is silly. You can find examples of C code from 40 years ago that still work perfectly with modern compilers. Like C, SQL is a technology that has far outlived its usefulness, though, for very different reasons. SQL was not designed for application development, and every attempt to integrate it into higher level programs (ORM, fluent query builders, raw strings, macros/preprocessors) comes with unpleasant rough edges. The best thing young developers can do is read a book like "Designing Data-intensive Applications" and learn how the fundamental technology behind databases work. Learning relational modelling is great, but learning SQL itself, unless you actively have to work with it, is a waste of time.
    • > Learning relational modelling is great, but learning SQL itself, unless you actively have to work with it, is a waste of time.

      What a wild statement.

      SQL is one of the most useful tools ever developed, as evidence that it's BY FAR the most widely used programming language in the world.

      The idea that it is an ABSOLUTE LAST RESORT that should never be used unless you ABSOLUTELY HAVE TO is insane...

      • Even wilder:

        > SQL was not designed for application development, and every attempt to integrate it into higher level programs (ORM, fluent query builders, raw strings, macros/preprocessors) comes with unpleasant rough edges.

        Forgetting that interactive SQL queries, and to an even greater degree the underlying databases, are applications.

        • Anyone who hates SQL typically is completely clueless how hard it is to actually do what SQL does correctly and efficiently.

          SQL makes doing something that is VERY hard stupidly easy.

        • The issue is that most of these tools try to reinvent the wheel. Instead of using sql, you use 'fluent builders' or whatever, and they have their own tricks and cevats.

          Sqlc is the best thing I've personally used because it produces models and repositories based on plain sql queries.

          • It's much easier to reinvent one of the most used tools in the world as a far worse version than better.

            It's almost as if it was easy to reinvent something highly used that someone would've already done it...

    • The point is that you will actively have to work with it. If you are designing data intensive applications without sql you are the exception, not the rule.
  • I also feel like giving a shout out the the PDF version of the official PostgreSQL manual. It was one of the most enjoyable and engaging tech books I've ever come across, and seems like pretty much the gold standard in what official documentation can look like. It took me a while to figure this out though, because the UX of the standard HTML version of the manual is pretty clunky.
  • > Edgar Codd formalised relational algebra in 1970. SQL sits on top of it as a declarative interface. You describe what you want. The database engine decides how to get it. The engine improves every year. Your query stays the same.

    Although SQL is of course not relational Algebra (and others like Datalog and D4M are better), it's still cool. It inspired kSQL like Lil uses https://beyondloom.com/decker/lil.html#lilthequerylanguage , which inspired the code I'm most proud of: https://codeberg.org/veqq/declarative-dsls A common query language, a common idiom, for many data structures (arrays, hashmaps, datafremas) is liberating, permitting you to e.g. solve sudoku, make mandelbrot sets or calculate primes directly:

        (def n 40) # to reach primes up to, left is sqr of n, right n/2, then multiply them for rows
        (def composites
        (df/select :from (range 2 (+ 1 (math/floor (math/sqrt n))))
                   :cross (range 2 (+ 1 (/ n 2)))
                   :where |(<= (* ($ :value_left) ($ :value_right)) n)
                   [[:value_left :value_right] :value
                    |(* ($ :value_left) ($ :value_right))]))
        (df/select :from (range 2 (+ 1 n)) :exclude composites)
    
    Or e.g.

        (import declarative-dsls/dataframes :as df)
        (def people (df/dataframe :name :age :job))
        (df/dataframe? people)
        
        (df/insert! {:name "Bob" :age 30 :job "Developer"} :into people)
        (df/insert! {:name "Alice" :age 27 :job "Sales"} :into people)
        (df/update! :set {:job "Engineer"}
                 :where |(= ($ :job) "Developer")
                 :from people)
        
        (df/save-csv people "people.csv" :sep "\\t")
        (def people2 (df/load-csv "people.csv" :sep "\\t"))
        
        (-> people2
           df/dataframe->rows
           df/rows->dataframe
           df/print-as-table)
    
    The tests file has many such things (like the sudoku solver) and even datalog and minikanren implemented on top of this!
    • Datalog is the dream. But SQL with a good query builder like Clojure's honeysql is not so bad.

      That and SQLite seems to be able to scale to almost any problem, is disgustingly fast and with litestream incredibly resilient.

  • My first job in 1994 was using RDB, very soon updated to RDB-Oracle 6 that was very compliant with my book about sql 92. I learned SQL during my studies, so my queries based on joins naturally performed well, unlike the nested SELECT statements in the legacy code. I learned to tune performances. After that, I've encountered Oracle with its share of non-standard syntax and behavior. Now, when I build a backend (flask or fastapi), my CLAUDE.md contains "use sqlite without sqlalchemy" so that I can understand how data is accessed and check that there is no hidden query inside a loop. When the SQL queries generated by AI become unreadable, it’s a sign that the system is no longer functioning properly. In my view, SQL is a good way to maintain control over an application that is largely generated by AI.
  • > "JavaScript and its ecosystem is an environment where browser wars, framework trends, and open-source maintainer preferences reshaped every few years."

    In general, I hate frameworks, and not just JavaScript frameworks. Firstly, for the reasons she describes; they do change quite frequently and break stuff. Secondly, I don't see it saving any more time than using several nice libraries. Not only do you have to learn JavaScript, Java, C#, etc. but you have to learn the framework syntax as well. I will, obviously, use a framework at work when I have to, but for my personal projects, I try to "hand roll" as much as I can with vanilla languages.

    • This is why I like things like React and Astro, for the most part it is just JavaScript in the end. Other than JSX which is already familiar from using HTML and has applications beyond React
  • > The Only Programming Language Built on Mathematics, Not Fashion

    Had to reread the title again since I thought I opened a different article about TLA+.

    As for SQL, if you're referring to DBMS systems, here's what E.F. Codd, inventor of relational algebra, had to say about them and the departure from his work: https://thaumatorium.com/articles/the-papers-of-ef-the-coddf...

  • I’ve always felt that SQL is somewhat easy to grasp for basic queries, but gets complex and difficult for even moderate to higher complexity use cases. My eyes glaze over when I read long stored procedures that someone else has written. Any recommended resources to go from beginner/beginner-intermediate to advanced?
    • I think advanced SQL authoring is generally simple to understand, and that's the larger learning curve!

      I find those big stored procedures usually fall into two categories; logic that should be in the DB, but should be decomposed (staging tables, other SPs, etc) in which case they can be understandable in chunks; or logic that shouldn't be in the DB but has been shoved in there, in which case there's more of an ideological debate but I generally prefer to pull out and run in the application layer. (the latter is pretty much IMO the things that you've done after you've gotten the data at the right grain, when you are massaging it to a particular form/presentation format; performance is often the final arbitre here though).

    • My advice is: don't write complicated SQL.

      The best thing I learned about SQL is that it can do an awful lot of clever stuff but that the vast majority of the time you really don't need it. Learn the basics. Shrug the rest off.

      • This is the correct way. Much like any other kind of code, if you find yourself doing something "clever" it's time to think about whether you're really going down the right path.
      • What do you consider to be clever SQL?
    • The way to learn advanced SQL is to challenge yourself to find a set oriented solution and avoid procedural code. The more unreasonable it feels, the more you learn.

      If the solution you find is longer and not much faster than the procedural alternative, you throw it away and fall back on procedural code.

      Stored procedures are not advanced SQL. Most of them are not SQL at all. There are a few legitimate reasons for using SPs such as reducing roundtrips to the database and writing little pure functions for use in SQL statements.

      But many uses of SPs are just laziness or a symptom of organisational dysfunction.

    • I feel like stored procedures and co crosses over into the realm of application programming, and while I can't speak from experience (so take this with a huuuuge grain of salt), this is where things break down. It feels like adding logic / basic programming to JSON/YAML, which are data/config languages primarily.

      I think stored procedures - or anything that goes beyond storing / looking up data - had a place when a database had multiple different clients, but with modern day systems that's less likely to be an issue.

    • [dead]
  • I thought this was about me.

    I learned SQL 30 years ago and, well, pretty much stopped.

    Through the years, as we added ORMs and were using the databases more for base dumb storage, and that I ventured away from report writing, focuses more and just services, workflow and CRUD, I never really learned modern SQL. I've just been able to muddle through with the SQL I learned long ago.

    To the articles other point, I've been doing Java for almost 30 years, and while we have a much more modern language, the fundamentals of years ago are still sound and used every day.

  • > Learn SQL Once, Use It for 30 Years.

    Well, during those 30 years you'd have to learn a few additions that were added to the SQL standard, like "OVER / PARTITION BY", and maybe a few non-standard extensions like "QUALIFY".

    There have been 11 modifications to the SQL standard since the initial SQL-86 ANSI standard of 1986.

    • Very worthwhile additions in most cases IMO, and well worth the "continued education" required.
  • I've been using SQL for almost 40 years and I know for sure I will be using it for the next 20. Relational databases is about data sets used everywhere and SQL is the math behind it.

    With AI it's even easier as it creates all the DB schema for me, all the dummy data to test my apps and all the complex queries needed.

  • Not quite as simple as learning it once. SQL evolves like other languages, across vendor implementations.

    The ClickHouse and DuckDB dialects for example extend the language with analytic options not found in ANSI SQL, nor T-SQL, Pl/PgSQL, etc. DuckDB QoL enhancements are greatly missed when not available.

    • Which SQL-specific QoL enhancements do you miss? I was really excited to use DuckDB for things like structs and enums, but after a while I just went back to regular SQL and used it for its other features.
  • SQL and Bash/sh/zsh are the only two languages I learned at the start of my career that have stayed consistently useful ever since.

    (I never got very good at Bash but just the REPL terminal basics have served me very well.)

  • Strong agreement. Also applies to regular expressions.
    • Eh... yes but, I can write a sql query and come back a year later and understand it at first glance, or maybe if its really complicated a minute or two.

      Writing a regex involves always looking up the syntax, and being unable to read it an hour later without having to carefully disect it. I say this with 20+ YOE and I think I am better than the average bear with them.

      That said, knowing what they can do is important and very useful. Its just the implementation that never quite sticks.

      • > Writing a regex involves always looking up the syntax

        No. I mean, I believe it does for you. I use it often enough that it's always fresh. On the other hand, I always need to look up the syntax of CREATE TABLE. My experience can be described the same as yours.

  • At my first job in 1997, I learned SQL & Relational DB design with the book "Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design"
    • As a student I used to work as a network administrator in the summer breaks. The place had a very nice library of technical books. I had the pleasure to read "Database Design for Mere Mortals" there, in the hours when work was slow.
  • > If you are a junior developer, “learn SQL properly” is the most valuable 40 hours you can spend.

    "Learn SQL Properly" is referenced as if it were a book or a course, but it seems to be a hallucination? I can't find any reference to this online.

    • It's ai written. The prompt likely contained the phrase hence the quotes
  • I remember that when I was learning SQL, I felt like I was gaining a new way, or perspective, to think about other things in life.

    I think you end up exercising how to structure your thoughts.

  • I've learned SQL around 20 years ago, and in all this time I've felt it was just a poorly designed language. It was always infuriating to write because of its verbose nature. Keywords were split into two words. I'm still shocked it's not "GROUPBY". There is no composition and modularization of logic, queries become massive expressions.

    I know I'm in the minority in places like this, but I've spent all my life using ORMs, and never once regretted it. And I'm the kind of person that actually likes low-level C from time to time. SQL just feels like a poor abstraction layer: either go higher or lower.

    • It’s a good abstraction layer, and a fundamentally good/effecient model of organization and data management. It’s a horrible language, has a meaningless standards doc, some of the worst debugging tooling of modern system and generally any tooling outside of the RDBMS engine itself is 20 years stale.

      The only difficult part in arguing this is that RDBMS != SQL != RelationalAlgebra, and it’s very often forgotten

    • Are we really a minority ? Feels like a lot of people just suffer in silence.

      Even though I dislike SQL on many levels, I would be hard pressed to find a better, widely supported alternative. I gave up writing portable SQL and just target PostgreSQL now.

  • Well-designed purpose-built tools stand the test of time. When you need a hammer you need a hammer. I learned to swing a hammer a very long time ago, and that skill has stayed with me on modern-day hammers - I didn't have to learn the New Way Of Hammering Things.
  • I’ve been using Postgres for over 6 years (since I started), and I honestly think it’s one of the best investments you can make as a developer
  • Alternatives come and go, SQL stays.

    It's not that I like or dislike SQL, it is just that it has such raw power and mature tooling/resources, I wonder what an alternative could even offer me.

    It's like C. It does such a great job at being structured assembly that it is hard to displace it for similar reasons.

  • the main reason i dont like sql is the way it splits your query into parts that run in a different order and you can only have one of each. thats why you need things like ctes. if it was a more "functional" language with features like let bindings it would be easier to understand (and maybe to optimize):

      from customers as c
      let orders := all(orders where customer_id = c.id)
      select c.name, count(orders), avg(orders.price)
  • One of the best things that happened to me is my boss giving me a crash course in advanced SQL at my first job. In the database we used at work, he gave me increasingly difficult questions to answer with queries.

    It was a great foundation and has served me well to this day.

  • Additionally learn stored procedures.

    Helps simplify complex SQL queries and no need to waste network traffic on data that client side is never going to use, and waste CPU cycles processing it.

    Yes, what about database portability?

    I am on my 50s and it only mattered on a single project, which was anyway a middleware for application servers.

    • > Additionally learn stored procedures.

      For sure, but have a solid grounding in set theory to go with it.

      I've dealt with so many poorly-performing stored procedures that ended up being written as iteration over a CURSOR when they could have been done with sets. Programmers who don't grok set theory reach for iterative constructs which, while they work fine, are an impedance mismatch with SQL.

      • At least in that case you can refactor the stored proc to be more performant without pushing application changes.
      • Agreed, however that applies to SQL in general.

        I have seen DBAs make wonders without changing queries, only by adding the right set of indexes.

    • I moved from stored procedures to dbt. I find it easier to maintain and it helps me with version control, testing, and docs. Plus, since I deal with data pipelines a lot it get other goodies like lineage and auto DDL.
  • This is standard advise I give to any IT consultants (incl some that didn't ask for any lol). Cos I see too many of them evolving into purchasing clerks and postmen, far removed from the tech and operations.

    Regex, SQL, Basic linux command line tools, awk. More as job demands.

  • > Now try this experiment with the JavaScript ecosystem.

    Wow, JavaScript ecosystem is bad!

    > Now try this experiment with [scrubbed] JavaScript [scrubbed].

    Wow, JavaScript is great!

  • For me SQL has long been the gateway to the world of development. I work in the UK non-profit sector and traditionally this kind of technical knowledge is rare, so for any team I've worked with I've built learning pathways that start with SQL before pushing out into Python, Linux, and other things. We're not exactly at the bleeding edge of current technologies, but SQL has consistently proved to be a great jumping-off point for novices who have even a passing interest in computing.
  • … and you’ll have to look up the syntax every year.
    • Yeah, I actually had to learn it three times because it didn't stick until I was actually using it daily.
  • I frankly can't agree more.

    Learn SQL, learn the normal forms at least up to 3NF, profit.

    I took an Oracle SQL class in High School in the very early 2000s and frankly it set me up for my career, despite never having touched actual Oracle SQL I've become the go to guy at every job I've had for optimizing queries and reviewing designs.

    I read a book on how MySQL actually worked under the hood in the late aughts and it really went a long way towards the effort.

    It's really not as hard as the complexity of modern ORM tooling likes to make it seem. That scares people away. It's an elegant language for a more elegant age.

    I went to a talk like 10 years ago about how SQL will be displaced by Hadoop/MapReduce in the next 5 years. I posted on Twitter about it at the time like we'll see if that happens. Spoilers, it didn't. I can't even think of the last time I've heard someone invoke the name of Hadoop

  • I fully agree with the title (with reservation for "dialects"), and I believe the same can be said for JSON and Markdown, among possibly others.
  • This is so true. I’ve been making a good living for the last 15 years because I learned SQL as a junior DBA.
  • Same can be said for learning an OO programming language or a procedural programming language. I learned C++ at school and started using Java on my first job. I forgot how to work correctly with pointers but I have tried multiple languages (using the same paradigms) and managed to build working software
  • Honorable mention to C89 for also being supported forever and ever
  • I've played once with codesignal to pass SQL chapters and it really helped to advance querying skills.
  • Yes yes, all fine and good. I love SQL, think it is incredible how relevant it still is as a high level language, and think it should be the basis of almost every data storage and retrieval system out there. And I love that the foundation is relational algebra, which is an extremely useful abstraction for data management.

    But for the love of god, get rid of the ternary logic. It is only mathematically sound to the extent that mathematicians are masochists and will try to formalize anything regardless of how painful it is for normies. Boolean logic is good enough and doesn't feel like an exercise in retroactive continuity.

  • Just, for god's sake, move SELECT after GROUP BY, I beg you.
    • Current structure makes sense to me.

        SELECT .... what do I want
        FROM .... where is it 
        WHERE .... what filters do I want to apply
        GROUP BY .... how do I want it aggregated
      
      Maybe it's just that I'm so used to it. I could see FROM being first, that would actually make a little more sense to me.
      • > I could see FROM being first, that would actually make a little more sense to me.

        Personally, I disagree. In English, an imperative statement like "move that chair from the dining room to the living room" is generally verb-first (with respect to location, anyway). SQL's flow has always made perfect sense to me.

      • A pretty common request is to lift the FROM up before the select, like the below. I'm pretty fine with status quo since my mind is usually "hmm what do I need to get" first, then I figure out how to get it, but some engines (duckdb, I think?) support both so everyone gets their cake.

        What people often want: <where to get data from> <what I want from it> <how it's filtered> <how it's grouped> <how it's filtered post group>

  • Agreed. SQL has been one of the most stable and useful skills I have.

    Rivalled only by Linux, shell scripts, and Cron!

  • May be of interest: A Critique of Modern SQL And A Proposal Towards A Simple and Expressive Query Language.

    It is a critique of modern SQL and a suggestion for "SaneQL":

    "SaneQL features a straightforward and consistent syntax, which improves its learnability and ease of implementation. Additionally, it provides extensibility, with the added ability to define new operators that integrate seamlessly with the existing built-in ones. Unlike most data frame APIs and NoSQL query languages, SaneQL fully embraces the core principles behind SQL, especially multiset semantics."

    https://www.cidrdb.org/cidr2024/papers/p48-neumann.pdf

    A colleague of mine is working on an implementation of these ideas:

    https://github.com/wvlet/wvlet

  • I resisted learning SQL for years. When I finally got around to learning it, I kicked myself for not learning it sooner. Sure, it has its share of jank and every DB has its own flavor, but at the end of the day you can do insane amounts of data processing using an easy to learn, although sometimes difficult to master, query language.
  • Learn Cobol once, use it for 100 years :-D
  • I agree with the gist, but I’m so sick of SQL nowadays — or rather, so sick of writing business logic in SQL, that I’d rather not work with it for the rest of my life.

    It’s ironic that I’m pretty good with it.

  • I don’t think it’s rational to flatten data. If an item contains an array of sub items which in turn contains an array of subitems, that item belongs in one place not three tables.

    I know those view isn’t popular, but I’ve happily used Linux, Python, virtualisation, node and Rust when they were laughed at and I’m not particularly concerned.

  • > Now try this experiment with the JavaScript ecosystem. Take a React component from 2015. React.createClass, mixins, componentWillMount. It doesn't just look old, it throws TypeError: React.createClass is not a function the moment it loads. You rewrite it from scratch to ship it today. Ten years passed, and the framework cycled through three different mental models in that time.

    Pretty absurd comparison. SQL is a language, React is not. SQL has been around for over 30 years, React has not.

    This is what I refer to as React Derangement Syndrome.

    > JavaScript is an imperative language that browser wars, framework trends, and open-source maintainer preferences reshaped every few years. It rewards you for keeping up.

    > SQL rewards you for sitting still.

    Again, this is apples and oranges. These technologies are in far different places in their history. JavaScript that worked 20 years ago still works today.

    You can write an article about how great SQL is without having to bring React up. I promise it's possible.

  • Having been working with computers professionally for almost 40 years now I've seen quite a lot of things come and go. I'm not convinced that LLMs will stick around for that long although they're currently doing better than "fuzzy logic", which is what it used to be called when they could run on 68HC11s ;-)

    You know what has stuck around though?

    Thumping great Unix boxes running SQL databases.

    Yes, there's a lot wrong with the whole concept, but everything else is in some important way worse.

  • was this article secretly written by regexp
  • > JavaScript is an imperative language that browser wars, framework trends, and open-source maintainer preferences reshaped every few years. It rewards you for keeping up. > Take a React component from 2015

    Javascript is actually fully backwards-compatible, to not break the Web. Any javascript from 10 years ago works in the browser. This is good but also a bit of a burden, since the language can only expand but not shrink. React is a library, and like all libraries it has breaking versions. Not understanding the basic difference between the two kinda undermines the credibility of the article.

    Also, in a similar way, core, ANSI SQL is largely backwards compatible, but all the SQL dialects linked to various DBMS implementation are generally incompatible. Obviously that's not mentioned in the article.

    > Not a tutorial. Not an ORM. Actual SQL: joins, subqueries, window functions, query plans.

    Not text written by a human. Not a style that an real writer would ever use. Actual AI slop: Short sentences. Incorrect facts. Not X, Y.

    • > Not a tutorial. Not an ORM. Actual SQL: joins, subqueries, window functions, query plans.

      My brain absolutely checks out when I read this stuff now.

      Not to mention that query plans are absolutely not "actual SQL".

    • An article laser-targeted at HN's front page, making tantalizingly negative and easily disprovable claims about Javascript? Perish the thought.
  • > SQL is the only programming language

    SQL is not a programming language. You do not write programs in SQL. It's a declarative language (or set-of-sublanguages).

    > a working developer can learn once and > use for 30 years without rewriting their mental model.

    There is any number of long-living languages which satisfy this.

    Plus, SQL it's not even really a single language, because the spec changes, and is huge, and few people know it fully; and the dialects have non-trivial differences; and if you switch DBMSes, you often switch SQL dialect. In that sense, it is very much like other programming languages which evolve, like C++ or Fortran or even C.

  • It's broken expression to update karma brings downvotes

      UPDATE users
      SET karma = 9001
      WHERE name='notlibrary';
  • [flagged]
  • [flagged]
  • [flagged]
  • Everyone knows SQL already. The harder parts that pay off are schema design, knowing how to interact with your DB in code, and knowing all the ins and outs of whatever DBMS you're using.
    • I would emphasize the importance of batching and set operations. This is where I think many developers lose track of the rabbit, because you don't have much control over either of these things via ORMs. You have to get your hands dirty with raw command text.

      The value of this stuff is difficult to overstate. Batching allows for you to rapidly load the RDBMS. The first few times you test, it will probably go so fast you won't believe it loaded anything at all. Set operations allow for you to bring this newly loaded data to visibility in production tables nearly instantly. Your OLAP & OLTP workloads should be dominating the compute. ETL ops (loading/set ops) should be a ghost in terms of cpu time and memory. None of this is vendor specific knowledge. Every major engine has a reasonable way to bulk load and perform quick merging of records.

      • > I would emphasize the importance of batching and set operations.

        Please, preach your gospel more loudly and frequently. It always feels like people complain about RDBMSs being slow because they run insert queries one at a time.

      • There's always a bulk insert, but I wouldn't say every engine has always had a reasonable way to bulk load truly large data... parquet really helped with interop but before that when your best option was a CSV and bcp life was not fun.
      • Well yeah they should've banned ORMs in the Geneva Convention. Quickest way to irreversibly ruin your schema design and backend code.
  • I refuse to learn SQL. I'm not a computer, I'll let them deal with that.
    • SELECT excuse FROM ignorance ORDER BY snobbery DESC LIMIT 1;
      • Love it! I am speaking as someone who has used SQL for over two decades with very good success. I find it extremely logical and a good fit for my mental model. Long live SQL!!
    • What's your job title?