Microservices and the First Law of Distributed Objects (2014)

46 points by pjmlp 4 days ago | 33 comments

torginus
An anecdote I like to tell:
I once participated in implementing a system as a monolith, and later on handled the rewrite to microservices, to 'future-proof' the system.
The nice thing is that I have the Jira tickets for both projects, and I have actual hard proof, the microservice version absolutely didn't go smoother or take less time or dev hours.
You can really match up a lot of it feature-by-feature and it'll be plainly visible that the microservice version of the feature took longer and had more bugs.
And Imo this is the best case scenario for microservices. The 'good thing' about microservices, is once you have the interfaces, you can start coding. This makes these projects look more productive at least initially.
But the issue is that, more often than not, the quality of the specs are not great to awful, I've seen projects where basically Team A and Team B coded their service against a wildly different interface, and it was only found in the final stretch that these two parts do not meet.
- cirego
  I've noticed that there's another problem with microservices as well. People tend to tie microservices and multi-repo into the same strategy.
  Multi-repo appears to make teams faster (builds are faster! fewer merge conflicts!) but, like micro-services, they push complexity into the ether. Things like updating service contracts, library updates, etc. all become more complicated.
  jayd16
  I think the real sin is just cutting against the grain on your services and library boundaries.
  It's not that hard to version and deploy multiple services and libraries. If you need the flexibility of that separation, it can very much be worth it.
  But if you separate them and still treat them like you're in a mono whatever and you cut corners on keeping your separation clean and clear, you're going to have a bad time.
  Either pattern has its advantages. It's best to remember that they're just a pattern and you should be doing one or the other for a reason.
  kerblang
  In my experience, while monorepo makes shared libraries between services easier, that inevitably turns into "Hey, why not just put a copy of my application inside yours?"
  taeric
  Oddly, I would say that this often exposes complexity? Not that that is a valid reason to go all in on it. But some things, like updating service contracts, are complicated. Indeed, anything that makes it look like many services all deployed in unison is almost certainly hiding one hell of a failure case.
  twic
  Having gone from multi-repo to monorepo recently, I'd say the opposite. A multi-repo lets you do those things incrementally. A monorepo forces you to do them in one go.
- danmaz74
  Distributed systems are always more complex than equivalent monolithic ones. Luckily, it looks like most engineers now understand that microservices mostly make sense for big companies where the biggest issue is distributing work between lots and lots of developers in a sensible way.
- cratermoon
  I participated in something kind of the opposite of that: multiple microservices in independent repos, but with intertwining dependencies on each other. Adding features meant shotgun surgery across at least 3 repos/services, sometimes more.
torginus
> While small microservices are certainly simpler to reason about, I worry that this pushes complexity into the interconnections between services
100% true in retrospect.
- bandrami
  I've found a lot of bugs in software in my career and basically none of them were at a single spot in a codebase. They've all been where two remote spots (or even more frequently to spots in two different codebases) interact. This was true even before microservices were a thing.
  kitd
  There's a stat I think quoted in "Code Complete" by McConnell that says the number of bugs in a system strongly correlates with the number of coders. The conclusion is that as the # of coders goes up, the # of lines of communication between them grows exponentially, and it's the lines of (mis)communication that lead to bugs.
  This:
  1. explains Brooks' assertion that adding coders to a late project makes it later
  2. emphasises the importance in clearly defining interfaces between components, interfaces being the "paths of communication" between the coders of those components.
  So your assertion is well founded.
  jayd16
  Look at Mr. Never seen an off by one error over here. I realize that putting a remote service call in between functionality adds complexity, but this is just so laughably hyperbolic.
  bandrami
  I've found fencepost errors but AFAICR it was always somebody calling somebody else's code with the wrong semantics.
  souvlakius
  I mean, microservices split the code into smaller chunks but now lots of little pieces communicate over the network and unless you are using some form of RPC, this communication channels are not typed and there's a lot more stuff that could go wrong (packets dropped, DNS not resolving). Plus you could update one microservice and not update its dependents. I think a lot of people jumped on the hype without realising that it's a trade-off
  mememememememo
  I work at a largish org (where microservices make sense but there are monoliths too) and the scary bits are unowned functionality. Leaning into a platform but being a business it isn't pure generic like say AWS it knows ahout the business. Some features are distributed across dozens of services. It is a skill hunting down who to blame for a problem. Not blame, ask for help, of course ;)
- taeric
  This same logic can be used to argue against the "keep all functions at 10 or so lines of code" mantra that a lot of folks try and push.
  Which is not to say it isn't valid.
kerblang
I'm always going to say that if you have third-party integrations where you call out to other organizations' services, that will be the thing that breaks down the most. You have to armor the heck out of it and plan for contingencies, and yes, that includes when third party is <Famous Company Where Surely Nothing Ever Goes Wrong>.
Microservices are just a slightly more reliable version of that, since you can hassle the author as coworker instead of via harried FCWSNEGW support mouse.
rednafi
I dream of a SQL like engine for distributed systems where you can declaratively say "svc A uses the results of B & C where C depends on D."
Then the engine would find the best way to resolve the graph and fetch the results. You could still add your imperative logic on top of the fetched results, but you don't concern yourself with the minutiae of resilience patterns and how to traverse the dependency graph.
- Garlef
  Isn't this a common architecture in CQRS systems?
  Commands to go specific microservices with local state persisted in a small DB; queries go to a global aggregation system.
- burakemir
  You could build something like this using Mangle datalog. The go implementation supports extension predicates that you can use for "federated querying", with filter pushdown. Or you could model your dependency graph and then query the paths and do something custom.
  You could also build a fancier federated querying system that combines the two, taking a Mangle query and the analyzing and rewriting it. For that you're on your own though - I prefer developers hand-crafting something that fits their needs to a big convoluted framework that tries to be all things to all people.
- torginus
  I think SQL alone is great if you didn't drink the microservice kool-aid. You can model dependencies between pieces of data, and the engine will enforce them (and the resulting correct code will probably be faster than what you could do otherwise).
  Then you can run A,B,C and D from a consistent snapshot of data and get correct results.
  The only thing microservices allow you to do if scale stateless compute, which is (architecturally) trivial to scale without microservices.
  I do not believe there has been any serious server app that has had a better solution to data consistency than SQL.
  All these 'webscale' solutions I've seen basically throw out all the consistency guarantees of SQL for speed. But once you need to make sure that different pieces of data are actually consistent, then you're basicallly forced to reimplement transactions, joins, locks etc.
- kgwxd
  Datomic?
jFriedensreich
Its a weird notion of a distributed object, even in 2014, I think i would never consider calling the methods of a distributed object directly with something linke RPC but instead replicate the objects with a replication protocol and then use the replicas locally.
cmrdporcupine
"The consequence of this difference is that your guidelines for APIs are different. In process calls can be fine-grained, if you want 100 product prices and availabilities, you can happily make 100 calls to your product price function and another 100 for the availabilities."
While this is true, in fact for efficiency reasons it's often better to treat even local dispatch like it's "network" -- chasing pointers and doing things one at a time in a loop is far less efficient on a modern architecture than doing things in bulk and vectorized.
Non uniform memory hierarchies, caches, branch predictors, SIMD, and now GPUs, etc. all tend to reward working with data in batches.
If I were to think of a "pure" model of computation that unified remote and local it would be to treat the entire machine in terms of the relational data model, not objects. To treat all data manipulation and decisions like a query.
And to ideally in fact have the same concept of a query optimizer / planner that a DBMS has, which is able to make decisions on how to proceed based on the cost of the storage model, the indexes, etc. because it has a bigger picture of what the programmer is trying to accomplish.
gethly
99% of systems out there are not truly microservices but SOA(fat services). A microservice is something that send emails, transforms images, encodes video and so on. Most real services are 100x bigger than that.
Secondly, if you are not doing event sourcing from the get go, doing distributed system is stupid beyond imagination.
When you do event sourcing, you can do CQRS and therefore have zero need for some humongous database that scales ad infinitum and costs and arm and a leg.
- another-dave
  yeah definitely agree.
  Generally, the microservices that I've seen work well are the type of things that you could decide to "buy" in the build vs buy debate - like you say, stuff that are either "fire and forget" or stuff where you only care about a fixed output produced, not the guts of how it's done.
  Anything that depends on your core business logic within the service (if customer type X, do custom process Y) is probably not going to be a clean fit for microservices as you'd think, especially with an emergent design.
Havoc
AI has also changed the dynamics around this. Splitting things into smaller components now has a dev advantage because the AI program better with smaller scope
- teddyh
  A separated component does not necessarily mean a microservice. It could be its own process, its own module, or even just its own function, which is fine. But microservices bring their own problems.
- mettamage
  Well yea... but the big con of microservices is still a thing: unexpected interactions
  But some of that could be mitigated I guess.
- Garlef
  > AI has also changed the dynamics around this. Splitting things into smaller components now has a dev advantage because the AI program better with smaller scope
  This is not AI specific and nothing new and also precisely why microservices are a good solution to some problems: They reduce a teams cognitive load (if architected properly, caveats, team topologies, etc, etc)
jauntywundrkind
A lot of this first law was specifically coupled to how these systems often hid that distributed objects were distributed. In the past 10 years, async has become far more common place, and it makes the distributed boundary much less like a secret special anomaly that you wouldn't otherwise deal with and far more like just another type of async code.
I still thoroughly want to see capnproto or capnweb emerge the third party handoff, so we can do distributed systems where we tell microservice-b to use the results from microservice-a to run it's compute, without needing to proxy those results through ourself. Oh to dream.
- teddyh
  Async fixes one problem with microservices. It does not fix the unexpected latency swings, the network timeouts and errors, the service disruptions when the microservice is unavailable, etc.
  souvlakius
  or the mismatch between request and response when using HTTP, or the overhead of using RPCs to protect against the previous scenario, or the issue of updating one microservice and not updating all the dependents
jamiemallers
[dead]
peytongreen_dev
[flagged]