Understanding Service Discovery

Why service discovery?

Let’s assume we have X number of applications deployed across Y number of hosts which needs to communicate with each other. How do we can ensure that our components can freely communicate using proper hosts and ports?

In the ideally world we would like to be able to answer simple questions:

  • How do we know which services are running?
  • How we can identify services in network (host, port)?

The basic approach is using static configuration with simple mapping where specific application is deployed. However, in modern cloud based infrastructure (which we see more often these days) can be really painful. We can regenerate our config after each deployment but does not really give us any flexibility we are looking for.

And there comes service discovery tools which can help with this tasks. There are many types of service discovery - passive and active, agent and agentless. Which is the best discovery method? The answer is dependent on your needs.

Agentless vs Agent service discovery pattern

With agentless approach, there is no agent on discoverable hosts which means there is no need of managing an agent on every discoverable host. This leads to minimal overhead required during the deployment the service and less configuration management. However, there can be a difficulty in recoding an server state without an agent in place.

Opposite to this is agent discovery pattern. On every host you want to be discoverable in your network, there is a need to deploy the agent which will handle registering hosts into catalog called - service registry. From there, you can query service registry to find out which hosts contains which services and connect to them accordingly.

Active vs Passive service discovery pattern

The difference between these two are in the matter of decision making. Active service discovery pattern moves the decision to the application in terms which backend should be used and in detecting failures. Passive patterns separate decision making from services. Applications cannot decide which backend connect.

Existing service discovery solutions

Apache Zookeeper

Pros

  • Mature technology
  • Feature rich
  • Apache Fundation support

Cons

  • Latency-dependent
  • No multi datacenter support
  • Client number limit

coreOS etcd

Pros

  • Data persistence
  • CoreOS out of the box integrration

Cons

  • Young project.

Consul by Hashicorp

Pros

  • Easy to setup
  • Hashicorp!
  • Multidata center support
  • Gossip protocols
  • Consensus protocols (Raft)
  • DNS or HTTP based service discovery

Cons

  • Immature. Even younger than etcd.

Comparison

Zookeeper Consul Etcd
Service Discovery HTTP API HTTP API + DNS HTTP API
Healthchecking TCP HTTP API Up to application
Key/Value store Strong consistency 3 consistency modes Strong consistency
Multidatacenter support
ACL
Language Java Go Go


Tab. 1. Comparison of production ready service discovery solutions

Other resources

If you want to read more, these are some other great resources in this topic: