I strongly oppose passing data between services through cache!

I strongly oppose passing data between services through cache!

[[408610]]

The movement of data requires a carrier, and DB and cache are common data storage carriers.

As shown above:

  • Service-A puts the data into cache;
  • Service-B reads data from the cache;

The benefits of cache as a data storage medium are:

  • The cache is read and written very quickly;
  • Service-A and service-B are physically decoupled;

So here comes the question:

  • Have you ever encountered this kind of "data transfer between services through cache" architecture design?
  • Is this architecture design good or bad? Why?

Regarding this architectural design, I would like to share my personal views.

Do you support this architectural design?

Let me first state the conclusion. The original poster is clearly against "transferring data between services through cache".

Why oppose?

There are three core reasons.

First point: In the data pipeline scenario, MQ is more suitable than cache.

If you simply use the cache as a pipeline for data communication between two services, service-A produces data, and service-B (of course, there may be service-C/service-D, etc.) subscribes to the data, MQ is more suitable than cache:

(1) MQ is a common logical decoupling and physical decoupling component on the Internet. It supports 1:1 and 1:many modes and is a very mature data channel.

(2) The cache will couple service-A/B/C/D together, and everyone needs to work together to agree on the key format, IP address, etc.;

(3)MQ can support push, while cache can only pull, which is not real-time and has a delay;

(4)MQ naturally supports clustering and high availability, but cache may not;

(5)MQ can support data landing, and cache can store data in memory, which is "volatile". Of course, some caches support landing, but the principle of Internet technology selection is to let professional software do professional things: nginx is used as a reverse proxy, db is used for solidification, cache is used for caching, and mq is used for channels;

In summary, MQ is more suitable than cache in data pipeline scenarios.

Second point: In the data co-management scenario, two (or more) services reading and writing a cache instance at the same time will lead to coupling.

If it is not a data pipeline, but two (or more) services co-manage data in a cache and read and write at the same time, it is also not recommended. These services will be coupled together because of this cache:

(1) Everyone needs to work together to agree on the key format, IP address, etc., coupling;

(2) Agreeing on the same key may result in data overwriting, leading to data inconsistency;

(3) Different service business models, data volumes, and concurrency are different, which may affect each other because of a cache. For example, if service-A has a large amount of data and occupies most of the cache memory, it will cause all the hot data of service-B to be squeezed out of the cache, causing cache failure. Another example is that service-A has a high concurrency and occupies most of the cache connections, which will cause service-B to be unable to obtain cache connections, resulting in service anomalies.

In summary, in the scenario of data co-management, it is not recommended to couple multiple services in one cache instance. Vertical splitting and instance decoupling are required.

The third point: data access scenario, two (or more) services have the need to read and write a copy of data.

According to the principle of service-oriented, data is private (essentially decoupled):

(1) The service layer shields the complexity of the underlying storage engine, sub-library, and chain from data demanders;

(2) Any demander cannot bypass the service to read or write data on its backend;

Assuming that other services need to obtain data, they should be accessed through the RPC interface provided by the service instead of directly reading and writing the backend data, whether it is cache or database.

In summary

  • In data pipeline scenarios, MQ is more suitable than cache;
  • Multiple services should not share a cache instance, but should be vertically split and decoupled;
  • In a service-oriented architecture, you should not bypass the service to read the cache/db on the backend, but access it through the RPC interface;

[This article is an original article by 51CTO columnist "58 Shen Jian". Please contact the original author for reprinting.]

Click here to read more articles by this author

<<:  Google officially joins the O-RAN Alliance to promote the development of telecommunications technology

>>:  Inventory: Ten key steps in the development of quantum communication in China, from following to partially leading

Recommend

How to Re-evaluate Unified Communications Tools in the Work-from-Home Era

As the pandemic shapes a new normal, value chains...

How to manage data center cabling?

Tracking and managing data center cabling is one ...

Have you learned how to configure multiple public IP addresses?

background For some customers working on video an...

Enable IPv6 protocol and experience IPv6 website

IPV4 resources have been exhausted and there is n...

It’s time to consider leaf-spine network architecture

With the changes in traffic flows used in modern ...

What network engineers should know about ARP

Dynamic ARP entry learning In most cases, devices...

Ten times faster than 5G? What is the future of 10G network?

In the digital age, how to use technology to prom...