When these three words are mentioned, do many people think that distributed = high concurrency = multithreading? When the interviewer asks what methods can be used to solve high-concurrency systems, or how distributed systems solve consistency problems, are you confused? Indeed, when people first come into contact with them, many of them confuse the three, mistakenly believing that the so-called distributed high-concurrency system is one that can be accessed by a large number of users at the same time, and that the use of multi-threading can provide the system's concurrency capabilities, right? In fact, the three of them always exist together, but with different focuses.
What is distributed? Distributed is more of a concept, and is an optimization method used to solve the capacity and performance bottleneck of a single physical server. There are many problems that need to be solved in this field, including distributed file systems, distributed caches, distributed databases, distributed computing, etc. At different technical levels, some terms such as Hadoop, Zookeeper, MQ, etc. are related to distribution. Conceptually, there are two forms of distributed implementation: Horizontal expansion: When one machine cannot handle the traffic, add more machines to split the traffic equally among all servers, so that all machines can provide equivalent services. Vertical splitting: When there are multiple query requirements on the front end, one machine cannot handle them all. Different requirements can be distributed to different machines. For example, machine A processes requests for remaining ticket inquiries, and machine B processes requests for payment. What is high concurrency? Compared with distribution, high concurrency is more focused on solving problems, and it reflects the amount of volume at the same time: for example, online live broadcast services can be watched by tens of thousands of people at the same time. High concurrency can be solved by distributed technology, which can divide concurrent traffic into different physical servers. But in addition to this, there are many other optimization methods: for example, using a cache system to put all static content into CDN, etc.; you can also use multi-threading technology to maximize the service capacity of a server. What is multithreading? Multithreading refers to the technology of implementing concurrent execution of multiple threads from software or hardware. It is more about solving the problem of CPU scheduling multiple processes so that these processes appear to be executed simultaneously (in fact, they run alternately). Among these concepts, multithreading solves the most clear problems and has a relatively simple approach. Basically, the biggest problem encountered is thread safety. In the JAVA language, you need to have a deep understanding of the JVM memory model, instruction reordering, etc. to write high-quality multithreaded code. To summarize:
Distributed and highly concurrent systems involve a large number of concepts and knowledge points. Without systematic learning, it is easy to mix up the concepts and make them unclear, which will lead to difficulties in interviews and actual work. |
<<: Three things to consider before building a data center
InspireVM is a site under Inspire Solutions LLC. ...
Recently, at the Second China Domain Name Develop...
According to information from the official websit...
At first glance, everyone must be shocked by thi...
As digital transformation is in full swing, the n...
[[398710]] This article is reprinted from the WeC...
We are about to bid farewell to 2016 and welcome ...
DogYun (狗云) launched its two-year anniversary cel...
"As a network engineer, what does it take to...
If synchronized is the "chief steward" ...
100M broadband is not necessarily fast! When it c...
ZJI is a well-known hosting company in the WordPr...
Revolutionizing Connectivity: The Untold Benefits...
For most of the front-end developers interviewed,...
During the COVID-19 outbreak, Internet medical se...