Getting Started with Dubbo - Distributed and Clustered

What is Dubbo

Dubbo is a high performance, lightweight, open source Java RPC framework that provides three core capabilities: Interface-oriented remote method calls, smart fault tolerance and load balancing, and automatic service registration and discovery.

What is RPC

RPC Full Name (Remote Procedure Call) Remote Procedure Call

Procedures refer to the execution of a snippet of code, while remote calls mean that we can call this code on other processes, or even other machines, and of course we can get the return value from its execution. By this definition, requesting an http address to get the corresponding data is actually an RPC, but this is too cumbersome, (the data needs to be packaged into an http request format first,When invoking the related request library, the result is also text format needs to be converted, execution efficiency, and development efficiency are lower than RPC;

How easy is it to call a service on a remote server as if it were a local method, like the following

Why rpc is needed

RPC is the cornerstone of the distributed architecture, which splits different modules in the same system into different subsystems, and the subsystems are distributed on different servers, thus requiring RPC to complete the mutual access between subsystems.

It can be said that distributed RPC is indispensable, and RPC also needs to play its core value in distributed systems.

Implementation principle of rpc

There is no reason to think that the underlying layer must use socket s for network communication, but how can you call methods directly from another machine?

The service consumer (client) calls to invoke the service locally;

2) client stub is responsible for assembling methods, parameters, etc. into a message body capable of network transmission after receiving the call;

3) client stub finds the service address and sends the message to the server;

4) server stub decodes the message after it receives it;

5) server stub calls local services based on decoding results;

6) Local service execution and return results to server stub;

7) server stub packages the returned results into messages and sends them to consumers;

8) client stub receives messages and decodes them;

9) Service consumers get the final result.

Of course, when the passed parameter or return value is a Java object, it also needs to be serialized and deserialized

Distributed and Clustered


The cluster architecture replicates the same processing logic (duplicates a source code) to create a set of services with the same functionality. Each service in the cluster can independently complete a user's request. There is little need to communicate with each other and no need for RPC.


Distributed refers to splitting a system into separate subsystems and deploying them on different machines.

When processing a task, a task is divided into several subtasks and distributed to different subsystems. Each subsystem can only handle a part of the task. Usually a complete task contains several processing steps, for example, if a user wants to purchase a commodity, he needs to create an order first, then modify the inventory, assuming that the service to modify the inventory is provided by another server, then RPC will shine.

Distribution can be found to be completely different from clustering in the underlying architecture, so it will take a lot of modifications to reconstruct a system that was originally clustered to be distributed, so if there is a high concurrent demand in the later stage of the system, it can be built early in the project using the distributed architecture.

Is distribution necessary?

Advantages and disadvantages of distributed:

  • Increase computing speed by turning native serial tasks into parallel execution (no front-back dependency)
  • Increase availability because the system is distributed across different computing nodes, where a failure of one node does not have a significant impact on the overall system
  • Each subsystem runs independently, which greatly reduces the coupling of the system, and improves the scalability and maintainability of the operational functions of each subsystem.
  • Because of modularity, system modules are more reusable (system level)
  • Open technology, diverse, fully available in other languages, other platforms to develop a subsystem
  • More efficient use of hardware resources


  • Response time is longer because RPC is required

  • System architecture is more complex and maintenance is cumbersome

  • Requires service management and scheduling
  • Testing and debugging is more complex
  • Common modules cannot be reused (code level)

It should be emphasized that distribution and clustering are not the only alternatives, and in high concurrency scenarios, clusters can also be formed for nodes with high pressure.
Distributed and Micro Services:

Distributed systems are loosely coupled systems composed of multiple processors connected by communication lines and are a broader concept.

Micro-services are also structurally distributed, which emphasizes the complete independence and decoupling of a function.

RPC and micro-services are at the same level, that is, RPC can be used or micro-services can be used for distributed implementation.

System Architecture Evolution:

SOA is the ultimate solution for massive concurrent access, whether using RPC or microservices

Why do I need Dubbo:

Quote the official line:

Before large-scale servicing, applications might simply expose and reference remote services through tools such as RMI or Hessian, make calls by configuring the URL address of the service, and load-balance through hardware such as F5.

As more and more services become available, service URL configuration management becomes very difficult, and the single-point pressure on the F5 hardware load balancer increases.A service registry is needed to dynamically register and discover services to make their location transparent.Soft load balancing and Failover can be achieved by acquiring the address list of service providers at the consumer side, reducing the dependence on F5 hardware load balancer and reducing part of the cost.

As Service Dependencies become more complex and it is not clear which application will start before which one, architects cannot fully describe the application's architecture relationship.In this case, you need to automatically draw a dependency diagram between applications to help architects sort out the relationship.

Then, as more and more services are invoked, the capacity of the service is exposed. How much machine support does this service need?When should I add the machine?In order to solve these problems, the first step is to calculate the daily service usage and response time as a reference index for capacity planning.Secondly, to dynamically adjust the weight, on-line, increase the weight of a machine and record the change of response time in the process of increasing until the response time reaches the threshold, record the amount of access at this time, and then multiply this amount of access by the number of machines to reverse the total capacity.

Simply put, Dubbo not only implements RPC, but also provides a complete set of management solutions for distributed services; these include

  • Service Registration and Discovery

  • load balancing
  • Traffic Scheduling
  • Provide visual tools for service governance and maintenance

Architecture and service invocation process

Give an example


node Role Description
Provider Service Providers Exposing Services
Consumer Service consumer invoking remote service
Registry Registry for Service Registration and Discovery
Monitor Monitoring Center for Counting Service Calls and Time
Container Service Run Container

Call procedure:

  1. The service container is responsible for starting, loading, and running the service provider.
  2. Service providers register their services with the registry at startup.
  3. When service consumers start up, they subscribe to the registry for the services they need.
  4. The registry returns a list of service provider addresses to the consumer, and if there is a change, the registry will push the change data to the consumer based on a long connection.
  5. Service consumers, from the list of provider addresses, select a provider to invoke based on the Soft Load Balancing algorithm, and if the call fails, choose another call.
  6. Service consumers and providers, accumulating calls and call times in memory, regularly send statistics to the monitoring center every minute.

hello Dubbo

1. Create a Maven Paving Project DubboDemo

2. Create provider module under current project

3. Add dependencies to provider s


4.dubbo publishes services in units of interfaces, so we need to create a service interface, which also needs the same interface on the consumer side to generate proxy objects. In order to extract the public part of the code, we can create a new module and then let the provider and consumer depend on this project to find the desired interface.

Create interfaces in public modules:

package com.yyh.service;

public interface HelloService {
    String helloMan(String name);

Neutralize dependencies on newly created projects in the pom


5. Create implementation classes in provider

package com.yyh.service.impl;

import com.yyh.service.HelloService;

public class HelloServiceImpl implements HelloService {
    public String helloMan(String name) {
        return "hello: "+name;

6. Write provider profile

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns=""
       xmlns:xsi="" xmlns:dubbo=""
    <!--The unique name of the current project throughout the system for calculating dependencies-->
    <dubbo:application name="hello-service">
      <dubbo:parameter key="qos.enable" value="true"/>
    <!--dubbo The registry corresponding to the address of the service to be exposed by this service-->
    <dubbo:registry address="N/A"/>
    <!--Protocols on which current service publishing depends;webserovice,Thrift,Hessain,http-->
    <dubbo:protocol name="dubbo" port="20880"/>
    <!--Configuration of service publishing requiring exposed service interfaces-->
    <dubbo:service interface="com.yyh.service.HelloService" ref="helloService"/>
    <!--Bean bean Definition-->
    <bean id="helloService" class="com.yyh.service.impl.HelloServiceImpl"/>

7. Start Services



public class Runner {
    public static void main(String[] args) throws IOException {
        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext("classpath:provider.xml");
        System.out.println("send anyket to exit");;

8. For debugging purposes we can provide a log configuration named in the resource directory

log4j.appender.console.Target = System.out
log4j.appender.console.layout.ConversionPattern = %-d{yyyy-MM-dd HH:mm:ss} [ %t:%r]-[%p] %m%n

9. Create consumer module

Introduce the same dependency in the pom

10. Create profile consumer.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns=""
       xmlns:xsi="" xmlns:dubbo=""
    <!--apply name        -->
    <dubbo:application name="hello-consumer"/>
    <!--Registration Center        -->
    <dubbo:registry address="N/A"/>
    <!--Create a proxy object into the container    -->
    <dubbo:reference id="helloService" interface="com.yyh.service.HelloService"


11. Run tests:

import com.yyh.service.HelloService;

public class Runner {
    public static void main(String[] args) {
        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext("classpath:consumer.xml");
        HelloService helloService = (HelloService) context.getBean("helloService");
        String jerry = helloService.helloMan("jerry");

If hello jerry is output, the call to the service is successful;

Tags: Java Dubbo log4j xml Apache

Posted on Sat, 07 Mar 2020 12:06:16 -0500 by mattcairns