Distributed storage ShardingSphere

Distributed storage ShardingSphere

Previously, we did not use middleware to divide databases and tables, but did not use any middleware. This chapter introduces a middleware shardingSphere for dividing databases and tables. It includes three open source distributed database middleware solutions

Sharding JDBC [server agent] in this article, we mainly talk about him: it locates an enhanced version of jdbc driver. In short, it is to complete the routing and fragmentation operations related to database sub databases and tables on the application side. When our business code operates the database, it will connect to the database through sharding JDBC code. That is, some core actions of database and table splitting, such as SQL parsing, routing, execution and result processing, are completed by it. It works on the client.

Sharding proxy [client agent]: in short, our application used to connect directly to the database. After sharding proxy was introduced, our application connected directly to sharding proxy, and then the sharding proxy was processed and forwarded to mysql. The advantage of this method is that users do not need to be aware of the existence of sub databases and tables, which is equivalent to normal access to mysql

Sharding sidecar: it is mainly targeted at the cloud native database agent of Kubernetes. It has not been officially released yet.

Description of relevant concepts of sharding JDBC

  • [logical table]: a logical table can be understood as a view in the database and is a virtual table. It can be mapped to one physical table or composed of multiple physical tables, which can come from different data sources. For example, define a t_order, when we target t_ When the order table is operated, it will be mapped to the actual physical table according to the fragmentation rules for related transaction operations.
    • spring.shardingsphere.rules.sharding.tables.t_order.actual-data-nodes=ds-$-> {0.}.t_order_$->{0.}
  • [broadcast table]: broadcast table is also called global table, that is, it will exist in multiple databases redundantly to avoid cross database query problems, such as provinces, dictionaries and other basic data. In order to avoid cross database problems in querying these basic data by associated tables after database and table splitting, these data can be synchronized to each database node, which is called broadcast table.
    • #Broadcast table whose primary node is ds0
    • spring.shardingsphere.sharding.broadcast-tables=t_config spring.shardingsphere.sharding.tables.t_config.actual-data-nodes=ds$-> {0}.t_config
  • Bind table: the data of some of our tables have a logical primary foreign key relationship, and cross database association query is also troublesome. We can use this to make them in the same library, such as order_ The data with id = 1001 is in node1, and all its detailed data are also placed in node1. In this way, the associated query is still in the same library.
    • #Binding table rules. Multiple groups of binding rules are configured in the form of array
    • spring.shardingsphere.rules.sharding.binding-tables=t_order,t_order_item
    • If there are multiple binding table rules, they can be declared as arrays
    • Spring. Shardingsphere. Rules. Sharding. Binding tables [0] = # binding table rule list
    • Spring. Shardingsphere. Rules. Sharding. Binding tables [x] = # binding table rule list

Use of sharding JDBC [common use]

Like all third parties, we just need to import maven. Let's just talk about his configuration here. When we divide the database into databases and tables, we need to design the database's database division, table division rules, and the algorithm of the primary key id. We need to configure these things when using sharding.

private static Map<String, DataSource> createDataSourceMap(){
    //Represents a real data source
    Map<String,DataSource> dataSourceMap=new HashMap<>();
    //Logical library, real database
    return dataSourceMap;
//Create sharding rule
// * For database
// * For table
   //* Be sure to configure the partition key
   //* The sharding algorithm must be configured
   //* Completely unique id Question of
    //according to uuid Take the mold and divide the library
    //according to orderId Take the model for resolution

private static ShardingRuleConfiguration createShardingRuleConfiguration(){
    ShardingRuleConfiguration configuration=new ShardingRuleConfiguration();
    //Add the correspondence between the logical table and the real table to the partition rule configuration
    //Set database sub database rules
            new StandardShardingStrategyConfiguration
    Properties properties=new Properties();
    //Use it here user_id And 2, and then find the real database, ds It refers to our logical library
    //Set sub database policy
            put("db-inline",new ShardingSphereAlgorithmConfiguration("INLINE",properties));

    //Set the table fragmentation rules (horizontal splitting of data), the fragmentation key and the fragmentation algorithm, which we use inline algorithm
    configuration.setDefaultTableShardingStrategy(new StandardShardingStrategyConfiguration
    //Set split table policy
    Properties props=new Properties();
            new ShardingSphereAlgorithmConfiguration("INLINE",props));

    //Set primary key generation policy
    // * UUID
    // * Snowflake algorithm
    Properties idProperties=new Properties();
    configuration.getKeyGenerators().put("snowflake",new ShardingSphereAlgorithmConfiguration(
    return configuration;
//Configure the logical table and the of the table id strategy
private static ShardingTableRuleConfiguration getOrderTableRuleConfiguration(){
    //Configure the relationship between logical tables and real tables
    ShardingTableRuleConfiguration tableRuleConfiguration=
            new ShardingTableRuleConfiguration("t_order","ds${0..1}.t_order_${0..1}");
    tableRuleConfiguration.setKeyGenerateStrategy(new KeyGenerateStrategyConfiguration("order_id","snowflake"));
    return tableRuleConfiguration;

// use ShardingSphere Create a data source
// We can pass multiple data sources. And the rule configuration of sub database and sub table
public static DataSource getDatasource() throws SQLException {
    return ShardingSphereDataSourceFactory
            .createDataSource(createDataSourceMap(), Collections.singleton(createShardingRuleConfiguration()),new Properties());

After the configuration is written, just get the getDatasource method above and get a data source. sharding will process according to our configuration above.

Code for creating data source

public class DataSourceUtil {

    private static final String HOST = "";

    private static final int PORT = 3306;

    private static final String USER_NAME = "root";

    private static final String PASSWORD = "123456";

    public static DataSource createDataSource(final String dataSourceName) {
        HikariDataSource result = new HikariDataSource();
        result.setJdbcUrl(String.format("jdbc:mysql://%s:%s/%s?serverTimezone=UTC&useSSL=false&useUnicode=true&characterEncoding=UTF-8", HOST, PORT, dataSourceName));
        return result;

During the test, the data source is obtained through getDatasource. When writing sql, we always use logical tables  , It will intercept our sql and automatically route according to our configured rules


Use of sharding JDBC [SpringBoot]

In SpringBoot, as long as the relevant jar s are imported, it will automatically intercept the sql. Let's talk about the common fragmentation algorithms provided by him.

  • [automatic slicing algorithm]: we only need to configure it and then operate the configured logic table or library in springBoot to achieve the corresponding effect. He just offers a few.
    • Fragmentation according to data capacity: we can customize the data range, and it will automatically route during operation. For example, we can set a table with 200 data, a total of 600, and each table will only store 200 data
    • server.port=8080
      # Range based fragmentation strategy
      # Maximum capacity of data storage range
      # The interval of each range is 200
    • [partition according to partition boundary]: we can set 0-1000 as an interval (stored in the first table) 10001-20000 as an interval (stored in the second table) 300000 infinity as an interval (stored in the third table)
    • spring.shardingsphere.rules.sharding.tables.t_order_boundary_range.actual-data-nodes=ds-0.t_order_boundary_range_$->{0..3}
      # 0-1000 is an interval, 10001-20000 is an interval, 300000 - infinity is an interval
    • [slice according to time period]: for example, a year is a table.
    • # Configure 12 tables
      # Time start interval
      spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-auto-interval.props.datetime-lower=2010-01-01 23:59:59
      # Time end interval
      spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-auto-interval.props.datetime-upper=2021-01-01 23:59:59
      # Year in s
  • [custom sharding algorithm]: if the sharding algorithm provided by him does not meet our needs, we can implement our own sharding algorithm. It's actually spi.
  • [SPI]: in fact, some methods to extend some frameworks, such as dubbo, springBoot, and other common frameworks or middleware, will provide users with their own space to play, because they can't meet the requirements of all users, so sharding is the same. So what do we need to do?
    • In the META-INF/services / directory of the project, take the fully qualified name of the interface as the file name, and the content of the file is the service class that implements the interface. (because we need to extend his interface, the file name we create must be consistent with his interface name, and the content is written in the fully qualified name of our implementation class)

    •   Of course, you should also specify your own algorithm in the springBoot configuration

    • spring.shardingsphere.rules.sharding.sharding-algorithms.standard-mod.props.algorithm-class-name=Path to custom algorithm
    • Then we implement our own segmentation algorithm.
    • public class StandardModTableShardAlgorithm implements StandardShardingAlgorithm<Long> {
          private Properties props=new Properties();
           * Used to process the tiles of = and IN.
           * @param collection Represents a collection of target tiles
           * @param preciseShardingValue Logic table related information
           * @return
           * Mod
          public String doSharding(Collection<String> collection, PreciseShardingValue<Long> preciseShardingValue) {
              for(String name:collection){
                  //according to order_id To obtain a target value
                  //name.endsWith,  "order_3".endWith("")
                      return name;
              throw new UnsupportedOperationException();
           * It is used to process BETWEEN AND fragmentation. If RangeShardingAlgorithm is not configured, BETWEEN AND in SQL will be processed according to the full database route
           * @param collection
           * @param rangeShardingValue
           * @return
          public Collection<String> doSharding(Collection<String> collection, RangeShardingValue<Long> rangeShardingValue) {
              Collection<String> result=new LinkedHashSet<>(collection.size());
              for(Long i=rangeShardingValue.getValueRange().lowerEndpoint();i<=rangeShardingValue.getValueRange().upperEndpoint();i++){
                  for(String name:collection){
              return result;
           * Method called when initializing an object
          public void init() {
           * Corresponding to the type of sharding algorithms
           * @return
          public String getType() {
              return "STANDARD_MOD";
          public Properties getProps() {
              return this.props;
           * Get partition related attributes
           * @param properties
          public void setProps(Properties properties) {
  • When we write our own spi, we know how its underlying is implemented
    • Create a new project, similar to the tool class that provides parsing. Write a spi loading class. In this class, we use a container to store all the tool classes we provide. And when loading classes, load the implementation classes below the subclasses of all tool classes.
      • public class ParserManager {
            //Store all implementation classes
            private final static ConcurrentHashMap<String, Parser> registeredParser=new ConcurrentHashMap<String, Parser>();
            static {
                // Load all classes that implement the interface
                // Load your own implementation class
            //SPI Loading of
            private static void loadInitialParser(){
                //SPI of api method
                //load META-INF/service Directory Parser Implementation class of
                ServiceLoader<Parser> parserServiceLoader=ServiceLoader.load(Parser.class);
                for (Parser parser : parserServiceLoader) {
                    registeredParser.put(parser.getType(), parser);
            private static void initDefaultStrategy(){
                Parser jsonParser=new JsonParser();
                Parser xmlParser=new XmlParser();
            public static Parser getParser(String key){
                return registeredParser.get(key);
    • Then use maven to package the project, take the above code as a jar, import that jar into the project, and use the functions in the jar
      • public class ParserController {
            public String parser(@PathVariable("type") String type){
                try {
                    return ParserManager.getParser(type).parser(new File(""));
                } catch (Exception e) {
                return "This parsing method is not supported";

    • Create a file with the name of the interface, and write the path of our own implementation class in it
    • Then we expand its original function by ourselves
      • ublic class WordParser implements Parser {
            public String parser(File file) throws Exception {
                return "I'm based on word analysis";
            public String getType() {
                return "word";
    • Then we can use our own functions without changing the original framework.

Tags: Distribution

Posted on Thu, 04 Nov 2021 06:45:39 -0400 by eth0g