Distributed storage ShardingSphere
Previously, we did not use middleware to divide databases and tables, but did not use any middleware. This chapter introduces a middleware shardingSphere for dividing databases and tables. It includes three open source distributed database middleware solutions
Sharding JDBC [server agent] in this article, we mainly talk about him: it locates an enhanced version of jdbc driver. In short, it is to complete the routing and fragmentation operations related to database sub databases and tables on the application side. When our business code operates the database, it will connect to the database through sharding JDBC code. That is, some core actions of database and table splitting, such as SQL parsing, routing, execution and result processing, are completed by it. It works on the client.
Sharding proxy [client agent]: in short, our application used to connect directly to the database. After sharding proxy was introduced, our application connected directly to sharding proxy, and then the sharding proxy was processed and forwarded to mysql. The advantage of this method is that users do not need to be aware of the existence of sub databases and tables, which is equivalent to normal access to mysql
Sharding sidecar: it is mainly targeted at the cloud native database agent of Kubernetes. It has not been officially released yet.
Description of relevant concepts of sharding JDBC
- [logical table]: a logical table can be understood as a view in the database and is a virtual table. It can be mapped to one physical table or composed of multiple physical tables, which can come from different data sources. For example, define a t_order, when we target t_ When the order table is operated, it will be mapped to the actual physical table according to the fragmentation rules for related transaction operations.
- spring.shardingsphere.rules.sharding.tables.t_order.actual-data-nodes=ds-$-> {0.}.t_order_$->{0.}
- [broadcast table]: broadcast table is also called global table, that is, it will exist in multiple databases redundantly to avoid cross database query problems, such as provinces, dictionaries and other basic data. In order to avoid cross database problems in querying these basic data by associated tables after database and table splitting, these data can be synchronized to each database node, which is called broadcast table.
- #Broadcast table whose primary node is ds0
- spring.shardingsphere.sharding.broadcast-tables=t_config spring.shardingsphere.sharding.tables.t_config.actual-data-nodes=ds$-> {0}.t_config
- Bind table: the data of some of our tables have a logical primary foreign key relationship, and cross database association query is also troublesome. We can use this to make them in the same library, such as order_ The data with id = 1001 is in node1, and all its detailed data are also placed in node1. In this way, the associated query is still in the same library.
- #Binding table rules. Multiple groups of binding rules are configured in the form of array
- spring.shardingsphere.rules.sharding.binding-tables=t_order,t_order_item
- If there are multiple binding table rules, they can be declared as arrays
- Spring. Shardingsphere. Rules. Sharding. Binding tables [0] = # binding table rule list
- Spring. Shardingsphere. Rules. Sharding. Binding tables [x] = # binding table rule list
Use of sharding JDBC [common use]
Like all third parties, we just need to import maven. Let's just talk about his configuration here. When we divide the database into databases and tables, we need to design the database's database division, table division rules, and the algorithm of the primary key id. We need to configure these things when using sharding.
private static Map<String, DataSource> createDataSourceMap(){ //Represents a real data source Map<String,DataSource> dataSourceMap=new HashMap<>(); //Logical library, real database dataSourceMap.put("ds0",DataSourceUtil.createDataSource("shard01")); dataSourceMap.put("ds1",DataSourceUtil.createDataSource("shard02")); return dataSourceMap; } //Create sharding rule // * For database // * For table //* Be sure to configure the partition key //* The sharding algorithm must be configured //* Completely unique id Question of //according to uuid Take the mold and divide the library //according to orderId Take the model for resolution private static ShardingRuleConfiguration createShardingRuleConfiguration(){ ShardingRuleConfiguration configuration=new ShardingRuleConfiguration(); //Add the correspondence between the logical table and the real table to the partition rule configuration configuration.getTables().add(getOrderTableRuleConfiguration()); //Set database sub database rules configuration.setDefaultDatabaseShardingStrategy( new StandardShardingStrategyConfiguration ("user_id","db-inline")); Properties properties=new Properties(); //Use it here user_id And 2, and then find the real database, ds It refers to our logical library properties.setProperty("algorithm-expression","ds${user_id%2}"); //Set sub database policy configuration.getShardingAlgorithms(). put("db-inline",new ShardingSphereAlgorithmConfiguration("INLINE",properties)); //Set the table fragmentation rules (horizontal splitting of data), the fragmentation key and the fragmentation algorithm, which we use inline algorithm configuration.setDefaultTableShardingStrategy(new StandardShardingStrategyConfiguration ("order_id","order-inline")); //Set split table policy Properties props=new Properties(); props.setProperty("algorithm-expression","t_order_${order_id%2}"); configuration.getShardingAlgorithms().put("order-inline", new ShardingSphereAlgorithmConfiguration("INLINE",props)); //Set primary key generation policy // * UUID // * Snowflake algorithm Properties idProperties=new Properties(); idProperties.setProperty("worker-id","123"); configuration.getKeyGenerators().put("snowflake",new ShardingSphereAlgorithmConfiguration( "SNOWFLAKE",idProperties)); return configuration; } //Configure the logical table and the of the table id strategy private static ShardingTableRuleConfiguration getOrderTableRuleConfiguration(){ //Configure the relationship between logical tables and real tables ShardingTableRuleConfiguration tableRuleConfiguration= new ShardingTableRuleConfiguration("t_order","ds${0..1}.t_order_${0..1}"); tableRuleConfiguration.setKeyGenerateStrategy(new KeyGenerateStrategyConfiguration("order_id","snowflake")); return tableRuleConfiguration; } // use ShardingSphere Create a data source // We can pass multiple data sources. And the rule configuration of sub database and sub table public static DataSource getDatasource() throws SQLException { return ShardingSphereDataSourceFactory .createDataSource(createDataSourceMap(), Collections.singleton(createShardingRuleConfiguration()),new Properties()); }
After the configuration is written, just get the getDatasource method above and get a data source. sharding will process according to our configuration above.
Code for creating data source
public class DataSourceUtil { private static final String HOST = "127.0.0.1"; private static final int PORT = 3306; private static final String USER_NAME = "root"; private static final String PASSWORD = "123456"; public static DataSource createDataSource(final String dataSourceName) { HikariDataSource result = new HikariDataSource(); result.setDriverClassName("com.mysql.jdbc.Driver"); result.setJdbcUrl(String.format("jdbc:mysql://%s:%s/%s?serverTimezone=UTC&useSSL=false&useUnicode=true&characterEncoding=UTF-8", HOST, PORT, dataSourceName)); result.setUsername(USER_NAME); result.setPassword(PASSWORD); return result; } }
During the test, the data source is obtained through getDatasource. When writing sql, we always use logical tables , It will intercept our sql and automatically route according to our configured rules
Use of sharding JDBC [SpringBoot]
In SpringBoot, as long as the relevant jar s are imported, it will automatically intercept the sql. Let's talk about the common fragmentation algorithms provided by him.
- [automatic slicing algorithm]: we only need to configure it and then operate the configured logic table or library in springBoot to achieve the corresponding effect. He just offers a few.
- Fragmentation according to data capacity: we can customize the data range, and it will automatically route during operation. For example, we can set a table with 200 data, a total of 600, and each table will only store 200 data
-
server.port=8080 spring.mvc.view.prefix=classpath:/templates/ spring.mvc.view.suffix=.html spring.shardingsphere.datasource.names=ds-0 spring.shardingsphere.datasource.common.type=com.zaxxer.hikari.HikariDataSource spring.shardingsphere.datasource.common.driver-class-name=com.mysql.jdbc.Driver spring.shardingsphere.datasource.ds-0.username=root spring.shardingsphere.datasource.ds-0.password=123456 spring.shardingsphere.datasource.ds-0.jdbc-url=jdbc:mysql://127.0.0.1:3306/shard01?serverTimezone=UTC&useSSL=false&useUnicode=true&characterEncoding=UTF-8 spring.shardingsphere.rules.sharding.tables.t_order_volume_range.actual-data-nodes=ds-0.t_order_volume_range_$->{0..2} spring.shardingsphere.rules.sharding.tables.t_order_volume_range.table-strategy.standard.sharding-column=user_id spring.shardingsphere.rules.sharding.tables.t_order_volume_range.table-strategy.standard.sharding-algorithm-name=t-order-volume-range spring.shardingsphere.rules.sharding.tables.t_order_volume_range.key-generate-strategy.column=order_id spring.shardingsphere.rules.sharding.tables.t_order_volume_range.key-generate-strategy.key-generator-name=snowflake # Range based fragmentation strategy spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-volume-range.type=VOLUME_RANGE spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-volume-range.props.range-lower=200 # Maximum capacity of data storage range spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-volume-range.props.range-upper=600 # The interval of each range is 200 spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-volume-range.props.sharding-volume=200 spring.shardingsphere.rules.sharding.key-generators.snowflake.type=SNOWFLAKE spring.shardingsphere.rules.sharding.key-generators.snowflake.props.worker-id=123
- [partition according to partition boundary]: we can set 0-1000 as an interval (stored in the first table) 10001-20000 as an interval (stored in the second table) 300000 infinity as an interval (stored in the third table)
-
spring.shardingsphere.rules.sharding.tables.t_order_boundary_range.actual-data-nodes=ds-0.t_order_boundary_range_$->{0..3} spring.shardingsphere.rules.sharding.tables.t_order_boundary_range.table-strategy.standard.sharding-column=user_id spring.shardingsphere.rules.sharding.tables.t_order_boundary_range.table-strategy.standard.sharding-algorithm-name=t-order-boundary-range spring.shardingsphere.rules.sharding.tables.t_order_boundary_range.key-generate-strategy.column=order_id spring.shardingsphere.rules.sharding.tables.t_order_boundary_range.key-generate-strategy.key-generator-name=snowflake spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-boundary-range.type=BOUNDARY_RANGE # 0-1000 is an interval, 10001-20000 is an interval, 300000 - infinity is an interval spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-boundary-range.props.sharding-ranges=1000,20000,300000 spring.shardingsphere.rules.sharding.key-generators.snowflake.type=SNOWFLAKE spring.shardingsphere.rules.sharding.key-generators.snowflake.props.worker-id=123
- [slice according to time period]: for example, a year is a table.
-
# Configure 12 tables spring.shardingsphere.rules.sharding.tables.t_order_interval.actual-data-nodes=ds-0.t_order_interval_$->{0..12} spring.shardingsphere.rules.sharding.tables.t_order_interval.table-strategy.standard.sharding-column=create_time spring.shardingsphere.rules.sharding.tables.t_order_interval.table-strategy.standard.sharding-algorithm-name=t-order-auto-interval spring.shardingsphere.rules.sharding.tables.t_order_interval.key-generate-strategy.column=order_id spring.shardingsphere.rules.sharding.tables.t_order_interval.key-generate-strategy.key-generator-name=snowflake spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-auto-interval.type=AUTO_INTERVAL # Time start interval spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-auto-interval.props.datetime-lower=2010-01-01 23:59:59 # Time end interval spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-auto-interval.props.datetime-upper=2021-01-01 23:59:59 # Year in s spring.shardingsphere.rules.sharding.sharding-algorithms.t-order-auto-interval.props.sharding-seconds=31536000 spring.shardingsphere.rules.sharding.key-generators.snowflake.type=SNOWFLAKE spring.shardingsphere.rules.sharding.key-generators.snowflake.props.worker-id=123
- [custom sharding algorithm]: if the sharding algorithm provided by him does not meet our needs, we can implement our own sharding algorithm. It's actually spi.
- [SPI]: in fact, some methods to extend some frameworks, such as dubbo, springBoot, and other common frameworks or middleware, will provide users with their own space to play, because they can't meet the requirements of all users, so sharding is the same. So what do we need to do?
-
In the META-INF/services / directory of the project, take the fully qualified name of the interface as the file name, and the content of the file is the service class that implements the interface. (because we need to extend his interface, the file name we create must be consistent with his interface name, and the content is written in the fully qualified name of our implementation class)
-
-
Of course, you should also specify your own algorithm in the springBoot configuration
-
spring.shardingsphere.rules.sharding.sharding-algorithms.standard-mod.props.algorithm-class-name=Path to custom algorithm
- Then we implement our own segmentation algorithm.
-
public class StandardModTableShardAlgorithm implements StandardShardingAlgorithm<Long> { private Properties props=new Properties(); /** * Used to process the tiles of = and IN. * @param collection Represents a collection of target tiles * @param preciseShardingValue Logic table related information * @return * Mod */ @Override public String doSharding(Collection<String> collection, PreciseShardingValue<Long> preciseShardingValue) { for(String name:collection){ //according to order_id To obtain a target value //Order_id%4=3 //name.endsWith, "order_3".endWith("") if(name.endsWith(String.valueOf(preciseShardingValue.getValue()%4))){ return name; } } throw new UnsupportedOperationException(); } /** * It is used to process BETWEEN AND fragmentation. If RangeShardingAlgorithm is not configured, BETWEEN AND in SQL will be processed according to the full database route * @param collection * @param rangeShardingValue * @return */ @Override public Collection<String> doSharding(Collection<String> collection, RangeShardingValue<Long> rangeShardingValue) { Collection<String> result=new LinkedHashSet<>(collection.size()); for(Long i=rangeShardingValue.getValueRange().lowerEndpoint();i<=rangeShardingValue.getValueRange().upperEndpoint();i++){ for(String name:collection){ if(name.endsWith(String.valueOf(i%4))){ result.add(name); } } } return result; } /** * Method called when initializing an object */ @Override public void init() { } /** * Corresponding to the type of sharding algorithms * @return */ @Override public String getType() { return "STANDARD_MOD"; } @Override public Properties getProps() { return this.props; } /** * Get partition related attributes * @param properties */ @Override public void setProps(Properties properties) { this.props=properties; } }
-
- When we write our own spi, we know how its underlying is implemented
- Create a new project, similar to the tool class that provides parsing. Write a spi loading class. In this class, we use a container to store all the tool classes we provide. And when loading classes, load the implementation classes below the subclasses of all tool classes.
-
public class ParserManager { //Store all implementation classes private final static ConcurrentHashMap<String, Parser> registeredParser=new ConcurrentHashMap<String, Parser>(); static { // Load all classes that implement the interface loadInitialParser(); // Load your own implementation class initDefaultStrategy(); } //SPI Loading of private static void loadInitialParser(){ //SPI of api method //load META-INF/service Directory Parser Implementation class of ServiceLoader<Parser> parserServiceLoader=ServiceLoader.load(Parser.class); for (Parser parser : parserServiceLoader) { registeredParser.put(parser.getType(), parser); } } private static void initDefaultStrategy(){ Parser jsonParser=new JsonParser(); Parser xmlParser=new XmlParser(); registeredParser.put(jsonParser.getType(),jsonParser); registeredParser.put(xmlParser.getType(),xmlParser); } public static Parser getParser(String key){ return registeredParser.get(key); } }
-
- Then use maven to package the project, take the above code as a jar, import that jar into the project, and use the functions in the jar
-
public class ParserController { @GetMapping("/{type}") public String parser(@PathVariable("type") String type){ try { return ParserManager.getParser(type).parser(new File("")); } catch (Exception e) { e.printStackTrace(); } return "This parsing method is not supported"; } }
-
- Create a file with the name of the interface, and write the path of our own implementation class in it
-
- Then we expand its original function by ourselves
-
ublic class WordParser implements Parser { @Override public String parser(File file) throws Exception { return "I'm based on word analysis"; } @Override public String getType() { return "word"; } }
-
- Then we can use our own functions without changing the original framework.
- Create a new project, similar to the tool class that provides parsing. Write a spi loading class. In this class, we use a container to store all the tool classes we provide. And when loading classes, load the implementation classes below the subclasses of all tool classes.