shardingsphere Source Analysis--Merge Engine

shardingsphere Source Analysis (6) - Merge Engine

Official Introduction

The links are as follows:
https://shardingsphere.apache.org/document/current/cn/features/sharding/principle/merge/

Combining multiple data result sets obtained from each data node into one result set and returning correctly to the requesting client is called result merge.

ShardingSphere supports results grouped and functionally categorized into five types: traversal, sorting, grouping, paging, and aggregation, which are combinations rather than mutually exclusive relationships.From the structural division, it can be divided into stream merge, memory merge and decorator merge.Streaming merge and memory merge are mutually exclusive, and decorator merge can be further processed on top of streaming merge and memory merge.

  • Traversal Merge
    It is the simplest way to merge.Simply combine multiple sets of data results into a one-way chain table.After traversing the current data result set in the list, move the list element one bit backward to continue traversing the next data result set.

  • Sort Merge
    Since there is an ORDER BY statement in the SQL, each data result set itself is ordered, so you only need to sort the data values that the current cursor of the data result set points to.This is equivalent to sorting multiple ordered arrays, and merge sort is the most appropriate sorting algorithm for this scenario.

  • Group merge
    Grouping merge is the most complex case, which is divided into streaming grouping merge and memory grouping merge.Streaming grouping consolidation requires that SQL's sorted items be consistent with the field of the grouped items and the sort type (ASC or DESC), otherwise only memory consolidation can guarantee the correctness of their data.

  • Aggregate Merge
    The processing of aggregation functions is consistent whether it is a streaming grouping merge or a memory grouping merge.In addition to grouped SQLs, non-grouped SQLs can also use aggregation functions.Therefore, aggregate merging is the ability to append to the merging classes described earlier, that is, the decorator pattern.Aggregation functions can be classified into three types: comparison, accumulation and mean.

  • Paging Merge
    All merge types described above may be paging.Paging is also an ornament appended to other merge types, and ShardingSphere uses the ornament mode to increase the ability to page data result sets.Paging consolidation is responsible for filtering out data that does not need to be retrieved.

The overall structure of the merge engine is divided into the following figures.

debug

Run examples/shardingsphere-jdbc-example/sharding-example/sharding-raw-jdbc-example/src/main/java/org/apache/shardingsphere/example/sharding/raw/jdbc/YamlRangeConfiguration ExampleMain.java

Query sql to get back to the merge engine

// ShardingSpherePreparedStatement.java
public ResultSet executeQuery() throws SQLException {
	...
  	List<QueryResult> queryResults = this.executeQuery0();
    // Merger
    MergedResult mergedResult = this.mergeQuery(queryResults);
    ...
 }

Initialize MergeEngine through SPI first

// MergeEngine.java
public MergeEngine(DatabaseType databaseType, ShardingSphereSchema schema, ConfigurationProperties props, Collection<ShardingSphereRule> rules) {
    this.databaseType = databaseType;
    this.schema = schema;
    this.props = props;
    this.engines = OrderedSPIRegistry.getRegisteredServices(rules, ResultProcessEngine.class);
}

Then, call the merge function

// MergeEngine.java
public MergedResult merge(List<QueryResult> queryResults, SQLStatementContext<?> sqlStatementContext) throws SQLException {
    Optional<MergedResult> mergedResult = this.executeMerge(queryResults, sqlStatementContext);
    Optional<MergedResult> result = mergedResult.isPresent() ? Optional.of(this.decorate((MergedResult)mergedResult.get(), sqlStatementContext)) : this.decorate((QueryResult)queryResults.get(0), sqlStatementContext);
    return (MergedResult)result.orElseGet(() -> {
        return new TransparentMergedResult((QueryResult)queryResults.get(0));
    });
}

The query statement ends up in ShardingDQLResultMerger

// ShardingDQLResultMerger.java
public MergedResult merge(List<QueryResult> queryResults, SQLStatementContext<?> sqlStatementContext, ShardingSphereSchema schema) throws SQLException {
    if (1 == queryResults.size()) {
        return new IteratorStreamMergedResult(queryResults);
    } else {
        Map<String, Integer> columnLabelIndexMap = this.getColumnLabelIndexMap((QueryResult)queryResults.get(0));
        SelectStatementContext selectStatementContext = (SelectStatementContext)sqlStatementContext;
        selectStatementContext.setIndexes(columnLabelIndexMap);
        // Decide which merge to make
        MergedResult mergedResult = this.build(queryResults, selectStatementContext, columnLabelIndexMap, schema);
        return this.decorate(queryResults, selectStatementContext, mergedResult);
    }
}

Here we do different merging based on the group by, distinct, order by keywords

// ShardingDQLResultMerger.java
private MergedResult build(List<QueryResult> queryResults, SelectStatementContext selectStatementContext, Map<String, Integer> columnLabelIndexMap, ShardingSphereSchema schema) throws SQLException {
    if (this.isNeedProcessGroupBy(selectStatementContext)) {
        return this.getGroupByMergedResult(queryResults, selectStatementContext, columnLabelIndexMap, schema);
    } else if (this.isNeedProcessDistinctRow(selectStatementContext)) {
        this.setGroupByForDistinctRow(selectStatementContext);
        return this.getGroupByMergedResult(queryResults, selectStatementContext, columnLabelIndexMap, schema);
    } else {
        return (MergedResult)(this.isNeedProcessOrderBy(selectStatementContext) ? new OrderByStreamMergedResult(queryResults, selectStatementContext, schema) : new IteratorStreamMergedResult(queryResults));
    }
}
Here logicsql yes SELECT * FROM t_order
Actual SQL: ds_0 ::: SELECT * FROM t_order ORDER BY order_id ASC


So what's going on above is sort merge

The result of the final merge is as follows

The avg statement we modified, the actual sql is as follows

Logic SQL: SELECT avg(user_id) FROM t_order_item 
SQLStatement: MySQLSelectStatement(limit=Optional.empty, lock=Optional.empty, window=Optional.empty) 
Actual SQL: ds_0 ::: SELECT avg(user_id) , COUNT(user_id) AS AVG_DERIVED_COUNT_0 , SUM(user_id) AS AVG_DERIVED_SUM_0 FROM t_order_item 
Actual SQL: ds_1 ::: SELECT avg(user_id) , COUNT(user_id) AS AVG_DERIVED_COUNT_0 , SUM(user_id) AS AVG_DERIVED_SUM_0 FROM t_order_item 

What we did was group merge

However, due to inconsistencies between the query field and the previous code, an error was reported when running to set a value.

Let's modify it again
Modify examples/example-core/example-raw-jdbc/src/main/java/org/apache/shardingsphere/example/core/jdbc/repository/OrderItemRepositoryImpl.java

public List<OrderItem> selectAll() throws SQLException {
    String sql = "SELECT * FROM t_order_item group by status";
    return getOrderItems(sql);
}

Then execute mvn install

Run YamlRangeConfiguration ExampleMain again

In query t_Order_When item, go group merge
Then this code controls whether the streaming group merge or the memory group merge

// ShardingDQLResultMerger.java
private MergedResult getGroupByMergedResult(List<QueryResult> queryResults, SelectStatementContext selectStatementContext, Map<String, Integer> columnLabelIndexMap, ShardingSphereSchema schema) throws SQLException {
    return (MergedResult)(selectStatementContext.isSameGroupByAndOrderByItems() ? new GroupByStreamMergedResult(columnLabelIndexMap, queryResults, selectStatementContext, schema) : new GroupByMemoryMergedResult(queryResults, selectStatementContext, schema));
}

Our sql is a streaming grouping merge

The results from the last group are as follows

summary

If the query does not have a sub-database and sub-table key, the results of the query need to be merged, so it is better for the query statement to have a sub-database and sub-table key.

Tags: Java Database shardingsphere

Posted on Fri, 03 Sep 2021 16:50:14 -0400 by SnowControl