Read TiDB source code with problems: Power BI Desktop connects TiDB with MySQL driver and reports an error

It is often said that reading source code is the only way for every excellent development engineer, but in the face of complex systems like TiDB, source code reading is a very huge project. For some TiDB users, starting from their daily problems, reading the source code is a good entry point. Therefore, we planned a series of articles on reading the source code with problems.

This is the second article in this series. Taking a Power BI Desktop with abnormal performance on TiDB as an example, this paper introduces the process from problem discovery and positioning to problem solving through issuing and writing PR in the open source community, and makes trouble shooting from the perspective of code implementation, hoping to help you better understand the TiDB source code.

First, let's recreate the failed scenario (TiDB 5.1.1 on MacOS) and build a simple table with only one field:

CREATE TABLE test(name VARCHAR(1) PRIMARY KEY);

Yes on MySQL, but not on TiDB. An error is reported

DataSource.Error: An error happened while reading data from the provider: 'Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints.'
Details:

DataSourceKind=MySql
DataSourcePath=localhost:4000;test

Look at the SQL of the last run on the general log TiDB:

select COLUMN_NAME, ORDINAL_POSITION, IS_NULLABLE, DATA_TYPE, case when NUMERIC_PRECISION is null then null when DATA_TYPE in ('FLOAT', 'DOUBLE') then 2 else 10 end AS NUMERIC_PRECISION_RADIX, NUMERIC_PRECISION, NUMERIC_SCALE,            CHARACTER_MAXIMUM_LENGTH, COLUMN_DEFAULT, COLUMN_COMMENT AS DESCRIPTION, COLUMN_TYPE  from INFORMATION_SCHEMA.COLUMNS  where table_schema = 'test' and table_name = 'test';

We use tiup to start a TiDB cluster, and use tiup client to execute the command. The tiup client will also report an error:

error: mysql: sql: Scan error on column index 4, name "NUMERIC_PRECISION_RADIX": converting NULL to int64 is unsupported

Let's focus on solving the problem of this statement. Let's first look at what the error reported by the tiup client means. The tiup client uses the golang xo/usql library, but we cannot find the corresponding error information in the xo/usql library. The grep converting keyword returns very limited and irrelevant content. Let's take another look at the mysql driver of xo/usql, which refers to go SQL driver / MySQL. Download its code and grep converting. Only one piece of information in the changelog is returned, and the error rate is not in this library. Browse the code in go SQL driver / MySQL and find that it depends on database/sql. Let's take a look at the content of database/sql. database/sql is the standard library of golang, so we need to download the source code of golang. In the database directory of golang, grep converting quickly finds the content that matches the error message:

go/src/database/sql/convert.go

case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
        if src == nil {
                return fmt.Errorf("converting NULL to %s is unsupported", dv.Kind())
        }
        s := asString(src)
        i64, err := strconv.ParseInt(s, 10, dv.Type().Bits())
        if err != nil {
                err = strconvErr(err)
                return fmt.Errorf("converting driver.Value type %T (%q) to a %s: %v", src, s, dv.Kind(), err)
        }
        dv.SetInt(i64)
        return nil

Let's trace this fragment again to see how the types here come from. Finally, we will return to go SQL driver / MySQL:

mysql/fields.go

        case fieldTypeLongLong:
                if mf.flags&flagNotNULL != 0 {
                        if mf.flags&flagUnsigned != 0 {
                                return scanTypeUint64
                        }
                        return scanTypeInt64
                }
                return scanTypeNullInt

This part of the code is in the return body of the parsing statement column definition , convert to a type in golang. We can use MySQL -- host 127.0.0.1 -- port 4000 - U root -- column type info to connect and view the column metadata returned by the SQL in question:

MySQL

Field 5: `NUMERIC_PRECISION_RADIX`
Catalog: `def`
Database: `` 
Table: ``
Org_table: ``
Type: LONGLONG
Collation: binary (63)
Length: 3
Max_length: 0
Decimals: 0
Flags: BINARY NUM

TiDB

Field 5: `NUMERIC_PRECISION_RADIX`
Catalog: `def`
Database: ``
Table: ``
Org_table: ``
Type: LONGLONG
Collation: binary (63)
Length: 2
Max_length: 0
Decimals: 0
Flags: NOT_NULL BINARY NUM

It is obvious that the numeric in the error message of the tiup client_ PRECISION_ There is an obvious problem with the column definition of the radix field on TiDB. This field is marked not in the return body of TiDB_ NULL, which is obviously unreasonable, because the field can obviously be NULL, which is also reflected in the return value of MySQL. Therefore, xo/usql reported an error when processing the return body. Here, we have found out why the client side reported an error. Next, we need to find out why TiDB returned an error column definition.

Through TiDB Dev Guide, we can know the general execution process of a DQL statement in TiDB. We look down from the server/conn.go#clientConn.Run of the entry and pass through server/conn.go#clientConn.dispatch, server/conn.go#clientConn.handleQuery, server/conn.go#clientConn.handleStmt and server/driver_tidb.go#TiDBContext.ExecuteStmt,session/session.go#session.ExecuteStmt,executor/compiler.go#Compiler.Compile,planner/optimize.go#Optimize,planner/optimize.go#optimize,planner/core/planbuilder.go#PlanBuilder.Build,planner/core/logical_plan_builder.go#PlanBuilder.buildSelect. In buildSelect, we can see a series of processing of query statements by TiDB planner, and then we can go to planner/core/expression_rewriter.go#PlanBuilder.rewriteWithPreprocess and planner/core/expression_rewriter.go#PlanBuilder.rewriteExprNode. In rewriteExprNode, the problematic field will be numeric_ PRECISION_ Finally, the CASE expression will be parsed in expression / builtin_ In control.go#casewhenfunctionclass.getfunction, we finally come to the place where the column definition returned by the CASE expression is calculated (this depends on traversing the AST parsed by the compiler):

    for i := 1; i < l; i += 2 {       
        fieldTps = append(fieldTps, args[i].GetType())
        decimal = mathutil.Max(decimal, args[i].GetType().Decimal)
        if args[i].GetType().Flen == -1 {
            flen = -1
        } else if flen != -1 {
            flen = mathutil.Max(flen, args[i].GetType().Flen)
        }
        isBinaryStr = isBinaryStr || types.IsBinaryStr(args[i].GetType())
        isBinaryFlag = isBinaryFlag || !types.IsNonBinaryStr(args[i].GetType())
    }
    if l%2 == 1 {
        fieldTps = append(fieldTps, args[l-1].GetType())
        decimal = mathutil.Max(decimal, args[l-1].GetType().Decimal)
        if args[l-1].GetType().Flen == -1 {
            flen = -1
        } else if flen != -1 {
            flen = mathutil.Max(flen, args[l-1].GetType().Flen)
        }
        isBinaryStr = isBinaryStr || types.IsBinaryStr(args[l-1].GetType())
        isBinaryFlag = isBinaryFlag || !types.IsNonBinaryStr(args[l-1].GetType())
    }


    fieldTp := types.AggFieldType(fieldTps)
    // Here we turn off NotNullFlag. Because if all when-clauses are false,
    // the result of case-when expr is NULL.
    types.SetTypeFlag(&fieldTp.Flag, mysql.NotNullFlag, false)
    tp := fieldTp.EvalType()


    if tp == types.ETInt {
        decimal = 0
    }
    fieldTp.Decimal, fieldTp.Flen = decimal, flen
    if fieldTp.EvalType().IsStringKind() && !isBinaryStr {
        fieldTp.Charset, fieldTp.Collate = DeriveCollationFromExprs(ctx, args...)
        if fieldTp.Charset == charset.CharsetBin && fieldTp.Collate == charset.CollationBin {
            // When args are Json and Numerical type(eg. Int), the fieldTp is String.
            // Both their charset/collation is binary, but the String need a default charset/collation.
            fieldTp.Charset, fieldTp.Collate = charset.GetDefaultCharsetAndCollate()
        }
    } else {
        fieldTp.Charset, fieldTp.Collate = charset.CharsetBin, charset.CollationBin
    }
    if isBinaryFlag {
        fieldTp.Flag |= mysql.BinaryFlag
    }
    // Set retType to BINARY(0) if all arguments are of type NULL.
    if fieldTp.Tp == mysql.TypeNull {
        fieldTp.Flen, fieldTp.Decimal = 0, types.UnspecifiedLength
        types.SetBinChsClnFlag(fieldTp)
    }

Looking at the code for calculating the column definition flag above, we can find that no matter what the CASE expression is, not_ The null flag bit must be set to false, so the problem does not appear here! At this time, we can only look back along the above code path to see if the column definition generated above has been modified in the future. Finally, in server/conn.go#clientConn.handleStmt, it is found that it calls server/conn.go#clientConn.writeResultSet, and then successively calls server/conn.go#clientConn.writeChunks, server/conn.go#clientConn.writeColumnInfo, server/column.go#ColumnInfo.Dump and server/column.go#dumpFlag. In dumpFlag, The previously generated column definition flag has been modified:

func dumpFlag(tp byte, flag uint16) uint16 {
    switch tp {
    case mysql.TypeSet:
        return flag | uint16(mysql.SetFlag)
    case mysql.TypeEnum:
        return flag | uint16(mysql.EnumFlag)
    default:
        if mysql.HasBinaryFlag(uint(flag)) {
            return flag | uint16(mysql.NotNullFlag)
        }
        return flag
    }
}

Finally, we found the reason why TiDB returned the wrong column definition! In fact, this bug has been fixed in the latest version 5.2.0 of TiDB: *: fix some problems related to notNullFlag by wjhuang2016 · Pull Request #27697 · pingcap/tidb.

Finally, in the process of reading the code above, we'd better see what the AST parsed by TiDB looks like, so that we won't be confused in the process of traversing the AST at last. In TiDB dev guide parser chapter Explain how to debug parser, parser/quickstart.md at master · pingcap/parser There is also an AST generated by sample output in, but simply outputting is basically useless. We can use davecgh / go speed to directly output the node generated by parser, so as to obtain an understandable tree:

package main

import (
        "fmt"
        "github.com/pingcap/parser"
        "github.com/pingcap/parser/ast"
        _ "github.com/pingcap/parser/test_driver"
        "github.com/davecgh/go-spew/spew"
)

func parse(sql string) (*ast.StmtNode, error) {
        p := parser.New()
        stmtNodes, _, err := p.Parse(sql, "", "")
        if err != nil {
                return nil, err
        }
        return &stmtNodes[0], nil
}

func main() {
        spew.Config.Indent = "    "
        astNode, err := parse("SELECT a, b FROM t")
        if err != nil {
                fmt.Printf("parse error: %v\n", err.Error())
                return
        }
        fmt.Printf("%s\n", spew.Sdump(*astNode))
}
(*ast.SelectStmt)(0x140001dac30)({
    dmlNode: (ast.dmlNode) {
        stmtNode: (ast.stmtNode) {
            node: (ast.node) {
                text: (string) (len=18) "SELECT a, b FROM t"
            }
        }
    },
    resultSetNode: (ast.resultSetNode) {
        resultFields: ([]*ast.ResultField) <nil>
    },
    SelectStmtOpts: (*ast.SelectStmtOpts)(0x14000115bc0)({
        Distinct: (bool) false,
        SQLBigResult: (bool) false,
        SQLBufferResult: (bool) false,
        SQLCache: (bool) true,
        SQLSmallResult: (bool) false,
        CalcFoundRows: (bool) false,
        StraightJoin: (bool) false,
        Priority: (mysql.PriorityEnum) 0,
        TableHints: ([]*ast.TableOptimizerHint) <nil>
    }),
    Distinct: (bool) false,
    From: (*ast.TableRefsClause)(0x140001223c0)({
        node: (ast.node) {
            text: (string) ""
        },
        TableRefs: (*ast.Join)(0x14000254100)({
            node: (ast.node) {
                text: (string) ""
            },
            resultSetNode: (ast.resultSetNode) {
                resultFields: ([]*ast.ResultField) <nil>
            },
            Left: (*ast.TableSource)(0x14000156480)({
                node: (ast.node) {
                    text: (string) ""
                },
                Source: (*ast.TableName)(0x1400013a370)({
                    node: (ast.node) {
                        text: (string) ""
                    },
                    resultSetNode: (ast.resultSetNode) {
                        resultFields: ([]*ast.ResultField) <nil>
                    },
                    Schema: (model.CIStr) ,
                    Name: (model.CIStr) t,
                    DBInfo: (*model.DBInfo)(<nil>),
                    TableInfo: (*model.TableInfo)(<nil>),
                    IndexHints: ([]*ast.IndexHint) <nil>,
                    PartitionNames: ([]model.CIStr) {
                    }
                }),
                AsName: (model.CIStr)
            }),
            Right: (ast.ResultSetNode) <nil>,
            Tp: (ast.JoinType) 0,
            On: (*ast.OnCondition)(<nil>),
            Using: ([]*ast.ColumnName) <nil>,
            NaturalJoin: (bool) false,
            StraightJoin: (bool) false
        })
    }),
    Where: (ast.ExprNode) <nil>,
    Fields: (*ast.FieldList)(0x14000115bf0)({
        node: (ast.node) {
            text: (string) ""
        },
        Fields: ([]*ast.SelectField) (len=2 cap=2) {
            (*ast.SelectField)(0x140001367e0)({
                node: (ast.node) {
                    text: (string) (len=1) "a"
                },
                Offset: (int) 7,
                WildCard: (*ast.WildCardField)(<nil>),
                Expr: (*ast.ColumnNameExpr)(0x14000254000)({
                    exprNode: (ast.exprNode) {
                        node: (ast.node) {
                            text: (string) ""
                        },
                        Type: (types.FieldType) unspecified,
                        flag: (uint64) 8
                    },
                    Name: (*ast.ColumnName)(0x1400017dc70)(a),
                    Refer: (*ast.ResultField)(<nil>)
                }),
                AsName: (model.CIStr) ,
                Auxiliary: (bool) false
            }),
            (*ast.SelectField)(0x14000136840)({
                node: (ast.node) {
                    text: (string) (len=1) "b"
                },
                Offset: (int) 10,
                WildCard: (*ast.WildCardField)(<nil>),
                Expr: (*ast.ColumnNameExpr)(0x14000254080)({
                    exprNode: (ast.exprNode) {
                        node: (ast.node) {
                            text: (string) ""
                        },
                        Type: (types.FieldType) unspecified,
                        flag: (uint64) 8
                    },
                    Name: (*ast.ColumnName)(0x1400017dce0)(b),
                    Refer: (*ast.ResultField)(<nil>)
                }),
                AsName: (model.CIStr) ,
                Auxiliary: (bool) false
            })
        }
    }),
    GroupBy: (*ast.GroupByClause)(<nil>),
    Having: (*ast.HavingClause)(<nil>),
    WindowSpecs: ([]ast.WindowSpec) <nil>,
    OrderBy: (*ast.OrderByClause)(<nil>),
    Limit: (*ast.Limit)(<nil>),
    LockTp: (ast.SelectLockType) none,
    TableHints: ([]*ast.TableOptimizerHint) <nil>,
    IsAfterUnionDistinct: (bool) false,
    IsInBraces: (bool) false,
    QueryBlockOffset: (int) 0,
    SelectIntoOpt: (*ast.SelectIntoOption)(<nil>)
})

Tags: Database

Posted on Fri, 03 Dec 2021 02:28:23 -0500 by coffeecup