[localization series] Introduction to Godson platform instruction set and supporting features of existing processors

DATE: 2021.10.27

1. Reprint reference

Introduction to Godson platform instruction set and supporting features of existing processors

2. Introduction to Godson platform instruction set

  • LoongISA includes some MIPS instruction sets, including the full set of MIPS64 Release 2 instruction set and MSA vector instruction module, DSP instruction module and VZ virtualization instruction module in MIPS64 Release 5.
    The floating point of LoongISA 1.0 complies with the IEEE754-1989 standard and uses the QNaN/SNaN definition of Legacy NaN. However, its MADD instruction is implemented according to the Fused MADD standard of IEEE754-2008, which is different from MIPS64 release 2.
    LoongISA 2.0 fully complies with the IEEE754-2008 standard, uses the QNaN/SNaN definition of NaN2008 specified in IEEE754-2008, and implements the floating-point operation unit according to MIPS64 Release5.
  • LoongMMI (MMl) instruction set is Godson multimedia extension instruction set (MMI is the abbreviation of multimedia construction). For multimedia acceleration, it has been used in Godson's ffmpeg media codec library, and the gcc community also supports the optimization option of this instruction set. The performance of MMI instruction set for multimedia codec is doubled.
  • Loonext (LEXT) is Godson general extension instruction set. The latest version of loonext is now 3.0. It is divided into loonext32 and loonext64 according to the instruction length. The loonext instruction set has submitted support for the gcc community. The optimization options of this instruction set can be selected in compilation.
  • LoongVZ (LVZP for short) is an extension of Godson's VZ virtualization module instruction set in MIPS64 Release 5. It has been used in Godson Zhongke's KVM, QEMU and libvirt - libraries.

3. Compile parameters

Specify cpu type - march=loongson3a, gs464, gs464e, gs264e
Use the optimized parameters - O2 or - O3
If you need to compile the target file of mips64r2 n64: add the parameter "- mips64r2 -mabi=64"
Instruction set msa: - mmsa
Instruction set msa2: - mmsa2
Instruction set mipsfpu: - mmipsfpu
Instruction set Loongson MMI: - mloongson MMI
Instruction set Loongson CAM: - mloongson cam
Instruction set Loongson EXT: - mloongson ext
Instruction set Loongson EXT2: - mloongson-ext2
Instruction set Loongson EXT3: - mloongson-ext3
Instruction set Loongson AMO: - mloongson amo
Instruction set Loongson CSR: - mmloongson CSR
Instruction set Description:

virt            Recognize the virtualization ASE instructions.

loongson-mmi    Recognize the Loongson MultiMedia extensions Instructions (MMI) ASE instructions.

loongson-cam    Recognize the Loongson Content Address Memory (CAM) instructions.

loongson-ext    Recognize the Loongson EXTensions (EXT) instructions.

loongson-ext2   Recognize the Loongson EXTensions R2 (EXT2) instructions.

loongson-ext3   Recognize the Loongson EXTend R3 (EXT3) ASE instructions.

loongson-amo    Recognize the Loongson Atomic Memory Operation (AMO) ASE instructions.

loongson-csr    Recognize the Loongson Ctrl Status Register (CSR) ASE instructions.

gpr-names=ABI   Print GPR names according to  specified ABI.Default: based on binary being disassembled.

fpr-names=ABI   Print FPR names according to specified ABI. Default: numeric.

cp0-names=ARCH  Print CP0 register names according to specified architecture. Default: based on binary being disassembled.

hwr-names=ARCH  Print HWR names according to specified architecture. Default: based on binary being disassembled.

reg-names=ABI   Print GPR and FPR names according to specified ABI.

reg-names=ARCH  Print CP0 register and HWR names according to specified architecture.

explain:

binutils 2.32 supports mmi, ext, ext2, cam and other instruction sets.

binutils 2.24 maintained by Godson is used. All instruction sets support the gcc built-in function of msa2. At present, only the compiler maintained by Godson supports it.

4. Find hotspot functions for code optimization

intel vtune is used for hotspot function analysis on x86 platform, Oprofile is used for hotspot function analysis on Godson platform, and assembly is used to rewrite hotspot functions. When in use, the corresponding acceleration instructions are used according to the application characteristics during assembly.

Or use perf top -p pid to view hotspot functions.

At present, Godson fully supports MIPS64R2. See the following table for other extension instructions:

THE END!

Posted on Tue, 26 Oct 2021 23:16:05 -0400 by nicko