Binary, hexadecimal, big end, small end

Use of hexadecimal

In the development process, it is common to write files. If the content is text, you can open it with a notepad software to check whether the content is correct. If you write an audio file, you should use the audio software to check. If it is a video file, you should use the video software to check... The corresponding files should be viewed with the corresponding software, but for development, it is necessary to view the binary of the file. For example, when we write an audio converter and find that the converted audio cannot be played, we need to view the binary of the audio file to analyze where the data is wrong, and when we look at the binary, 0101010100111 is determined by few people, because we mainly look at binary to see the value corresponding to binary. Its value can be octal, hexadecimal and hexadecimal. Among them, hexadecimal F represents 15, which is the maximum value that can be represented by 4 bits, FF represents 255, which is the maximum value that can be represented by 8 bits, Eight bits are exactly one byte, and the computer is stored in bytes, so using hexadecimal, every two hexadecimal values can represent the value of one byte. If using hexadecimal, 255 is the maximum value of one byte, 256 needs two bytes. 255 and 256 are three digits, one is represented by one byte, and the other needs two bytes, Therefore, it is troublesome to understand binary in decimal system, while it is more convenient to use hexadecimal system. For example, there are the following binary systems:

0111 1111   0000 0001   0000 1111   1111 1111

Gary added some spaces to let you see that the separation of every 4 bits and every 8 bits is cleared. There are 3 bytes in total, and its corresponding hexadecimal values are as follows:

  • 0111 > 0x7
  • 1111 > 0xF
  • 0000 > 0x0
  • 0001 > 0x1
  • 0000 > 0x0
  • 1111 > 0xF
  • 1111 > 0xF
  • 1111 > 0xF

The connection is 0x7F010FFF. You can use the calculator provided by Windows to enter the corresponding binary to view the results, as follows:


It can be seen that when we input binary, it will automatically get the corresponding hexadecimal, hexadecimal and octal values. The hexadecimal value is 0x7F010FFF and the corresponding hexadecimal value is 2130776063. It can be seen that the hexadecimal is very long and the hexadecimal is relatively short, so it is more convenient to write hexadecimal, and every two bits of hexadecimal correspond to a byte, It is also convenient for us to check each byte. For example, I want to check the value of each byte of the integer 2130776063. We know that an int type accounts for 4 bytes. We can use the displacement operation to obtain the value of each byte, as follows:

fun main() {
    val value: Int = 2130776063
    val bytes = ByteArray(4)
    bytes[0] = (value ushr 24).toByte()
    bytes[1] = (value ushr 16).toByte()
    bytes[2] = (value ushr 8).toByte()
    bytes[3] = (value ushr 0).toByte()
    bytes.forEachIndexed { index, byte ->
        // Convert byte to an equivalent positive int value in case byte is negative
        val byteValue = byte.toInt() and 255
        println("bytes[${index}] = $byteValue")
    }
}

The output results are as follows:

bytes[0] = 127
bytes[1] = 1
bytes[2] = 15
bytes[3] = 255

As an example above, we check the value corresponding to each byte of integer 2130776063 in hexadecimal, but after reading the print result, we don't know whether it is right or not, and we can't be sure whether there is a problem with the code. If we use hexadecimal, the result is easy to check. The hexadecimal of integer 2130776063 is 0x7F010FFF, and the code is as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bytes = ByteArray(4)
    bytes[0] = (value ushr 24).toByte()
    bytes[1] = (value ushr 16).toByte()
    bytes[2] = (value ushr 8).toByte()
    bytes[3] = (value ushr 0).toByte()
    bytes.forEachIndexed { index, byte ->
        // Convert byte to an equivalent positive int value in case byte is negative
        val byteValue = byte.toInt() and 0xFF
        println("bytes[${index}] = ${Integer.toHexString(byteValue)}")
    }
}

The output results are as follows:

bytes[0] = 7f
bytes[1] = 1
bytes[2] = f
bytes[3] = ff

We say that every two bits in hexadecimal correspond to a byte, then 0x7F010FFF can be divided into 7F, 01, 0F and FF, which correspond to the above print results. In this way, we can easily conclude that the code we write to obtain each byte of an int integer is correct, that is, our code can correctly obtain each byte of int.

Use hexadecimal to understand the write function of the output stream

The write(int b) function of the output stream is commonly used. How many bytes does it write out? Examples are as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bos = ByteArrayOutputStream()
    bos.write(value)
    val bytes = bos.toByteArray()
    printBytesWithHex(bytes)
}

/** Print byte array in hexadecimal mode */
fun printBytesWithHex(bytes: ByteArray) {
    bytes.forEachIndexed { index, byte ->
        println("bytes[${index}] = 0x${byteToHex(byte)}")
    }
}

/** Convert the word byte to hexadecimal representation, such as FF and 0F  */
fun byteToHex(byte: Byte): String {
    // 2 indicates that the total length is two bits, 0 indicates that if the length is not enough, it is supplemented with 0, x indicates that it is represented in hexadecimal, and the upper 0xFF is to prevent the byte from being converted to int. the previous 1 is cleared to 0, and only one low byte is reserved
    return String.format("%02x", byte.toInt() and 0xFF).uppercase(Locale.getDefault())
}

The operation results are as follows:

bytes[0] = 0xFF

It can be concluded from the results that write(int b) writes out the contents of the lowest byte of the int value, and the other three bytes are not written out.

What if I just want to write all four bytes of an int to a file and save it? In this case, you can use DataOutputStream, for example:

fun main() {
    val value: Int = 0x7F010FFF
    val bos = ByteArrayOutputStream()
    val dos = DataOutputStream(bos)
    dos.writeInt(value)
    val bytes = bos.toByteArray()
    printBytesWithHex(bytes)
}

The operation results are as follows:

bytes[0] = 0x7F
bytes[1] = 0x01
bytes[2] = 0x0F
bytes[3] = 0xFF

You can see that the writeInt() function of DataOutputStream is nothing more than writing out the four bytes of an int value in turn. It's no magic. It also solves this. In fact, we can take out the four bytes of an int value through the bit operator, and then write out an int value only by using an ordinary OutputStream.

Previously, when I used DataOutputStream, I accidentally used ObjectOutputStream. There was a problem because the int written out was wrong when it was read back, as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bos = ByteArrayOutputStream()
    val oos = ObjectOutputStream(bos)
    oos.writeInt(value)
    val bytes = bos.toByteArray()
    printBytesWithHex(bytes)

    val bis = ByteArrayInputStream(bytes)
    val ooi = ObjectInputStream(bis)
    val intValue = ooi.readInt()
    println("intValue = 0x$intValue")
}

fun intToHex(value: Int): String {
    return String.format("%08x", value.toLong() and 0xFFFFFFFF).uppercase(Locale.getDefault())
}

The operation results are as follows:

bytes[0] = 0xAC
bytes[1] = 0xED
bytes[2] = 0x00
bytes[3] = 0x05
Exception in thread "main" java.io.EOFException
	at java.base/java.io.DataInputStream.readInt(DataInputStream.java:397)
	at java.base/java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:3393)
	at java.base/java.io.ObjectInputStream.readInt(ObjectInputStream.java:1110)
	at cn.android666.kotlin.BinaryDemoKt.main(BinaryDemo.kt:16)
	at cn.android666.kotlin.BinaryDemoKt.main(BinaryDemo.kt)

It can be seen that the int value written out through ObjectOutputStream is not the content of the four bytes of the integer 0x7F010FFF at all. What are the four bytes printed? Moreover, an exception was thrown behind, and the int values we read were not printed. It's very simple to read and write one by one. Why did you throw an exception and can't see the contents of the int values we wrote? Check the JDK document and find that ObjectOutputStream is used to write objects. Does it write an int as an object? It's possible, but it's not unusual to write and read one by one? After thinking for several times, I guess the ObjectInputStream hasn't written the data to our memory stream (ByteArrayOutputStream)? Therefore, we first close the ObjectInputStream stream to ensure that it writes all data, and then obtain data from ByteArrayOutputStream, as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bos = ByteArrayOutputStream()
    val oos = ObjectOutputStream(bos)
    oos.writeInt(value)
    oos.close()
    val bytes = bos.toByteArray()
    printBytesWithHex(bytes)

    val bis = ByteArrayInputStream(bytes)
    val ooi = ObjectInputStream(bis)
    val intValue = ooi.readInt()
    println("intValue = 0x${intToHex(intValue)}")
}

The operation results are as follows:

bytes[0] = 0xAC
bytes[1] = 0xED
bytes[2] = 0x00
bytes[3] = 0x05
bytes[4] = 0x77
bytes[5] = 0x04
bytes[6] = 0x7F
bytes[7] = 0x01
bytes[8] = 0x0F
bytes[9] = 0xFF
intValue = 0x7F010FFF

OK, this time we see that ObjectOutputStream has written out a total of 10 bytes. The last four bytes are the contents of our integer 0x7F010FFF. As for what the previous bytes are, I don't care about it.

Here we have learned an experience: when using wrapper stream to wrap ByteArrayOutputStream, we must first close the wrapper stream and then obtain data from ByteArrayOutputStream, because when the wrapper stream is closed, we will ensure that all data is written to ByteArrayOutputStream.

Large end, small end

Big end and small end are generally used to describe the order in which shaping data is saved when saving to a file. For example, we take out the 4 bytes of an int value, and the 4 bytes respectively save the high-order to low-order contents of the int, as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bytes = ByteArray(4)
    bytes[0] = (value ushr 24).toByte() // Remove the highest 7F
    bytes[1] = (value ushr 16).toByte() // Remove 01
    bytes[2] = (value ushr 8).toByte()  // Remove 0F
    bytes[3] = (value ushr 0).toByte()  // Remove the lowest FF
    bytes.forEachIndexed { index, byte ->
        println("bytes[${index}] = 0x${byteToHex(byte)}")
    }
}

The operation results are as follows:

bytes[0] = 0x7F
bytes[1] = 0x01
bytes[2] = 0x0F
bytes[3] = 0xFF

Here, we use the order normally understood by people to obtain data from the high bit of int and obtain 4 bytes from the high bit to the low bit.

Small end: when saving, first save the high bit of int and then the low bit. An example is as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bytes = ByteArray(4)
    bytes[0] = (value ushr 24).toByte() // Remove the highest 7F
    bytes[1] = (value ushr 16).toByte() // Remove 01
    bytes[2] = (value ushr 8).toByte()  // Remove 0F
    bytes[3] = (value ushr 0).toByte()  // Remove the lowest FF

    val bos = ByteArrayOutputStream()
    
    // Save int value in small end mode
    bos.write(bytes[0].toInt())
    bos.write(bytes[1].toInt())
    bos.write(bytes[2].toInt())
    bos.write(bytes[3].toInt())

    bos.toByteArray().forEachIndexed { index, byte ->
        println("bytes[${index}] = 0x${byteToHex(byte)}")
    }
}

The operation results are as follows:

bytes[0] = 0x7F
bytes[1] = 0x01
bytes[2] = 0x0F
bytes[3] = 0xFF

It can be seen that the results saved in the small end mode are the same as our normal logical understanding, that is, the data in the small end mode is easier to view.

Big end: when saving, first save the low bit of int and then the high bit. An example is as follows:

fun main() {
    val value: Int = 0x7F010FFF
    val bytes = ByteArray(4)
    bytes[0] = (value ushr 24).toByte() // Remove the highest 7F
    bytes[1] = (value ushr 16).toByte() // Remove 01
    bytes[2] = (value ushr 8).toByte()  // Remove 0F
    bytes[3] = (value ushr 0).toByte()  // Remove the lowest FF

    val bos = ByteArrayOutputStream()

    // Save int value in small end mode
    bos.write(bytes[3].toInt())
    bos.write(bytes[2].toInt())
    bos.write(bytes[1].toInt())
    bos.write(bytes[0].toInt())

    bos.toByteArray().forEachIndexed { index, byte ->
        println("bytes[${index}] = 0x${byteToHex(byte)}")
    }
}

The operation results are as follows:

bytes[0] = 0xFF
bytes[1] = 0x0F
bytes[2] = 0x01
bytes[3] = 0x7F

It can be seen that the big end result is not easy to understand, because you have to combine the data upside down and then calculate the result. For example, we have to spell 0x7F010FFF instead of 0xFF0F017F. For example, when we view a bmp file, we view the binary content in hexadecimal mode, as shown below:

As shown in the figure above, the position marked in red is the position required by the bmp file format to save the file size, and it is required to be saved in the big end mode. If we don't know the big end and only know that the data in that position represents the file size, we will spell such a hexadecimal number: 0x4e00000, and its corresponding decimal system is 1308622848, which is wrong, This is a very small file. It can't be so big. If the data is assembled according to the big end, it is 0x0000004E, and its corresponding decimal system is 78, that is, the size of the bmp file is 78 bytes, which is correct!

Summary:

  • Small end: when saving, first save the high bit of int and then the low bit (the way understood by normal people)
  • Big end: when saving, first save the low bit of int and then the high bit (reverse thinking is required)

Tags: Java kotlin

Posted on Fri, 19 Nov 2021 04:01:22 -0500 by kriek