About C# understanding packing and unpacking

catalogue

1. Understand packing

2. Understand unpacking

3. Generated IL code

4. Practical application

5. Summary

1. Understand packing

Simply put, boxing is to store data of a value type in a variable of a reference type.

Suppose you create a local variable of type int in a method, and you want to represent the value type as a reference type, it means that you have boxed the value, as shown below:

static void SimpleBox() 
{ 
  int myInt = 25; 
  //Packing operation 
  object boxedInt = myInt; 
}
  Specifically, the boxing process is the process of assigning a value type to an Object type variable. When you bin a value, CoreCLR will allocate a new Object on the heap and copy the value of the value type to the Object instance. What is returned to you is a reference to a newly allocated Object in the managed heap.

2. Understand unpacking

Conversely, the process of converting the value of the Object reference type variable back to the corresponding value type in the stack is called unpacking.

Grammatically speaking, the unpacking operation looks like a normal conversion operation. However, its semantics are completely different. CoreCLR first verifies whether the received data type is equal to the boxed type. If so, it copies the value back to the local variable stored based on the stack.

For example, if the underlying type of boxedInt is int, the unpacking operation is completed:

static void SimpleBoxUnbox() 
{ 
  int myInt = 25; 
  // Packing operation 
  object boxedInt = myInt; 
  // Unpacking operation 
  int unboxedInt = (int)boxedInt; 
} 
Remember, unlike performing a typical type conversion, you must unpack it into an appropriate data type. If you try to unpack a piece of data into an incorrect data type, a InvalidCastException Abnormal.
For safety's sake, if you can't guarantee Object The type behind the type is best used try/catch Logic wraps the unpacking operation, although it will be troublesome. Consider the following code, which will throw an error because you are trying to pack the int Type unpack into one long Type:
static void SimpleBoxUnbox() 
{ 
  int myInt = 25; 
  // Packing operation 
  object boxedInt = myInt; 
  // Unpacking to the wrong data type will trigger a runtime exception 
  try 
  { 
    long unboxedLong = (long)boxedInt; 
  } 
  catch (InvalidCastException ex) 
  { 
    Console.WriteLine(ex.Message); 
  } 
} 

3. Generated IL code

When the C# compiler encounters boxing / unpacking syntax, it generates IL code containing the boxing / unpacking operation. If you use ildasm.exe to view the compiled assembly, you will see the box and unbox instructions corresponding to boxing and unpacking operations:

.method assembly hidebysig static 
    void  '<<Main>$>g__SimpleBoxUnbox|0_0'() cil managed 
{ 
  .maxstack  1 
  .locals init (int32 V_0, object V_1, int32 V_2) 
    IL_0000:  nop 
    IL_0001:  ldc.i4.s   25 
    IL_0003:  stloc.0 
    IL_0004:  ldloc.0 
    IL_0005:  box        [System.Runtime]System.Int32 
    IL_000a:  stloc.1 
    IL_000b:  ldloc.1 
    IL_000c:  unbox.any  [System.Runtime]System.Int32 
    IL_0011:  stloc.2 
    IL_0012:  ret 
  } // end of method '<Program>$'::'<<Main>$>g__SimpleBoxUnbox|0_0' 

At first glance, packing / unpacking seems to be a useless language feature, which is more academic than practical. After all, you rarely need to store a local value type in a local Object variable. However, the fact is that the boxing / unpacking process is quite useful because it allows you to assume that everything can be treated as an Object type, and CoreCLR will automatically help you deal with memory related details.

4. Practical application

Let's take a look at the practical application of boxing / unpacking. We take C#'s ArrayList class as an example to store a batch of integer data stored in the stack. The relevant method members of ArrayList class are listed as follows:

public class ArrayList : IList, ICloneable 
{ 
  ... 
  public virtual int Add(object? value); 
  public virtual void Insert(int index, object? value); 
  public virtual void Remove(object? obj); 
  public virtual object? this[int index] { get; set; } 
} 

 

Please note that the above ArrayList methods operate on Object type data. ArrayList is designed to manipulate objects (representing any type), which are data allocated on the managed heap. Consider the following code:

static void WorkWithArrayList() 
{ 
  //When passed to an object's method, the value type is automatically boxed 
  ArrayList myInts = new ArrayList(); 
  myInts.Add(10); 
} 
Although you directly pass the digital data into the method that needs the Object parameter, the runtime will automatically pack the data allocated in the stack. If you want to use the indexer to retrieve a piece of data from the ArrayList, you must use the conversion operation to unpack the heap allocated Object into the stack allocated integer, because the indexer of the ArrayList returns the Object type, not the int type.
static void WorkWithArrayList() 
{ 
  // When passed to a method that requires an object parameter, the value type is automatically boxed 
  ArrayList myInts = new ArrayList(); 
  myInts.Add(10); 
  // Unpacking occurs when an object is converted back to stack based data 
  int i = (int)myInts[0]; 
  // because WriteLine() Needed object Parameter, repackaged again 
  Console.WriteLine("Value of your int: {0}", i); 
} 

Before calling ArrayList.Add(), the int value allocated in the stack is boxed, so it can be passed into a method with an Object parameter. When Object type data is retrieved from ArrayList, it is unpacked into int type through conversion operation. Finally, when it is passed to the Console.WriteLine() method, it is boxed again because the parameter of this method is of type Object.

5. Summary

From the programmer's point of view, boxing and unpacking are very convenient. We don't need to manually copy and transfer the data of value type and reference type in memory.

However, the stack / heap memory transfer behind boxing and unpacking also brings performance problems. The following is a summary of the steps required to box and unpack a simple integer:

Allocate a new object in the managed heap;

The data value in the stack is transferred to the object in the managed heap;

When unpacking, the value stored on the object in the heap is transferred back to the stack;

Unused objects on the heap will eventually be recycled by the GC.

Although many times, boxing and unpacking operations will not have a significant impact on performance, if a collection such as ArrayList contains thousands of data, and your program will operate these data frequently, the impact on performance will be obvious.

Therefore, we should try to avoid packing and unpacking when programming. For example, for the example of ArrayList above, if the collection element types are consistent, you should use generic collection types, such as List, LinkedList, etc.

Tags: C#

Posted on Mon, 29 Nov 2021 01:35:40 -0500 by slashpine