iOS Reads runtime Basics

Catalog

Preface

This paper describes in detail the underlying data structure, classes and original classes, messaging and forwarding, dynamic methods and other technical solutions for each object number in the Objective-C runtime. Apple open source This paper is a long one, with a personal understanding, mainly used for recording and learning, rough writing, technical chicken, if there are errors or inappropriateness, I would appreciate your generous comments!

What is runtime

Objective-C runtime is a dynamic runtime that supports the dynamic nature of the Objective-C language. All applications are linked to the runtime.

The Objective-C runtime is a runtime library that provides support for the dynamic properties of the Objective-C language, and as such is linked to by all Objective-C apps. Objective-C runtime library support functions are implemented in the shared library found at /usr/lib/libobjc.A.dylib.

Three key concepts

In the following narrative process, you should pay attention to three very important concepts: Class, SEL, IMP. Here I will list them first, and then we will go into their internal structure and relationship one by one.

typedef struct objc_class *Class;
 
typedef struct objc_object {
 
    Class isa;
 
} *id;
 
typedef struct objc_selector   *SEL;   
 
typedef id (*IMP)(id, SEL, ...);

A data structure for each primary object

1 objc_object

objc_object indicates that the underlying instance object is a structure with a private isa pointer inside that points to its class objects

struct objc_object {
private:
    isa_t isa;
...
// isa related operations
// Weak references, associated objects, memory management, and so on
// It's all in this structure, it's too long to post it all anymore
}

2 objc_class

Objc_class inherits from objc_object (so there must be an ISA pointer) and represents the class object. The bottom level is still the structure. The isa pointer inside the objc_class points to the metaclass object of the class. Meanwhile, the inner superclass points to its parent object, the NSObject object superclass points to nil, the cache is a method cache structure, and bits is a structure for storing variables, attributes, methods, etc.

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    
    class_rw_t *data() { 
        return bits.data();
    }
    ...
	// Class-related data operations are all in this structure and are no longer all posted out
}

2.1 cache_t

Cache method, when messaging, first uses a hash lookup algorithm to query whether there is a method cache to execute in this data structure, and then executes the method function quickly if there is one, which improves the efficiency of messaging.

The method cache policy is Locality principle Best application;

Essentially, it is incremental Hashtable , which maintains a list of structures made up of bucket_t

struct cache_t {
    struct bucket_t *_buckets;
    mask_t _mask;
    mask_t _occupied;
public:
    struct bucket_t *buckets();
    ...
};

Bucket_t stores the mapping relationship between method cache key and pointer address of untyped function. When looking for a cache, notify the specified key to find the specific bucket_t, and then query the function IMP address from bucket_t to execute the function.

struct bucket_t {
private:
    cache_key_t _key;
    IMP _imp;

public:
    inline cache_key_t key() const { return _key; }
    inline IMP imp() const { return (IMP)_imp; }
    inline void setKey(cache_key_t newKey) { _key = newKey; }
    inline void setImp(IMP newImp) { _imp = newImp; }

    void set(cache_key_t newKey, IMP newImp);
};

2.2 class_data_bits_t

  • The class_data_bits_t structure mainly encapsulates class_rw_r
  • class_rw_r encapsulates class_ro_r again
struct class_rw_t {
	// class_rw_t partial code
    uint32_t flags;
    uint32_t version;
	// Points to a read-only structure, storing the initial content of the class
    const class_ro_t *ro;
	/*
	Three read-write two-dimensional arrays that store the initialization information for the class
	*/
    method_array_t methods;			// Method List
    property_array_t properties;	// Attribute List
    protocol_array_t protocols;		// Protocol List
	// First subclass
    Class firstSubclass;
    // Next sibling class
    Class nextSiblingClass;
};

class_ro_t structure

struct class_ro_t {
   	// class_ro_t Part Code
    const char * name;
    // class_ro_t stores methods, attributes, protocols, etc. that a class determines at compile time
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;
    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;

    method_list_t *baseMethods() const {
        return baseMethodList;
    }
};

It is important to note that class_ro_t stores the content information that the class determines at compile time, while class_rw_t not only contains the content information of the class at compile time (actually merges the contents of class_ro_t), but also contains the content of the class dynamically added at run time, such as the methods, attributes, protocols, etc. A graph represents the relationship between the above structures:

3 isa

Before arm64 was a schema, the isa pointer stored the address information of the class or metaclass object, and the isa pointer (non-pointer pointer) was optimized from the arm64 schema. Bit fields stored other information than the class or metaclass address information, such as has_assoc to indicate whether the associated object was set

union isa_t  {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }
    Class cls;
    uintptr_t bits;
    struct {
    	// Tag bit, 0 for pointer isa, 1 for non-pointer ISA
        uintptr_t indexed           : 1;
        // Is there an associated object
        uintptr_t has_assoc         : 1;
        // Is there a C++ destructor
        uintptr_t has_cxx_dtor      : 1;
        // Memory address of the class or metaclass that stores the current object
        uintptr_t shiftcls          : 33; // MACH_VM_MAX_ADDRESS 0x1000000000
        // Determine whether the object has been initialized
        uintptr_t magic             : 6;
        // Does the object have a weak reference pointer
        uintptr_t weakly_referenced : 1;
        // Does the current object have a dealloc operation
        uintptr_t deallocating      : 1;
        // Is there an external reference table for the current isa pointer
        // When the reference count value is greater than the maximum value that isa can store
        // A sidetable hash property is bound to store more reference count information
        uintptr_t has_sidetable_rc  : 1;
        // Additional reference count values
        uintptr_t extra_rc          : 19;
    };
}

Note here

  • If the isa belongs to an instance object, it points to the class object of the instance object
  • If isa belongs to a class object, it points to a metaclass object of the class object

4 method_t

method_t is the underlying data structure of a function. It is the encapsulation of a function. Apple introduces the function Ad locum

struct method_t {
    SEL name;			// Function Name
    const char *types;	// Function return values and parameters
    IMP imp;			// Typeless function pointer to function body
    struct SortBySELAddress :
        public std::binary_function<const method_t&,
                                    const method_t&, bool> {
        bool operator() (const method_t& lhs,
                         const method_t& rhs)
        { return lhs.name < rhs.name; }
    };
};

Four Elements of 4.1 Function

  • Name
  • Return value
  • parameter
  • Function Body

4.2 types

Apple uses Type Encodings Technology to implement type encoding, the Objective-C runtime library uses type encoding internally to help speed up message distribution.
The structure is a list containing the return values of the function, parameters 1, 2, 3...Where the return value of the function is stored at position 0, since the function has only one return value (Go supports multiple return values), and the parameters can have multiple.

For a function with no type and no parameters, its type value is "V@:"

- (void)method {
    // among
    // V corresponds to the return value, meaning that the return value type is void
    // @corresponds to the first parameter, the id type represents an object, the default first parameter is the object itself (self), and the parameter is fixed
    // : corresponds to the SEL, which means the parameter is a method selector and is the second fixed parameter by default
}

Five charts showing relationships between data structures

Two-instance object, class object and metaclass object

A picture drawn by a big man (worship) illustrates the relationship between the three. Apple website There are similar descriptions, but I don't think it's any better than this one below)

  • The isa pointer of an instance object points to its class object
  • The isa pointer of a class object points to its metaclass object
  • The isa pointer to any metaclass object points to the root metaclass object
  • The superclass pointer of the class object points to its parent object, and the root object points to nil
  • The superclass pointer of a metaclass object points to its parent class object, and the root metaclass object points to the root class

The root class is NSObject in Objective-C. The instance object is objc_object(), and the class object is objc_class(); as mentioned above, objc_class() inherits from objc_object(), so there is also an isa pointer in the class object

typedef struct objc_object {
    Class isa;
} *id;

From the underlying data structure, we can see that the class object stores the list of instance object methods, member variables, etc. At the same time, the metaclass object stores the list of class methods of the class object, etc.

1 How to find an instance method call

When an instance object calls an instance method

  • First, based on the isa pointer of the object, the class object is found and the implementation of the called method is queried in the class object method list.
  • If not, the class object looks for its parent object based on its superclass pointer, and queries the list of parent object methods for a method implementation with the same name as the method invoked.
  • Recursive calls up to the root class object, and if any of the steps in the process queries for a specific method implementation, the specific function call is executed.
  • If no method implementation is found until the root class, two system methods are called, and then the system call process is followed.
+ (BOOL)resolveClassMethod:(SEL)sel OBJC_AVAILABLE(10.5, 2.0, 9.0, 1.0, 2.0);
+ (BOOL)resolveInstanceMethod:(SEL)sel OBJC_AVAILABLE(10.5, 2.0, 9.0, 1.0, 2.0);

See below for specific messaging processes

2 self and super

self is therefore a parameter of the current class, points to an instance object of the class, and, when making a method call, represents a lookup starting from the current class

OBJC_EXPORT void objc_msgSend(void /* id self, SEL op, ... */ )
    __OSX_AVAILABLE_STARTING(__MAC_10_0, __IPHONE_2_0);

Sup is essentially a compiler identifier, representing only method implementations starting from the parent to which the current object belongs

OBJC_EXPORT id objc_msgSendSuper(struct objc_super *super, SEL op, ...)
    __OSX_AVAILABLE_STARTING(__MAC_10_0, __IPHONE_2_0);

Three Message Delivery and Message Forwarding

In development, we often encounter this error unrecognized selector sent to instance xx
Roughly, you invoked a method that does not exist. Now, we'll take a closer look at why this exception occurred.
In fact, the above exception is exactly what we want to say. In the message mechanism of Objective-C, using OC message mechanism: if the message cannot find a specific IMP in the process of delivery, the internal trigger message forwarding mechanism, and the default implementation of the message forwarding mechanism of the system is to throw the above exception. Next, we will describe the message delivery and forwarding separately.

We know that Objective-C is a dynamic language, and method calls are not like C's static binding. At compile time, we determine which function should be called when the program is running (no method implementation in C will error). At run time, runtime-based dynamic runtime libraries use a series of lookups to decide which function to call, which is more flexible.We can even dynamically modify the implementation of a method at runtime, similar to the current popular "hot update" technology. This lookup process is the message mechanism of Objective-C.

1 Messaging process

In Objective-C, a method call actually sends a message to an object, and in the compiled file we see that the bottom level is turned into a function call

// Return value, parameter 1: fixed self, parameter 2: fixed SEL, followed by parameter 3, parameter 4....
OBJC_EXPORT void objc_msgSend(void /* id self, SEL op, ... */ )
    __OSX_AVAILABLE_STARTING(__MAC_10_0, __IPHONE_2_0);

As you can see from the code, the message sending function has two default parameters, the first is the receiver of the message, the default is the current object self; the second is SEL, the essential method selector of the SEL; (Read the runtime documentation and you will find that almost all method calls are related to selector)!Our method calls can say ** [receiver selector] **, so what is this selector sacred? Unfortunately, I'm in Apple and GNU Only this line of code was found in the runtime code provided.

typedef struct objc_selector *SEL;

However Apple It is explained that the method selector is a string mapped to C. According to the various materials I have read, the selector is the method name of the C string type.

Method selectors are used to represent the name of a method at runtime. A method selector is a C string that has been registered (or "mapped") with the Objective-C runtime. Selectors generated by the compiler are automatically mapped by the runtime when the class is loaded.

Having said that for half a day, what does objc_msgSend() have to do with our runtime (objc_msgSend() is implemented during the [receiver selector] compilation phase)? Then, how can the objc_msgSend() function be called further at runtime?

  • First, the recevier's class (class) is found through the recevier's isa pointer;
  • Second, first find out if there is a corresponding cache selector in the cache list in the class.
  • If it is found in the cache list, the corresponding IMP(value) is executed directly according to selector(key);
  • Otherwise, continue to find the corresponding selector in the method list of the class;
  • If no corresponding selector is found, continue to look in its superclass (parent class);
  • Finally, if a corresponding selector is found, the IMP (method implementation) of the corresponding recever selector method implementation is executed directly
  • Otherwise, the system enters the default message forwarding mechanism.

Let's use a diagram to illustrate the above process
Sometimes we call it super, and the truth is the same, the objc_msgSendSuper() function is generated after compilation

OBJC_EXPORT id objc_msgSendSuper(struct objc_super *super, SEL op, ...)
    __OSX_AVAILABLE_STARTING(__MAC_10_0, __IPHONE_2_0);

Inside the objc_super structure, the message receiver is still the current instance object of the receiver. Unlike the above, self starts from the class object of the current object and super starts from the parent object of the class object.

struct objc_super {
    /// Specifies an instance of a class.
    __unsafe_unretained id receiver;
};

2 Message forwarding process

During message delivery, we have said that if receiver cannot find the IMP implementation of the corresponding selector, it will enter the default message forwarding process of the system. The mechanism by which the system handles message forwarding by default throws an unrecognized selector sent to instance xx exception and then ends the entire message forwarding. If you want to avoid this, you canWe need to dynamically add implementations to receivers at runtime if selector cannot find them.
Fortunately, although the default system process is to throw exceptions, during the method call to throw exceptions, the system has opened a door for us to process messages through dynamic parsing, receiver redirection, message redirection, etc. The process is as follows:

2.1 Message Dynamic Resolution

In the process of processing message forwarding, the following two APIs are called separately according to the type of calling object. We can avoid crash by overloading dynamically adding methods within these two methods

// Class method not found, overload class method add class method implementation
+ (BOOL)resolveClassMethod:(SEL)sel OBJC_AVAILABLE(10.5, 2.0, 9.0, 1.0, 2.0);
// Instance method not found, overload such method add instance method implementation
+ (BOOL)resolveInstanceMethod:(SEL)sel OBJC_AVAILABLE(10.5, 2.0, 9.0, 1.0, 2.0);

Let's take the example method_

// Instance method test has no method implementation here
+ (BOOL)resolveInstanceMethod:(SEL)sel {
    // Determine if it is a test method
    if (sel == @selector(test)) {
        NSLog(@"resolveInstanceMethod:");
        // Implementation of dynamically adding test method
        class_addMethod(self, @selector(test), testImp, "v@:");
    }
    
    return [super resolveInstanceMethod:sel];
}
void testImp (void) {
    NSLog(@"test invoke");
}

2.2 Message Receiver Redirection

If the message is not processed in resolveInstanceMethod:SEL (that is, NO is returned), the system will give us a second chance to call forwardingTargetForSelector:SEL! The method return value is an id type and tells the system which object this instance method call is transferred to (class object if it is a class method call, class object if it is an instance method call) to accept processing.If we specify a new receiver, we resubmit the message to the new receiver.
Again, let's take a look at_

 - (id)forwardingTargetForSelector:(SEL)aSelector {
    if (aSelector == @selector(test)) {
        NSLog(@"forwardingTargetForSelector:");
        // Redirect, let ForwardObj object act as receiver, receive and process this message
        return [[ForwardObj alloc] init];
    }
    
    return [super forwardingTargetForSelector:aSelector];
}

2.3 Message Redirection

If the system gives us a second chance and the object we return is nil or self, the system will give us the last chance to avoid crash, that is, the message redirection process, call the methodSignatureForSelector method, and the return value is a method signature
Continue to raise one_

// Define function parameters and return value types and return function signatures
 - (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector {
    if (aSelector == @selector(test)) {
        NSLog(@"methodSignatureForSelector:");
        // Return value void for method signature and @ ID type for self
        // : SEL type, which stands for method selector, is actually @selector(test)
        return [NSMethodSignature signatureWithObjCTypes:"@:"];
    }
    
    return [super methodSignatureForSelector:aSelector];
}
// Message Redirection
 - (void)forwardInvocation:(NSInvocation *)anInvocation {
    NSLog(@"forwardInvocation:");
    ForwardObj *obj = [[ForwardObj alloc] init];
    if ([obj performSelector:anInvocation.selector]) {
    	// If the obj object responds, the message is forwarded to the obj object for processing
        [anInvocation invokeWithTarget: obj];
    } else {
    	// Otherwise, throwing an exception will not find the corresponding implementation of the method
        [self doesNotRecognizeSelector:anInvocation.selector];
    }
}

Four Dynamic Methods

1 Dynamic Add Method

We've used dynamic add methods in message forwarding

// Dynamically adding underlying implementations
// cls: For which dynamic add method
// Name: The name of the method to be added (method selector)
// IMP: Untyped function pointer address
// types: Type Encodings function parameters and return values
BOOL class_addMethod(Class cls, SEL name, IMP imp, const char *types) {
    if (!cls) return NO;

    rwlock_writer_t lock(runtimeLock);
    return ! addMethod(cls, name, imp, types ?: "", NO);
}

2 Dynamic Method Resolution

Similarly, in the process of message forwarding, it is the dynamic parsing of a method. Now let's talk about the type of dynamic parsing of another method, @dynamic
Dynamic: This word means dynamic. What kind of dynamic? Dynamic runtime, dynamic method, dynamic parsing, dynamic language!
A property marked by @dynamic is compiled without implementing its getter and setter methods, but with

  • On-the-fly, defer its implementation to run time, i.e. defer function resolution to run time
  • Static languages, on the other hand, make functional decisions at compile time

Please note the author and link for reprinting!
Reference material:
GNU
NS type encoding
Runtime Programming Guide
Objective-C Runtime
Objective-C programming
iOS Development:'Runtime'Details (1) Basic Knowledge

Published 5 original articles, won 1. Visits 1367
Private letter follow

Tags: encoding C Attribute hot update

Posted on Thu, 05 Mar 2020 22:04:57 -0500 by rofl90