Binding of data members
Here's a question:
extern float x; class Point3d { public: Point3d(float, float, float); float X() const { return X; } void X(float new_x) const { x = new_x; } //... private: float x, y, z; };
Which X does Point3d::X() return, the one inside the class or the one on the first line? Now many people must say that the class is internal. It is true that it is inside the class, but not necessarily before. In order to avoid this situation, there was this defensive code style before:
- Declare all data members at the beginning of the class to ensure correct binding:
class Point3d { private: float x, y, z; public: float X() const { return X; } //... };
- Put all inline function definitions outside the class declaration:
class Point3d { public: Point3d(float, float, float); float X() const; void X(float new_x) const; //... }; //Define on the outside inline float Point3d:: X() const { return X; } //...
However, the analysis of the member function itself will not start until the declaration of the whole class appears, so the data member binding operation in an inline function will not start until the declaration of the whole class, so now:
extern int x; class Point3d { public: //Now the analysis of the function body will be delayed until after the right brace of the class float X() const { return x; } //... private: float x; }; //Now it is analyzed here to eliminate the previous problems
Unfortunately, the parameter list of member functions cannot be spared. The name of the parameter list will be determined by the first encounter:
typedef int length; class Point3d { public: //Here, both length s are determined to be int void mumble(length val) { _val = val; } length mumble() { return _val; } //... private: typedef float length; //Here, the occurrence of length causes the previous operation to be regarded as illegal length _val; //... };
You have to use a defensive programming style here, so it's best to put the typedef at the beginning of the class.
Layout of data members
In the C + + standard, the arrangement of members in the same access domain can only comply with the point that "members that appear later have higher addresses in the object". Therefore, members are not necessarily arranged continuously. Something can be inserted in the middle, such as boundary alignment, filled memory, etc. The compiler also synthesizes internally used data members, such as vptr. vptr is usually placed at the end of all explicitly declared members, but there are also compilers placed before them
Access to data members
Leave a question first and answer it later:
Point3d origin, *pt = &origin; origin.x = 0.0; pt->x = 0.0; //Q: what is the difference between the two methods?
Static data member
Static data members are treated as global variables visible within the class. Each static data member has only one entity and is stored in the data segment of the program. Because the static member is not in the class object, it is not necessary to access the static member through the class object.
Taking the address of a static data member will get a pointer to its data type instead of a pointer to its class member. Or the following sentence: the static member is not in a class object:
//chunkSize is static const int &Point3d::chunkSize; //You get a const int*
If there are two classes that declare a static data member with the same name, if they are placed in the data segment of a program, it will lead to naming conflict. The compiler's solution is to code each static data member implicitly.
Non static data member
Non static data members are stored directly in objects, and there is no way to access them directly like static members. When programmers deal with non static data members in member functions, they will have an implicit object (this pointer) to manipulate them.
When the compiler operates on non static data members, it will add the offset of the data member to the starting address of the class object:
origin._y = 0.0; //Here & origin_ Y equals & origin + (& point3d:: _y - 1)
Why - 1? Because the pointer to the data member must be added with 1, which is used by the compilation system to distinguish whether the pointer to the data member really points to the data member or NULL.
Back to the initial question, here is the answer:
Point3d origin, *pt = &origin; origin.x = 0.0; pt->x = 0.0; //Q: what is the difference between the two methods?
If Point3d is a derived class and has a virtual base class in its inheritance structure, and x is a member inherited from the virtual base class, there will be a significant difference. At this time, we don't know which type pt points to and how many offsets to add, so the access operation will be delayed to the execution period. Using origin will not have this problem
Inheritance and data members
C + + what an inherited class object shows is the sum of its own object and the object of the base class. The arrangement order of the members of the two classes is not specified and can be arranged freely. However, most compilers generally have members of base classes before derived classes.
If there are 2D and 3D classes:
class Point2d { public: //... private: float x, y; }; class Point3d { public: //... private: float x, y, z; };
What are the differences between the two classes written separately and the classes with inheritance relationship that 3d inherits 2d and 2d inherits 1d? Next, we will discuss this issue according to the situation.
Inheritance without polymorphism
Because programmers want both 2D and 3D objects to share the same entity and continue to use entities related to type properties. So there is inheritance. 3D inherits 2D, so you can share the data itself and the data processing method. Generally, inheritance does not increase the burden of space or access time.
class Point2d { Point2d(float x = 0.0, float y = 0.0) : _x(x), _y(y) {} float x() { return _x; } float y() { return _y; } void setX(float newX) { _x = newX; } void setY(float newY) { _y = newY; } void operator+=(const Point2d &rhs) { _x += rhs.x(); _y += rhs.y(); } protected: float _x, _y; }; class Point3d : public Point2d { public: Point2d(float x = 0.0, float y = 0.0, float z = 0.0) : Point2d(x, y), _z(z) {} float z() { return _z; } void setZ(float newZ) { _z = newZ; } void operator+=(const Point3d &rhs) { Point2d::operator+=(rhs); _z += rhs.z(); } protected: float _z; };
The advantage of this design is that the code of x and y and the code of management are in 2d, and the code of z is in 3d, which has good locality. It can also show a close relationship between the two classes. But there are also disadvantages:
- Inexperienced people may design some repetitive operations with the same operation. For example, the above constructor and overloaded + =, which are not used as inline functions in 2d, are used in 3d. So it is very important to choose which functions to use as inline functions
- Dividing a class into multiple layers may expand the required space
Let's talk about the second point:
Suppose you have a class
class Concrete { public: //... private: int val; //Occupation 4B char c1, c2, c3; //Total occupation 3B }; //Finally, align to 8B plus 1B //Now divide this class into three to form an inheritance chain class Concrete1 { //... private: int val; char c1; }; class Concrete2 : public Concrete1 { //... private: char c2; }; class Concrete3 : public Concrete2 { //... private: char c3; };
Can you guess the total size of the Concrete 3 above? Is it still the same 8B as the Concrete before layering? The fact is that the Concrete3 object has 16B, because Concrete1 has 5B in total, and the alignment becomes 8B; Guess how big Concrete 2 is? Will it inherit 5B of Concrete1, add its own 1B, and then fill 2B for alignment? No, no, Concrete2 directly inherits 8B after Concrete1 alignment! Add your own 1B after 8B, and then align it into 12B! For the same reason, Concrete3 becomes 16B.
Why do you need the above alignment operation? Someone must ask
Plus polymorphism
class Point2d { Point2d(float x = 0.0, float y = 0.0) : _x(x), _y(y) {} float x() { return _x; } float y() { return _y; } void setX(float newX) { _x = newX; } void setY(float newY) { _y = newY; } virtual float z() { return 0.0; } virtual void setZ(float) {} virtual void operator+=(const Point2d &rhs) { _x += rhs.x(); _y += rhs.y(); } protected: float _x, _y; };
Some virtual functions are added here to realize the polymorphism. For example, we can now pass 2d reference formal parameters and 3d arguments to realize the addition and subtraction of three coordinates, etc. Unfortunately, there is a price to pay for this operation. We bring additional space and time burden to Point2d:
- There is a vtbl to store all virtual functions, and one or two slots are added to support RTTI
- vptr is imported into each class object to point to its corresponding vtbl
- The constructor is extended so that the initial value of vptr is set to point to the correct vtbl
- The destructor is extended so that vptr can be destructed
There is also a problem of where vptr is placed, which is mainly divided into the head and tail of the object. vptr placed at the end can preserve C language compatibility, so it can also be used in C code. (in 3.2a, there is no virtual function on the left, and there is virtual function on the right, so there is vptr)
Later, the emergence of virtual inheritance and abstract base classes made vptr placed at the beginning of the object, so it would be more convenient to call virtual functions through pointers to members under multiple inheritance. The price is the loss of compatibility with the C language
3.3 shows the layout after the derived class inherits the vptr of the base class, where vptr is placed at the end.
multiple inheritance
Single inheritance provides a natural polymorphic form. As can be seen in figure 3.2a and 3.3 above, the objects of base class and derived class start from the same address, that is, if the base class pointer points to a derived class, the compiler does not need to modify the address, which can occur naturally.
But look at 3.2b. The derived class has vptr, which is still placed at the beginning of the object, and the base class has no virtual function (no vptr). At this time, it does not start at the same address, breaking the natural polymorphism. In this case, the compiler needs to adjust the address to convert a derived class to a base class type
class Point2d { public: //... there are virtual functions, so there is vptr protected: float _x, _y; }; class Point3d : public Point2d { public: //... protected: float _z; }; class Vertex { public: //... there are virtual functions, so there is vptr protected: Vertex *next; }; class Vertex3d : public Point3d, public Vertex { public: //... protected: float mumble; };
For a multiple derived object, assign its address to the leftmost base class in the inheritance list (Point3d in the above code). To specify the base class after the inheritance list, you need to modify the address to: plus the size of the base class object in the middle:
Vertex3d v3d; Vertex *pv; Point2d *p2d; Point3d *p3d; pv = &v3d; //This assignment operation will be internally converted into: //pv = (Vertex*)(((char*)&v3d) + sizeof(Point3d));
Because the addresses of multiple derived objects are consistent with those of the leftmost base class, there is no additional cost to access the members of the leftmost base class. What about accessing the members of the following base class? In fact, it is not necessary. The position of members is determined at compile time, so only one displacement operation is needed to access them.
Virtual inheritance
To be understood
Pointer to data member
class Point3d { public: virtual ~Point3d(); //... private: static Point3d origin; float x, y, z; };
Now a Point3d object has three members x, y, z and a vptr. The static member origin is placed outside the object, and vptr may be placed at the beginning or the end of the object. What does it mean to take the address of a data member?
&Point3d::z;
The purpose of the above sentence is to obtain the offset of z in the class object, at least the sum of the sizes of x and y. because we don't know whether vptr is at the head or tail, the value obtained in the above sentence is either 8 or 12 (vptr is 4 in 32 bits and float is also 4). In fact, if you really get the address of the data member, the value will always be 1 more. As mentioned before, in order to distinguish whether the pointer points to the data member or not. So when you really use this value, subtract 1 first.
Now it's easy to distinguish & point3d:: z from & origin. Z. the former will get the offset value in the class, and the latter will get the address of a data member bound to the object, which is the real address of the member in memory. The latter subtracts the offset value of Z and adds 1 to get the starting address of origin
Consider the following:
class Base1 { int val1; }; class Base2 { int val2; }; class Derived : Base1, Base2 { ... }; void func1(int *dmp, Derived *pd) { //The first parameter is expected to be a pointer to a Derived member pd->*dmp; } void func2(Derived *pd) { int *bmp = &Base2::val2; //Now bmp == (0 + 1) is 1 func1(bmp, pd); //What is passed here is a pointer to the base class member //In func1, Pd - > * DMP will be val1! //val2 is at (4 + 1), i.e. 5! }
To solve the above problem, func1 must be called by passing func1(bmp + sizeof(Base1), pd)