Some space conversion methods in unity

1 model space - world space - observation space - clipping space

Modeling is carried out in model space, and the coordinates of the model are represented in model space.  
When the model is placed in the world coordinate system, the coordinates in world space are used to express the position of a model, so a corresponding point on the model must be transformed into coordinates in world space.  
Transformation from model space to world space   be called   Model transformation.  
In the Shader of Unity, the coordinates of model space are directly provided by the Renderer as the input of vertex Shader, and the semantics is POSITION. For example:

struct appdata
    float4 vertex : POSITION;

The rendered images we can see are obtained through the camera. In order to facilitate subsequent clipping, projection and other operations, it is also necessary to convert the model from model space to observation space after transforming it from model space to world space. The so-called observation space is a coordinate system with the camera position as the origin and the camera local coordinate axis as the coordinate axis.  
The transformation from world space to observation space is called observation transformation (view transformation).

After the coordinates are converted to the observation space, it is complex to directly use the flat truncated body of the camera for cutting (the boundary equation of the flat truncated body is difficult to intersect), so it needs to be converted to the cutting space. The idea of clipping space transformation is to scale the frustum so that the near clipping plane and the far clipping plane become squares, and the w component of the coordinate represents the clipping range. At this time, it is only necessary to simply compare the sizes of x,y,z and W components.  
The transformation from observation space to clipping space is called projection transformation.  
Note that although it is called projection transformation, the projection transformation does not make a real projection.

In vertex shaders, the correlation transformation matrices of model, observation and clipping spaces are generally the following:

UNITY_MATRIX_Munity_ObjectToWorldModel transformation matrix
UNITY_MATRIX_Vunity_MatrixVView transformation matrix
UNITY_MATRIX_Pglstate_matrix_projectionProjection transformation matrix
UNITY_MATRIX_VPunity_MatrixVPView projection transformation matrix
UNITY_MATRIX_MVmul(unity_MatrixV, unity_ObjectToWorld)Model view transformation
UNITY_MATRIX_MVPmul(unity_MatrixVP, unity_ObjectToWorld)Model view projection transformation

Note that in the latest version (the author uses Unity5.6.3p4), it is officially recommended to use the UnityObjectToClipPos method when converting coordinate points from model space to clipping space. This method is internally defined as:

mul(UNITY_MATRIX_VP, mul(unity_ObjectToWorld, float4(pos, 1.0));

The above method is more efficient than the following sequence.

mul(UNITY_MATRIX_MVP,  appdata.vertex);

In a fixed-point shader, the output is the coordinates in clipping space.

2 screen space

After the clipping operation, we need to project the coordinates of the clipping space to the screen space.  
There are mainly two steps:  
1. Homogeneous Division  
Use xyz to obtain NDC (normalized equipment coordinates) by dividing w components respectively. After this step, the coordinate points that can be seen become a cube with xy side length of 1 and z of - 1 to 1.  
2. Screen mapping  
The NDC coordinates are directly mapped with the screen length and width pixels to obtain the xy coordinates in the screen space. Note that although there is no depth in the screen space, the coordinates under the screen space still retain the depth value of z, which can be used for depth detection or other processing.

From crop space to screen space, unit directly. It is also important to remember that interpolation is performed directly by hardware from clipping space to screen space.

Then we get the semantics from the pixel shader as SV_ The coordinates of position input are basically useless.  
If you want to obtain the screen coordinates at this time, you can use VPOS or WPOS. In Unity, the two are synonymous. Of course, using the above semantics requires

#pragma target 3.0

xy in VPOS represents the pixel coordinates in the screen space. Note that the pixel coordinates here are the center point coordinates, not integers. For example, if the screen resolution is 400 * 300, the range of x is [0.5, 400.5], and the range of y is [0.5, 300.5]. The range of z is [0,1], 0 is the near clipping plane and 1 is the far clipping plane. The range of w needs to be divided according to the camera projection type. If it is perspective projection, its range is [1/near, 1/far], and if it is orthogonal projection, it is always 1.  
It should also be noted here that if VPOS or WPOS is used as the input of fs, SV cannot be used at the same time_ Position is used as input, so vs and fs need to be written as follows:

Shader "Unlit/Screen Position"
        _MainTex ("Texture", 2D) = "white" {}
            #pragma vertex vert
            #pragma fragment frag
            #pragma target 3.0

            // note: no SV_POSITION in this struct
            struct v2f {
                float2 uv : TEXCOORD0;

            v2f vert (
                float4 vertex : POSITION, // vertex position input
                float2 uv : TEXCOORD0, // texture coordinate input
                out float4 outpos : SV_POSITION // clip space position output
                v2f o;
                o.uv = uv;
                outpos = UnityObjectToClipPos(vertex);
                return o;

            sampler2D _MainTex;

            fixed4 frag (v2f i, UNITY_VPOS_TYPE screenPos : VPOS) : SV_Target
                // screenPos.xy will contain pixel integer coordinates.
                // use them to implement a checkerboard pattern that skips rendering
                // 4x4 blocks of pixels

                // checker value will be negative for 4x4 blocks of pixels
                // in a checkerboard pattern
                screenPos.xy = floor(screenPos.xy * 0.25) * 0.5;
                float checker = -frac(screenPos.r + screenPos.g);

                // clip HLSL instruction stops rendering a pixel if value is negative

                // for pixels that were kept, read the texture and output it
                fixed4 c = tex2D (_MainTex, i.uv);
                return c;

Common built-in variables related to camera and screen are shown in the table below.


3 viewport space

Students familiar with OpenGL Programming know that the gl interface has a glViewport, which is actually to determine the viewport space. Viewport space is a space that maps screen coordinates to the range (0, 0) to (1, 1).  
If you want to get the viewport space coordinates, you can use the following two methods.

fixed4 frag(float4 sp : VPOS) : SV_Target {
    return fixed(sp.xy/_ScreenParams.xy, 0.0, 1.0);

Here_ Screen resolutions are saved in ScreenParams.  
Another method is as follows:

struct vertOut {
    float4 pos : SV_POSITION;
    float4 srcPos : TEXCOORD0;

vertOut vert(appdata_base v) {
    vertOut o;
    o.pos = UnityObjectToClipPos(v.vertex);
    o.srcPos = ComputeScreenPos(o.pos);
    return o;

fixed4 frag(vertOut i) : SV_Target {
    float2 wcoord = (i.srcPos.xy/i.scrPos.w);
    return fixed4(wcoord, 0.0, 1.0);

Here, ComputeScreenPos does not directly perform perspective division because interpolation is linear and must be performed after perspective division. Therefore, we must manually perform it in fs.


Tags: Unity

Posted on Fri, 03 Dec 2021 05:35:48 -0500 by Hopps