Ray tracing rendering practice: low difference sequence and importance sampling to accelerate convergence!

preface

Use low difference sequence and importance sampling to accelerate the convergence of ray tracing! The following figure shows the result of only 1000 spp, and no noise can be found. You should know that the example drawings of previous blogs started with 4000 spp:

Let's see what we've done:

  1. Low difference sequence noise reduction:
  1. Noise reduction of importance sampling:
  1. Sample the importance of hdr map (from left to right are the original image, sample and result):
  1. chong resampling:

In the last blog Micro plane theory and Disney BRDF, strictly follow physics! In this paper, we have studied the micro plane theory, realized the complete BRDF of Disney principle through the given formula, and realized the physics based rendering

However, at this stage, we calculate ray tracing through hemispherical uniform sampling. Although uniform sampling can get good results for diffuse reflection, it is difficult to achieve convergence for BRDF, especially when the roughness is very low, because we waste a lot of samples:

As shown in the figure, for the 10 samples in the figure above, only 3 are really useful, and the efficiency is very low. In this blog, we will strengthen the sampling strategy through some mathematical methods:

  1. From the sample: replace the pseudo-random sequence with a spatially more uniform low difference sequence
  2. In terms of distribution, the importance sampling strategy in probability theory is used to accelerate the convergence process of rendering equation

I won't say much. Let's start

1. Introduction to low difference sequence

The rendering equation is a difficult integral:

Usually, we use Monte Carlo sampling to estimate the value of difficult integral. For the estimation of a function, the following steps are required:

  1. Select any sample in the integral domain x x x
  2. Calculate the probability of selecting the sample p d f ( x ) pdf(x) pdf(x)
  3. Calculation sample x x Function value corresponding to x f ( x ) f(x) f(x)
  4. The contribution of this sampling is f ( x ) p d f ( x ) \frac{f(x)}{pdf(x)} pdf(x)f(x)​

Here is a conclusion that the more uniform the distribution of the samples in the sample space, the more accurate the estimation of the integral will be. The reason is that the uniform sample will not miss any corner of the integrand function, so the estimated value will be more accurate

with y = x y=x Take y=x as an example, select 5 samples and assume p d f ( x ) pdf(x) pdf(x) constant 1 1.0 − 0.0 \frac{1}{1.0 - 0.0} 1.0 − 0.01, it can be seen that the estimated value obtained by evenly selecting 0.2, 0.4, 0.6, 0.8 and 1.0 should be accurate to the sampling of 0.1, 0.7, 0.8, 0.9 and 1.0:

Selecting uniform samples is very important for the effect of sampling. So we introduce the difference to measure a group of samples X X The uniformity of X, the lower the difference, the more uniform the sample is. For a person who has N N N samples and V V V point set of sample space per unit volume X X 10. We take any volume as v v The subspace of v, which contains n n n points, then:

D i s c ( X ) = ∣ n / N v / V − 1.0 ∣ Disc(X) = \left| \frac{n/N}{v/V} -1.0 \right| Disc(X)=∣∣∣∣​v/Vn/N​−1.0∣∣∣∣​

For example, if a cup of water is 100ml, there are 3000 water molecules. If you hit 10ml with a spoon, you should get 300 water molecules, which shows that the distribution of water is absolutely uniform and there is no difference

More intuitively, the following are the points generated on the two-dimensional plane using relatively uniform low difference sequences and ordinary random number sequences. It can be seen intuitively that when the number of samples is 256, the low difference sequence fills the space more evenly:

It can be seen that the low difference sequence uniformly covers the sample area, and there are many holes for pseudo-random numbers

Note:
Monte Carlo integration is unbiased, but some corners of the integration domain are difficult to achieve when the number of samples is small. It can still be obtained when the number of samples is up
For the low difference sequence, a very small number of samples can evenly take care of the region, so the low difference sequence can converge faster when the number of samples is low

2. sobol sequence and its implementation

I have read the principle of sobol sequence three times, but I still don't understand it. After all, it's mathematical. It's all made by monsters... Fortunately, I finally managed to understand the meaning of those parameters and symbols, and was able to piece together the code implementation

here Is the address of the paper on sobol principle, and here Is a good note, which is helpful for code implementation. last here Is to calculate the generating polynomial parameters of the generating matrix

The original text is relatively complex. Let's just pick some important ones to summarize a little, and then the code implementation

2.1. Generating matrix and recurrence formula

sobol sequence is a recursive sequence. The i-th number of the sequence depends on the i-1 st number of the sequence. In addition, sobol sequences require a generating matrix v v v. It is a binary matrix, usually in the form of a one-dimensional array:

v = [ 0 1 0 0 0 0 1 1 1 0 0 1 0 1 1 0 ] = [ 4 3 9 6 ] = [ v ⃗ 1 v ⃗ 2 v ⃗ 3 v ⃗ 4 ] v =\begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ \end{bmatrix} = \begin{bmatrix} 4 \\ 3 \\ 9 \\ 6\\ \end{bmatrix} = \begin{bmatrix} \vec v_1 \\ \vec v_2 \\ \vec v_3 \\ \vec v_4 \\ \end{bmatrix} v=⎣⎢⎢⎡​0010​1001​0101​0110​⎦⎥⎥⎤​=⎣⎢⎢⎡​4396​⎦⎥⎥⎤​=⎣⎢⎢⎡​v 1​v 2​v 3​v 4​​⎦⎥⎥⎤​

In addition, the expression of sobol is based on bitwise XOR operation (bit xor), where x i , j x_{i,j} xi,j = the j-th bit of the ith sobol number, and i j i_j ij , represents the jth bit of integer i, v i , j v_{i,j} vi,j , represents row i and column j of the generation matrix. The expression of the jth bit of the ith sobol number is as follows:

x i , j = i 1 v 1 , j   ⊕   i 2 v 2 , j   ⊕   i 3 v 3 , j   ⋯ x_{i,j}= i_1v_{1,j} \ \oplus \ i_2v_{2,j} \ \oplus \ i_3v_{3,j} \ \cdots xi,j​=i1​v1,j​ ⊕ i2​v2,j​ ⊕ i3​v3,j​ ⋯

For each x i x_i Each bit of xi , must be connected with all v i v_i vi. bitwise XOR is somewhat troublesome, but there is a special case: if the binary representation of i and i-1 is only 1 bit different, such as 5 = ( 101 ) 2 5=(101)_2 5 = (101) 2} and 7 = ( 111 ) 2 7=(111)_2 7 = (111) 2, then pass x i − 1 x_{i-1} xi − 1 is the bit of the difference on the XOR x i x_i xi, equivalent to that we get the recursive formula:

x i , j = x i − 1 , j ⊕ v c i − 1 , j x_{i,j}=x_{i-1,j}\oplus v_{c_{i-1},j} xi,j​=xi−1,j​⊕vci−1​,j​

among c i − 1 c_{i-1} ci − 1 is the bit where i and i-1 differ. When an integer changes from 0 to i, it needs to be changed k times. k is the number of 1 in the binary representation of i. take 0 to 11 (binary 1011) as an example:

  1. Change the first bit from 0000 to 0001, c 0 = 0 c_0=0 c0​=0
  2. Change the second bit from 0000 to 0011, c 1 = 1 c_1=1 c1​=1
  3. Change the 4th bit from 0000 to 1011, c 2 = 3 c_2=3 c2​=3

So for the recurrence formula:

x i , j = x i − 1 , j ⊕ v c i − 1 , j = ( x i − 2 , j ⊕ v c i − 2 , j ) ⊕ v c i − 1 , j = ⋯ x_{i,j} =x_{i-1,j}\oplus v_{c_{i-1},j} =(x_{i-2,j}\oplus v_{c_{i-2},j})\oplus v_{c_{i-1},j} = \cdots xi,j​=xi−1,j​⊕vci−1​,j​=(xi−2,j​⊕vci−2​,j​)⊕vci−1​,j​=⋯

In addition, it is stipulated that x 0 , j = 0 x_{0,j}=0 x0,j = 0, and 0 exclusive or any number does not affect the result, so it finally becomes:

x i , j = v c i − 1 , j ⊕ v c i − 1 , j ⊕ ⋯ ⊕ v 2 , j ⊕ v 1 , j x_{i,j}= v_{c_{i-1},j}\oplus v_{c_{i-1},j} \oplus \cdots \oplus v_{2,j} \oplus v_{1,j} xi,j​=vci−1​,j​⊕vci−1​,j​⊕⋯⊕v2,j​⊕v1,j​

This is just a calculation x i x_i xi j j j bit s, and for the whole x i x_i xi, because the XOR operation satisfies interchangeability, that is, a^b = b^a, here we assume that our generation matrix is a 32 x 32 generation matrix represented by a uint array with a length of 32 and a content of 32 bit s, that is:

uint v[32] =  {v1, v2, .... v32}

Consider vi as a row vector, so we can use an integer XOR operation to x i x_i XOR is executed once for all 32 bit s of xi !! So the recurrence formula becomes:

x i = v c i − 1 ⊕ v c i − 1 ⊕ ⋯ ⊕ v 2 ⊕ v 1 x_{i}= v_{c_{i-1}}\oplus v_{c_{i-1}} \oplus \cdots \oplus v_{2} \oplus v_{1} xi​=vci−1​​⊕vci−1​​⊕⋯⊕v2​⊕v1​

If sequence 0 ∼ i 0 \sim i Both 0 ∼ i are satisfied i i i and i − 1 i-1 The binary representation of i − 1 is only 1 bit different, so we only need to scan i i Each bit of i, at j j When the j position encounters 1, we will sum with the second position of the generated matrix j j Line j does XOR and gets the second i i The code of i sobol numbers is:

// The i th sobol number
// v is the generating matrix of a certain dimension
float sobol(unsigned int v[32], unsigned int i) {
    unsigned int result = 0;

    for(unsigned int j = 0; i; i >>= 1, j++)
        if(i & 1)
            result ^= v[j];

    return result * (1.0f/(float)0xFFFFFFFF);
}

Unfortunately, from 0 ∼ n 0 \sim n Not all natural numbers of 0 ∼ n are satisfied i i i and i − 1 i-1 The binary representation of I − 1 is only 1 bit different. So gray code is introduced. Gray code is an out of order mapping, in which the binary expressions of gray(i-1) and gray(i) differ by only one bit

So we can use gray(i) instead i i i. Then the recursive version of sobol formula is used to generate the second generation i i i sobol number. The expression of gray code is as follows:

g r a y ( i ) = i   ⊕ [ i 2 ] gray(i)=i \ \oplus \left[ \frac{i}{2} \right] gray(i)=i ⊕[2i​]

Square brackets are rounding operations. For programming languages, just divide by two

2.2. Generation of Sobol sequence

Because sobol is a sequence of points defined in N-dimensional space, each dimension has its own generation matrix. The following code uses the precomputed 3 x 32 x 32 generation matrix (the dimension subscript starts from 1), which can generate the most 2 32 2^{32} 232 sobol points in three-dimensional space. The above two-dimensional scatter diagram is also generated by this program:

#include <bits/stdc++.h>

using namespace std;

typedef unsigned int uint;

// V[d] represents the generation matrix of dimension d 
uint V[4][32] = { 
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	// dimension 1 
	2147483648, 1073741824, 536870912, 268435456, 134217728, 67108864, 33554432, 16777216, 8388608, 4194304, 2097152, 1048576, 524288, 262144, 131072, 65536, 32768, 16384, 8192, 4096, 2048, 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1,
	// dimension 2
	2147483648, 3221225472, 2684354560, 4026531840, 2281701376, 3422552064, 2852126720, 4278190080, 2155872256, 3233808384, 2694840320, 4042260480, 2290614272, 3435921408, 2863267840, 4294901760, 2147516416, 3221274624, 2684395520, 4026593280, 2281736192, 3422604288, 2852170240, 4278255360, 2155905152, 3233857728, 2694881440, 4042322160, 2290649224, 3435973836, 2863311530, 4294967295,
	// dimension 3
	2147483648, 3221225472, 1610612736, 2415919104, 3892314112, 1543503872, 2382364672, 3305111552, 1753219072, 2629828608, 3999268864, 1435500544, 2154299392, 3231449088, 1626210304, 2421489664, 3900735488, 1556135936, 2388680704, 3314585600, 1751705600, 2627492864, 4008611328, 1431684352, 2147543168, 3221249216, 1610649184, 2415969680, 3892340840, 1543543964, 2382425838, 3305133397
};

// Green code 
uint grayCode(uint i) {
	return i ^ (i>>1);
}

// Generate the ith sobol number of dimension dimension
float sobol(unsigned int vectors[][32], unsigned int dimension, unsigned int i) {
    unsigned int result = 0;

    for(unsigned int j = 0; i; i >>= 1, j++)
        if(i & 1)
            result ^= vectors[dimension][j];

    return result * (1.0f/(float)0xFFFFFFFF);
}

int main() {
	
	// Generate sobol sequence of dimension D 
	int N = 20;
	for(int i=0; i<N; i++) {
		float x = sobol(V, 1, grayCode(i)); cout<<x<<" ";
		float y = sobol(V, 2, grayCode(i)); cout<<y<<" ";
		float z = sobol(V, 3, grayCode(i)); cout<<z<<" ";	
		cout<<endl;
	}

	return 0;
}

The code refers to the operation of sobol sequence by the open source renderer blender. stay here You can view the source code of this part of blender. You can also talk to sobol official website The results of the routine are compared. The routine on the left is the result under linux, and the output of our program on the right:

Well, the result is correct!

2.3. Calculation of generating matrix

Simply put, it is emmm. The generation matrix is represented by an array, v k , j v_{k,j} vk,j , represents the second k k Line k j j Element of column j. Each bit of has a recursive formula, and then a a A is a s i s_i si is an integer of two bit s, a j a_j aj represents the j-th bit of a, and the formula is as follows:

But because our v matrices are all 32-bit integers, we can multiply them on both sides of the equation at the same time 2 32 2^{32} 232, and then divide the finally generated sobol number by 2 32 2^{32} 232 maps back to the uniform float range of 0 ~ 1. So the final v matrix actually stores v ∗ 2 32 v * 2^{32} v * 232, so there are:

v k , j = m k , j ∗ 2 32 − k v_{k,j}=m_{k,j} * 2^{32-k} vk,j​=mk,j​∗232−k

Then there is the formula of violence. Here is the code I wrote directly. The code can be based on the input parameters (these parameters can be sobol official website Download above) generate the generation matrix of the corresponding dimension, but only 32-bit is supported. The binary generation matrix of each dimension is 32 x 32, and then compressed into a 32-length array. Take the third dimension as an example. The parameters should be entered as follows:

The code of the routine for generating the generation matrix of the D-th dimension is as follows:

#include <bits/stdc++.h>

using namespace std;

typedef unsigned int uint;

float sobol(unsigned int vectors[][32], unsigned int dimension, unsigned int i) {
    unsigned int result = 0;

    for(unsigned int j = 0; i; i >>= 1, j++)
        if(i & 1)
            result ^= vectors[dimension][j];

    return result * (1.0f/(float)0xFFFFFFFF);
}

// Green code 
uint grayCode(uint i) {
	return i ^ (i>>1);
}

// Generate the kth direction number M_k 
void generate_M_k(vector<uint>& M, uint S, uint k, uint a[]) {
	unsigned int M_k = 0;
	for(int i=1; i<=S-1; i++) {
		M_k ^= (1<<i) * a[i] * M[k-i];
	}
	M_k ^= ((1<<S) * M[k-S]) ^ M[k-S];
	M.push_back(M_k);
}

int main() {
	
	// parameter 
	int D = 3;
	int S = 2;
	int A = 1;
	vector<uint> M = {0, 1, 3};	// Subscript starts with 1  
	
	
	// Generate polynomials based on the value of A 
	uint a[S];	// The subscript starts from 1 and there are S-1 elements in total 
	uint A_ = A;
	uint i = S-1;
	do {
		a[i--] = A_ & 1;
		A_ = A_>>1;
	} while(A_);
	cout<<"Polynomial array a[1] ~ a[S-1]"<<endl;
	for(int i=1; i<S; i++) {
		cout<<a[i]<<" ";
	}
	cout<<endl<<endl;
	
	
	// Generation direction number 
	for(int k=S+1; k<=32; k++) {
		generate_M_k(M, S, k, a);
	}
	cout<<"Direction number array m[1] ~ m[32]"<<endl;
	for(int i=1; i<=32; i++) {
		cout<<M[i]<<" ";
	}
	cout<<endl<<endl;
	
	
	// Convert m to v and write the generation matrix of dimension d 
	uint V[33][32];
	for(int k=1; k<=32; k++) {
		V[D][k-1] = M[k] * (1<<(32-k));
	}
	cout<<"Generating matrix v[1] ~ v[32]"<<endl;
	for(int k=1; k<=32; k++) {
		cout<<V[D][k-1]<<", ";
	}
	cout<<endl<<endl;
	
	
	// Generate sobol sequence of dimension D 
	cout<<"Dimension is "<<D<<" of sobol sequence"<<endl;
	int N = 30;
	for(int i=0; i<N; i++) {
		float x_i = sobol(V, D, grayCode(i));
		cout<<x_i<<endl;
	}
	

	return 0;
}

The results can be compared with standard sobol routines. Taking the third dimension as an example, we can see that the result is correct:

Note:
My code does not give how to find the generation matrix of the first dimension. In fact, there are:
vi = 2^i
Namely:
V[1][] = {2147483648, 1073741824, 536870912 ... 4, 2, 1}
The sequence obtained by using this matrix (actually the identity matrix) (also use gray code to do i) is one-dimensional sobol number

3. Use sobol sequence in shader

So far, we have mastered the method of generating sobol sequence of N dimension. The 0 ~ n numbers of each dimension are a separate random number sequence, but because sobol has the characteristic of uniformly filling the space, it only needs to fill the integral field with a few dimensions

There are also some questions: for N D-dimensional sobol vectors, there are N numbers in total. For each frame, each pixel and each rebound of light, how to select these numbers as random numbers?

  • For frame i, we select the ith sobol vector
  • Suppose that two random numbers are required for a rebound. For the first rebound of frame I, the numbers in the first and second dimensions of the i-th sobol vector are selected as the random numbers. For the second rebound, select 3 and 4 dimensions, and so on

Note:
The website given above can generate sobol vectors of 21201 dimensions at most. If you can understand what I'm talking about above, you can understand why 8 random numbers are required for each rebound. The maximum light depth of blender is 21201 ÷ 8 (the following figure is from blender document):

Using sobol sequence in the shader is also very simple, because our hemispherical area fraction requires two random variables, namely ϕ \phi ϕ and θ \theta θ, We only need sobol sequences of two dimensions. Assuming that we support a maximum of 4 rebounds, we need a sobol generation matrix of up to 8 dimensions. Add to shader:

// sobol generating matrix of 1 ~ 8 dimensions
const uint V[8*32] = {
    2147483648, 1073741824, 536870912, 268435456, 134217728, 67108864, 33554432, 16777216, 8388608, 4194304, 2097152, 1048576, 524288, 262144, 131072, 65536, 32768, 16384, 8192, 4096, 2048, 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1,
    2147483648, 3221225472, 2684354560, 4026531840, 2281701376, 3422552064, 2852126720, 4278190080, 2155872256, 3233808384, 2694840320, 4042260480, 2290614272, 3435921408, 2863267840, 4294901760, 2147516416, 3221274624, 2684395520, 4026593280, 2281736192, 3422604288, 2852170240, 4278255360, 2155905152, 3233857728, 2694881440, 4042322160, 2290649224, 3435973836, 2863311530, 4294967295,
    2147483648, 3221225472, 1610612736, 2415919104, 3892314112, 1543503872, 2382364672, 3305111552, 1753219072, 2629828608, 3999268864, 1435500544, 2154299392, 3231449088, 1626210304, 2421489664, 3900735488, 1556135936, 2388680704, 3314585600, 1751705600, 2627492864, 4008611328, 1431684352, 2147543168, 3221249216, 1610649184, 2415969680, 3892340840, 1543543964, 2382425838, 3305133397,
    2147483648, 3221225472, 536870912, 1342177280, 4160749568, 1946157056, 2717908992, 2466250752, 3632267264, 624951296, 1507852288, 3872391168, 2013790208, 3020685312, 2181169152, 3271884800, 546275328, 1363623936, 4226424832, 1977167872, 2693105664, 2437829632, 3689389568, 635137280, 1484783744, 3846176960, 2044723232, 3067084880, 2148008184, 3222012020, 537002146, 1342505107,
    2147483648, 1073741824, 536870912, 2952790016, 4160749568, 3690987520, 2046820352, 2634022912, 1518338048, 801112064, 2707423232, 4038066176, 3666345984, 1875116032, 2170683392, 1085997056, 579305472, 3016343552, 4217741312, 3719483392, 2013407232, 2617981952, 1510979072, 755882752, 2726789248, 4090085440, 3680870432, 1840435376, 2147625208, 1074478300, 537900666, 2953698205,
    2147483648, 1073741824, 1610612736, 805306368, 2818572288, 335544320, 2113929216, 3472883712, 2290089984, 3829399552, 3059744768, 1127219200, 3089629184, 4199809024, 3567124480, 1891565568, 394297344, 3988799488, 920674304, 4193267712, 2950604800, 3977188352, 3250028032, 129093376, 2231568512, 2963678272, 4281226848, 432124720, 803643432, 1633613396, 2672665246, 3170194367,
    2147483648, 3221225472, 2684354560, 3489660928, 1476395008, 2483027968, 1040187392, 3808428032, 3196059648, 599785472, 505413632, 4077912064, 1182269440, 1736704000, 2017853440, 2221342720, 3329785856, 2810494976, 3628507136, 1416089600, 2658719744, 864310272, 3863387648, 3076993792, 553150080, 272922560, 4167467040, 1148698640, 1719673080, 2009075780, 2149644390, 3222291575,
    2147483648, 1073741824, 2684354560, 1342177280, 2281701376, 1946157056, 436207616, 2566914048, 2625634304, 3208642560, 2720006144, 2098200576, 111673344, 2354315264, 3464626176, 4027383808, 2886631424, 3770826752, 1691164672, 3357462528, 1993345024, 3752330240, 873073152, 2870150400, 1700563072, 87021376, 1097028000, 1222351248, 1560027592, 2977959924, 23268898, 437609937
};

// Green code 
uint grayCode(uint i) {
	return i ^ (i>>1);
}

// Generate the ith sobol number of dimension d
float sobol(uint d, uint i) {
    uint result = 0;
    uint offset = d * 32;
    for(uint j = 0; i; i >>= 1, j++) 
        if(i & 1)
            result ^= V[j+offset];

    return float(result) * (1.0f/float(0xFFFFFFFFU));
}

// Generate the two-dimensional random vector required for the b rebound of frame i
vec2 sobolVec2(uint i, uint b) {
    float u = sobol(b*2, grayCode(i));
    float v = sobol(b*2+1, grayCode(i));
    return vec2(u, v);
}

For frame i, we take the ith sobol vector, and for rebound b, we take [ b ∗ 2 , b ∗ 2 + 1 ] [b*2, b*2+1] The number of [b * 2,b * 2 + 1] dimensions is enough. Pass in the frameCounter variable of the frame counter, and then generate points on the two-dimensional plane. Look at the effect, you can see that sobol points are evenly filling the whole space:

Of course, you can also try different dimensions, and you will find that their filling patterns are different

Take the two-dimensional sobol vector as a random variable ξ 1 \xi_1 ξ 1. And ξ 2 \xi_2 ξ 2. Modify the code of generating hemisphere vector in pathTracing as follows, where bounce is the number of light bounces, starting from 0:

vec3 SampleHemisphere(float xi_1, float xi_2) {
    float z = xi_1;
    float r = max(0, sqrt(1.0 - z*z));
    float phi = 2.0 * PI * xi_2;
    return vec3(r * cos(phi), r * sin(phi), z);
}

...

vec3 pathTracing(HitResult hit, int maxBounce) {
	for(int bounce=0; bounce<maxBounce; bounce++) {
		...	
		
		vec2 uv = sobolVec2(frameCounter+1, bounce);
		vec3 L = SampleHemisphere(uv.x, uv.y);
		L = toNormalHemisphere(L, hit.normal);	
		
		...
	}
}

Don't forget to project L back to the normal hemisphere:

// Project the vector v onto the normal hemisphere of N
vec3 toNormalHemisphere(vec3 v, vec3 N) {
    vec3 helper = vec3(1, 0, 0);
    if(abs(N.x)>0.999) helper = vec3(0, 0, 1);
    vec3 tangent = normalize(cross(N, helper));
    vec3 bitangent = normalize(cross(N, tangent));
    return v.x * tangent + v.y * bitangent + v.z * N;
}

Then let's look at the effect. The picture on the left shows the first 200 iterations. The effect is very poor, just like the illusion seen after skating. The results of 4000 iterations on the right are excellent. No noise can be seen, showing the powerful power of low difference sequence:

Why do these "reflection" figures appear when there is no fitting? This is because all pixels use the same sequence for light ejection, just like a smooth mirror, which is of course not good. One solution is to generate a two-dimensional random vector with pixel coordinates as seeds, and then add this vector to the sequence of the frame. If it exceeds [ 0 , 1 ] [0,1] The value of [0,1] is modeled back. This strategy is called crankey Patterson rotation. It is simple to implement and has good results:

uint wang_hash(inout uint seed) {
    seed = uint(seed ^ uint(61)) ^ uint(seed >> uint(16));
    seed *= uint(9);
    seed = seed ^ (seed >> 4);
    seed *= uint(0x27d4eb2d);
    seed = seed ^ (seed >> 15);
    return seed;
}

...

vec2 CranleyPattersonRotation(vec2 p) {
    uint pseed = uint(
        uint((pix.x * 0.5 + 0.5) * width)  * uint(1973) + 
        uint((pix.y * 0.5 + 0.5) * height) * uint(9277) + 
        uint(114514/1919) * uint(26699)) | uint(1);
    
    float u = float(wang_hash(pseed)) / 4294967296.0;
    float v = float(wang_hash(pseed)) / 4294967296.0;

    p.x += u;
    if(p.x>1) p.x -= 1;
    if(p.x<0) p.x += 1;

    p.y += v;
    if(p.y>1) p.y -= 1;
    if(p.y<0) p.y += 1;

    return p;
}

Where pix is the plane pixel coordinate of ± 1 in the NDC coordinate system, weight and height are the screen resolution. We use pixel coordinates as random seeds. Let's look at the effect again. We can see that in the case of 50 spp, the result of using pseudo-random number is on the left and the result of using sobol + Cranley Patterson Rotation is on the right:

You can see that the sobol sequence results are significantly smoother. Pay attention to the ground in the right figure

Note:
For multiple ejections in the same frame, sobol numbers in different dimensions must be used, otherwise strange bugs will appear when multiple ejections use the same direction. The following figure is a visual effect I accidentally produced. Very... The bug of stream of consciousness, like a teapot from a different world, flashes the light that does not belong to this dimension

4. Introduction to importance sampling

In the above, we only optimized the way to generate random numbers, but in essence, we are still sampling with uniformly distributed samples. Probability statistics tells us that compared with uniform sampling at various locations, if we sample more times where the function value is large, we can better fit. This strategy is called importance sampling:

In fact, "sampling more times where the function value is large" is only an intuitive description. The mathematical expression is: if the probability density function of random variables p d f ( x ) pdf(x) pdf(x) can be proportional to the function value of the estimated function f ( x ) f(x) f(x), then the estimated result is the best

for instance, f ( 1 ) = 1 , f ( 2 ) = 7 , f ( 3 ) = 2 f(1)=1, f(2)=7, f(3)=2 f(1)=1,f(2)=7,f(3)=2, then the probability of x taking 1, 2 and 3 should be 0.1, 0.7 and 0.2 respectively

The importance sampling of ray tracing is divided into the following parts because of the different BRDF functions:

  1. diffuse importance sampling
  2. Specular importance sampling
  3. clearcoat importance sampling
  4. HDR importance sampling

For path tracking that supports importance sampling, you need to implement these steps:

  1. Randomly sample one direction L according to the surface material
  2. Gets the BRDF value in the L direction
  3. Obtain the probability density of sampling in the L direction
  4. Calculate the contribution of this path

5. Diffuse importance sampling

For the diffuse reflection of Disney BRDF, we think roughly F d = b a s e C o l o r π ∗ cos ⁡ ( w i ) F_d = \frac{baseColor}{\pi} * \cos(w_i) Fd = π baseColor * cos(wi), that is, N dot L in Lambert diffuse, and then calculate the hemispherical product to obtain the normalization factor c = 1 π c=\frac{1}{\pi} c = π 1, resulting in:

p d f ( x ) ∼ cos ⁡ ( w i ) π pdf(x) \sim \frac{\cos(w_i)}{\pi} pdf(x)∼πcos(wi​)​

The derivation process is shown in: PBRT 13.6 2D Sampling with Multidimensional Transformations , for the generation of such samples, PBRT gives us a simple coincidence method:

  1. Generate two random numbers xi_1,xi_ two
  2. The two-dimensional point xy on the uniform disk is obtained according to the random number
  3. The two-dimensional point is projected to the z-axis hemisphere to obtain the three-dimensional point xyz

This method is called Malley's Method, and the specific principle is unknown. We are more interested in the application of the law than in exploring the law itself - this sentence comes from the original science fiction short story "lamp God Spirit" on island A. compared with the unrealistic lamp God observer and its three wishes, let's look at the code of cosine weighted importance sampling, and also pay attention to projecting the random vector to the normal hemisphere:

// Cosine weighted normal hemisphere sampling
vec3 SampleCosineHemisphere(float xi_1, float xi_2, vec3 N) {
    // The xy disk is uniformly sampled and projected onto the z hemisphere
    float r = sqrt(xi_1);
    float theta = xi_2 * 2.0 * PI;
    float x = r * cos(theta);
    float y = r * sin(theta);
    float z = sqrt(1.0 - x*x - y*y);

    // Projection from z hemisphere to normal hemisphere
    vec3 L = toNormalHemisphere(vec3(x, y, z), N);
    return L;
}

Where xi_1,xi_2 is two random variables, which means that importance sampling also supports us to use samples from sobol sequence. First sample a direction, and then obtain the corresponding BRDF value:

xi_1, xi_2 is 2 random number

vec3 L = SampleCosineHemisphere(xi_1, xi_2, N)
vec3 f_r = ...
float pdf = NdotH / PI;

Let's see the effect under 5 SPPs:

The left side is uniform sampling, and the right side is cosine hemisphere sampling. You can see that the importance sampling has less noise and is smoother

Note:
The test image uses pseudo-random number instead of sobol number, because the latter itself has noise reduction effect
Therefore, only pseudo-random numbers are used for control variables

6. BRDF importance sampling

This item is mainly for the importance sampling of specular reflection. Specular reflection is approximated by GTR function. The lower the roughness of this function, the greater the peak. The larger values of BRDF appear near the specular reflection light, which makes it difficult to fit through general uniform sampling. Although we use sobol sequence, noise is inevitable:

If you remember the complex expression of GTR2 in Disney BRDF, you will not want to manually deduce a distribution that conforms to it:

Fortunately, Disney helped us deduce in the paper: to generate a distribution conforming to the GTR function, we take the probability density of the half angle vector h as p d f ( θ h ) = D ( θ h ) cos ⁡ h pdf(\theta_h)=D(\theta_h)\cos_h pdf( θ h​)=D( θ H) cosh, because the bidirectional reflection distribution function has been normalized. In addition, for importance sampling, our PDF is for θ l \theta_l θ l , that is, the of outgoing light, so there is another step of additional conversion:

At this point, we can determine the process of generating importance sampling direction:

  1. First roll two uniformly distributed random vectors ξ 1 ,   ξ 2 \xi_1, \ \xi_2 ξ1​, ξ2​
  2. according to ξ 1 ,   ξ 2 \xi_1, \ \xi_2 ξ 1​,  ξ 2. Calculation of hemispherical coordinates θ h ,   ϕ h \theta_h, \ \phi_h θ h​,  ϕ H ﹤ to determine the half angle vector H
  3. Project the half angle vector H to the normal hemisphere of the reflection point
  4. Taking the half angle vector H as the normal of the reflecting mirror, the incident light V is reflected to obtain the outgoing light L

Namely:

The most difficult step is 2, because we want to generate a random variable according to:

p d f ( θ h ) ∼ D ( θ h ) cos ⁡ h pdf(\theta_h)\sim D(\theta_h)\cos_h pdf(θh​)∼D(θh​)cosh​

The solid angle of this distribution θ h \theta_h θ h, and for the rotation angle ϕ h \phi_h ϕ h , because of the importance of isotropic sampling, it and θ h \theta_h θ h , is irrelevant, we can take uniform distribution. So immediately push: directly put the bad copy of the formula in the paper is finished, for GTR2 θ h \theta_h θ h) we have:

Note that our calculations are still calculated in the z-up geometric coordinate system, rather than the y-up standard OpenGL coordinates:

When generating the vector according to the above formula, you can see that the lower the roughness, the closer the distribution of the vector is to the direction of specular reflection. The roughness on the left and right are 0.3 and 0.1 respectively. Write a python script to try the effect:

After understanding the principle, the corresponding code under the shader is implemented as follows. We first calculate h, and then reflect light to get L in the way of H:

// GTR2 importance sampling
vec3 SampleGTR2(float xi_1, float xi_2, vec3 V, vec3 N, float alpha) {
    
    float phi_h = 2.0 * PI * xi_1;
    float sin_phi_h = sin(phi_h);
    float cos_phi_h = cos(phi_h);

    float cos_theta_h = sqrt((1.0-xi_2)/(1.0+(alpha*alpha-1.0)*xi_2));
    float sin_theta_h = sqrt(max(0.0, 1.0 - cos_theta_h * cos_theta_h));

    // The normal vector of the "micro plane" is sampled as the half angle vector h of the specular reflection 
    vec3 H = vec3(sin_theta_h*cos_phi_h, sin_theta_h*sin_phi_h, cos_theta_h);
    H = toNormalHemisphere(H, N);   // Project to the true normal hemisphere

    // Calculates the direction of reflected light from the micronormal
    vec3 L = reflect(-V, H);

    return L;
}

Note that here is the importance sampling of isotropic GTR2. In addition, alpha is the square of roughness. During ray tracing main loop sampling, we need to make a mapping when passing in. Don't forget to clamp a 0.001 to prevent the roughness from being too small:

float alpha = max(0.001, sqr(hit.material.roughness));

Assuming that all samples are from specular reflection, we directly replace the cosine hemisphere sampling above with BRDF sampling, and then see the effect:

When the metallicity is 1.0 and the roughness is 0.2 and 50 spp, it is difficult to sample the peak because the roughness is too low, and the reflection can hardly be formed on the left, while the perfect reflection can be formed on the right

Note:
Because pdf involves division, we have to consider the case where the divisor is equal to 0
If you are not careful, you will throw an exception of divided by zero to down your fragment shader. At the same time, black spots appear on the screen. These black spots will never be eliminated until you close the program

7. Sampling of varnish

Disney BRDF uses GTR1 for specular lobe fitting of varnish, and Disney's paper also gives the importance sampling of isotropic GTR1 θ h \theta_h θ h) and ϕ h \phi_h ϕ Value of h:

So just do the same as GTR2, except cos_theta can be changed:

// GTR1 importance sampling
vec3 SampleGTR1(float xi_1, float xi_2, vec3 V, vec3 N, float alpha) {
    ...
    
    float cos_theta_h = sqrt((1.0-pow(alpha*alpha, 1.0-xi_2))/(1.0-alpha*alpha));
    ...
}

When taking BRDF, note that the alpha of the roughness of the varnish is calculated as follows, which is related to the clearcoat gloss parameter and independent of the surface roughness:

float alpha = mix(0.1, 0.001, hit.material.clearcoatGloss);

Take a look at the effect when there is only varnish:

Great. There's no noise at all

8. Mix sampling according to radiance

Previously, we used separate distribution to sample the three parts of BRDF separately. Now we want to stack the results of the three samples to get the final result. Then the problem comes. In one sampling, we can't sample three BRDF items at the same time. In this way, each ejection needs to emit three rays, which is too time-consuming, and most materials don't have three items at the same time, so we will waste a lot of sampling

8.1. Obtain sampling direction L

An heuristic method is to allocate the number of samples according to the energy contribution of three BRDF terms. For example, I have the cost of 100 samples. Assuming that diffuse reflection, specular reflection and varnish account for 0.6, 0.3 and 0.1 of the final color brightness respectively, I will allocate 60, 30 and 10 samples to diffuse reflection, specular reflection and varnish

According to the expression of Disney BRDF:

diffuse * (1.0 - material.metallic) + specular + clearcoat;

We can roughly calculate the proportion of each item. Note that here we think that the brightness of specular reflection is always 1.0, and the brightness of varnish is controlled by the clearcoat parameter of material:

// Radiometric statistics
float r_diffuse = (1.0 - material.metallic);
float r_specular = 1.0;
float r_clearcoat = 0.25 * material.clearcoat;
float r_sum = r_diffuse + r_specular + r_clearcoat;

// Calculate probability based on radiance
float p_diffuse = r_diffuse / r_sum;
float p_specular = r_specular / r_sum;
float p_clearcoat = r_clearcoat / r_sum;

If you only look at diffuse and special, you will find that we have such a function:

p d i f f u s e = 1.0 − m e t a l l i c 2.0 − m e t a l l i c ,   p s p e c u l a r = 1.0 2.0 − m e t a l l i c p_{diffuse} = \frac{1.0 - metallic}{2.0 - metallic} , \ p_{specular} = \frac{1.0}{2.0 - metallic} pdiffuse​=2.0−metallic1.0−metallic​, pspecular​=2.0−metallic1.0​

Take metallic as x and draw the image of diffuse ratio about x. you will find that this actually coincides with our commonly used empirical formula:

float diffuseRatio = 0.5 * (1.0 - Metallic);

There are few detailed explanations of the diffuseRatio formula on the Internet, which are obtained directly from the empirical formula. I think it can be explained through the idea of [sampling according to radiance]:

We roll a random number ξ 3 \xi_3 ξ 3. Make probability sampling according to the radiance contribution of three BRDF s. Assuming that the contributions of diffuse reflection, specular reflection and varnish are 0.6, 0.3 and 0.1 respectively, then our random number falls in [ 0 , 0.6 ] [0,0.6] [0,0.6] is diffuse reflection, falling on [ 0.6 , 0.9 ] [0.6,0.9] [0.6,0.9] is specular reflection, falling on [ 0.9 , 1.0 ] [0.9,1.0] [0.9,1.0] is varnish reflection:

// Three BRDF s were sampled according to the radiance distribution
vec3 SampleBRDF(float xi_1, float xi_2, float xi_3, vec3 V, vec3 N, in Material material) {
    float alpha_GTR1 = mix(0.1, 0.001, material.clearcoatGloss);
    float alpha_GTR2 = max(0.001, sqr(material.roughness));
    
    // Radiometric statistics
    float r_diffuse = (1.0 - material.metallic);
    float r_specular = 1.0;
    float r_clearcoat = 0.25 * material.clearcoat;
    float r_sum = r_diffuse + r_specular + r_clearcoat;

    // Calculate probability based on radiance
    float p_diffuse = r_diffuse / r_sum;
    float p_specular = r_specular / r_sum;
    float p_clearcoat = r_clearcoat / r_sum;

    // Sampling according to probability
    float rd = xi_3;

    // diffuse reflection
    if(rd <= p_diffuse) {
        return SampleCosineHemisphere(xi_1, xi_2, N);
    } 
    // Specular reflection
    else if(p_diffuse < rd && rd <= p_diffuse + p_specular) {    
        return SampleGTR2(xi_1, xi_2, V, N, alpha_GTR2);
    } 
    // varnish
    else if(p_diffuse + p_specular < rd) {
        return SampleGTR1(xi_1, xi_2, V, N, alpha_GTR1);
    }
    return vec3(0, 1, 0);
}

8.2. Obtain the probability density in direction L

When sampling a diffuse path, we should use the pdf of diffuse to calculate the contribution of the path. Compared with separately sampling three PDFs, a better strategy is to directly stack three PDFs according to the sampling probability. Similar to sampling, first calculate the radiance, then normalize to obtain the probability of each part, and finally stack three different PDFs according to the probability and return:

// Obtain the probability density of BRDF in the L direction
float BRDF_Pdf(vec3 V, vec3 N, vec3 L, in Material material) {
    float NdotL = dot(N, L);
    float NdotV = dot(N, V);
    if(NdotL < 0 || NdotV < 0) return 0;

    vec3 H = normalize(L + V);
    float NdotH = dot(N, H);
    float LdotH = dot(L, H);
     
    // Specular - isotropic
    float alpha = max(0.001, sqr(material.roughness));
    float Ds = GTR2(NdotH, alpha); 
    float Dr = GTR1(NdotH, mix(0.1, 0.001, material.clearcoatGloss));   // varnish

    // Calculate the probability density of three BRDF s respectively
    float pdf_diffuse = NdotL / PI;
    float pdf_specular = Ds * NdotH / (4.0 * dot(L, H));
    float pdf_clearcoat = Dr * NdotH / (4.0 * dot(L, H));

    // Radiometric statistics
    float r_diffuse = (1.0 - material.metallic);
    float r_specular = 1.0;
    float r_clearcoat = 0.25 * material.clearcoat;
    float r_sum = r_diffuse + r_specular + r_clearcoat;

    // The probability of selecting a sampling method is calculated according to the radiance
    float p_diffuse = r_diffuse / r_sum;
    float p_specular = r_specular / r_sum;
    float p_clearcoat = r_clearcoat / r_sum;

    // Mix pdf according to probability
    float pdf = p_diffuse   * pdf_diffuse 
              + p_specular  * pdf_specular
              + p_clearcoat * pdf_clearcoat;

    pdf = max(1e-10, pdf);
    return pdf;
}

Then change the main code of path tracking. First, obtain a direction L, then calculate the values of BRDF and pdf in that direction, and finally superimpose the contribution of this path. Where hdrColor is the color of the hdr map in the acquisition direction L:

// Path tracking -- importance sampling version
vec3 pathTracingImportanceSampling(HitResult hit, int maxBounce) {

    vec3 Lo = vec3(0);      // Final color
    vec3 history = vec3(1); // Recursive accumulated color

    for(int bounce=0; bounce<maxBounce; bounce++) {
        vec3 V = -hit.viewDir;
        vec3 N = hit.normal;       
     
        // Get 3 random numbers        
        float xi_1, xi_2, xi_3 = ...

        // A direction L is obtained by sampling BRDF
        vec3 L = SampleBRDF(xi_1, xi_2, xi_3, V, N, hit.material); 
        float NdotL = dot(N, L);
        if(NdotL <= 0.0) break;

        // Emit light
        Ray randomRay;
        randomRay.startPoint = hit.hitPoint;
        randomRay.direction = L;
        HitResult newHit = hitBVH(randomRay);

        // Obtain the BRDF value and probability density in the L direction
        vec3 f_r = BRDF_Evaluate(V, N, L, hit.material);
        float pdf_brdf = BRDF_Pdf(V, N, L, hit.material);
        if(pdf_brdf <= 0.0) break;

        // Miss        
        if(!newHit.isHit) {
            vec3 color = hdrColor(L);
            Lo += history * color * f_r * NdotL / pdf_brdf;
            break;
        }
        
        // Hit light accumulation color
        vec3 Le = newHit.material.emissive;
        Lo += history * Le * f_r * NdotL / pdf_brdf;             

        // Recursion (step)
        hit = newHit;
        history *= f_r * NdotL / pdf_brdf;   // Cumulative color
    }
    
    return Lo;
}

Let's see the effect. The roughness is 0.1 and the metallicity is 0, so there are both diffuse and specular reflections. At the same number of samples, the effect of diffuse sampling is similar. The specular reflection of importance sampling has won a suppressive victory in noise:

Great, another one with a longer iteration:

This picture can almost confuse the true with the false, and it is also my favorite picture

9. HRD environment map importance sampling

So far, we have used this environment map:

It is not that other maps are not good-looking, but other maps. They have many high-frequency light sources, as shown in the figure. Each path lamp has a brightness of hundreds of pixels:

The uniform sampling of HDR map enables ray tracing to occasionally accumulate a large brightness during diffuse reflection, resulting in noise:

It is not difficult to think of using the brightness of HDR map as the probability density function. Therefore, we need to generate samples consistent with the brightness distribution of HDR map to sample HDR map, so as to avoid noise!

Note:
Understanding the following content requires a certain basis of probability theory, at least know the probability density and distribution
But trust me, it's really not difficult, really

9.1. Generate samples with brightness as pdf

Reviewing probability theory, for generating a probability density function that conforms to a certain probability density function p d f ( x ) pdf(x) pdf(x) samples, in the case of continuous, we can find the probability distribution function c d f ( x ) cdf(x) The analytical solution of cdf(x), and then take its inverse function c d f − 1 ( x ) cdf^{-1}(x) cdf − 1(x), according to the definition of probability distribution:

c d f ( x ) = P { X ≤ x } = ξ cdf(x) = P\{ X \le x\}=\xi cdf(x)=P{X≤x}=ξ

Take the inverse function on both sides to get:

c d f − 1 ( c d f ( x ) ) = c d f − 1 ( ξ ) = x cdf^{-1}(cdf(x)) =cdf^{-1}(\xi)= x cdf−1(cdf(x))=cdf−1(ξ)=x

We roll a random number ξ \xi ξ, Then bring in c d f − 1 ( ξ ) cdf^{-1}(\xi) cdf−1( ξ) You can get a obedience p d f ( x ) pdf(x) Sample of pdf(x) distribution x x The value of x. for Disney BRDF importance sampling θ h \theta_h θ That's how the distribution function of h +

So it's a little abstract. Let's look at the discrete case: suppose that the random variable x has three values of 5, 6 and 7, and the probability of taking them is 0.2, 0.3 and 0.5 respectively, that is

p d f ( 5 ) = 0.2 ,   p d f ( 6 ) = 0.3 ,   p d f ( 7 ) = 0.5 pdf(5)=0.2, \ pdf(6)=0.3, \ pdf(7)=0.5 pdf(5)=0.2, pdf(6)=0.3, pdf(7)=0.5

We find the probability distribution function of random variable x through the prefix and operation of the array:

x=5x=6x=7
cdf(5)=0.2cdf(6)=0.5cdf(7)=1.0

So let's roll a random number from 0 to 1 ξ \xi ξ, There are three situations:

  • If ξ ∈ [ 0 , 0.2 ] \xi \in [0, 0.2] ξ ∈ [0,0.2], then x takes 5
  • If ξ ∈ [ 0.2 , 0.5 ] \xi \in [0.2, 0.5] ξ ∈ [0.2,0.5], then x is 6
  • If ξ ∈ [ 0.5 , 1.0 ] \xi \in [0.5, 1.0] ξ ∈ [0.5,1.0], then x is 7

Suppose our ξ \xi ξ Take 0.34, and we perform a lower bound operation in the discrete cdf table, that is, find the first cdf value greater than or equal to our own. Here, we can also use binary search, because the cdf is increasing

Here we find c d f ( 6 ) = 0.5 > 0.34 cdf(6)=0.5 > 0.34 CDF (6) = 0.5 > 0.34, so push it immediately ξ ∈ [ 0.2 , 0.5 ] \xi \in [0.2, 0.5] ξ ∈ [0.2,0.5], so x is 6

So far, we have mastered the generation of one-dimensional discrete random variables with arbitrary distribution

For HRD map, its probability density function is its brightness, with two-dimensional random variables, which requires two-dimensional joint distribution and edge distribution. First, prefix and sum each column to obtain the one-dimensional edge probability density function of X:

Using the lower bound method in the above one-dimensional discrete case, the coincidence is generated X X Sample of X probability density x x x. Then select the second x x Column x. Here we assume that we have selected x = 2 x=2 In the column x=2, we get the joint probability density under the condition of x=2, i.e f X Y ( x , y ) f_{XY}(x, y) fXY (x,y) is shown in the figure:

For two-dimensional distribution, we assume X , Y X,Y 10. Y independent, then:

f X Y ( x , y ) = f Y ∣ X ( y ) ∗ f X ( x ) f_{XY}(x, y) = f_{Y | X}(y) * f_X(x) fXY​(x,y)=fY∣X​(y)∗fX​(x)

And we just 0.35 0.35 Probability of 0.35 x = 2 x=2 x=2, then there is f X ( x ) = 0.35 f_X(x)=0.35 fX (x)=0.35, so the joint distribution is immediately f X Y ( x , y ) f_{XY}(x, y) Edge distribution of fXY (x,y) divided by X f X ( x ) = 0.35 f_X(x)=0.35 fX (x)=0.35 Y Y Y in X = x X=x Conditional probability density under the condition of X=x, i.e f Y ∣ X ( y ) f_{Y | X}(y) fY ∣ X (y) as shown in the figure:

Then, using the lower bound method in the above one-dimensional discrete case, the coincidence is generated f Y ∣ X ( y ) f_{Y | X}(y) Sample points of fY ∣ X (y) distribution y y y. And we took it from the front x = 2 x=2 x=2, so we finally get a two-dimensional sample (x,y), and this sample conforms to the distribution of the two-dimensional joint probability density function!

ok has said so much. Generally speaking, it is divided into these steps:

  1. Normalize the HDR map to probability density
  2. Accumulate each column to obtain the edge probability density of X f X ( x ) f_X(x) fX (x), both prefix and calculation F X ( x ) F_X(x) FX​(x)
  3. For all elements in each column, divide by the number of elements in that column f X ( x ) f_X(x) fX (x) obtains the conditional probability density f Y ∣ X ( y ) f_{Y | X}(y) fY∣X​(y)
  4. For each f Y ∣ X ( y ) f_{Y | X}(y) fY ∣ X (y) is prefixed and the distribution function is obtained F Y ∣ X ( y ) F_{Y | X}(y) FY∣X​(y)

Then the random variables x and y are generated:

  1. Select 2 random numbers ξ 1 , ξ 2 \xi_1, \xi_2 ξ1​,ξ2​
  2. use ξ 1 \xi_1 ξ 1. In F X ( x ) F_X(x) lower bound in FX (x) gets X
  3. Select column x and use ξ 2 \xi_2 ξ 2. Conditional distribution in column x F Y ∣ X ( y ) F_{Y | X}(y) Y is obtained by lower bound in FY ∣ X (y)

However, each lower bound seems a waste of time. We can preset some ξ 1 , ξ 2 \xi_1, \xi_2 ξ 1​, ξ 2 , usually let them take the whole [ 0 , 1 ] 2 [0,1]^2 The two-dimensional plane of [0,1] 2 is then precomputed for each ξ 1 , ξ 2 \xi_1, \xi_2 ξ 1​, ξ 2. The corresponding (x, y) value is stored in an hdrCache texture

The cache texture of the height row and width column is defined as follows: c a c h e [ i ] [ j ] cache[i][j] cache[i][j] indicates ξ 1 = i h e i g h t , ξ 2 = j w i d t h \xi_1=\frac{i}{height}, \xi_2=\frac{j}{width} ξ 1​=heighti​, ξ When 2 = widthj , the corresponding sample has the value of (x, y). The colorator generates random at a time μ 1 , μ 2 \mu_1, \mu_2 μ 1​, μ 2. Make texture coordinates, and then directly check the cache in O(1) time to obtain samples (x, y)

Take the original image resolution as the cache resolution, simply simulate it with python, and put the code into here The results are as follows:

There seems to be no problem at all, and then we implement it in c + + and shaders!

9.2. Precomputing hdr cache in c + +

We use HDR Image Loader , this is a lightweight library that can return the floating-point values of HDR pictures in the form of float array. Its return structure is defined as follows:

Its floating-point array is stored by row, so we can:

for (int i = 0; i < height; i++) {
    for (int j = 0; j < width; j++) {
        float R = cols[3 * (i * width + j)];
        float G = cols[3 * (i * width + j) + 1];
        float B = cols[3 * (i * width + j) + 2];
    }
}

To access the RGB components of pixels i, J. Where i is the row and j is the column. We need to transfer three numbers to the shader, namely, the xy sample coordinates of the cache and the value of the probability density pdf corresponding to this position, so a RGB32F texture is completely enough!

The following is the function to generate HRD cache texture according to the original HDR pixel value. The calculations of pdf and cdf are the same as above. The finally returned float array can be directly passed into OpenGL for texture. The RG channel of cache texture is the value of xy, which is a random number ξ 1 , ξ 2 \xi_1, \xi_2 ξ 1​, ξ 2. Make texture coordinates to get xy, and then channel B is the probability density pdf corresponding to the pixel:

// Calculate cache information related to HDR maps
float* calculateHdrCache(float* HDR, int width, int height) {

    float lumSum = 0.0;

    // Initialize the probability density pdf of row h and column w and count the total brightness
    std::vector<std::vector<float>> pdf(height);
    for (auto& line : pdf) line.resize(width);
    for (int i = 0; i < height; i++) {
        for (int j = 0; j < width; j++) {
            float R = HDR[3 * (i * width + j)];
            float G = HDR[3 * (i * width + j) + 1];
            float B = HDR[3 * (i * width + j) + 2];
            float lum = 0.2 * R + 0.7 * G + 0.1 * B;
            pdf[i][j] = lum;
            lumSum += lum;
        }
    }

    // Probability density normalization
    for (int i = 0; i < height; i++)
        for (int j = 0; j < width; j++)
            pdf[i][j] /= lumSum;
    
    // Accumulate each column to get the edge probability density of x
    std::vector<float> pdf_x_margin;
    pdf_x_margin.resize(width);
    for (int j = 0; j < width; j++) 
        for (int i = 0; i < height; i++) 
            pdf_x_margin[j] += pdf[i][j];

    // Calculate the edge distribution function of x
    std::vector<float> cdf_x_margin = pdf_x_margin;
    for (int i = 1; i < width; i++)
        cdf_x_margin[i] += cdf_x_margin[i - 1];

    // Calculate the conditional probability density function of y under X=x
    std::vector<std::vector<float>> pdf_y_condiciton = pdf;
    for (int j = 0; j < width; j++) 
        for (int i = 0; i < height; i++) 
            pdf_y_condiciton[i][j] /= pdf_x_margin[j];

    // Calculate the conditional probability distribution function of y under X=x
    std::vector<std::vector<float>> cdf_y_condiciton = pdf_y_condiciton;
    for (int j = 0; j < width; j++)
        for (int i = 1; i < height; i++)
            cdf_y_condiciton[i][j] += cdf_y_condiciton[i - 1][j];

    // cdf_ y_ Convert to store by column
    // cdf_y_condiciton[i] represents the conditional probability distribution function of Y under X=i
    std::vector<std::vector<float>> temp = cdf_y_condiciton;
    cdf_y_condiciton = std::vector<std::vector<float>>(width);
    for (auto& line : cdf_y_condiciton) line.resize(height);
    for (int j = 0; j < width; j++)
        for (int i = 0; i < height; i++)
            cdf_y_condiciton[j][i] = temp[i][j];
    
    // Exhaustive xi_1, xi_2 pre calculated sample xy
    // sample_x[i][j] denotes Xi_ 1=i/height, xi_ X in (x,y) when 2 = J / width
    // sample_y[i][j] denotes Xi_ 1=i/height, xi_ Y in (x,y) when 2 = J / width
    // sample_p[i][j] represents the probability density when taking point (i, j)
    std::vector<std::vector<float>> sample_x(height);
    for (auto& line : sample_x) line.resize(width);
    std::vector<std::vector<float>> sample_y(height);
    for (auto& line : sample_y) line.resize(width);
    std::vector<std::vector<float>> sample_p(height);
    for (auto& line : sample_p) line.resize(width);
    for (int j = 0; j < width; j++) {
        for (int i = 0; i < height; i++) {
            float xi_1 = float(i) / height;
            float xi_2 = float(j) / width;
            
            // With xi_1 in CDF_ x_ Sample x is obtained from lower bound in margin
            int x = std::lower_bound(cdf_x_margin.begin(), cdf_x_margin.end(), xi_1) - cdf_x_margin.begin();
            // With xi_2 obtain sample y when X=x
            int y = std::lower_bound(cdf_y_condiciton[x].begin(), cdf_y_condiciton[x].end(), xi_2) - cdf_y_condiciton[x].begin();

            // Store the probability density corresponding to the texture coordinates xy and xy positions
            sample_x[i][j] = float(x) / width;
            sample_y[i][j] = float(y) / height;
            sample_p[i][j] = pdf[i][j];
        }
    }
        
    // Integrate results into textures
    // R. Channel g stores samples (x,y) and channel B stores pdf(i, j)
    float* cache = new float[width * height * 3];
    //for (int i = 0; i < width * height * 3; i++) cache[i] = 0.0;

    for (int i = 0; i < height; i++) {
        for (int j = 0; j < width; j++) {
            cache[3 * (i * width + j)] = sample_x[i][j];        // R
            cache[3 * (i * width + j) + 1] = sample_y[i][j];    // G
            cache[3 * (i * width + j) + 2] = sample_p[i][j];    // B
        }
    }   

    return cache;
}

Admittedly, the above code is boring and rigid, but I have to admit that python's numpy processing of multi-dimensional arrays is much more elegant than c + +, which is divided in this piece of py

Then change the code for generating HDR to:

// Generate texture
GLuint hdrMap = ...
GLuint hdrCache = ...

// hdr panorama
HDRLoaderResult hdrRes;
bool r = HDRLoader::load("./HDR/circus_arena_4k.hdr", hdrRes);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB32F, hdrRes.width, hdrRes.height, 0, GL_RGB, GL_FLOAT, hdrRes.cols);

// hdr importance sampling cache
float* cache = calculateHdrCache(hdrRes.cols, hdrRes.width, hdrRes.height);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB32F, hdrRes.width, hdrRes.height, 0, GL_RGB, GL_FLOAT, cache);

Then in the shader, we take 100 sobol numbers in 1 and 2 dimensions as random numbers ξ 1 , ξ 2 \xi_1, \xi_2 ξ 1​, ξ 2 (of course, you can also take the real random number) to get the corresponding (x,y) description. Click to see the effect:

On the left is the original HDR image, in the middle is its probability density (multiplied by 1w times to be obvious), and on the right is the sampling point we generated. You can see that the sampling points hit the areas with bright light sources, such as lights and the middle stage. The effect is great!

9.3. Sampling hdr maps in shaders

For the importance of hdr mapping, sampling is often used to calculate direct lighting. We first sample 2D sample points (x, y) according to the importance, then convert xy to 3D vector L through spherical coordinates and emit a ray to judge whether it is hit. If the ray is miss, we should obtain the hdr map value in the L direction and calculate the contribution of the path

Spherical coordinates and hdr map coordinates are defined as follows, φ \varphi φ The range of is ± PI θ \theta θ The range of is ± 0.5 PI. Note that this is slightly different from the conventional definition:

So shake two random numbers, then sample hdr cache to obtain the coordinates xy of two-dimensional points and convert them to 3D coordinates. The code is as follows:

// Sampling precomputed HDR cache
vec3 SampleHdr(float xi_1, float xi_2) {
    vec2 xy = texture2D(hdrCache, vec2(xi_1, xi_2)).rg; // x, y
    xy.y = 1.0 - xy.y; // flip y

    // Get angle
    float phi = 2.0 * PI * (xy.x - 0.5);    // [-pi ~ pi]
    float theta = PI * (xy.y - 0.5);        // [-pi/2 ~ pi/2]   

    // Spherical coordinate calculation direction
    vec3 L = vec3(cos(theta)*cos(phi), sin(theta), cos(theta)*sin(phi));

    return L;
}

With the exit direction, you also need to know how much pdf hdr is in that direction. The b channel of hdr cache stores this value. For a three-dimensional coordinate L ∈ [ − 1 , 1 ] 3 L \in [-1, 1]^3 L ∈ [− 1,1] 3 we can convert it into two-dimensional texture coordinates u v ∈ [ 0 , 1 ] 2 uv \in [0, 1]^2 uv ∈ [0,1] 2, then sample cache.b to obtain pdf, and pay attention to the flip y of the texture

It is worth noting that the integral domain of pdf obtained by directly sampling pixels is on the whole 2D picture, and finally the integral domain of our hemisphere sampling is from normal to hemisphere, so it is necessary to convert the integral domain. For 2D textures, we cover the hemispherical surface directly:

First, the integral should be normalized according to the area of the sphere and 2D image, and then for each area micro element on the image, when surrounding the sphere, it should be multiplied by sin ⁡ ( θ ) \sin(\theta) sin( θ) Projection is also easy to understand, because the closer to the two poles of the ball, the image will be squeeze d, as shown in the figure:

So the final conversion factor is:

w h 2 π 2 sin ⁡ ( θ ) \frac{wh}{2 \pi ^2 \sin(\theta)} 2π2sin(θ)wh​

Therefore, the pdf code of the hdr map at this position is obtained through the direction vector as follows. Tospherical coord has flipped y when returning the sampling texture, so it is necessary to flip back when calculating theta. In addition, hdrResolution needs to be passed in c + + through the uniform variable:

// Convert the 3D vector v to the texture coordinate uv of HDR map
vec2 toSphericalCoord(vec3 v) {
    vec2 uv = vec2(atan(v.z, v.x), asin(v.y));
    uv /= vec2(2.0 * PI, PI);
    uv += 0.5;
    uv.y = 1.0 - uv.y;
    return uv;
}

// Enter the light direction L to obtain the probability density of HDR at this position
// hdr resolution is 4096 x 2048 -- > hdrresolution = 4096
float hdrPdf(vec3 L, int hdrResolution) {
    vec2 uv = toSphericalCoord(normalize(L));   // Direction vector to uv texture coordinates

    float pdf = texture2D(hdrCache, uv).b;      // Sampling probability density
    float theta = PI * (0.5 - uv.y);            // theta range [- pi/2 ~ pi/2]
    float sin_theta = max(sin(theta), 1e-10);

    // Conversion coefficients between spherical coordinates and picture integral domain
    float p_convert = float(hdrResolution * hdrResolution / 2) / (2.0 * PI * PI * sin_theta);  
    
    return pdf * p_convert;
}

At the beginning of the main loop, a code for directly sampling hdr importance sampling is added. First, sample the hdr cache to obtain an exit direction L, and then make a visibility judgment. If l does not hit any object, calculate the contribution of this light path. Note that the pdf of hdr map is used here instead of the pdf of brdf, because we collect it according to the distribution of hdr brightness Sample:

// HDR environment map importance sampling    
Ray hdrTestRay;
hdrTestRay.startPoint = hit.hitPoint;
hdrTestRay.direction = SampleHdr(rand(), rand());

// Conduct an intersection test to determine whether there is occlusion
if(dot(N, hdrTestRay.direction) > 0.0) { // If the sampling direction is away from point p, the test is abandoned because n dot l < 0            
    HitResult hdrHit = hitBVH(hdrTestRay);
    
    // Sky light accumulates brightness only when there is no occlusion
    if(!hdrHit.isHit) {
        vec3 L = hdrTestRay.direction;
        vec3 color = hdrColor(L);
        float pdf_light = hdrPdf(L, hdrResolution);
        vec3 f_r = BRDF_Evaluate(V, N, L, hit.material);
        
        Lo += history * color * f_r * dot(N, L) / pdf_light;	// Cumulative brightness
    }
}

Compare the results. One ejection only has direct illumination. It can be seen that because the light source is too small, the uniform sampling on the left can not effectively hit, so it is very noisy, while the hdr importance sampling on the right can effectively hit the light source and cast a real shadow:

10. Multiple importance sampling

Direct sampling of hdr Light Source solves the problem that small light sources are difficult to hit when sampling uniformly. However, there are still defects. If there are large light sources in hdr map and the roughness of the rebound surface is very low, this means that BRDF is a delta function, which has a value almost only in the perfect specular reflection direction, and it is 0 in other places, which leads to the loss of rays when sampling large-area light sources Direction L is difficult to get the direction with BRDF value

As shown in the figure, the red ray is the standard specular reflection, and only the red direction has BRDF, while the blue ray is the ray sampled from a large area light source, so it is difficult to obtain a correct estimation:

As a result, the smooth plane cannot effectively sample the specular reflection of a large area light source, resulting in noise. The following figure is the result of direct sampling only for the light source:

Here's a reference to this classic picture. The roughness of the flat plate increases from top to bottom. The direct sampling BRDF performs well when the roughness is low, and the light will disperse as soon as the roughness is high, so it is not easy to hit the small light source. While the direct sampling light source performs well when the roughness is high and the light source area is large, but the roughness becomes low, and the sharp peak of the BRDF function is more sharp, so it is not easy to die Take the direction with value of BRDF:

The effects of BRDF sampling and light source sampling are complementary, so their results need to be mixed according to some strategy. An enlightening method is to mix them according to the ratio of the probability density of the two samples. Because importance sampling is sampled according to the luminance contribution, it is considered that the greater the pdf at this point, the greater the contribution of this path and the more credible it is A heuristic formula for calculating the final weight of probability density is based on the square of pdf:

w l i g h t = p d f l i g h t 2 p d f l i g h t 2 + p d f b r d f 2 w_{light} = \frac{pdf_{light}^2}{pdf_{light}^2+pdf_{brdf}^2} \\ wlight​=pdflight2​+pdfbrdf2​pdflight2​​

w b r d f = p d f b r d f 2 p d f l i g h t 2 + p d f b r d f 2 w_{brdf} = \frac{pdf_{brdf}^2}{pdf_{light}^2+pdf_{brdf}^2} wbrdf​=pdflight2​+pdfbrdf2​pdfbrdf2​​

In other words, diffuse is mainly sampled by light source, while special is mainly sampled by BRDF. In other words, rough surface is mainly sampled by light source importance, and smooth surface is mainly sampled by BRDF. The mixing weight can be obtained according to two probability densities with a simple function. Note that the first parameter is molecule:

float misMixWeight(float a, float b) {
    float t = a * a;
    return t / (b*b + t);
}

When directly sampling the light source, it is necessary to calculate the pdf of BRDF in the sampling direction L, then mix it with the pdf of hdr map, finally obtain mis_weight and correct the contribution of this path:

// Obtain: 1. Illumination contribution, 2. pdf of environment map at this position, 3. BRDF function value, 4. pdf of BRDF in this direction
vec3 color = hdrColor(L);
float pdf_light = hdrPdf(L, hdrResolution);
vec3 f_r = BRDF_Evaluate(V, N, L, hit.material);
float pdf_brdf = BRDF_Pdf(V, N, L, hit.material);

// Multiple importance sampling
float mis_weight = misMixWeight(pdf_light, pdf_brdf);
Lo += mis_weight * history * color * f_r * dot(N, L) / pdf_light;

In indirect illumination, if the ray is miss, it is also necessary to accumulate brightness from the accumulated hdr map. Here, multiple importance sampling is also required, which is similar to the above. Just note that when calculating mis_weight here, the molecule should be pdf_brdf, because indirect illumination is the light emitted according to the distribution of BRDF:

// Miss        
if(!newHit.isHit) {
    vec3 color = hdrColor(L);
    float pdf_light = hdrPdf(L, hdrResolution);            
    
    // Multiple importance sampling
    float mis_weight = misMixWeight(pdf_brdf, pdf_light);  
    Lo += mis_weight * history * color * f_r * NdotL / pdf_brdf;

    break;
}

Let's take a look at the effect of 50 spp. on the right is the result of multiple importance sampling. I won't say more about explosion and killing. OK:

In addition, there is another problem, that is, when the hdr background is extremely high frequency, the indirect illumination path of the first diffuse and the second spectral will produce some noise, because the BRDF of the smooth spectral is too large. Once the light source is taken, it will bring fire that even multi importance sampling is difficult to eliminate, as shown in the figure:

I was troubled by this problem for a long time and finally found it on the Internet This blog , the author said that ignoring the contribution of spectral in indirect illumination can lead to abnormal reflection of metal objects (because metal has only spectral). The original words of the blog are as follows:

The author's own solution is to manually set an indirect spectral factor to control the brightness of the spectral in indirect light:

For the sake of correctness, I did not add this code. Here is only a solution:

11. Summary and postscript

The optimization of ray tracing is a process of constantly struggling with noise. We use sobol sequence to generate more uniform samples, use importance sampling to optimize the sampling strategy, and finally heuristic use multiple importance sampling to fine tune. It is the combination of these mathematical, probability theory and statistical knowledge that can present a colorful teapot:

This is also the place where graphic & rendering is called hard core, but the process of exploration and learning is really attractive. I have consulted many blogs and codes. Every day, I go to the website of foreign schools to steal ptt, go to GitHub to steal code... I often go back to the dormitory to take a bath at 10:30, watch b stand at 12:00, and then check the information, watch the blog and write code , debug to 2:00, and at 8:30 the next morning, run went to the library and saw Uncle Zhang Yu. In this way, she completed the writing of this blog intermittently

Indeed, the code written in this high push environment is like BS, and there are some fatal bug s, such as Stanford Dragon at the beginning. There is a problem with the normal interpolation of the model. I have to use the normal of the triangle instead of the interpolated normal. Fortunately, the number of faces of the model is as high as 202520, so I can't see the difference:

Not surprisingly, there is another article on noise reduction and real-time, which may have to be said after the end of December. That's it


12. References and references

[1] Brent Burley, Walt Disney Animation Studios 2012, "Physically-Based Shading at Disney"

[2] Stephen Joe and Frances Kuo, "Sobol sequence generator"

[3] Stephen Joe and Frances Y. Kuo, "Notes on generating Sobol
sequences"

[4] cs184.eecs.berkeley.edu, "Project 3-2, Part 3: Environment Map Lights"

[5] blender (document), "Cycles Sampling Patterns, Sobol"

[6] Shih-Chin, "Implementing Disney Principled BRDF in Arnold"

[7] Matt Pharr, Wenzel Jakob, and Greg Humphreys, "Physically Based Rendering: From Theory To Implementation"

[8] knightcrawler25 (GitHub), "GLSL-PathTracer"

[9] Wendao qiuer (Zhihu), "Low difference sequence (I) - definition and properties of common sequences"

[10] ksg fk, "[Nori] task 5: path tracking, micro surface model and multiple importance sampling"

[11] Image Synthesis (CS474 F16), "Assignment 4 – Image-based Lighting"

[12] Dezeming (CSDN), "Multiple importance sampling (MIS) and ray tracing technology"

Tags: Computer Graphics

Posted on Wed, 06 Oct 2021 16:13:43 -0400 by k994519