Category: Technical Blog

  • Games202 作业一 软阴影实现

    Games202 Assignment 1 Soft Shadow Implementation

    Contents of this article: JS and WebGL related knowledge, 2-pass shadow algorithm, BIAS to alleviate self-occlusion, PCF algorithm, PCSS, object movement.

    Project source code:

    GitHub – Remyuu/GAMES202-Homework: GAMES202-Homework​

    The picture above is fun to draw.


    Written in front

    Since I know nothing about JS and WebGL, I can only use console.log() when I encounter problems.

    In addition to the content required by the assignment, I also have some questions when coding, and I hope you can answer them QAQ.

    1. How to achieve dynamic point light shadow effect? We need to use point light shadow technology to achieve omnidirectional shadow maps. How to do it specifically?
    2. The possionDiskSamples function is not really a Poisson disk distribution?

    Framework Modification

    Please make some corrections to the assignment framework at the beginning of the assignment. Original text of the framework change:https://games-cn.org/forums/topic/zuoyeziliao-daimakanwu/

    • The unpack function algorithm provided by the framework is not accurately implemented. When no bias is added, it will cause serious banding (the ground is half white and half black instead of the typical z-fighting effect), which will affect job debugging to a certain extent.
    // homework1/src/shaders/shadowShader/shadowFragment.glsl
    vec4 pack (float depth) {
        // Use RGBA 4 bytes, 32 bits in total, to store the z value, and the precision of 1 byte is 1/255
        const vec4 bitShift = vec4(1.0, 255.0, 255.0 * 255.0, 255.0 * 255.0 * 255.0);
        const vec4 bitMask = vec4(1.0/255.0, 1.0/255.0, 1.0/255.0, 0.0);
        // gl_FragCoord: the coordinates of the fragment, fract(): returns the decimal part of the value
        vec4 rgbaDepth = fract(depth * bitShift); // Calculate the z value of each point
        rgbaDepth -= rgbaDepth.gba * bitMask; // Cut off the value which do not fit in 8 bits
        return rgbaDepth;
    }
    
    // homework1/src/shaders/phongShader/phongFragment.glsl
    float unpack(vec4 rgbaDepth) {
        const vec4 bitShift = vec4(1.0, 1.0/255.0, 1.0/(255.0*255.0), 1.0/(255.0*255.0*255.0));
        return dot(rgbaDepth, bitShift);
    }
    • To clear the screen, you also need to add a glClear.
    // homework1/src/renderers/WebGLRenderer.js
    gl.clearColor(0.0, 0.0, 0.0,1.0);// Clear to black, fully opaque
    gl.clearDepth(1.0);// Clear everything
    gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);

    The most basic knowledge of JS

    variable

    • In JavaScript, we mainly use three keywords: var, let and const to declare variables/constants.
    • var is a keyword for declaring variables.The entire function scopeIn the use statementvariable(Function scope).
    • The behavior of let is similar to var, which also declares avariable, but letScope is limited to blocks(Block scope), such as a block defined in a for loop or if statement.
    • const: used to declareconstantThe scope of const is alsoBlock Levelof.
    • It is recommended to use let and const instead of var to declare variables because they follow block-level scope, are more consistent with the scope rules in most programming languages, and are easier to understand and predict.

    kind

    A basic JavaScript class structure is as follows:

    class MyClass {
      constructor(parameter1, parameter2) {
        this.property1 = parameter1;
        this.property2 = parameter2;
      }
      method1() {
        // method body
      }
      static sayHello() {
        console.log('Hello!');
      }
    }

    Create an instance:

    let myInstance = new MyClass('value1', 'value2');
    myInstance.method1(); //Calling class method

    You can also call static classes directly (without creating an instance):

    MyClass.sayHello();  // "Hello!"

    Brief description of project process

    The program entry is engine.js, and the main function is GAMES202Main. First, initialize WebGL related content, including the camera, camera interaction, renderer, light source, object loading, user GUI interface and the most important main loop.

    During the object loading process, loadOBJ.js will be called. First, the corresponding glsl is loaded from the file, and the Phong material, Phong-related shadows, and shadow materials are constructed.

    // loadOBJ.js
    case 'PhongMaterial':
        Material = buildPhongMaterial(colorMap, mat.specular.toArray(), light, Translation, Scale, "./src/shaders/phongShader/phongVertex.glsl", "./src/shaders/phongShader/phongFragment.glsl");
        shadowMaterial = buildShadowMaterial(light, Translation, Scale, "./src/shaders/shadowShader/shadowVertex.glsl", "./src/shaders/shadowShader/shadowFragment.glsl");
        break;
    }

    Then, the 2-pass shadow map and conventional Phong material are directly generated through MeshRender. The specific code is as follows:

    // loadOBJ.js
    Material.then((data) => {
        // console.log("Now making surface material")
        let meshRender = new MeshRender(Renderer.gl, mesh, data);
        Renderer.addMeshRender(meshRender);
    });
    shadowMaterial.then((data) => {
        // console.log("Now making shadow material")
        let shadowMeshRender = new MeshRender(Renderer.gl, mesh, data);
        Renderer.addShadowMeshRender(shadowMeshRender);
    });

    Note that MeshRender has a certain degree of versatility, it accepts any type of material as its parameter. How is it distinguished specifically? By judging whether the incoming material.frameBuffer is empty, if it is empty, the surface material will be loaded, otherwise the shadow map will be loaded. In the draw() function of MeshRender.js, you can see the following code:

    // MeshRender.js
    if (this.Material.frameBuffer != null) {
        // Shadow map
        gl.viewport(0.0, 0.0, resolution, resolution);
    } else {
        gl.viewport(0.0, 0.0, window.screen.width, window.screen.height);
    }

    After the shadow is generated by MeshRender, it is pushed into the renderer. The corresponding implementation can be found in WebGLRenderer.js:

    addShadowMeshRender(mesh) { this.shadowMeshes.push(mesh); }

    Finally, enter the mainLoop() main loop to update the screen frame by frame.

    Detailed explanation of the project process

    This chapter will start from a small problem and explore how the fragment shader is constructed. This will connect almost the entire project, and this is also the reading project flow that I think is more comfortable.

    Where does glsl work? — Explain the code flow in detail starting from the fragment shader process

    In the above we did not mention in detail how the glsl file is called, here we will talk about it in detail.

    First inloadOBJ.jsThe .glsl file is introduced for the first time using the path method:

    // loadOBJ.js - function loadOBJ()
    Material = buildPhongMaterial(colorMap, mat.specular.toArray(), light, Translation, Scale, "./src/shaders/phongShader/phongVertex.glsl", "./src/shaders/phongShader/phongFragment.glsl");
    shadowMaterial = buildShadowMaterial(light, Translation, Scale, "./src/shaders/shadowShader/shadowVertex.glsl", "./src/shaders/shadowShader/shadowFragment.glsl");

    Here we take phongFragment.glsl as an example. phongFragment.glsl loads the glsl code from the hard disk through the getShaderString method in the buildPhongMaterial function of PhongMaterial.js. At the same time, the glsl code is passed in as a construction parameter and used to construct a PhongMaterial object. During the construction process, PhongMaterial calls the super() function to implement the constructor of the parent class Material.js, that is, to pass the glsl code to Material.js:

    // PhongMaterial.js
    super({...}, [], ..., fragmentShader);

    In C++, a subclass can choose whether to completely inherit the parameters of the parent class's constructor. Here, the parent class has 5 constructors, but only 4 are actually implemented, which is completely fine.

    In Material.js, the subclass passes the glsl code here through the fourth parameter #fsSrc of the constructor. So far, the transmission path of glsl code isCome to the end, the next function waiting for him will be called compile().

    // Material.js
    this.#fsSrc = fsSrc;
    ...
    compile(gl) {
        return new Shader(..., ..., this.#fsSrc,{...});
    }

    As for when is the compile function called? Back to the process of loadOBJ.js, now that we have completely executed the buildPhongMaterial() code, the next step is the then() part mentioned in the previous section.

    Note that loadOBJ() is just a function, not an object!

    // loadOBJ.js
    Material.then((data) => {
        let meshRender = new MeshRender(Renderer.gl, mesh, data);
        Renderer.addMeshRender(meshRender);
        Renderer.ObjectID[ObjectID][0].push(Renderer.meshes.length - 1);
    });

    When constructing a MeshRender object, compile() is called:

    // MeshRender.js
    constructor(gl, mesh, Material) {
    ...
        this.shader = this.Material.compile(gl);
    }
    // Material.js
    compile(gl) {
        return new Shader(..., ..., this.#fsSrc,{...});
    }

    Next, let's take a closer look at the structure of shader.js. Material implements all four construction parameters when constructing the shader object. Let's focus on fsSrc here, that is, continue to see the fate of the glsl code.

    // shader.js
    constructor(gl, vsSrc, fsSrc, shaderLocations) {
        ...
        const fs = this.compileShader(fsSrc, ...);
        ...
    }

    When constructing a shader object to implement fs compile shader, the compileShader() function is used. This compileShader function will create a global variable shader, the code is as follows:

    // shader.js
    compileShader(shaderSource, shaderType) {
        const gl = this.gl;
        var shader = gl.createShader(shaderType);
        gl.shaderSource(shader, shaderSource);
        gl.compileShader(shader);
    
        if (!gl.getShaderParameter(shader, gl.COMPILE_STATUS)) {
            console.error(shaderSource);
            console.error('shader compiler error:\n' + gl.getShaderInfoLog(shader));
        }
    
        return shader;
    };

    What is this gl? It is passed to shader.js as the parameter renderer.gl when loadOBJ() constructs the MeshRender object. And renderer is the first parameter of loadOBJ(), which is passed in engine.js.

    Actually, renderer in loadOBJ.js is a WebGLRenderer object. And the gl of renderer.gl is created in engine.js:

    // engine.js
    const gl = canvas.getContext('webgl');

    gl can be understood as getting the WebGL object of canvas from index.html. In fact, gl provides an interface for developers to interact with the WebGL API.

    <!-- index.html -->
    <canvas id="glcanvas">

    WebGL recommended references:

    1. https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API
    2. https://webglfundamentals.org
    3. https://www.w3cschool.cn/webgl/vjxu1jt0.html

    Tips: The website has a corresponding Chinese version, but it is recommended to read the English version if you are capable~ WebGL API:

    1. https://developer.mozilla.org/en-US/docs/Web/API
    2. https://webglfundamentals.org/docs/

    After knowing what gl is, it is natural to find out where and how the project framework is connected with WebGL.

    // Shader.js
    compileShader(shaderSource, shaderType) {
        const gl = this.gl;
        var shader = gl.createShader(shaderType);
        gl.shaderSource(shader, shaderSource);
        gl.compileShader(shader);
    
        if (!gl.getShaderParameter(shader, gl.COMPILE_STATUS)) {
            console.error(shaderSource);
            console.error('shader compiler error:\n' + gl.getShaderInfoLog(shader));
        }
    
        return shader;
    };

    That is to say, all gl methods are called through the WebGL API. gl.createShader is the first WebGL API we come into contact with.

    We just need to know that this createShader() function will return WebGLShader shader objectWe will explain this in more detail later, but let’s focus on where shaderSource goes.

    • gl.shaderSource:A string containing the GLSL source code to set.

    In other words, the GLSL source code we have been tracking is parsed into the WebGLShader through the gl.shaderSource function.

    The WebGLShader is then compiled through the gl.compileShader() function to make it binary data, which can then be used by the WebGLProgram object.

    Simply put, a WebGLProgram is a GLSL program that contains compiled WebGL shaders, which must contain at least a vertex shader and a fragment shader. In WebGL, one or more WebGLProgram objects are created, each containing a specific set of rendering instructions. By using different WebGLPrograms, you can achieve a variety of screens.

    The if statement is the part that checks whether the shader was compiled successfully. If the compilation fails, the code inside the brackets is executed. Finally, the shader object shader is returned after compilation (or attempted compilation).

    At this point, we have completed the work of taking the GLSL file from the hard disk and compiling it into a shader object.

    But the rendering process is not over yet. Let's go back to the construction of the Shadow object:

    // Shadow.js
    class Shader {
        constructor(gl, vsSrc, fsSrc, shaderLocations) {
            this.gl = gl;
            const vs = this.compileShader(vsSrc, gl.VERTEX_SHADER);
            const fs = this.compileShader(fsSrc, gl.FRAGMENT_SHADER);
    
            this.program = this.addShaderLocations({
                glShaderProgram: this.linkShader(vs, fs),
            }, shaderLocations);
        }
        ...

    Although we just explained the GLSL compilation process of the fragment shader, the vertex shader is quite similar, so it is omitted here.


    Here we introduce the process of linking shaders using linkShader(). The code is below the text.

    1. First create aCreating a programName it WebGLProgram.
    2. Add the compiled vertex shader and fragment shader vs and fs to the program. This step is calledAdditional ShadersSpecifically, they are attached to the WebGLProgram using gl.attachShader().
    3. Link the WebGLProgram using gl.linkProgram(). This generates an executable program that combines the shaders attached previously. This step is calledLinker.
    4. Finally, check the link status and return the WebGL object.
    // Shader.js
    linkShader(vs, fs) {
        const gl = this.gl;
        var prog = gl.createProgram();
        gl.attachShader(prog, vs);
        gl.attachShader(prog, fs);
        gl.linkProgram(prog);
    
        if (!gl.getProgramParameter(prog, gl.LINK_STATUS)) {
            abort('shader linker error:\n' + gl.getProgramInfoLog(prog));
        }
        return prog;
    };

    A WebGLProgram can be thought of as a container for shaders, which contains all the information and instructions needed to transform 3D data into 2D pixels on the screen.


    After getting the program glShaderProgram that is linked to the shader, it will be loaded together with the shaderLocations object.

    Simply put, the shaderLocations object contains two properties

    • Attributes are "individual" data (such as information about each vertex)
    • Uniforms are "overall" data (such as information about a light)

    The framework packages the loading process into addShaderLocations(). Simply put, after this step, when you need to assign values to these uniforms and attributes, you can directly operate through the acquired locations without having to query the locations every time.

    addShaderLocations(Result, shaderLocations) {
        const gl = this.gl;
        Result.uniforms = {};
        Result.attribs = {};
    
        if (shaderLocations && shaderLocations.uniforms && shaderLocations.uniforms.length) {
            for (let i = 0; i < shaderLocations.uniforms.length; ++i) {
                Result.uniforms = Object.assign(Result.uniforms, {
                    [shaderLocations.uniforms[i]]: gl.getUniformLocation(Result.glShaderProgram, shaderLocations.uniforms[i]),
                });
            }
        }
        if (shaderLocations && shaderLocations.attribs && shaderLocations.attribs.length) {
            for (let i = 0; i < shaderLocations.attribs.length; ++i) {
                Result.attribs = Object.assign(Result.attribs, {
                    [shaderLocations.attribs[i]]: gl.getAttribLocation(Result.glShaderProgram, shaderLocations.attribs[i]),
                });
            }
        }
    
        return Result;
    }

    Let's review what has been done so far: successfully construct a compiled (or attempted compiled) Shader object for MeshRender:

    // MeshRender.js - construct()
    this.shader = this.Material.compile(gl);

    At this point, the task of loadOBJ has been successfully completed. In engine.js, such loading needs to be done three times:

    // loadOBJ(renderer, path, name, objMaterial, transform, meshID);
    loadOBJ(Renderer, 'assets/mary/', 'Marry', 'PhongMaterial', obj1Transform);
    loadOBJ(Renderer, 'assets/mary/', 'Marry', 'PhongMaterial', obj2Transform);
    loadOBJ(Renderer, 'assets/floor/', 'floor', 'PhongMaterial', floorTransform);

    Next, we come to the main loop of the program. That is, one loop represents one frame:

    // engine.js
    loadOBJ(...);
    ...
    Function mainLoop() {...}
    ...

    Main program loop — mainLoop()

    In fact, when mainLoop is executed, the function will call itself again, forming an infinite loop. This is the basic mechanism of the so-called game loop or animation loop.

    // engine.js
    Function mainLoop() {
        cameraControls.update();
        Renderer.Render();
        requestAnimationFrame(mainLoop);
    };
    requestAnimationFrame(mainLoop);

    cameraControls.update(); Updates the camera's position or orientation, for example in response to user input.

    renderer.render(); The scene is rendered or drawn to the screen. The specific content and method of rendering depends on the implementation of the renderer object.

    The benefit of requestAnimationFrame is that it will try to synchronize with the screen refresh rate, which can provide smoother animations and higher performance because it will not execute code unnecessarily between screen refreshes.

    For more information about the requestAnimationFrame() function, refer to the following article: https://developer.mozilla.org/en-US/docs/Web/API/window/requestAnimationFrame

    Next, focus on the operation of the render() function.

    render()

    This is a typical process of light source rendering, shadow rendering and final camera perspective rendering. I will not go into details here and will move on to the multi-light source section later.

    // WebGLRenderer.js - render()
    const gl = this.gl;
    
    gl.clearColor(0.0, 0.0, 0.0, 1.0); // The default value of shadowmap is white (no occlusion), which solves the problem of shadows on the ground edge (because the ground cannot be sampled, the default value of 0 will be considered as occluded)
    gl.clearDepth(1.0);// Clear everything
    gl.enable(gl.DEPTH_TEST); // Enable depth testing
    gl.depthFunc(gl.LEQUAL); // Near things obscure far things
    
    console.assert(this.lights.length != 0, "No light");
    console.assert(this.lights.length == 1, "Multiple lights");
    
    for (let l = 0; l < this.lights.length; l++) {
        gl.bindFramebuffer(gl.FRAMEBUFFER, this.lights[l].entity.fb);
        gl.clear(gl.DEPTH_BUFFER_BIT);
        // Draw light
        // TODO: Support all kinds of transform
        this.lights[l].meshRender.mesh.transform.translate = this.lights[l].entity.lightPos;
        this.lights[l].meshRender.draw(this.camera);
    
        // Shadow pass
        if (this.lights[l].entity.hasShadowMap == true) {
            for (let i = 0; i < this.shadowMeshes.length; i++) {
                this.shadowMeshes[i].draw(this.camera);
            }
        }
    }
    // Camera pass
    for (let i = 0; i < this.meshes.length; i++) {
        this.gl.useProgram(this.meshes[i].shader.program.glShaderProgram);
        this.gl.uniform3fv(this.meshes[i].shader.program.uniforms.uLightPos, this.lights[0].entity.lightPos);
        this.meshes[i].draw(this.camera);
    }

    GLSL Quick Start - Analyzing the Fragment Shader FragmentShader.glsl

    Above we discussed how to load GLSL. This section introduces the concept and practical usage of GLSL.

    When rendering in WebGL, we need at least one Vertex Shader and a Fragment Shader In the previous section, we took the fragment shader as an example to introduce how the framework reads the GLSL file from the hard disk into the renderer. Next, we will take the Flagment Shader fragment shader as an example (i.e. phongFragment.glsl) to introduce the process of writing GLSL.

    What is the use of FragmentShader.glsl?

    The role of the fragment shader is to render the correct color for the current pixel when rasterizing. The following is the simplest form of a fragment shader, which contains a main() function, in which the color of the current pixel gl_FragColor is specified.

    void main(void){
        ...
        gl_FragColor = vec4(Color, 1.0);
    }

    What data does the Fragment Shader accept?

    Fragment Shader needs to know data, which is provided by the following three main methods. For specific usage, please refer to Appendix 1.6 :

    1. Uniforms (global variables): These are values that remain constant for all vertices and fragments within a single draw call. Common examples include transformation matrices (translation, rotation, etc.), light parameters, and material properties. Since they are constant across draw calls, they are called "uniforms".
    2. Textures: Textures are arrays of image data that can be sampled by the fragment shader to get color, normal, or other types of information for each fragment.
    3. Varyings: These are the values output by the vertex shader, which are interpolated between the vertices of a graphics primitive (such as a triangle) and passed to the fragment shader. This allows us to calculate values (such as transformed positions or vertex colors) in the vertex shader and interpolate between fragments for use in the fragment shader.

    Uniforms and Varyings were used in the project.

    GLSL Basic Syntax

    I won't go over the basic usage here, because that would be too boring. Let's just look at the project:

    // phongFragment.glsl - PCF pass
    void main(void) {
        // Declare variables
        float visibility;     // Visibility (for shadows)
        vec3 shadingPoint;     // Viewpoint coordinates from the light source
        vec3 phongColor;      // Calculated Phong lighting color
    
        // Normalize the coordinate value of vPositionFromLight to the range [0,1]
        shadingPoint = vPositionFromLight.xyz / vPositionFromLight.w;
        shadingPoint = shadingPoint * 0.5 + 0.5; //Convert the coordinates to the range [0,1]
    
        // Calculate visibility (shadows).
        visibility = PCF(uShadowMap, vec4(shadingPoint, 1.0)); // Use PCF (Percentage Closer Filtering) technology
    
        // Use the blinnPhong() function to calculate the Phong lighting color
        phongColor = blinnPhong();
    
        // Calculate the final fragment color, multiply the Phong lighting color by the visibility to get the fragment color that takes shadows into account
        gl_FragColor = vec4(phongColor * visibility, 1.0);
    }

    Like C language, GLSL is a strongly typed language. You cannot assign a value like this: float visibility = 1;, because 1 is of type int.

    vector or matrix

    In addition, glsl has many special built-in types, such as floating-point type vectors vec2, vec3 and vec4, and matrix types mat2, mat3 and mat4.

    The access method of the above data is also quite interesting.

    • .xyzw: Usually used to represent points or vectors in three-dimensional or four-dimensional space.
    • .rgba: Used when the vector represents a color, where r represents red, g represents green, b represents blue, and a represents transparency.
    • .stpq: Used when vectors are used as texture coordinates.

    therefore,

    • vx and v[0] and vr and vs all represent the first component of the vector.
    • vy and v[1] and vg and vt all represent the second component of the vector.
    • For vec3 and vec4, vz and v[2] and vb and vp all represent the third component of the vector.
    • For vec4, vw and v[3] and va and vq all represent the fourth component of the vector.

    You can even access these types of data using a technique called "component reassembly" or "component selection":

    1. Repeat a certain amount:
    2. v.yyyy results in a new vec4 where each component is the y component of the original v. This has the same effect as vec4(vy, vy, vy, vy).
    3. Exchange Components:
    4. v.bgra will produce a new vec4 with the components taken from v in the order b, g, r, a. This is the same as vec4(vb, vg, vr, va).

    When constructing a vector or matrix you can provide multiple components at once, for example:

    • vec4(v.rgb, 1) is equivalent to vec4(vr, vg, vb, 1)
    • vec4(1) is also equivalent to vec(1, 1, 1, 1)

    Reference: GLSL language specification https://www.khronos.org/files/opengles_shading_language.pdf

    Matrix storage method

    These tips can be found in the Doc of glmatrix: https://glmatrix.net/docs/mat4.js.html. In addition, if we look closely, we will find that this component is also usedColumn-first storageThe matrix is also stored in columns in WebGL and GLSL. It is shown below:

    To move an object to a new position, you can use the mat4.translate() function, which accepts three parameters: a 4×4 output out, an incoming 4×4 matrix a, and a 1×3 displacement matrix v.

    The simplest matrix multiplication can be done using mat4.multiply, scaling the matrix using mat4.scale(), adjusting the "looking" direction using mat4.lookAt(), and the orthogonal projection matrix using mat4.ortho().

    Implementing matrix transformation of light source camera

    If we use perspective projection, we need to scale the following Frustum to an orthogonal perspective space, as shown below:

    But if we use orthogonal projection, we can keep the linearity of the depth value and make the accuracy of the Shadow Map as large as possible.

    // DirectionalLight.js - CalcLightMVP()
    let lightMVP = mat4.create();
    let modelMatrix = mat4.create();
    let viewMatrix = mat4.create();
    let projectionMatrix = mat4.create();
    
    // Model transform
    mat4.translate(modelMatrix, modelMatrix, translate);
    mat4.scale(modelMatrix, modelMatrix, scale);
    
    // View transform
    mat4.lookAt(viewMatrix, this.lightPos, this.focalPoint, this.lightUp);
    
    // Projection transform
    let left = -100.0, right = -left, bottom = -100.0, top = -bottom, 
        near = 0.1, far = 1024.0;  
        // Set these values as per your requirement
    mat4.ortho(projectionMatrix, left, right, bottom, top, near, far);
    
    
    mat4.multiply(lightMVP, projectionMatrix, viewMatrix);
    mat4.multiply(lightMVP, lightMVP, modelMatrix);
    
    return lightMVP;

    2-Pass Shadow Algorithm

    Before implementing the two-pass algorithm, let’s take a look at how the main() function is called.

    // phongFragment.glsl
    void main(void){  
      vec3 shadingPoint = vPositionFromLight.xyz / vPositionFromLight.w;
      shadingPoint = shadingPoint*0.5+0.5;// Normalize to [0,1]
    
      float visibility = 1.0;
      visibility = useShadowMap(uShadowMap, vec4(shadingPoint, 1.0));
    
      vec3 phongColor = blinnPhong();
    
      gl_FragColor=vec4(phongColor * visibility,1.0);
    }

    So the question is, how does vPositionFromLight come from? It is calculated in the vertex shader.

    Unified space coordinates

    In layman's terms, the world coordinates of the scene's vertices are converted to new coordinates corresponding to the NDC space of the light camera. The purpose is to retrieve the required depth value in the light camera's space when rendering the shadow of a shading point of the main camera.

    vPositionFromLight represents the homogeneous coordinates of a point seen from the perspective of the light source. This coordinate is in the orthogonal space of the light source, and its range is [-w, w]. It is calculated by phongVertex.glsl. The function of phongVertex.glsl is to process the input vertex data and convert a series of vertices into clip space coordinates through the MVP matrix calculated in the previous chapter. Convert vPositionFromLight to the NDC standard space to get the shadingPoint, and then pass the Shading Point in the shadingPoint that needs to be used for shadow judgment into the useShadowMap function. Attached is the relevant code for vertex conversion:

    // phongVertex.glsl - main()
    vFragPos = (uModelMatrix * vec4(aVertexPosition, 1.0)).xyz;
    vNormal = (uModelMatrix * vec4(aNormalPosition, 0.0)).xyz;
    
    gl_Position = uProjectionMatrix * uViewMatrix * uModelMatrix *
                vec4(aVertexPosition, 1.0);
    
    vTextureCoord = aTextureCoord;
    vPositionFromLight = uLightMVP * vec4(aVertexPosition, 1.0);

    phongVertex.glsl is loaded in loadOBJ.js together with phongFragment.glsl.

    Compare depth values

    Next, implement the useShadowMap() function. The purpose of this function is to determine whether a fragment (pixel) is in the shadow.

    texture2D() is a GLSL built-in function used to sample a 2D texture.

    The unpack() and pack() functions in the code framework are set to increase numerical precision. The reasons are as follows:

    • Depth information is a continuous floating point number, and its range and precision may exceed what an 8-bit channel can provide. Storing such a depth value directly in an 8-bit channel will result in a lot of precision loss, resulting in incorrect shadow effects. Therefore, we can make full use of the other three channels, that is, encode the depth value into multiple channels. By allocating different parts of the depth value to the four channels of R, G, B, and A, we can store the depth value with higher precision. When we need to use the depth value, we can decode it from these four channels.

    closestDepthVec is the depth information of the blocker.

    Finally, the closestDepth is compared with the currentDepth. If the blocker (closestDepth) is greater than the depth value (shadingPoint.z) of the fragment to be rendered by the main camera, it means that the current Shading Point is not blocked, and visibility returns 1.0. In addition, in order to solve some shadow acne and self-occlusion problems, the position of the blocker can be increased, that is, EPS can be added.

    // phongFragment.glsl
    float useShadowMap(sampler2D shadowMap, vec4 shadingPoint){
      // Retrieve the closest depth value from the light's perspective using the fragment's position in light space.
      float closestDepth = unpack(texture2D(shadowMap, shadingPoint.xy));
      // Compare the fragment's depth with the closest depth to determine if it's in shadow.
      return (closestDepth + EPS + getBias(.4)> shadingPoint.z) ? 1.0 : 0.0;
    }

    Actually, there is still a problem. Our current light camera is not omnidirectional, which means that its illumination range is only a small part. If the model is within the range of the lightCam, then the picture is completely correct.

    But when the model is outside the range of lightCam, it should not participate in the calculation of useShadowMap. But we have not yet completed the relevant logic. In other words, if the position is outside the range of lightCam's MVP transformation matrix, unexpected errors may occur after calculation. Take a look at the soul diagram again:

    In the previous section, we defined zFar, zNear and other information in the directional light source script. The following code is shown:

    // DirectionalLight.js - CalcLightMVP()
    let left = -100.0, right = -left, bottom = -100.0, top = -bottom, near = 0.1, far = 1024.0;

    Therefore, in order to solve the problem that the model is outside the lightCam range, we add the following logic to useShadowMap or in the code before useShadowMap to remove the sampling points that are not in the lightCam range:

    // phongFragment.glsl - main()
    ...
    if(shadingPoint.x<0.||shadingPoint.x>1.||
       shadingPoint.y<0.||shadingPoint.y>1.){
      visibility=1.;// The light source cannot see the area, so it will not be covered by the shadow
    }else{
      visibility=useShadowMap(uShadowMap,vec4(shadingPoint,1.));
    }
    ...

    The effect is shown in the figure below. The left side has culling logic, and the right side has no culling logic. When 202 moves to the edge of the lightCam's frustum, her limbs are amputated directly, which is very scary:

    Of course, it is okay not to complete this step. In fact, in development, we will use a universal light source, that is, the lightCam is 360 degrees omnidirectional, and we only need to remove those points outside the zFar plane.

    Add bias to improve self-occlusion problem

    When we render the depth map from the light source's point of view, errors may occur due to the limitations of floating point precision. Therefore, when we use the depth map in the main rendering process, we may see the object's own shadow, which is called self-occlusion or shadow distortion.

    After completing the 2-pass rendering, we found shadow acne in many places such as 202's hair, which is very unsightly. As shown in the following figure:

    In theory, we can alleviate the self-occlusion problem by adding bias. Here I provide a method to dynamically adjust the bias:

    // phongFragment.glsl
    // Use bias offset value to optimize self-occlusion
    float getBias(float ctrl) {
      vec3 lightDir = normalize(uLightPos);
      vec3 normal = normalize(vNormal);
      float m = 200.0 / 2048.0 / 2.0; // Orthogonal matrix width and height/shadowmap resolution/2
      float bias = max(m, m * (1.0 - dot(normal, lightDir))) * ctrl;
      return bias;
    }

    First, when the light and the normal are almost perpendicular, self-occlusion is very likely to occur, such as the back of our 202 sauce's head. Therefore, we need to obtain the direction of the light and the direction of the normal. Among them, m represents the size of the scene space represented by each pixel under the light source view.

    Finally, change useShadowMap() in phongFragment.glsl to the following:

    // phongFragment.glsl
    float useShadowMap(sampler2D shadowMap, vec4 shadingPoint){
      ...
      return (closestDepth + EPS + getBias(.3)> shadingPoint.z) ? 1.0 : 0.0;
    }

    The effect is as follows:

    It should be noted that a larger bias value may lead to over-correction and shadow loss, while a smaller value may not improve acne, so multiple attempts are required.

    PCF

    However, the resolution of ShadowMap is limited. In actual games, the resolution of ShadowMap is much smaller than the resolution (because of the high performance consumption), so we need a method to soften the jagged edges. The PCF method calculates the Shading Point by taking the average of multiple pixels around each pixel on ShadowMap.

    Initially people wanted to use this method to soften shadows, but later they found that this method can achieve the effect of soft shadows.

    Before using the PCF algorithm to estimate the shadow ratio, we need to prepare a set of sampling points. For PCF shadows, we only use 4-8 sampling points on mobile devices, while high-quality images use 16-32. In this section, we use 8 sampling points, and on this basis, we adjust the parameters of the generated samples to improve the image, reduce noise, etc.

    However, the above different sampling methods do not have a particularly large impact on the final image. The most important factor affecting the image is the size of the shadow map when doing PCF. Specifically, it is the textureSize in the code, but generally speaking, this item is a fixed value in the project.

    So our next idea is to implement PCF first and then fine-tune the sampling method.

    After all, premature optimization is a taboo.

    Implementing PCF

    In main(), modify the shading algorithm used.

    // phongFragment.glsl
    void main(void){  
        ...
        visibility = PCF(uShadowMap, vec4(shadingPoint, 1.0));
        ...
    }

    shadowMap.xy is the texture coordinate used to sample the shadow map, and shadowMap.z is the depth value for that pixel.

    The sampling function requires us to pass in a Vec2 variable as a random seed, and then returns a random point within a circle with a radius of 1.

    Then divide the uv coordinates of $[0, 1]^2$ into textureSize parts. After setting the filter window, sample multiple times near the current shadingPoint position and finally count:

    // phongFragment.glsl
    float PCF(sampler2D shadowMap,vec4 shadingPoint){
      // The sampling result will be returned to the global variable - poissonDisk[]
      poissonDiskSamples(shadingPoint.xy);
    
      float textureSize=256.; // The size of the shadow map, the larger the size, the smaller the filtering range
      float filterStride=1.; // Filter step size
      float filterRange=1./textureSize*filterStride; // The range of the filter window
      int noShadowCount=0; // How many points are not in the shadow
      for(int i=0;i<NUM_SAMPLES;i++){
        vec2 sampleCoord=poissonDisk[i]*filterRange+shadingPoint.xy;
        vec4 closestDepthVec=texture2D(shadowMap,sampleCoord);
        float closestDepth=unpack(closestDepthVec);
        float currentDepth=shadingPoint.z;
        if(currentDepth<closestDepth+EPS){
          noShadowCount+=1;
        }
      }
      return float(noShadowCount)/float(NUM_SAMPLES);
    }

    The effect is as follows:

    image-20230805213129275

    poissonDisk sampling parameter settings

    In the homework framework, I found that this possionDiskSamples function is not really a Poisson disk distribution? A bit strange.I personally feel that it is more like points evenly distributed on the spiral line.I hope readers can give me some guidance. I will first analyze the code in the framework.


    Mathematical formulas related to poissonDiskSamples in the framework:

    // phongFragment.glsl
    float ANGLE_STEP = PI2 * float( NUM_RINGS ) / float( NUM_SAMPLES );
    float INV_NUM_SAMPLES = 1.0 / float( NUM_SAMPLES );
    float angle = rand_2to1( randomSeed ) * PI2;
    float radius = INV_NUM_SAMPLES;
    float radiusStep = radius;

    Convert polar coordinates to Cartesian coordinates: Update rule: Radius change:

    The specific code is as follows:

    // phongFragment.glsl
    vec2 poissonDisk[NUM_SAMPLES];
    
    void poissonDiskSamples( const in vec2 randomSeed ) {
      float ANGLE_STEP = PI2 * float( NUM_RINGS ) / float( NUM_SAMPLES );
      float INV_NUM_SAMPLES = 1.0 / float( NUM_SAMPLES );//Put the sample in a circle with a radius of 1
    
      float angle = rand_2to1( randomSeed ) * PI2;
      float radius = INV_NUM_SAMPLES;
      float radiusStep = radius;
    
      for( int i = 0; i < NUM_SAMPLES; i ++ ) {
        poissonDisk[i] = vec2( cos( angle ), sin( angle ) ) * pow( radius, 0.75 );
        radius += radiusStep;
        angle += ANGLE_STEP;
      }
    }

    That is, we can adjust the following parameters:

    • Selection of radius variation index

    As for why the number 0.75 is used in the homework framework, I made a more vivid animation, showing the change of the exponent of the distance (radius) between each result coordinate and the center of the circle between 0.2 and 1.1 during Poisson sampling. In other words, when the value is above 0.75, it can be basically considered that the center of gravity of the data will be more inclined to the position of the center of the circle. I put the code of the animation below Appendix 1.2 Readers can compile and debug by themselves.

    The above is a video. If you want the PDF version, you need to go to the website to view it.

    • Number of turnsNUM_RINGS

    NUM_RINGS is used together with NUM_SAMPLES to calculate the angle difference ANGLE_STEP between each sample point.

    At this point, the following analysis can be made:

    If NUM_RINGS is equal to NUM_SAMPLES, then ANGLE_STEP will be equal to $2π$, which means that the angle increment in each iteration is a full circle, which obviously does not make sense. If NUM_RINGS is less than NUM_SAMPLES, then ANGLE_STEP will be less than $2π$, which means that the angle increment in each iteration is a portion of a circle. If NUM_RINGS is greater than NUM_SAMPLES, then ANGLE_STEP will be greater than $2π$, which means that the angle increment in each iteration exceeds a circle, which may cause coverage and overlap.

    So in this code framework, when our sampling number is fixed (8 here), we can make decisions to make the sampling points more evenly distributed.

    Therefore, in theory, NUM_RINGS can be set directly to 1 here.

    The above is a video. If you want the PDF version, you need to go to the website to view it.

    When the sampling points are evenly distributed, the effect is quite good:

    If the sampling is very uneven, such as when NUM_RINGS is equal to NUM_SAMPLES, a dirty picture will appear:

    After getting these sampling points, we can also perform weight distribution on the sampling points. For example, in the 202 course, Professor Yan mentioned that different weights can be set according to the distance of the original pixel, and farther sampling points may be assigned lower weights, but this part of the code is not involved in the project.

    PCSS

    First find the AVG Blocker Depth of any uv coordinate in the Shadow Map.

    float findBlocker(sampler2D shadowMap,vec2 uv,float z_shadingPoint){
      float count=0., depth_sum=0., depthOnShadowMap, is_block;
      vec2 nCoords;
      for(int i=0;i<BLOCKER_SEARCH_NUM_SAMPLES;i++){
        nCoords=uv+BLOKER_SIZE*poissonDisk[i];
    
        depthOnShadowMap=unpack(texture2D(shadowMap,nCoords));
        if(abs(depthOnShadowMap) < EPS)depthOnShadowMap=1.;
        // The step function is used to compare two values.
        is_block=step(depthOnShadowMap,z_shadingPoint-EPS);
        count+=is_block;
        depth_sum+=is_block*depthOnShadowMap;
      }
      if(count<EPS)
        return z_shadingPoint;
      return depth_sum/count;
    }

    There are three steps, and I will not go into details here. It is not difficult to follow the theoretical formula.

    image-20230731142003749
    float PCSS(sampler2D shadowMap,vec4 shadingPoint){
      poissonDiskSamples(shadingPoint.xy);
      float z_shadingPoint=shadingPoint.z;
      // STEP 1: avgblocker depth
      float avgblockerdep=findBlocker(shadowMap,shadingPoint.xy,z_shadingPoint);
      if(abs(avgblockerdep - z_shadingPoint) <= EPS) // No Blocker
        return 1.;
    
      // STEP 2: penumbra size
      float dBlocker=avgblockerdep,dReceiver=z_shadingPoint-avgblockerdep;
      float wPenumbra=min(LWIDTH*dReceiver/dBlocker,MAX_PENUMBRA);
    
      // STEP 3: filtering
      float _sum=0.,depthOnShadowMap,vis;
      vec2 nCoords;
      for(int i=0;i<NUM_SAMPLES;i++){
        nCoords=shadingPoint.xy+wPenumbra*poissonDisk[i];
    
        depthOnShadowMap=unpack(texture2D(shadowMap,nCoords));
        if(abs(depthOnShadowMap)<1e-5)depthOnShadowMap=1.;
    
        vis=step(z_shadingPoint-EPS,depthOnShadowMap);
        _sum+=vis;
      }
    
      return _sum/float(NUM_SAMPLES);
    }

    Framework Part Analysis

    This part is the comments I wrote when I was casually browsing the code, and I have organized them here a little bit.

    loadShader.js

    Although both functions in this file load glsl files, the latter getShaderString(filename) function is more concise and advanced. This is mainly reflected in the fact that the former returns a Promise object, while the latter directly returns the file content. For more information about Promise, please refer to this article. Appendix 1.3 – Simple usage of JS Promise For more information about async await, see this article Appendix 1.4 – Introduction to async awaitFor the usage of .then(), see Appendix 1.5 - About .then .

    To put it more professionally, these two functions provide different levels of abstraction. The former provides the atomic level capability of directly loading files and has finer-grained control, while the latter is more concise and convenient.

    Add object translation effect

    Adding controllers to the GUI

    It is very expensive to calculate shadows for each frame, so I manually create a light controller and manually adjust whether to calculate shadows for each frame. In addition, when Light Moveable is unchecked, users are prohibited from changing the light position:

    After checking Light Moveable, the lightPos option box appears:

    Specific code implementation:

    // engine.js
    // Add lights
    // light - is open shadow map == true
    let lightPos = [0, 80, 80];
    let focalPoint = [0, 0, 0]; // Directional light focusing direction (starting point is lightPos)
    let lightUp = [0, 1, 0]
    const lightGUI = {// Light source movement controller. If not checked, shadows will not be recalculated.
        LightMoveable: false,
        lightPos: lightPos
    };
    ...
    Function createGUI() {
        const gui = new dat.gui.GUI();
        const panelModel = gui.addFolder('Light properties');
        const panelCamera = gui.addFolder("OBJ properties");
        const lightMoveableController = panelModel.add(lightGUI, 'LightMoveable').name("Light Moveable");
        const arrayFolder = panelModel.addFolder('lightPos');
        arrayFolder.add(lightGUI.lightPos, '0').min(-10).max( 10).step(1).name("light Pos X");
        arrayFolder.add(lightGUI.lightPos, '1').min( 70).max( 90).step(1).name("light Pos Y");
        arrayFolder.add(lightGUI.lightPos, '2').min( 70).max( 90).step(1).name("light Pos Z");
        arrayFolder.domElement.style.display = lightGUI.LightMoveable ? '' : 'none';
        lightMoveableController.onChange(Function(value) {
            arrayFolder.domElement.style.display = value ? '' : 'none';
        });
    }

    Appendix 1.1

    import numpy as np
    import matplotlib.pyplot as plt
    
    def simulate_poisson_disk_samples(random_seed, num_samples=100, num_rings=2):
        PI2 = 2 * np.pi
        ANGLE_STEP = PI2 * num_rings / num_samples
        INV_NUM_SAMPLES = 1.0 / num_samples
    
        # Initial angle and radius
        angle = random_seed * PI2
        radius = INV_NUM_SAMPLES
        radius_step = radius
    
        x_vals = []
        y_vals = []
    
        for _ in range(num_samples):
            x = np.cos(angle) * pow(radius, 0.1)
            y = np.sin(angle) * pow(radius, 0.1)
    
            x_vals.append(x)
            y_vals.append(y)
    
            radius += radius_step
            angle += ANGLE_STEP
    
        return x_vals, y_vals
    
    plt.figure(figsize=(8, 8))
    
    # Generate and plot the spiral 5 times with different random seeds
    for _ in range(50):
        random_seed = np.random.rand()
        x_vals, y_vals = simulate_poisson_disk_samples(random_seed)
        plt.plot(x_vals, y_vals, '-o', markersize=5, linewidth=2)
    
    plt.title("Poisson Disk Samples")
    plt.axis('on')
    plt.gca().set_aspect('equal', adjustable='box')
    plt.show()

    Appendix 1.2 – Poisson sampling point post-processing animation code

    illustrate:Appendix 1.2 The code is directly based on Appendix 1.1 Modified.

    import numpy as np
    import matplotlib.pyplot as plt
    from matplotlib.animation import FuncAnimation
    
    def simulate_poisson_disk_samples_with_exponent(random_seed, exponent, num_samples=100, num_rings=2):
        PI2 = 2 * np.pi
        ANGLE_STEP = PI2 * num_rings / num_samples
        INV_NUM_SAMPLES = 1.0 / num_samples
    
        angle = random_seed * PI2
        radius = INV_NUM_SAMPLES
        radius_step = radius
    
        x_vals = []
        y_vals = []
    
        for _ in range(num_samples):
            x = np.cos(angle) * pow(radius, exponent)
            y = np.sin(angle) * pow(radius, exponent)
            x_vals.append(x)
            y_vals.append(y)
            radius += radius_step
            angle += ANGLE_STEP
    
        return x_vals, y_vals
    
    fig, ax = plt.subplots(figsize=(8, 8))
    ax.axis('on')
    ax.set_xlim(-1, 1)
    ax.set_ylim(-1, 1)
    ax.set_aspect('equal', adjustable='box')
    
    lines = [ax.plot([], [], '-o', markersize=5, linewidth=2)[0] for _ in range(50)]
    exponent = 0.2
    
    def init():
        for line in lines:
            line.set_data([], [])
        return lines
    
    def update(frame):
        global exponent
        exponent += 0.005  # Increment to adjust the exponent
        for line in lines:
            random_seed = np.random.rand()
            x_vals, y_vals = simulate_poisson_disk_samples_with_exponent(random_seed, exponent)
            # plt.title(exponent +"Poisson Disk Samples")
            line.set_data(x_vals, y_vals)
        plt.title(f"{exponent:.3f} Poisson Disk Samples")
        return lines
    
    ani = FuncAnimation(fig, update, frames=180, init_func=init, blit=False)
    
    ani.save('animation.mp4', writer='ffmpeg', fps=12)
    
    # plt.show()

    Appendix 1.3 – Simple usage of JS Promise

    Here is an example of how to use Promise:

    Function delay(milliseconds) {
        return new Promises(Function(resolve, reject) {
            if (milliseconds < 0) {
                reject('Delay time cannot be negative!');
            } else {
                setTimeout(Function() {
                    resolve('Waited for ' + milliseconds + ' milliseconds!');
                }, milliseconds);
            }
        });
    }
    
    // Example
    delay(2000).then(Function(message) {
        console.log(message);  // Output after two seconds: "Waited for 2000 milliseconds!"
    }).catch(Function(error) {
        console.log('Error: ' + error);
    });
    
    // Error example
    delay(-1000).then(Function(message) {
        console.log(message);
    }).catch(Function(error) {
        console.log('Error: ' + error);  // Immediately output: "Error: Delay time cannot be negative!"
    });

    The fixed operation of using Promise is to write a Promise constructor, which has two parameters (parameters are also functions): resolve and reject. This allows you to build error handling branches. For example, in this case, if the input content does not meet the requirements, you can call reject to enter the Promise rejection branch.

    For example, now enter the reject branch, reject(XXX) is passed to the following then(function(XXX))'s XXX.

    To sum up, Promise is an object in JS. Its core value lies in that it provides aVery elegantandunifiedIt handles asynchronous operations and chain operations in a way that also provides error handling functions.

    1. With Promise’s .then() method, you can ensure that one asynchronous operation completes before executing another asynchronous operation.
    2. The .catch() method can be used to handle errors, without having to set up error handling for each asynchronous callback.

    Appendix 1.4 – async/await

    Async/await is a feature introduced in ES8, which aims to simplify the steps of using Promise.

    Let’s look at the example directly:

    async Function asyncFunction() {
        return "Hello from async function!";
    }
    
    asyncFunction().then(Result => console.log(Result));  // Output: Hello from async function!

    After adding async to the function, a Promise object will be implicitly returned.

    The await keyword can only be used inside an async function. It "pauses" the execution of the function until the Promise is completed (resolved or rejected). Alternatively, you can also use try/catch to capture the reject.

    async Function handleAsyncOperation() {
        try {
            const Result = await maybeFails();// 
            console.log(Result);// If the Promise is resolved, this will output "Success!"
        } catch (error) {
            console.error('An error occurred:', error);// If the Promise is rejected, this will output "An error occurred: Failure!"
        }
    }

    The "pause" here means pausing theSpecific asynchronous functions, not the entire application or JavaScript event loop.

    Here is a simplified explanation of how await works:

    1. When the await keyword is executed,The asynchronous functionThe execution of is suspended.
    2. Control is returned to the event loop, allowing other code (such as other functions, event callbacks, etc.) to run immediately after the current asynchronous function.
    3. Once the Promise following the await is fulfilled or rejected, the previously paused asynchronous function continues to execute, resumes from the paused position, and processes the result of the Promise.

    That is, although your specific async function is logically "paused", the main thread of JavaScript is not blocked. Other events and functions can still be executed in the background.

    Here is an example:

    console.log('Start');
    
    async Function demo() {
        console.log('Before await');
        await new Promises(resolve => setTimeout(resolve, 2000));
        console.log('After await');
    }
    
    demo();
    
    console.log('End');

    The output will be:

    Start Before await End (wait for 2 seconds) After await

    I hope the above explanation can help you understand the asynchronous mechanism of JS. Welcome to discuss in the comment area, I will try my best to reply to you immediately.

    Appendix 1.5 About .then()

    .then() is defined on the Promise object and is used to handle the result of the Promise. When you call .then(), it will not be executed immediately, but after the Promise is resolved (fulfilled) or rejected (rejected).

    Key points about .then() :

    1. Non-blocking: When you call .then(), the code does not pause to wait for the Promise to complete. Instead, it returns immediately and executes the callback in then when the Promise is completed.
    2. Returns a new Promise: .then() always returns a new Promise. This allows you to chain calls, i.e. a series of .then() calls, each one handling the result of the previous Promise.
    3. Asynchronous callbacks: The callbacks in .then() are executed asynchronously when the original Promise is resolved or rejected. This means that they are queued in the event loop's microtask queue instead of being executed immediately.

    For example:

    console.log('Start');
    
    const promise = new Promises((resolve, reject) => {
        setTimeout(() => {
            resolve('Promise resolved');
        }, 2000);
    });
    
    promise.then(Result => {
        console.log(Result);
    });
    
    console.log('End');

    The output will be:

    Start End (wait for 2 seconds) Promise resolved

    Appendix 1.6 - Fragment Shaders: Uniforms/Textures

    https://webglfundamentals.org/webgl/lessons/zh_cn/webgl-fundamentals.html

    Uniforms Global Variables

    The value of a global variable passed to the shader during a drawing process is the same. In the following simple example, an offset is added to the vertex shader using a global variable:

    attribute vec4 a_position;uniform vec4 u_offset; void main() {   gl_Position = a_position + u_offset;}

    Now we can offset all vertices by a fixed value. First, we find the address of the global variable during initialization.

    var offsetLoc = gl.getUniformLocation(someProgram, "u_offset");

    Then set the global variable before drawing

    gl.uniform4fv(offsetLoc, [1, 0, 0, 0]);  // Offset to the right by half the screen width

    It is important to note that global variables belong to a single shader program. If multiple shaders have global variables with the same name, you need to find each global variable and set its own value.

    Textures

    To get texture information in the shader, you can first create a sampler2D type global variable, and then use the GLSL method texture2D to extract information from the texture.

    precision mediump float; 
    uniform sampler2D u_texture; 
    void main() {   
        vec2 texcoord = vec2(0.5, 0.5);  // Get the value of the texture center   
        gl_FragColor = texture2D(u_texture, texcoord);
    }

    Data obtained from the textureDepends on many settingsAt a minimum, you need to create and fill the texture with data, for example

    var tex = gl.createTexture();
    gl.bindTexture(gl.TEXTURE_2D, tex);
    var level = 0;
    var width = 2;
    var height = 1;
    var data = new Uint8Array([
       255, 0, 0, 255,   // A red pixel
       0, 255, 0, 255,   // A green pixel
    ]);
    gl.texImage2D(gl.TEXTURE_2D, level, gl.RGBA, width, height, 0, gl.RGBA, gl.UNSIGNED_BYTE, data);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.LINEAR);

    Find the address of a global variable at initialization time

    var someSamplerLoc = gl.getUniformLocation(someProgram, "u_texture");

    WebGL requires that textures must be bound to a texture unit when rendering.

    var unit = 5;  // Pick a texture unit
    gl.activeTexture(gl.TEXTURE0 + unit);
    gl.bindTexture(gl.TEXTURE_2D, tex);

    Then tell the shader which texture unit you want to use.

    gl.uniform1i(someSamplerLoc, unit);

    References

    1. GAMES202
    2. Real-Time Rendering 4th Edition
    3. https://webglfundamentals.org/webgl/lessons/webgl-shaders-and-glsl.html
  • C++ Lambda Note

    C++ Lambda Note

    I am also a rookie QAQ, so I will share my notes on learning C++ Lambda. This is just like a glimpse of the big guys on Zhihu exploring the vast world of wisdom through a narrow slit. If I make any mistakes, please correct me.

    Lambda expressions are introduced in the C++11 standard and allow anonymous functions to be defined in code. Each chapter of this article will have a large number of code examples to help you understand. Some of the code in this article refers to Microsoft official documentation | Lambda expressions in C++ | Microsoft Learn.

    Table of contents

    Basics

    • 1. Lambda Basic Syntax
    • 2. How to use Lambda expressions
    • 3. Detailed discussion of capture lists
    • 4. mutable keyword
    • 5. Lambda return value deduction
    • 6. Nested Lambda
    • 7. Lambda, std:function and delegates
    • 8. Lambda in asynchronous and concurrent programming
    • 9. Generic Lambda (C++14)
      1. Lambda Scope
      1. practice

    Intermediate

    • 1. Lambda's underlying implementation
    • 2. Lambda type, decltype and conditional compilation
    • 3. Lambda’s evolution in the new standard
    • 4. State-preserving Lambda
    • 5. Optimization and Lambda
    • 6. Integration with other programming paradigms
    • 7. Lambda and Exception Handling

    Advanced

    • 1. Lambda and noexcept
    • 2. Template parameters in Lambda (C++20 feature)
    • 3. Lambda Reflection
    • 4. Cross-platform and ABI issues

    Basics

    1. Lambda Basic Syntax

    Lambda basically looks like this:

    [ capture_clause ] ( parameters ) -> return_type { // function_body }
    • Capture clause (capture_clause) determines which variables in the outer scope will be captured by this lambda and how they will be captured (by value, by reference, or not captured). We discuss capture clauses in detail in the next chapter.
    • Parameter list (parameters) and the function body (function_body) is the same as a normal function, there is no difference.
    • Return Type (return_type) is slightly different. If the function body contains multiple statements and needs to return a value, the return type must be explicitly specified, unless all return statements return the same type, in which case the return type can be inferred automatically.

    2. How to use Lambda expressions

    Syntax example:

    // a lambda that captures no outer variables, takes no arguments, and has no return value auto greet = [] { std::cout << "Hello, World!" << std::endl; }; // a lambda that captures outer variables by reference, takes one int argument, and returns an int int x = 42; auto add_to_x = [&x](int y) -> int { return x + y; }; // a lambda that captures all outer variables by value, takes two arguments, and has its return type automatically inferred int a = 1, b = 2; auto sum = [=](int x, int y) { return a + b + x + y; }; // a lambda that creates new variables using initializer capture (C++14 feature) auto multiply = [product = a * b](int scalar) { return product * scalar; };

    Practical example:

    1. As a sorting criterion
    // As sorting criteria #include #include #include int main() { std::vector v{4, 1, 3, 5, 2}; std::sort(v.begin(), v.end(), [](int a, int b) { return a < b; // Sort in ascending order}); for (int i : v) { std::cout << i << ' '; } // Output: 1 2 3 4 5 }
    1. For forEach operation
    #include #include #include int main() { std::vector v{1, 2, 3, 4, 5}; std::for_each(v.begin(), v.end(), [](int i) { std::cout << i * i << ' '; // print the square of each number }); // output: 1 4 9 16 25 }
    1. For cumulative functions
    #include #include #include int main() { std::vector v{1, 2, 3, 4, 5}; int sum = std::accumulate(v.begin(), v.end(), 0, [](int a, int b) { return a + b; // Sum }); std::cout << sum << std::endl; // Output: 15 }
    1. For thread constructor
    #include #include int main() { int x = 10; std::thread t([x]() { std::cout << "Value in thread: " << x << std::endl; }); t.join(); // Output: Value in thread: 10 // Note: x used in the thread is captured by value when the thread is created }

    3. Detailed discussion of the capture list

    The capture list is optional. It specifies the external variables that can be accessed from within the lambda expression. Referenced external variables can be modified from within the lambda expression, but external variables captured by value cannot be modified, that is, variables prefixed with an ampersand (&) are accessed by reference, and variables without the prefix are accessed by value.

    1. Do not capture any external variables:

    cpp []{ // }

    This lambda does not capture any variables from the outer scope.

    1. By default, all external variables are captured (by reference):

    cpp [&]{ // }

    This lambda captures all variables in the outer scope and captures them by reference. If the captured variables are destroyed or out of scope when the lambda is called, undefined behavior occurs.

    1. By default, all external variables are captured (by value):

    cpp[=]{ // }

    This lambda captures all outer scope variables by value, which means it uses a copy of the variables.

    1. Explicitly capture specific variables (by value):

    cpp [x]{ // }

    This lambda captures the outer variable x by value.

    1. Explicitly capture specific variables (by reference):

    cpp [&x]{ // }

    This lambda captures the outer variable x by reference.

    1. Mixed capture (by value and by reference):

    cpp [x, &y]{ // }

    This lambda captures the variable x by value and the variable y by reference.

    1. By default, variables are captured by value, but some variables are captured by reference.:

    cpp [=, &x, &y]{ // }

    This lambda captures all outer variables by value by default, but captures variables x and y by reference.

    1. By default, variables are captured by reference, but some are captured by value.:

    cpp [&, x, y]{ // }

    This lambda captures all outer variables by reference by default, but captures variables x and y by value.

    1. Capturing the this pointer:

    cpp [this]{ // }

    This allows the lambda expression to capture the this pointer of the class member function, thus giving access to the class's member variables and functions.

    1. Capture with initializer expression (since C++14) – Generic lambda capture:cpp [x = 42]{ // } creates an anonymous variable x inside the lambda, which can be used in the lambda function body. This is quite useful, for example, you can directly transfer std::unique_ptr with move semantics, which is discussed in detail in the "reference" below.
    2. Capturing the asterisk this (since C++17):cpp [this]{ /…*/ } This lambda captures the current object (the instance of its class) by value. This avoids the risk of the this pointer becoming a dangling pointer during the lambda's lifetime. Before C++17, you could get this by reference, but this had a potential memory risk, that is, if the lifetime of this ended, it would cause a memory leak. Using the asterisk this is equivalent to making a deep copy of the current object.

    std::unique_ptr is a smart pointer with exclusive ownership. Its original design is to ensure that only one entity can own the object at a time. Therefore, std::unique_ptr cannot be copied, but can only be moved. If you want to capture by value, the compiler will report an error. If you capture by reference, the compiler will not report an error. But there are potential problems. I can think of three:

    1. The life of std::unique_ptr ends before lambda. In this case, accessing this destroyed std::unique_ptr from within lambda will cause the program to crash.
    2. If std::unique_ptr is moved after capture, the reference in the lambda is null, causing the program to crash.
    3. In a multi-threaded environment, the above two problems will occur more frequently. To avoid these problems, you can consider value capture, that is, explicitly use std::move to transfer ownership. In a multi-threaded environment, lock.

    Code example:

    1. Using Lambda as callback function – This example also involves function()
    #include #include // Suppose there is a function that calls the callback function void performOperationAsync(std::function callback) { // Async operation... int result = 42; // Assume this is the result of the asynchronous operation callback(result); // Call callback function } int main() { int capture = 100; performOperationAsync([capture](int result) { std::cout << "Async operation result: " << result << " with captured value: " << capture << std::endl; }); }
    1. Used with smart pointers – This example also involves the mutable keyword
    #include #include void processResource(std::unique_ptr ptr) { // do some processing std::cout << "Processing resource with value " << *ptr << std::endl; } int main() { auto ptr = std::make_unique (10); // Use Lambda to delay resource processing auto deferredProcess = [p = std::move(ptr)]() { processResource(std::move(p)); }; // Do some other operations... // ... deferredProcess(); // Finally process the resource }
    1. Synchronizing data access in multiple threads
    int main() { std::vector data; std::mutex data_mutex; std::vector threadsPool; // Lambda is used to add data to vector to ensure thread safety auto addData = [&](int value) { std::lock_guard lock(data_mutex); data.push_back(value); std::cout << "Added " << value << " to the data structure." << std::endl; }; threadsPool.reserve(10); for (int i = 0; i < 10; ++i) { threadsPool.emplace_back(addData, i); } // Wait for all threads to complete for (auto& thread: threadsPool) { thread.join(); } }
    1. Application of Lambda in range query
    #include int main() { std::vector v = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; int lower_bound = 3; int upper_bound = 7; // Use Lambda to find all numbers in a specific range auto range_begin = std::find_if(v.begin(), v.end(), [lower_bound](int x) { return x >= lower_bound; }); auto range_end = std::find_if(range_begin, v.end(), [upper_bound](int x) { return x > upper_bound; }); std::cout << "Range: "; std::for_each(range_begin, range_end, [](int x) { std::cout << x << ' '; }); std::cout << std::endl; }
    1. Delayed execution
    #include // Simulate a potentially time-consuming operation void expensiveOperation(int data) { // Simulate a time-consuming operation std::this_thread::sleep_for(std::chrono::seconds(1)); std::cout << "Processed data: " << data << std::endl; } int main() { std::vector <std::function > deferredOperations; deferredOperations.reserve(10); // Assume this is a loop that needs to perform expensive operations, but we don't want to do them right away for (int i = 0; i < 10; ++i) { // Capture i and defer execution deferredOperations.emplace_back([i] { expensiveOperation(i); }); } std::cout << "All operations have been scheduled, doing other work now." << std::endl; // Assume now is a good time to do these expensive operations for (auto& operation : deferredOperations) { // Execute the lambda expression on a new thread to avoid blocking the main thread std::thread(operation).detach(); } // Give the thread some time to process the operation std::this_thread::sleep_for(std::chrono::seconds(2)); std::cout << "Main thread finished." << std::endl; } /* Note: In actual multithreaded programs, you usually need to consider thread synchronization and resource management, such as using std::async instead of std::thread().detach(), and using appropriate synchronization mechanisms such as mutexes and condition variables to ensure thread safety. In this simplified example, these details are omitted to maintain clarity and focus on the delay operation. */ // The following shows a more reasonable version of this example #include #include #include #include // Simulate a potentially time-consuming operation int expensiveOperation(int data) { // Simulate a time-consuming operation std::this_thread::sleep_for(std::chrono::seconds(1)); return data * data; // Return some processing results } int main() { std::vector <std::future > deferredResults; // Launch multiple asynchronous tasks deferredResults.reserve(10); for (int i = 0; i < 10; ++i) { deferredResults.emplace_back( std::async(std::launch::async, expensiveOperation, i) ); } std::cout << "All operations have been scheduled, doing other work now." << std::endl; // Get the results of asynchronous tasks for (auto& future : deferredResults) { // get() will block until the asynchronous operation is completed and returns the result std::cout << "Processed data: " << future.get() << std::endl; } std::cout << "Main thread finished." << std::endl; } /* Note: std::async manages all of this for us. We also don't need to use mutexes or other synchronization mechanisms because each asynchronous operation runs on its own thread and will not interfere with each other, and the returned future object handles all the necessary synchronization for us. std::async is used with the std::launch::async parameter, which ensures that each task runs asynchronously on a different thread. If you don't specify std::launch::async, the C++ runtime can decide to execute the tasks synchronously (delayed), which is not what we want to see. The future.get() call will block the main thread until the corresponding task is completed and the result is returned. This allows us to safely obtain the results without race conditions or the need to use mutexes. */

    4. mutable keyword

    First, let's review what the mutable keyword is. In addition to being used in lambda expressions, we also generally use it in class member declarations.

    When in aClass member variablesWhen you use the mutable keyword, you can modify this member variable in the const member function of the class. This is usually used for members that do not affect the external state of the object, such as caches, debugging information, or data that can be calculated lazily.

    class MyClass { public: mutable int cache; // int data can be modified in const member functions; MyClass() : data(0), cache(0) {} void setData(int d) const { // data = d; // Compile error: non-mutable member cache = d cannot be modified in const function; } };

    In lambda expressions, the mutable keyword allows you to modify the copy of the variable captured inside the Lambda. By default, the () in the Lambda expression is const, and generally you cannot modify the variable captured by value. Unless you use mutable.

    HereKey PointsMutable allows modification of the closure's own member variablesInstances, rather than the original variables in the outer scope. This means that the closure has "Closedness" is still maintained, because it does not change the state of the outer scope, but only changes its own internal state.

    Invalid example:

    int x = 0; auto f = [x]() { x++; // error: cannot modify captured variable}; f();

    It should be like this:

    int x = 0; auto f = [x]() mutable { x++; std::cout << x << std::endl; }; f(); // Correct: outputs 1

    Practical example:

    1. Capturing variable modifications
    #include #include int main() { int count = 0; // creates a mutable lambda expression that increments count on each call auto increment = [count]() mutable { count++; std::cout << count << std::endl; }; increment(); // prints 1 increment(); // prints 2 increment(); // prints 3 // The external count is still 0 because it was captured by value std::cout << "External count: " << count << std::endl; // prints External count: 0 }
    1. Generate a unique ID
    #include int main() { int lastId = 0; auto generateId = [lastId]() mutable -> int { return ++lastId; // Increment and return the new ID }; std::cout << "New ID: " << generateId() << std::endl; // Output New ID: 1 std::cout << "New ID: " << generateId() << std::endl; // Output New ID: 2 std::cout << "New ID: " << generateId() << std::endl; // Output New ID: 3 }
    1. State retention
    #include #include #include int main() { std::vector numbers = {1, 2, 3, 4, 5}; // Initial state int accumulator = 0; // Create a mutable lambda expression to accumulate values auto sum = [accumulator](int value) mutable { accumulator += value; return accumulator; // Returns the current accumulated value }; std::vector runningTotals(numbers.size()); // Apply sum to each element to generate a running total std::transform(numbers.begin(), numbers.end(), runningTotals.begin(), sum); // Print the running total for (int total : runningTotals) { std::cout << total << " "; // Prints 1 3 6 10 15 } std::cout << std::endl; }

    5. Lambda return value deduction

    When lambda expressions were introduced in C++11, the return type of the lambda usually needed to be explicitly specified.

    Starting from C++14, the deduction of Lambda return values has been improved and automatic type deduction has been introduced.

    The deduction of lambda return values in C++14 follows the following rules:

    1. If the lambda function body contains the return keyword, and the type of the expressions following all return statements is the same, then the lambda return type is deduced to be that type.
    2. If the body of the lambda function is a single return statement, or can be considered a single return statement (such as a constructor or brace initializer), the return type is inferred to be the type of the return statement expression.
    3. If the lambda function does not return any value (i.e. there is no return statement in the function body), or if the function body contains only return statements that do not return a value (i.e. return;), the deduced return type is void.
    4. C++11 return value deduction example

    In C++11, if the lambda body contains multiple return statements, the return type must be explicitly specified.

    auto f = [](int x) -> double { // explicitly specify the return type if (x > 0) return x * 2.5; else return x / 2.0; };
    • C++14 automatic deduction

    In C++14, the return type of the above lambda expression can be automatically deduced.

    auto f = [](int x) { // The return type is automatically deduced to double if (x > 0) return x * 2.5; // double else return x / 2.0; // double };
    • Error demonstration

    If the type of the return statement does not match, it cannot be automatically deduced, which will result in a compilation error.

    auto g = [](int x) { // Compilation error because the return type is inconsistent if (x > 0) return x * 2.5; // double else return x; // int };

    But after C++17, if the return types are so different that they cannot be unified into a common type directly or through conversion, you can use std::variant or std::any, which can contain multiple different types:

    #include auto g = [](int x) -> std::variant { if (x > 0) return x * 2.5; // Returns double type else return x; // Returns int type };

    The lambda expression returns a std::variant type, that is, a superposition state of int or double type, and subsequent callers can then check this variable and handle it accordingly. This part will not be discussed in detail.

    6. Nested Lambda

    It can also be called nested lambda, which is an advanced functional programming technique to write a lambda inside a lambda.

    Here is a simple example:

    #include #include #include int main() { std::vector numbers = {1, 2, 3, 4, 5}; // Outer Lambda is used to iterate over the collection std::for_each(numbers.begin(), numbers.end(), [](int x) { // Nested Lambda is used to calculate the square auto square = [](int y) { return y * y; }; // Call the nested Lambda and print the result std::cout << square(x) << ' '; }); std::cout << std::endl; return 0; }

    But we need to pay attention to many issues:

    1. Don't make it too complicated; readability is the main consideration.
    2. Note the lifetime of variables in the capture list, which will also be discussed in detail in the following examples.
    3. Capture lists should be kept as simple as possible to avoid errors.
    4. The compiler may not optimize nested lambdas as well as top-level functions or class member functions.

    If a nested Lambda captures local variables of an outer Lambda, you need to pay attention to the lifecycle of the variables. If the execution of the nested Lambda continues beyond the lifecycle of the outer Lambda, the captured local variables will no longer be valid and an error will be reported.

    #include #include std::function createLambda() { int localValue = 10; // local variable of the outer lambda // returns a lambda that captures localValue return [localValue]() mutable { return ++localValue; // attempts to modify captured variable (legal since it's value capture) }; } int main() { auto myLambda = createLambda(); // myLambda now holds a copy of a captured local variable that has been destroyed std::cout << myLambda() << std::endl; // this will print 11, but depends on the destroyed copy of localValue std::cout << myLambda() << std::endl; // calling it again will print 12, continuing to depend on that copy return 0; }

    To explain, since Lambda captures localValue by value, it holds a copy of localValue, and the life cycle of this copy is the same as that of the returned Lambda object.

    When we call myLambda() in the main function, it operates on the state of the localValue copy, not the original localValue (which has been destroyed after the createLambda function is executed). Although undefined behavior is not triggered here, the situation will be different if we use reference capture:

    std::function createLambda() { int localValue = 10; // Local variable of outer Lambda // Return a Lambda that captures the localValue reference return [&localValue]() mutable { return ++localValue; // Attempt to modify the captured variable }; } // Using the Lambda returned by createLambda at this point will result in undefined behavior

    7. Lambda, std:function and delegates

    Lambda expression, std::function and delegate are three different concepts used to implement function call and callback mechanism in C++. Next, we will explain them one by one.

    • Lambda

    C++11 introduces a syntax for defining anonymous function objects. Lambda is used to create a callable entity, namely a Lambda closure, which is usually passed to an algorithm or used as a callback function. Lambda expressions can capture variables in scope, either by value (copy) or by reference. Lambda expressions are defined inside functions, their types are unique, and cannot be explicitly specified.

    auto lambda = [](int a, int b) { return a + b; }; auto result = lambda(2, 3); // Calling Lambda Expression
    • std::function

    std::function is a type-erased wrapper introduced in C++11 that canstorage,CallandcopyAny callable entity, such as function pointers, member function pointers, lambda expressions, and function objects. The cost is that the overhead is large.

    std::function func = lambda; auto result = func(2, 3); // Call the Lambda expression using the std::function object
    • Delegation

    Delegate is not a formal term in C++. Delegate is usually a mechanism to delegate function calls to other objects. In C#, a delegate is a type-safe function pointer. In C++, there are generally several ways to implement delegates: function pointer, member function pointer, std::function and function object. The following is an example of a delegate constructor.

    class MyClass { public: MyClass(int value) : MyClass(value, "default") { // delegates to another constructor std::cout << "Constructor with single parameter called." << std::endl; } MyClass(int value, std::string text) { std::cout << "Constructor with two parameters called: " << value << ", " << text << std::endl; } }; int main() { MyClass obj(30); // this will call both constructors }
    • Comparison of the three

    Lambda ExpressionsIt is lightweight and well suited for defining simple local callbacks and as parameters to algorithms.

    std::function is heavier, but more flexible. For example, if you have a scenario where you need to store different types of callback functions, std::function is an ideal choice because it can store any type of callable entity. An example that demonstrates its flexibility.

    #include #include #include // A function that takes an int and returns void void printNumber(int number) { std::cout << "Number: " << number << std::endl; } // A Lambda expression auto printSum = [](int a, int b) { std::cout << "Sum: " << (a + b) << std::endl; }; // A function object class PrintMessage { public: void operator()(const std::string &message) const { std::cout << "Message: " << message << std::endl; } }; int main() { // Create a vector of std::function that can store any type of callable object std::vector <std::function > callbacks; // Add a callback for a normal function int number_to_print = 42; callbacks.push_back([=]{ printNumber(number_to_print); }); // Add a callback for a Lambda expression int a = 10, b = 20; callbacks.push_back([=]{ printSum(a, b); }); // Add a callback for a function object std::string message = "Hello World"; PrintMessage printMessage; callbacks.push_back([=]{ printMessage(message); }); // Execute all callbacks for (auto& callback : callbacks) { callback(); } return 0; }

    DelegationUsually related to event handling. There is no built-in event handling mechanism in C++, so std::function and Lambda expressions are often used to implement the delegation pattern. Specifically, you define a callback interface, and users can register their own functions or Lambda expressions to this interface so that they can be called when an event occurs. The general steps are as follows (by the way, an example):

    1. Defines the types that can be called: You need to determine what parameters your callback function or Lambda expression needs to accept and what type of result it returns.
    using Callback = std::function ; // callback with no parameters and return value
    1. Create a class to manage callbacks: This class will hold all callback functions and allow users to add or remove callbacks.
    class Button { private: std::vector onClickCallbacks; // Container for storing callbacks public: void addClickListener(const Callback& callback) { onClickCallbacks.push_back(callback); } void click() { for (auto& callback : onClickCallbacks) { callback(); // Execute each callback } } };
    1. Provide a method to add a callback: This method allows users to add their own functions or Lambda expressionsRegister as callback.
    Button button; button.addClickListener([]() { std::cout << "Button was clicked!" << std::endl; });
    1. Provide a method to execute the callback:When necessary, this method willCall all registered callback functions.
    button.click(); // User clicks the button to trigger all callbacks

    Isn’t it very simple? Let’s take another example to deepen our understanding.

    #include #include #include class Delegate { public: using Callback = std::function ; // Define the callback type, the callback here receives an int parameter // Register the callback function void registerCallback(const Callback& callback) { callbacks.push_back(callback); } // Trigger all callback functions void notify(int value) { for (const auto& callback : callbacks) { callback(value); // Execute callback } } private: std::vector callbacks; // container for storing callbacks }; int main() { Delegate del; // users register their own functions del.registerCallback([](int n) { std::cout << "Lambda 1: " << n << std::endl; }); // another Lambda expression del.registerCallback([](int n) { std::cout << "Lambda 2: " << n * n << std::endl; }); // trigger callback del.notify(10); // this will call all registered Lambda expressions return 0; }

    8. Lambda in asynchronous and concurrent programming

    All because Lambda has the function of capturing and storing state, which makes it very useful when we write modern C++ concurrent programming.

    • Lambda and Threads

    Use lambda expressions directly in the std::thread constructor to define the code that the thread should execute.

    #include #include int main() { int value = 42; // Create a new thread, using a Lambda expression as the thread function std::thread worker([value]() { std::cout << "Value in thread: " << value << std::endl; }); // Main thread continues executing... // Wait for the worker thread to finish worker.join(); return 0; }
    • Lambda and std::async

    std::async is a tool that allows you to easily create asynchronous functions. After the calculation is completed, it returns a std::future object. You can call get, but it will block if the execution is not completed. There are many interesting things about async, which I will not go into here.

    #include #include int main() { // Start an asynchronous task auto future = std::async([]() { // Do some work... return "Result from async task"; }); // In the meantime, the main thread can do other tasks... // Get the result of the asynchronous operation std::string result = future.get(); std::cout << result << std::endl; return 0; }
    • Lambda and std::funtion

    These two are often used together, so let's take an example of storing a callable callback.

    #include #include #include #include // A task queue std::vector that stores std::function objects <std::function > tasks; // Function to add tasks void addTask(const std::function & task) { tasks.push_back(task); } int main() { // Add a Lambda expression as a task addTask([]() { std::cout << "Task 1 executed" << std::endl; }); // Start a new thread to process the task std::thread worker([]() { for (auto& task : tasks) { task(); // Execute task } }); // The main thread continues to execute... worker.join(); return 0; }

    9. Generic Lambda (C++14)

    Use the auto keyword to perform type inference in the argument list.

    Generic basic syntax:

    auto lambda = [](auto x, auto y) { return x + y; };

    Example:

    #include int main() { std::vector vi = {1, 2, 3, 4}; std::vector vd = {1.1, 2.2, 3.3, 4.4, 5.5}; // Use generic Lambda to print int elements std::for_each(vi.begin(), vi.end(), [](auto n) { std::cout << n << ' '; }); std::cout << '\n'; // Use generic Lambda to print double elements std::for_each(vd.begin(), vd.end(), [](auto n) { std::cout << n << ' '; }); std::cout << '\n'; // Use generic Lambda to calculate the sum of a vector of int auto sum_vi = std::accumulate(vi.begin(), vi.end(), 0, [](auto total, auto n) { return total + n; }); std::cout << "Sum of vi: " << sum_vi << '\n'; // Use generic Lambda to calculate the sum of a vector of double type auto sum_vd = std::accumulate(vd.begin(), vd.end(), 0.0, [](auto total, auto n) { return total + n; }); std::cout << "Sum of vd: " << sum_vd << '\n'; return 0; }

    It is also possible to make a lambda that prints any type of container.

    #include int main() { std::vector vec{1, 2, 3, 4}; std::list lst{1.1, 2.2, 3.3, 4.4}; auto print = [](const auto& container) { for (const auto& val : container) { std::cout << val << ' '; } std::cout << '\n'; }; print(vec); // print vector print(lst); // print list return 0; }

    10. Lambda Scope

    First, Lambda can capture local variables within the scope in which it is defined. After capture, even if the original scope ends, copies or references of these variables (depending on the capture method) can still continue to be used.

    It is important to note that if a variable is captured by reference and the original scope of the variable has been destroyed, this will lead to undefined behavior.

    Lambda can also capture global variables, but this is not achieved through a capture list, because global variables can be accessed from anywhere.

    If you have a lambda nested inside another lambda, the inner lambda can capture variables in the capture list of the outer lambda.

    When Lambda captures a value, even if the original value is gone and Lambda is gone (returned to somewhere else), all variables captured by the value will be copied to the Lambda object. The life cycle of these variables will automatically continue until the Lambda object itself is destroyed. Here is an example:

    #include #include std::function createLambda() { int localValue = 100; // local variable return [=]() mutable { // copy localValue by value capture std::cout << localValue++ << '\n'; }; } int main() { auto myLambda = createLambda(); // Lambda copies localValue myLambda(); // Even if the scope of createLambda has ended, the copied localValue still exists in myLambdamyLambda(); // You can safely continue to access and modify the copy }

    When lambda captures a reference, it’s another story. Smart readers should be able to guess that if the scope of the original variable ends, the lambda depends on a dangling reference, which will lead to undefined behavior.

    11. Practice – Function Compute Library

    After all this talk, it's time to put it into practice. No matter what you do, the following are the knowledge points you need to master:

    • capture
    • Higher-order functions
    • Callable Objects
    • Lambda Storage
    • Mutable Lambdas
    • Generic Lambda

    Our goal in this section is to create a math library that supports vector operations, matrix operations, and provides a function parser that accepts a mathematical expression in string form and returns a computable Lambda. Let's get started right away.

    This project starts with simple mathematical function calculations and gradually expands to complex mathematical expression parsing and calculations. Project writing steps:

    • Basic vector and matrix operations
    • Function parser
    • More advanced math functions
    • Composite Functions
    • Advanced Mathematical Operations
    • More expansion...

    Basic vector and matrix operations

    First, define the data structure of vectors and matrices and implement basic arithmetic operations (addition and subtraction).

    In order to simplify the project and focus on the use of Lambda, I did not use templates, so all data is implemented with std::vector.

    In the following code, I have implemented a basic vector framework. Please improve the framework by yourself, including vector subtraction, dot multiplication and other operations.

    // Vector.h #include #include class Vector { private: std::vector elements; public: // Constructor - explicit to prevent implicit conversion Vector() = default; explicit Vector(const std::vector &elems); Vector operator+const Vector& rhs) const; // Get the vector size [[nodiscard]] size_t size() const { return elements.size(); } // Access the element and return a reference to the object double&. ::ostream& operator<<(std::ostream& os, const Vector& v); }; /// Vector.cpp #include "Vector.h" Vector::Vector(const std::vector<std::ostream> os, const Vector& v); }; /// Vector.cpp #include "Vector.h" Vector::Vector(const std::vector<std::ostream> os, const Vector& v); }; /// Vector.cpp #include "Vector.h" Vector::Vector(const std::vector<std::ostream> os, const Vector& v); }; /// Vector.cpp #include "Vector.h" Vector::Vector(const std::vector<std::ostream> os, const Vector& v); & elems) : elements(elems){} Vector Vector::operator+(const Vector &rhs) const { // First make sure the two vectors are consistent if( this->size() != rhs.size() ) throw std::length_error("Vector sizes are inconsistent!"); Vector result; result.elements.reserve(this->size()); // Allocate memory in advance// Use iterators to traverse each element of the vector std::transform(this->begin(), this->end(), rhs.begin(), std::back_inserter(result.elements), [](double_t a,double_t b){ return a+b; }); return result; } std::ostream& operator<<(std::ostream& os, const Vector& v) { os << '['; for (size_t i = 0; i < v.elements.size(); ++i) { os << v.elements[i]; if (i < v.elements.size() - 1) { os << ", "; } } os << ']'; return os; }

    You can use the [[nodiscard]] tag in the declaration operation to remind the compiler to check whether the return value is used, and then users of the library will be reminded in the editor, such as the following.

    Function parser

    Design a function parser that can convert mathematical expressions in string form into Lambda expressions.

    Creating a function parser that can parse mathematical expressions in string form and convert them into Lambda expressions involves parsing theory. To simplify the example, we currently only parse the most basic + and -. Then package the function parser into an ExpressionParser tool class.

    First we create a parser that recognizes + and – signs:

    // ExpressionParser.h #include #include using ExprFunction = std::function ; class ExpressionParser { public: static ExprFunction parse_simple_expr(const std::string& expr); }; // ExpressionParser.cpp #include "ExpressionParser.h" ExprFunction ExpressionParser::parse_simple_expr (const std::string &expr) { if (expr.find ('+') != std::string::npos) { return [](double x, double y) { return x + y; }; } else if (expr.find('-') != std::string::npos) { return [](double x, double y) { return x - y; }; } // More operations... return nullptr; }

    This section is not very relevant to Lambda, so you can skip it. Then we can improve the function parser to recognize numbers based on this. Split the string into tokens (numbers and operators), and then perform operations based on the operators. For more complex expressions, you need to use algorithms such as RPN or existing parsing libraries, so I won't make it so complicated here.

    // ExpressionParser.h ... #include ... static double parse_and_compute(const std::string& expr); ... // ExpressionParser.cpp ... double ExpressionParser::parse_and_compute(const std::string& expr) { std::istringstream iss(expr); std ::vector tokens; std::string token; while (iss >> token) { tokens.push_back(token); } if (tokens.size() != 3) { throw std::runtime_error("Invalid expression format."); } double num1 = std::stod(tokens[0]); const std::string& op = tokens[1]; double num2 = std::stod(tokens[2]); if (op == "+") { return num1 + num2; } else if (op == "-") { return num1 - num2; } else { throw std:: runtime_error("Unsupported operator."); } }

    test:

    // main.cpp #include "ExpressionParser.h" ... std::string expr = "10 - 25"; std::cout << expr << " = " << ExpressionParser::parse_and_compute(expr) << std ::endl;

    Interested readers can also try to parse multiple operators using an operator precedence parsing algorithm (such as the Shunting Yard algorithm) to convert infix expressions to Reverse Polish Notation (RPN). exhibit A little bit of nonsense about data structures, which has little to do with Lambda.

    #include #include #include #include #include #include // Determine whether it is an operator bool is_operator(const std::string& token) { return token == "+" || token == "-" || token == "*" || token == "/"; } // Determine the operator priority int precedence(const std::string& token) { if (token == "+" || token == "-") return 1; if (token == "*" || token == "/") return 2; return 0; } // Convert infix expression to reverse Polish notation std::vector infix_to_rpn(const std::vector & tokens) { std::vector output; std::stack operators; for (const auto& token : tokens) { if (is_operator(token)) { while (!operators.empty() && precedence(operators.top()) >= precedence(token)) { output.push_back(operators.top()); operators.pop(); } operators.push(token); } else if (token == "(") { operators.push(token); } else if (token == ")") { while (!operators.empty() && operators.top() != "(") { output.push_back(operators.top()); operators.pop(); } if (!operators.empty()) operators.pop(); } else { output.push_back(token); } } while (!operators.empty()) { output.push_back(operators.top()); operators.pop(); } return output; } // Compute reverse Polish notation double compute_rpn(const std::vector & tokens) { std::stack operands; for (const auto& token : tokens) { if (is_operator(token)) { double rhs = operands.top(); operands.pop(); double lhs = operands.top(); operands.pop(); if (token == "+") operands.push(lhs + rhs); else if (token == "-") operands.push(lhs - rhs); else if (token == "*") operands.push(lhs * rhs); else operands.push(lhs / rhs); } else { operands.push(std::stod(token)); } } return operands.top(); } // Main function int main() { std::string input = "3 + 4 * 2 / ( 1 - 5 )"; std::istringstream iss(input); std::vector tokens; std::string token; while (iss >> token) { tokens.push_back(token); } auto rpn = infix_to_rpn(tokens); for (const auto& t : rpn) { std::cout << t << " "; } std::cout << std::endl; double result = compute_rpn(rpn); std::cout << "Result: " << result << std::endl; return 0; }

    More advanced math functions

    AssumptionsOur parser is already able to recognize more advanced mathematical operations, such as trigonometric functions, logarithms, exponentials, etc. We need to provide a Lambda expression for the corresponding operation.

    First we modify the aliases of two std::function with different signatures.

    // ExpressionParser.cpp using UnaryFunction = std::function ; using BinaryFunction = std::function ; ... // ExpressionParser.cpp UnaryFunction ExpressionParser::parse_complex_expr (const std::string& expr) { using _t = std::unordered_map ; static const _t functions = { {"sin", [](double x) -> double { return std::sin(x); }}, {"cos", [](double x) -> double { return std::cos(x); }}, {"log", [](double x) -> double { return std::log(x); }}, // ... add more functions }; auto it = functions.find(expr); if (it != functions.end()) { return it->second; } else { // Handle error or return a default function return [](double) -> double { return 0.0; }; // Example error handling } }

    Composite Functions

    To implement compound mathematical functions, you can combine multiple Lambda expressions. Here is a small example:

    #include #include #include int main() { // define the first function f(x) = sin(x) auto f = [](double x) { return std::sin(x); }; // define the second function g(x) = cos(x) auto g = [](double x) { return std::cos(x); }; // create the composite function h(x) = g(f(x)) = cos(sin(x)) auto h = [f, g](double x) { return g(f(x)); }; // use the composite function double value = M_PI / 4; // PI/4 std::cout << "h(pi/4) = cos(sin(pi/4)) = " << h(value) << std::endl; return 0; }

    If you want a more complicated composite function, say $\text{cos}(\text{sin}(\text{exp}(x))$ , you can do this:

    auto exp_func = [](double x) { return std::exp(x); }; // Create a composite function h(x) = cos(sin(exp(x))) auto h_complex = [f, g, exp_func](double x) { return g(f(exp_func(x))); }; std::cout << "h_complex(1) = cos(sin(exp(1))) = " << h_complex(1) << std::endl;

    One of the advantages of using lambda expressions for function composition is that they allow you to easily create higher-order functions, that is, composite functions that are built on top of each other.

    auto compose = [](auto f, auto g) { return [f, g](double x) { return g(f(x)); }; }; auto h_composed = compose(f, g); std:: cout << "h_composed(pi/4) = " << h_composed(M_PI / 4) << std::endl;

    The above example is the core idea of higher-order functions.

    Advanced Mathematical Operations

    Implements differential and integral calculators that can use lambda expressions to approximate the derivatives and integrals of mathematical functions.

    The differentiation here uses the forward difference method of numerical differentiation to approximate the reciprocal $f'(x)$.

    The integration was performed using the numerical integration method using the trapezoidal rule.

    // Derivative auto derivative = [](auto func, double h = 1e-5) { return [func, h](double x) { return (func(x + h) - func(x)) / h; }; }; // For example, derivative of sin(x) auto sin_derivative = derivative([](double x) { return std::sin(x); }); std::cout << "sin'(pi/4) ≈ " << sin_derivative(M_PI / 4) << std::endl; // Integration - lower limit a, upper limit b and number of divisions n auto trapezoidal_integral = [](auto func, double a, double b, int n = 1000) { double h = (b - a) / n; double sum = 0.5 * (func(a) + func(b)); for (int i = 1; i < n; i++) { sum += func(a + i * h); } return sum * h; }; // For example, integrate sin(x) from 0 to pi/2 auto integral_sin = trapezoidal_integral([](double x) { return std::sin(x); }, 0, M_PI / 2); std::cout << "∫sin(x)dx from 0 to pi/2 ≈ " << integral_sin << std::endl;

    Numerical Differentiation – Forward Difference Method

    The numerical approximation of the derivative of the function $$f(x)$$ at the point $$x$$ can be given by the forward difference formula:

    Here $$h$$ represents a small increase in the value of $$x$$. When $$h$$ approaches 0, the ratio approaches the true value of the derivative. In the code, we set a relatively small value $$10^{-5}$$.

    Numerical Integration – Trapezoidal Rule

    The numerical approximation of the definite integral $$\int_a^bf(x) d x$$ can be calculated using the trapezoidal rule:

    Where $$n$$ is the number of small intervals into which the interval $$[a, b]$$ is divided, and $$h$$ is the width of each small interval, which is calculated as:

    Intermediate

    1. Lambda's underlying implementation

    On the surface, lambda expressions seem to be just syntactic sugar, but in fact, the compiler will perform some underlying transformations on each lambda expression.

    First, the type of each lambda expression is unique. The compiler generates a unique class type for each lambda, which is often calledClosure Types.

    The concept of closure comes from closure in mathematics. It refers to a structure whose internal operations are closed and do not depend on elements outside the structure. In other words, the result of applying any operation to the elements in the collection will still be in the collection. In programming, this word is used to describe the combination of a function and its context. A closure allows you to access variables in the scope of an outer function even if the outer function has finished executing. A function "closes" or "captures" the state of the environment when it is created. By default, the operator() of the closure class generated by lambda expressions is const. In this case, developers cannot modify any data inside the closure, which ensures that they will not modify the captured values, which is consistent with the mathematical and functional origins of closures.

    The compiler generates aClosure ClassThis classOverloadoperator() is added so that the closure object can be called like a function. This overloaded operator contains the code of the lambda expression.

    Lambda expressions cancaptureExternal variables, which are implemented as member variables of the closure class. Capture can be value capture or reference capture, corresponding to the copying of values and the storage of references in the closure class, respectively.

    The closure class has a constructor that initializes the captured outer variables. If it is a value capture, these values are copied to the closure object. If it is a reference capture, the reference of the outer variable is stored.

    When a lambda expression is called, the operator() of the closure object is actually called.

    Assume the lambda expression is as follows:

    [capture](parameters) -> return_type { body }

    Here is some pseudo code that a compiler might generate:

    // The pseudocode of the closure class may be as follows: class UniqueClosureName { private: // Captured variable capture_type captured_variable; public: // Constructor, used to initialize the captured variable UniqueClosureName(capture_type captured) : captured_variable(captured) {} // Overloaded function call operator return_type operator()(parameter_type parameters) const { // The body of the lambda expression } }; // Use an instance of the closure class UniqueClosureName closure_instance(captured_value); auto result = closure_instance(parameters); // This is equivalent to calling a lambda expression

    2. Lambda types and decltype with conditional compilation constexpr (C++17)

    As we know, each lambda expression has its own unique type, which is automatically generated by the compiler. Even if two lambda expressions look exactly the same, their types are different. These types cannot be expressed directly in the code, we use templates and type inference mechanisms to operate and infer them.

    The decltype keyword can be used to obtain the type of a lambda expression. In the following example, decltype(lambda) obtains the exact type of the lambda expression. In this way, another variable another_lambda of the same type can be declared and the original lambda can be assigned to it. This feature generally plays an important role in template programming.

    Look at the following example of a chef cooking. You don't know the type of the ingredient yet, but you can use decltype to get the type of the ingredient. The key point is that you can clearly get the type of the return value and mark the return type for the lambda.

    template auto cookDish(T ingredient) -> decltype(ingredient.prepare()) { return ingredient.prepare(); }

    Furthermore, an important use of decltype in C++ isCompile timeChoose different code paths according to different types, that is,Conditional compilation.

    #include template void process(T value) { if constexpr (std::is_same ::value) { std::cout << "Processing integer: " << value << std::endl; } else if constexpr (std::is_same ::value) { std::cout << "handle floating point numbers: " << value << std::endl; } else { std::cout << "handle other types: " << value << std::endl; } }

    The following example is about lambda.

    #include #include // A generic function that performs different operations depending on the lambda type passed in template void executeLambda(T lambda) { if constexpr (std::is_same ::value) { std::cout << "Lambda is a void function with no parameters." << std::endl; lambda(); } else if constexpr (std::is_same ::value) { std::cout << "Lambda is a void function taking an int." << std::endl; lambda(10); } else { std::cout << "Lambda is of an unknown type." << std::endl; } } int main() { // Lambda with no parameters auto lambda1 = []() { std::cout << "Hello from lambda1!" << std::endl; }; // Lambda with one int parameter auto lambda2 = [](int x) { std::cout << "Hello from lambda2, x = " << x << std::endl; }; executeLambda(lambda1); executeLambda(lambda2); return 0; }

    3. Lambda’s evolution in the new standard

    C++11

    • Introducing Lambda Expressions: Lambda expressions were first introduced in the C++11 standard, which can easily define anonymous function objects. The basic form is capture -> return_type { body }.
    • Capture List: Supports capturing external variables by value (=) or reference (&).

    C++14

    • Generic Lambda: Allows the use of the auto keyword in the parameter list, making Lambda work like a template function.
    • Capture Initialization: Allows the use of initializer expressions in capture lists to create lambda-specific data members.

    C++17

    • Default construction and assignment: The closure type produced by a lambda expression can be default constructible and assignable under certain conditions.
    • Capture the *this pointer: By capturing *this, you can copy the current object to Lambda by value to avoid the dangling pointer problem.
    • constexpr Lambda: constexpr Lambda can be used to perform calculations at compile time. It is particularly useful in scenarios such as template metaprogramming and compile-time data generation.

    C++20

    • Template Lambda: Lambda expressions can have template parameter lists, similar to template functions.
    • More flexible capture lists: Capture lists of the form [=, this] and [&, this] are allowed.
    • Implicit motion capture: Automatically use move capture when appropriate (only copy and reference capture are supported in C++14).

    4. State-preserving Lambda

    In the following example, the value and reference capture variable x is the key to keep the state of Lambda. It can also capture and maintain its own state.

    #include int main() { int x0 = 10, x1 = 20, count = 0; auto addX = [x0, &x1, count](int y) mutable { count++; return x0 + x1 + y + count; }; std::cout << addX(5) << std::endl; // output 36 std::cout << addX(5) << std::endl; // output 37 std::cout << addX(5) << std::endl; // output 38 }

    5. Optimization and Lambda

    Why is Lambda good?

    • Inline optimization: Lambda is generally short, and inline optimization reduces function call overhead.
    • Avoid unnecessary object creation: Reference capture and move semantics can reduce the overhead of transferring and copying large objects.
    • Deferred computation: Calculations are performed only when the result is actually needed.

    6. Integration with other programming paradigms

    Functional Programming

    class StringBuilder { private: std::string str; public: StringBuilder& append(const std::string& text) { str += text; return *this; } const std::string& toString() const { return str; } }; // Use StringBuilder builder; builder.append("Hello, ").append("world! "); std::cout << builder.toString() << std::endl; // Output "Hello, world! "

    Pipeline call

    #include #include #include int main() { std::vector vec = {1, 2, 3, 4, 5}; auto pipeline = vec | std::views::transform([](int x) { return x * 2; }) | std::views::filter([](int x) { return x > 5; }); for (int n : pipeline) std::cout << n << " "; // Output elements that meet the conditions }

    7. Lambda and Exception Handling

    auto divide = [](double numerator, double denominator) { if (denominator == 0) { throw std::runtime_error("Division by zero."); } return numerator / denominator; }; try { auto result = divide( 10.0, 0.0); } catch (const std::runtime_error& e) { std::cerr << "Caught exception: " << e.what() << std::endl; }

    Although a lambda expression itself cannot contain a try-catch block (before C++20), exceptions can be caught outside of a lambda expression. That is:

    auto riskyTask = []() { // Assume that there is a possibility of exception being thrown here }; try { riskyTask(); } catch (...) { // Handle the exception }

    Starting from C++20, lambda expressions support exception specifications.

    Before C++17, you could use dynamic exception specifications in function declarations, such as throw(Type), to specify the types of exceptions that a function might throw. However, this practice was deprecated in C++17 and completely removed in C++20. Instead, the noexcept keyword is used to indicate whether a function throws an exception.

    auto lambdaNoExcept = []() noexcept { // This guarantees that no exception will be thrown};

    Advanced

    1. Lambda and noexcept (C++11)

    noexcept can be used to specify whether a lambda expression is guaranteed not to throw exceptions.

    auto lambda = []() noexcept { // The code here is guaranteed not to throw an exception};

    When the compiler knows that a function will not throw exceptions, it can generate more optimized code.

    You can also explicitly throw exceptions to improve code readability, but it's the same as not writing any.

    auto lambdaWithException = []() noexcept(false) { // Code here may throw an exception};

    2. Template parameters in Lambda (C++20)

    In C++20, Lambda expressions have received an important enhancement, which is the support of template parameters. How cool!

    auto lambda = [] (T param) { // Code using template parameter T}; auto print = [] (const T& value) { std::cout << value << std::endl; }; print(10); // prints an integer print("Hello"); // prints a string

    3. Lambda Reflection

    I don’t know, I’ll write about it later.

    4. Cross-platform and ABI issues

    I don’t know, I’ll write about it later.

  • GAMES101.HW7:路径追踪代码实现和微材质模型

    GAMES101.HW7: Path tracing code implementation and micro-texture model

    Project code https://github.com/Remyuu/GAMES101-Homework

    This article is divided into two parts:Path tracing code implementationandMicro-texture model.

    We are HW.5 Constructed the Whitted-Style Ray Tracing algorithm ray tracing project, HW.6 The BVH acceleration structure is used to speed up the intersection process. This time, we built Path Tracing's ray tracing and used multi-threading to accelerate rendering. Finally, the micro-surface model is used to provide the project with a more rough material.

    In addition, it should be noted that the content of this article on microsurface models is mainly derived from Ref.5 , mainly talks about the basic theory and code implementation of the Cook-Torrance model.

    This article basically explains the entire content of the framework. If there are any errors in the content, please point them out. This project is about rendering a CornellBox scene, and the final effect is roughly as shown below:

    img

    Parameter 1: {SSP:64, res:{784, 784}, parallel: false, RussianRoulette = 0.8}, render time:{4101 seconds},

    Parameter 2: {SSP:64, res:{784, 784}, parallelism: true, RussianRoulette = 0.8, cookTorrance, PDF = GGX}, render time: {3415 seconds}

    Seven Frameworks of OperationDownload (Please forgive the slow download of self-built small water pipe)

    Project Flow – main.cpp

    As usual, we start the analysis from the main function.

    The process of this project is very simple:Set up the scene and render it.Let’s take a closer look.

    First initialize the Scene object and set the scene resolution.

    // Change the definition here to change resolution Scene scene(784, 784);

    Create four materials - red, green, white and light. These materials use the DIFFUSE type and set different diffuse coefficients (Kd).

    Material* red = new Material(DIFFUSE, Vector3f(0.0f)); red->Kd = Vector3f(0.63f, 0.065f, 0.05f); ... Material* light = new Material(DIFFUSE, (8.0f * Vector3f (0.747f+0.058f, 0.747f+0.258f, 0.747f) + 15.6f * Vector3f(0.740f+0.287f,0.740f+0.160f,0.740f) + 18.4f *Vector3f(0.737f+0.642f,0.737f+0.159f,0.737f))); light->Kd = Vector3f (0.65f);

    Create the Cornell scene objects and then add them to the scene.

    MeshTriangle floor("../models/cornellbox/floor.obj", white); ... MeshTriangle light_("../models/cornellbox/light.obj", light); scene.Add(&floor); .. . scene.Add(&light_);

    Build a BVH to accelerate collision detection between rays and objects in the scene.

    scene.buildBVH();

    Finally, a renderer object r is created to render the scene and the rendering time is recorded.

    Renderer r; auto start = std::chrono::system_clock::now(); r.Render(scene); auto stop = std::chrono::system_clock::now();

    The above is the general process of the project.

    Object abstract base class – Object

    The Object class defines all the basic behaviors that an object needs in the ray tracing algorithm. It uses pure virtual functions to indicate that this is an interface that needs to be inherited by specific object classes (MeshTriangle, Sphere, and Triangle) and implement these methods.

    For detailed instructions, see the following code comments:

    Object() {} virtual ~Object() {} virtual bool intersect(const Ray& ray) = 0;// Used to determine whether a ray intersects with the object virtual bool intersect(const Ray& ray, float &, uint32_t &) const = 0;// Also used to detect whether a ray intersects with an object, but this function also returns a parameterized representation of the intersection and the intersection index virtual Intersection getIntersection(Ray _ray) = 0;// Returns the intersection information between the ray and the object virtual void getSurfaceProperties(const Vector3f &, const Vector3f &, const uint32_t &, const Vector2f &, Vector3f &, Vector2f &) const = 0;// This function is used to obtain the properties of the object surface, such as the surface normal, texture coordinates, etc. virtual Vector3f evalDiffuseColor(const Vector2f &) const =0;// Evaluates the diffuse color of an object at a specific texture coordinate virtual Bounds3 getBounds()=0;// Returns the bounding box of the object virtual float getArea()=0;// Returns the surface area of the object. The calculation method for each shape can be different virtual void Sample(Intersection &pos, float &pdf)=0;// Sample a point from the surface of the object for light source sampling. The `pos` parameter is the information of the sampling point, and `pdf` is the probability density function value of the point. virtual bool hasEmit()=0;// Determine whether the object emits light, that is, whether it is a light source.

    Based on this class, we also created three specific object classes: MeshTriangle, Sphere and Triangle. These three classes are subclasses of the object class Object and are used to represent different geometric shapes in three-dimensional space.

    Since we need to render the picture by ray tracing, we need to implement an important operation called intersect, which is used to detect whether a ray intersects with an object.

    In addition, each class has a data member m of the Material type, which represents the material of the object. The material defines the object's color, texture, emitted light and other properties.

    In Triangle.hpp, rayTriangleIntersect uses the Möller-Trumbore algorithm to determine whether a ray intersects a triangle in 3D space, and if so, it can calculate the exact location of the intersection. For a detailed code explanation, please see my other articlearticle , the main steps are written in the code comments.

    bool rayTriangleIntersect(const Vector3f& v0, const Vector3f& v1, const Vector3f& v2, const Vector3f& orig, const Vector3f& dir, float& tnear, float& u, float& v){ // First, calculate the vectors of the two sides of the triangle (edge1 and edge2), and then calculate a new vector pvec based on the vector product (outer product) of the ray direction dir and the edge edge2. Vector3f edge1 = v1 - v0; Vector3f edge2 = v2 - v0; Vector3f pvec = crossProduct(dir, edge2); float det = dotProduct(edge1, pvec); if (det == 0 || det < 0) return false; // Then, by calculating the dot product (inner product) of pvec and edge edge1, a determinant value is obtained. If this value is 0 or negative, it means that the ray is parallel to the triangle or the ray is in the opposite direction of the triangle, and false should be returned at this time. Vector3f tvec = orig - v0; u = dotProduct(tvec, pvec); if (u < 0 || u > det) return false; // Then, calculate the cross product of tvec and edge1 to get qvec, and calculate its dot product with dir to get v. If v is less than 0 or u+v is greater than det, return false. Vector3f qvec = crossProduct(tvec, edge1); v = dotProduct(dir, qvec); if (v < 0 || u + v > det) return false; float invDet = 1 / det; // Finally, if all tests pass, the ray intersects the triangle. Calculate the depth tnear of the intersection point, as well as the barycentric coordinates (u, v) inside the triangle. tnear = dotProduct(edge2, qvec) * invDet; u *= invDet; v *= invDet; return true; }

    Due to space constraints, we will only focus on these three categories.

    Triangle

    Next, a Triangle object represents a triangle in three-dimensional space.

    Constructor

    Triangle(Vector3f _v0, Vector3f _v1, Vector3f _v2, Material* _m = nullptr) : v0(_v0), v1(_v1), v2(_v2), m(_m) { e1 = v1 - v0; e2 = v2 - v0; normal = normalize(crossProduct(e1, e2)); area = crossProduct(e1, e2).norm()*0.5f; }

    Each triangle has three vertices (v0, v1, and v2), two edge vectors (e1 and e2), a normal vector (normal), an area (area), and a material pointer (m). In the triangle constructor, the edge vector, normal vector, and area are calculated based on the three vertices input. The area is calculated by half the modulus of the cross product of e1 and e2.

    Triangle related operations

    There are three functions (Sample, getArea and hasEmit) that are directly overridden.

    ... void Sample(Intersection &pos, float &pdf){ float x = std::sqrt(get_random_float()), y = get_random_float(); pos.coords = v0 * (1.0f - x) + v1 * (x * (1.0f - y)) + v2 * (x * y); pos.normal = this->normal; pdf = 1.0f / area; } float getArea(){ return area; } bool hasEmit(){ return m->hasEmission(); }
    1. The Sample function randomly samples a point on the surface of a triangle and returns:
    2. Information about this point (including position and normal vector)
    3. Probability density function value (pdf) of the sampling point
    4. The getArea function returns the area of the triangle.
    5. The hasEmit function checks whether the triangle's material emits light.

    MeshTriangle

    The MeshTriangle class is also a subclass of the Object class. It represents a 3D model or mesh composed of many triangles. In other words, a MeshTriangle object may contain many Triangle objects. For example, you can use 12 triangles (2 triangles per face, a total of 6 faces) to represent a cube model.

    The MeshTriangle class also includes some additional functions, such as calculating the model's AABB bounding box, BVHAccel objects, etc.

    Constructor

    The following pseudo code succinctly describes the construction process of MeshTriangle:

    MeshTriangle(string filename, Material* mt) { // 1. Load the model file loader.LoadFile(filename); // 2. Create a Triangle object for each face and store for (each face in model) { Triangle tri = create triangles (face vertices, mt); triangles.push_back(tri); } // 3. Calculate the bounding box and total area of the model bounding_box = calculate bounding box (all vertices of the model); area = calculate total area (all triangles); // 4. Create a BVH for fast intersection testing bvh = create BVH (all triangles); }

    The constructor accepts a file name filename and a material mt, and then uses objl::Loader to load the 3D model. After loading the model, it traverses all the triangles of the model and creates corresponding Triangle objects. At the same time, the bounding box of the entire model and the total area of all triangles are calculated and stored.

    In the framework, we use the objl::Loader class to read the .obj file. Call loader.LoadFile(filename) to complete the loading. Then access loader.LoadedMeshes to obtain the loaded 3D model data.

    objl::Loader loader; loader.LoadFile(filename); ... auto mesh = loader.LoadedMeshes[0];

    When loading, we noticed an assertion, which checks whether a model has only one mesh. If there are multiple meshes or no mesh, an assertion error is triggered.

    assert(loader.LoadedMeshes.size() == 1);

    Initializes the minimum and maximum values of the model's vertices, which are used to calculate the axis-aligned bounding box of the 3D model.

    Vector3f min_vert = Vector3f{std::numeric_limits ::infinity(), std::numeric_limits ::infinity(), std::numeric_limits ::infinity()}; Vector3f max_vert = Vector3f{-std::numeric_limits ::infinity(), -std::numeric_limits ::infinity(), -std::numeric_limits ::infinity()};

    Next we need to understand the data structure of the mesh in the objl library, that is, what is stored in the objl::Loader loader.

    • MeshName: stores the name of the mesh
    • Vertices: Stores all vertex data in the model, including position, normal and texture coordinates.
    • Indices: Stores all face (usually triangle) data in the model. Each face consists of a set of indices pointing to vertices in Vertices.
    • MeshMaterial: Stores all material data in the model, including diffuse color, specular color, texture and other properties.

    The vertex information of each triangle is stored continuously in Vertices, so we construct a Triangle with three vertices as a group, and then set the AABB.

    for (int i = 0; i < mesh.Vertices.size(); i += 3) { std::array face_vertices; for (int j = 0; j < 3; j++) { auto vert = Vector3f(mesh.Vertices[i + j].Position.X, mesh.Vertices[i + j].Position.Y, mesh.Vertices [i + j].Position.Z); face_vertices[j] = vert; min_vert = Vector3f(std::min(min_vert.x, vert.x), std::min(min_vert.y, vert.y), std::min(min_vert.z, vert.z)); max_vert = Vector3f(std::max(max_vert.x, vert.x ), std::max(max_vert.y, vert.y), std::max(max_vert.z, vert.z)); } triangles.emplace_back(face_vertices[0], face_vertices[1], face_vertices[2], mt); } bounding_box = Bounds3(min_vert, max_vert);

    Finally, the areas of all triangles are calculated and the BVH acceleration structure is constructed. In the loop, the code first stores the pointers of all triangles into ptrs, then calculates the sum of the areas of all triangles. Then all triangle pointers are passed into the BVH constructor.

    std::vector ptrs; for (auto& tri : triangles){ ptrs.push_back(&tri); area += tri.area; } bvh = new BVHAccel(ptrs);

    Mesh triangle related operations

    1. Calculation of patch attributes

    First isCalculation of patch attributes — getSurfaceProperties, where the following properties need to be calculated:

    1. The normal vector of a triangleN: This is easy to do. Just find the two sides of the triangle and do a cross product.
    2. Texture Coordinatesst: The texture coordinates of the triangle vertices are interpolated, and uv is the barycentric coordinate of the intersection point inside the triangle. The following is a detailed description.
    void getSurfaceProperties(const Vector3f& P, const Vector3f& I, const uint32_t& index, const Vector2f& uv, Vector3f& N, Vector2f& st) const{ const Vector3f& v0 = vertices[vertexIndex[index * 3]]; const Vector3f& v1 = vertices[vertexIndex[index * 3 + 1]]; const Vector3f& v2 = vertices[vertexIndex[index * 3 + 2]]; Vector3f e0 = normalize(v1 - v0); Vector3f e1 = normalize(v2 - v1); N = normalize(crossProduct(e0, e1)); const Vector2f& st0 = stCoordinates[vertexIndex[index * 3]]; const Vector2f& st1 = stCoordinates[vertexIndex[index * 3 + 1]]; const Vector2f& st2 = stCoordinates[vertexIndex[index * 3 + 2]]; st = st0 * (1 - uv.x - uv.y) + st1 * uv.x + st2 * uv.y; }

    We have also seen this function in Triangle, but it is different in MeshTriangle.

    As their names imply, Triangle represents an independent triangle, while MeshTriangle represents a group of interconnected triangles, that is, a triangular mesh. The getSurfaceProperties of Triangle directly uses the vertex and texture coordinate information stored in the Triangle class. Since our MeshTriangle has multiple triangles, we know the triangle to be obtained through the parameter index.

    For the calculation of texture coordinates, first we know that UV coordinates are used in the process of mapping 2D textures onto 3D models. Use the uv coordinates to weight the st coordinates of the triangle vertices to obtain the st coordinates of the intersection. This is called interpolation. These coordinates define the corresponding position of each vertex of the 3D model on the 2D texture.

    In this function,

    • st0, st1, and st2 are the texture coordinates corresponding to the triangle vertices. These coordinates specify the position of the vertices in the texture map.
    • 1 – uv.x – uv.y, uv.x and uv.y correspond to the weights of the three vertices of the triangle. In other words, if you are at a vertex of the triangle, the weight of that vertex is 1, and the weights of other vertices are 0.
    • In addition, stCoordinates is defined in advance by the artists, so programmers don’t need to worry about it.

    According to the Möller Trumbore algorithm, it is determined that st0 corresponds to (1 – uv.x – uv.y), st1 corresponds to uv.x, and st2 corresponds to uv.y.

    To summarize what this function does: multiply the texture coordinate st of each vertex by its corresponding weight, and then add them together. This is an interpolation method that can be used to find the texture coordinates of any point within the triangle.

    2. The diffuse color of the uv coordinate on a specific material

    The evalDiffuseColor function is used to calculate the diffuse color of a given 2D texture coordinate on a specific material.

    When light hits an object's surface, it will reflect according to certain rules, which are affected by the material of the object's surface. Diffuse color is a way to describe this reflection effect, which represents the object's ability to reflect light.

    The effect diagram when the pattern item is set to 0, default and 1 respectively:

    img
    Vector3f evalDiffuseColor(const Vector2f& st) const { float scale = 5; float pattern = (fmodf(st.x * scale, 1) > 0.5) ^ (fmodf(st.y * scale, 1) > 0.5); return lerp( Vector3f(0.815, 0.235, 0.031), Vector3f(0.937, 0.937, 0.231), pattern); }

    Collision point information class – Intersection

    Collision information structure

    This class is used to store information about the intersection of light and objects. I have written the function of each item in the comments below for your reference.

    struct Intersection { Intersection(){ happened=false; coords=Vector3f(); normal=Vector3f(); distance= std::numeric_limits ::max(); obj =nullptr; m=nullptr; } // happened indicates whether an intersection actually occurred. // If the ray did not hit any object, then happened will be false. bool happened; // coords Represents the coordinates of the intersection. //If happened is true, then coords will contain the exact position where the light intersects the object. Vector3f coords; // coords represents the texture coordinates of the intersection. //It is used to get the texture of the object surface at the intersection position Information. Vector3f tcoords; // normal represents the normal vector at the intersection. // The normal vector is a vector perpendicular to the surface of an object. It is used to determine the orientation of the object and plays a key role in lighting calculations. Vector3f normal; // emit Indicates the emission value of the light source at the intersection. //If the object at the intersection is a light source, this vector is non-zero. Vector3f emit; //Indicates the distance from the origin of the light to the intersection. double distance; //Points to the object that the light hits . Object* obj; // Points to the material of the object at the intersection, including the object's color, smoothness, reflectivity, etc. Material* m; };

    Get collision information

    This function is actually in Triangle, Sphere and BVHAccel, but the function is inseparable from the Intersection structure. At the same time, for the sake of typesetting, it is simply placed in this chapter.

    Intersection in Triangle class

    First, paste the source code directly:

    inline Intersection Triangle::getIntersection(Ray ray) { Intersection inter; if (dotProduct(ray.direction, normal) > 0) return inter; double u, v, t_tmp = 0; Vector3f pvec = crossProduct(ray.direction, e2) ; double det = dotProduct(e1, pvec); if (fabs(det) < EPSILON) return inter; double det_inv = 1. / det; Vector3f tvec = ray.origin - v0; u = dotProduct(tvec, pvec) * det_inv; if (u < 0 || u > 1) return inter; Vector3f qvec = crossProduct(tvec, e1) ; v = dotProduct(ray.direction, qvec) * det_inv; if (v < 0 || u + v > 1) return inter; t_tmp = dotProduct(e2, qvec) * det_inv; if (t_tmp < 0) { return inter; } inter.distance = t_tmp; inter.coords = ray(t_tmp); inter.happened = true; inter.m = m; inter.normal = normal; inter .obj = this; return inter; }

    Renderer Class – Renderer

    The operation process of the Renderer is very simple: it loops to generate an image for each pixel on the screen.

    Here are some simple instructions:

    • spp (samples per pixel): The number of samples per pixel, which indicates how many rays the ray tracing algorithm will cast in one pixel.
    • framebuffer: A one-dimensional array of size width*height that stores the color of each pixel in the scene.
    • scene.castRay(Ray(eye_pos, dir), 0) function: Calls the ray tracing algorithm. This function will shoot a ray and calculate the color of the object that this ray hits in the scene. This color value is then added to the color of the corresponding pixel in the framebuffer.

    So the key point of rendering the Renderer class is this line castRay. And castRay is in the Scene class, which is also the focus of this article.

    Some notes on intersect

    We see a lot of intersect functions in the project, which makes people dizzy. For example, when we first see intersect(), it directly returns true in the triangle class. But in fact, we will wonder, shouldn’t it contain judgment logic? Instead of directly returning 0 or 1. There are also Intersect() in BVHAccel, IntersectP() in Bounds3, and the relationship between getIntersection of specific objects, etc.

    In this regard, the process is briefly explained.

    According to the process, first in the castRay() function, we call BVHAccel's Intersect(), and then BVHAccel's Intersect() will find the AABB of the leaf node in the BVH data structure. Then it will call Bounds3's IntersectP to determine whether the bounding box of the node intersects with the light.

    • If they do not intersect: return the default constructor of the Intersection class (empty collision data structure);
    • If the current node is a leaf node: directly call the getIntersection method of the object to calculate the intersection of the light and the object.

    Next, the getIntersection method will return the Intersection data structure corresponding to their object type. The following are the getIntersection methods of Sphere, MeshTriangle and Triangle respectively.

    Intersection getIntersection(Ray ray){ Intersection result; ... return result; } Intersection getIntersection(Ray ray){ Intersection intersec; if (bvh) intersec = bvh->Intersect(ray); return intersec; }

    The getIntersection method of Triangle is marked with override, which means that the function declaration is inside the class definition, while the function definition is outside the class definition. Therefore, the part that is actually executed is inline:

    inline Intersection Triangle::getIntersection(Ray ray){ ... }

    Scene

    Before talking about castRay, let's take a quick look at the general shape of the Scene.

    This class contains all the objects and lights in the scene, as well as some rendering parameters, such as the scene's width, height, field of view, background color, etc. Next, we will explain the main parts of this class:

    Member variables:

    • width and height are the pixel width and height of the scene, fov is the field of view of the camera, and backgroundColor is the background color of the scene.
    • objects and lights store the objects and light sources in the scene respectively. objects is a vector of pointers to Object type, and lights is a vector containing smart pointers\ type.
    • bvh is a pointer to BVHAccel, which is used to store the scene's bounding volume hierarchy (BVH) structure to speed up the intersection calculation between rays and objects.
    • maxDepth and RussianRoulette are used to control the details of the path tracing algorithm (we will come back to this in a moment).

    Member functions:

    • The Add function is used to add objects or light sources to the scene.
    • The HandleAreaLight function is used to handle ray tracing of area light sources.
    • The reflect and refract functions are used to calculate the direction of reflection and refraction of light, respectively. The refract function implements Snell's law and is used to calculate the direction of refraction of light when it enters different media. Two situations need to be handled specially: light is emitted from the inside of the object, and light is emitted from the outside of the object.
    • The fresnel function is used to calculate the Fresnel equation and obtain the ratio of reflected light to transmitted light. That is, the ratio of reflection to transmission when light passes through the interface of two different media.
    • The functions of intersect, buildBVH, castRay, sampleLight and trace are described in detail later.

    So far, this Scene class encapsulates all the data and operations of a ray tracing rendering scene. Next, we will talk about some details of this class in detail.

    In Scene.cpp, we first have the buildBVH() function, which creates a bounding volume hierarchy (BVH).

    The intersect(const Ray &ray) const function is then used to find the intersection of the ray defined by ray with any object in the scene. It does this for efficiency by using the previously constructed BVH.

    sampleLight(Intersection &pos, float &pdf) const The function is used to randomly sample the luminous objects in the scene. This function first calculates the sum of the areas of all luminous objects, and then randomly selects a position in this area as a sampling point.

    void Scene::sampleLight(Intersection &pos, float &pdf) const{ float emit_area_sum = 0; for (uint32_t k = 0; k < objects.size(); ++k) { if (objects[k]->hasEmit() ){ emit_area_sum += objects[k]->getArea(); } } float p = get_random_float() * emit_area_sum; emit_area_sum = 0; for (uint32_t k = 0; k < objects.size(); ++k) { if (objects[k]->hasEmit()){ emit_area_sum += objects[k]->getArea( ); if (p <= emit_area_sum){ objects[k]->Sample(pos, pdf); break; } } } }

    enter: Intersection refers to pos, and floating point numbers refer to pdf.

    These two parameters are output parameters, that is, this method will change their values. The object of type Intersection is used to store the intersection information between the light and the object, and pdf represents the probability density of selecting this intersection.

    Output: This method has no return value, its result is returned by changing the values of pos and pdf.

    Specific process:

    1. First, this method calculates the sum of the areas of all emitting objects, emit_area_sum, through the first loop. For each object, it first calls the hasEmit method to check whether the object can emit light. If it can, it calls the getArea method to get the area of the object and adds it to emit_area_sum.
    2. Next, it generates a random number p between 0 and emit_area_sum. This random number p is used to randomly select an emitter object.
    3. Then, it goes through the second loop to select the emitter. In this loop, it first checks whether each object can emit light. If it can, it adds the area of the object and then checks whether p is less than or equal to the current emit_area_sum. If so, then this object is the selected object. Then, it calls the Sample method of this object, randomly selects a point on this object, and updates the values of pos and pdf. Finally, it jumps out of the loop and ends this method.

    Path Tracing Implementation – castRay()

    Finally, we come to castRay. We are going to use the Path Tracing algorithm to implement the ray tracing function.

    // Implementation of Path Tracing Vector3f Scene::castRay(const Ray &ray, int depth) const { // TO DO Implement Path Tracing Algorithm here }

    Next, let's introduce the key processes: light source sampling and indirect lighting calculation.

    In other words, we can roughly divide the PT algorithm into two parts: direct illumination and indirect illumination.

    First some basic definitions:

    • p: The collision point is the intersection of the light and the object in the scene.
    • wo: outgoing direction, which is the direction reflected from the intersection point to the camera.
    • wi: incoming direction, which is the direction in which the light is reflected from the light source to the intersection point.

    The following is pseudo code that details the above process:

    shade(p, wo) // First calculate the direct light source sampleLight(inter, pdf_light) Get x, ws, NN, emit from inter Shoot a ray from p to x If the ray is not blocked in the middle L_dir = emit * eval( wo, ws, N) * dot(ws, N) * dot(ws, NN) / |xp|^2 / pdf_light // Then calculate the indirect light source L_indir = 0.0 Test Russian Roulette with probability RussianRoulette wi = sample(wo, N) Trace a ray r(p, wi) If ray r hit a non-emitting object at q L_indir = shade(q, wi) * eval(wo, wi , N) * dot(wi, N) / pdf(wo, wi, N) / RussianRoulette Return L_dir + L_indir

    Make sure you have a clear understanding of how Path Tracing is implemented, then start writing code.

    Vector3f Scene::castRay(const Ray &ray, int depth) const { const float EPLISON = 0.0001f; Intersection p_inter = intersect(ray); if (!p_inter.happened) return Vector3f();// Default construction zero vector - blackif (p_inter.m->hasEmission())// Is it self-luminousreturn p_inter.m->getEmission();// Directly return the emission color// Get the intersection between ray and object plane Intersection x_inter; float pdf_light = 0.0f; // Sample light source at intersection point sampleLight(x_inter, pdf_light); // Get x, ws, N, NN, emit from inter Vector3f p = p_inter.coords;// Object intersection coordinates Vector3f x = x_inter.coords;// Light source intersection coordinates Vector3f ws_dir = (x - p).normalized();// float ws_distance = (x - p).norm(); // Distance from object to light source Vector3f N = p_inter.normal.normalized(); // Normal of object intersection Vector3f NN = x_inter.normal.normalized(); // Normal of light source intersection Vector3f emit = x_inter.emit; // Color vector of light source intersection // Shoot a ray from p to x Vector3f l_dir(0.0f), l_indir(0.0f); // See the following description Ray ws_ray(p, ws_dir); // Make a Ray Intersection from p to the light point ws_ray_inter = intersect(ws_ray); // Then intersect // If the ray is not blocked in the middle // That is, check whether the straight line path from object p to light source x is blocked by other objects if(ws_ray_inter.distance - ws_distance > -EPLISON) { // See below for details l_dir = emit * p_inter.m->eval(ray.direction, ws_ray.direction, N) * dotProduct(ws_ray.direction, N) * dotProduct(-ws_ray.direction, NN) / (ws_distance * ws_distance) / pdf_light; } // Test Russian Roulette with probability RussianRoulette if(get_random_float() <= RussianRoulette) { // See below for details Vector3f wi_dir = p_inter.m->sample(ray.direction, N).normalized(); Ray wi_ray(p_inter.coords, wi_dir); // If ray r hit a non-emitting object at q Intersection wi_inter = intersect(wi_ray); // If a collision is detected and the collision point is not emitting, start calculating indirect lighting if (wi_inter.happened && (!wi_inter.m->hasEmission())) { // See the description below for details l_indir = castRay(wi_ray, depth + 1) * p_inter.m->eval(ray.direction, wi_ray.direction, N) * dotProduct(wi_ray.direction, N) / p_inter.m->pdf(ray.direction, wi_ray.direction, N) / RussianRoulette; } } return l_dir + l_indir; }

    A few points need to be explained:

    1. ray or wo_ray: This is the ray we are dealing with, called the outgoing ray, which means the ray that propagates from a certain point to a certain direction. In the castRay() function, ray is the ray passed as a parameter.
    2. intersect: will call the overloaded method located in bvh, as follows:

    c++ Intersection BVHAccel::Intersect(const Ray& ray) const { Intersection isect; if (!root) return isect; isect = BVHAccel::getIntersection(root, ray); return isect; }

    1. p_inter: This isLight and SurfaceSpecifically, p_inter is an object of type Intersection.
    2. x_inter: This isLight and light sourcesThe intersection information. Similar to p_inter, x_inter is also an object of Intersection type.
    3. l_dir (direct illumination): This is the lighting contribution from the light source directly hitting the surface. In path tracing, this is typically obtained by sampling the light source from the surface point and calculating the direct illumination.
    4. l_indir (indirect illumination): This is the indirect lighting contribution reflected by the environment onto the surface. In path tracing, this is typically calculated by sampling the BRDF of the surface and recursively tracing reflected rays.
    5. ws_ray: This is the ray from point p to point x. When calculating direct lighting, a new ray needs to be sent towards the light source to check if the object can directly see the light source (i.e. there are no other objects in the way). This ray is ws_ray.

    Here we will focus on the BRDF formula for direct lighting. The formula here takes into account the geometry term and the PDF of the light source sampling, so the formula is the same as mine.Previous Articles The formula given in is different.

    The mathematical formula for direct lighting is as follows: Radiance under direct lightingRadiance emitted by the light sourceCosine of the angle between the incident light ray and the surface normalCosine of the angle between the outgoing light ray and the normal vector of the light sourceAttenuation functionProbability density function for selecting the direction of the light sourceLdir=Li⋅fr(wi,wo,N)⋅cos⁡(θ)⋅cos⁡(θ′)r2⋅p(wi)where,Ldir: radiance under direct lightingLi: radiance emitted by the light sourcefr(wi,wo,N):BRDFcos(θ): cosine of the angle between the incident light ray wi and the surface normal Ncos(θ′): cosine of the angle between the outgoing light ray −wsdir and the normal vector NN of the light sourcer2: Attenuation functionp(wi): Probability density function for selecting the direction of the light source (PDF) Converted into code is:

    l_dir = emit * p_inter.m->eval(ray.direction, ws_ray.direction, N) * dotProduct(ws_ray.direction, N) * dotProduct(-ws_ray.direction, NN) / (ws_distance * ws_distance) / pdf_light;

    The eval function is as follows, which describes how the light intensity changes with the change of the direction of the incident light and the outgoing light. The input of this function is the incident direction of the light ray.direction, the reflection direction ws_ray.direction, and the surface normal N, and the output is a value that measures the intensity of the reflected light. In the code, diffuse BRDF = ρ/π.

    ... float cosalpha = dotProduct(N, wo); if (cosalpha > 0.0f) { Vector3f diffuse = Kd / M_PI; return diffuse; } else return Vector3f(0.0f); ...

    After calculating direct illumination, we start to calculate indirect illumination using Russian roulette. The process is as follows:

    In Scene.hpp we define the value of RussianRoulette, and we will calculate indirect lighting only when the random number obtained is less than RussianRoulette. In other words, it is possible that the indirect lighting of a pixel is not calculated at all.

    If we need to calculate indirect lighting now, we generate a new ray wi_ray for it. The direction of this ray is sampled based on the surface material of the current intersection and the original ray direction. This part implements material-based importance sampling. In the code, we doUniform SamplingThe specific calculation method is shown in the following code:

    ... // uniform sample on the hemisphere float x_1 = get_random_float(), x_2 = get_random_float(); float z = std::fabs(1.0f - 2.0f * x_1); // r - distance from the origin to the hemisphere; phi - angle in polar coordinates float r = std::sqrt(1.0f - z * z), phi = 2 * M_PI * x_2; Vector3f localRay(r*std::cos(phi), r*std::sin(phi), z); return toWorld(localRay, N); ...

    Finally, let's talk about the calculation of indirect lighting l_indir. The calculation formula for indirect lighting is: Lo=shade(q,−wi)⋅fr⋅cosine⋅1pdf(wi) But here

    l_indir = castRay(wi_ray, depth + 1) * p_inter.m->eval(ray.direction, wi_ray.direction, N) * dotProduct(wi_ray.direction, N) / p_inter.m->pdf(ray.direction, wi_ray .direction, N) / RussianRoulette;

    castRay(wi_ray, depth + 1): This is a recursive call, which represents emitting a new ray from the current intersection point and obtaining the lighting contribution generated by the reflection of the ray on all objects.

    • Why divide by RussianRoulette at the end?

    Because when using Russian Roulette, we randomly generate a number, and if this number is greater than a threshold (here is the RussianRoulette variable), we terminate the ray tracing. When we terminate the tracing of a certain ray path, we are actually abandoning all possible subsequent reflections of this ray path, and these abandoned reflections may contribute to the final ray information.

    Therefore, to compensate for the effect of terminating the trace, we amplify the intensity of the remaining rays by a factor of 1 / RussianRoulette. This ensures that the expected values of the intensity of all retained ray paths are equal to their actual intensities, thus ensuring the unbiased nature of the ray tracing algorithm. I will discuss this in myPrevious article It is also mentioned in.

    At this point, we have basically explained the entire project.Previous Articles There are involved.

    Finally, we achieved the following rendering:

    img

    Next, I will talk about how to use multi-threaded computing to speed up rendering.

    Multithreaded acceleration

    A quick introduction to C++ multithreading

    For readers who are not familiar with C++ multithreading, here is an example to help you get started quickly.

    #include #include // This is the function we will run in both threads void printMessage(std::string message) { std::cout << message << std::endl; } int main() { // Create and run two threads std::thread thread1(printMessage, "Hello from thread 1"); std::thread thread2(printMessage, "Hello from thread 2"); // Wait for both threads to finish thread1.join(); thread2.join(); return 0; }

    Before our main function ends, we need to wait for all threads to end, so we must call the join() function. If the thread does not end, but the main program ends early, this may cause unexpected results.

    Deploy multithreading

    A relatively simple method is to use OpenMP, which will not be described here.

    The other is the most direct method, manual block + mutex, the code is as follows:

    // change the spp value to change sample ammount int spp = 4; int thread_num = 16; int thread_height = scene.height / thread_num; std::vector threads(thread_num); std::cout << "SPP: " << spp << "\n"; std::mutex mtx; float process=0; float Reciprocal_Scene_height=1.f/ (float)scene.height; auto castRay = [&](int thread_index) { int height = thread_height * (thread_index + 1); for (uint32_t j = height - thread_height; j < height; j++) { for (uint32_t i = 0; i < scene.width; ++i) { // generate primary ray direction float x = (2 * (i + 0.5) / ( float)scene.width - 1) * imageAspectRatio * scale; float y = (1 - 2 * (j + 0.5) / (float)scene.height) * scale; Vector3f dir = normalize(Vector3f(-x, y, 1)); for (int k = 0; k < spp; k++) framebuffer[j*scene.width+i] += scene.castRay(Ray(eye_pos, dir), 0) / spp; } mtx.lock( ); process = process + Reciprocal_Scene_height; UpdateProgress(process); mtx.unlock(); } }; for (int k = 0; k < thread_num; k++){ threads[k] = std::thread(castRay,k); std::cout << "Thread[" << k << "] Started:" << threads[k] .get_id() << "\n"; } for (int k = 0; k < thread_num; k++){ threads[k].join(); } UpdateProgress(1.f);

    But surprisingly, on my macOS, the speed of single thread and multi-thread (8t) is the same. My configuration is as follows:

    • Model Name: MacBook Pro – 14-inch, 2021
    • Chip: Apple M1 Pro
    • Total Number of Cores: 8 (6 performance and 2 efficiency)
    • Memory: 16G
    • System Version: macOS 13.2.1 (22D68)
    img

    16.threads vs 1.thread

    Optimize progress display

    Let each thread track its own progress and find that each thread is indeed calculating the corresponding area synchronously.

    auto castRay = [&](int thread_index) { // Create a local progress variable for this thread float thread_progress = 0.f; int height = thread_height * (thread_index + 1); for (uint32_t j = height - thread_height; j < height; j++) { for (uint32_t i = 0; i < scene.width; ++i) { // ... Rest of your code ... } // Update this thread's progress thread_progress += Reciprocal_Scene_height; std::cout << "Thread[" << thread_index << "] Progress: " << thread_progress << "\n"; } };
    img

    After testing, using 2 threads on my device is the best speed, which is about 70% faster. This statement is also confirmed in my task manager (the program sets the number of threads to 2):

    img

    It is speculated that the scheduling strategy of the M1pro chip is that when encountering a large number of parallel calculations, it will prioritize starting only two large cores to collaborate, rather than starting all cores like Windows (generally speaking).

    Microfacet Models

    • The microsurface theory

    As early as 1967, physicists Torrance and Sparrow proposed the theory of microfacet models in their paper "Theory for Off-Specular Reflection From Roughened Surfaces".

    The application in computer graphics is attributed to the work of Robert L. Cook and Kenneth E. Torrance. In their 1982 paper "A Reflectance Model for Computer Graphics", they introduced the microsurface model to the field of computer graphics, providing a physically based method for simulating various materials in the real world.

    • What is a microsurface model?

    The microsurface model is a theoretical model used in computer graphics to simulate the reflection and refraction of light from rough surfaces. The basic assumption of this model is that a complex microsurface can be replaced by a simplified macrosurface, and the scattering function (BSDF) of the macrosurface can match the overall directional scattering behavior of the microsurface. In other words, the microsurface model attempts to simulate how light scatters on a rough surface through statistical methods, rather than trying to accurately simulate the details of each microsurface.

    The microsurface model provides a concrete, physically based form for the BRDF, which is a core component of the lighting model.

    • Why the Microsurface Model?

    To quote the paper "A Reflectance Model for Computer Graphics":

    why images rendered with previous models often look plastic

    In other words, the model proposedOne of themThe purpose is to solve the "plastic feeling".

    definition

    In the previous shading model, we have the formula for the total reflectance r: r=ka+∑i=0nlc⋅(rd+rs)With,rd=kd⋅(n→⋅l→)rs=ks⋅(h→⋅n→)p $r_d, r_s$ represents the diffuse reflectance and specular reflectance respectively.

    Cook-Torrance provides a variation of this equation: r=ka+∑i=0nlc⋅(n→⋅l→)⋅(d⋅rd+s⋅rs) This equation introduces two new concepts:

    1. The diffuse reflection term $r_d$ and the specular reflection term $r_s$ are controlled by the variables d and s respectively.
    2. The dot product of the normal vector and the light vector is separated from the diffuse reflectance, making it part of the sum. This allows the diffuse reflectance to be a constant, and this definition can be modified later if necessary.

    In addition, the specular reflectivity $r_s$ contains a divisor of $(n⋅l)$, which cancels out the $(n⋅l)$ in the sum, which may be very confusing to the reader. Let me explain why.

    • About $s$ and $d$

    Since $s$ and $d$ are used to control the balance between diffuse and specular terms, there is the following relationship between the two: s+d=1 Therefore, we generally ignore $d$ and simplify the Cook-Torrance formula: r=ka+∑i=0nlc⋅(n→⋅l→)⋅((1−s)⋅rd+s⋅rs) It is not difficult to understand the relationship between $s$ and $d$: According to the law of conservation of energy, the total light energy reflected by a material (including diffuse and specular reflection) cannot exceed the light energy it absorbs. Therefore, if a material reflects a lot of light energy, then it must reflect less light energy diffusely, and vice versa. For example, a very smooth surface (such as a mirror or metal) will have high specular reflection but almost no diffuse reflection. A very rough surface (such as a brick wall or cloth) will have high diffuse reflection but almost no specular reflection.

    • About $r_s$

    rs=D*G*F4*(n→⋅l→)*(n→⋅v→)

    Note that the literature and other places sometimes use $π$ in the denominator of the formula instead of $4$. This is an error that was made in the Cook-Torrance paper and has been repeated in many places. The correct derivation of the formula should use $4$. However, this is a small difference in a constant factor so it is not particularly important if not done exactly.

    In the Cook-Torrance specular reflectance (r_s) formula, D, G, and F are three functions that can be selected in different forms. They represent the distribution function (Distribution), the geometric attenuation function (Geometry), and the Fresnel reflection function (Fresnel).

    Microsurface

    The microfacet model considers the surface of an object as composed of countless tiny face elements, each of which can have its own normal. Such a model can better simulate the details of the surface of an object, including roughness and smoothness.

    img

    The surface normal and the normal of each microfacet may not be the same. For very smooth surfaces, such as a perfect mirror, all microfacets face the same direction, which is the surface normal $n$. However, for matte or rough surfaces, such as matte paint or stone, the directions faced by microfacets are random. Or, on rough surfaces, these microfacets may occlude other facets and cast shadows on the facets.

    Cook-Torrance Model

    The Cook-Torrance model attempts to explain the above three phenomena:

    1. The distribution of microfacets (via the Normal Distribution Function D): Different microfacets face different directions, depending on the roughness of the surface. This distribution describes the proportion of microfacets that reflect light for an observer.
    2. Mutual occlusion and shadow effects between microfacets (through geometric attenuation function G): This function describes the mutual occlusion and shadow effects between microfacets. On a rough surface, some microfacets may be occluded by other facets or cast shadows on other facets.
    3. The interaction between light and microfacets (through the Fresnel reflection function F): This determines how much light will be reflected, absorbed, or transmitted after it hits a microfacet.

    Although the contribution of G and F functions to the final rendering result may be small, they are still important components of the model. In the physically based BRDF model, the distribution function D is a very critical part because it determines how light is reflected from various parts of the object's surface, thus affecting the visual effect of the object. Commonly used distribution functions include Beckmann distribution, GGX distribution, etc. Choosing different distribution functions can simulate the visual effects of different types of materials.

    Fresnel term

    This part is the Cook-Torrance model used to explain the Fresnel effect.

    In graphics, the Schlick approximation is generally used: F=F0+(1−F0)∗(1−(v→⋅h→))5 where

    • $F_0$ is the reflectivity when the light is incident vertically, which is calculated from the refractive index n of the object
    • v and h are the view direction and half-distance vector (the average direction of the incident and reflected rays) respectively.

    This formula assumes that the reflectivity changes linearly with the incident angle, but the reflectivity increases faster in the edge area (where the incident angle is close to 90 degrees), so a fifth power term is added to better simulate this phenomenon.

    Finally, this formula can be implemented in C++ as follows:

    double Fresnel_Schlick(double n, const Vector3d &v, const Vector3d &h) { double F0 = ((n - 1) * (n - 1)) / ((n + 1) * (n + 1)); double cosTheta = std ::max(dot(v, h), 0.0); return F0 + (1.0 - F0) * std::pow(1.0 - cosTheta, 5); }

    Geometric attenuation term

    In the Cook-Torrance model, geometric attenuation is usually defined as the minimum of two attenuation factors, corresponding to the two effects. These two attenuation factors can be calculated by some simple formulas, called the Torrance-Sparrow geometric shielding model. G=min(1,2(h→⋅n→)(n→⋅v→)(v→⋅h→),2(h→⋅n→)(n→⋅l→)(v→⋅h→)) where,

    • V is the viewing direction
    • H is the half vector (average of the light direction and the view direction)
    • N is the surface normal
    • n_dot_v and n_dot_l are the dot products of the normal with the view direction and the light direction respectively.

    This model is more suitable when the surface roughness is low (that is, the surface is closer to complete mirror reflection), and can be expressed in C++ as follows:

    double v_dot_h = dotProduct(V, H); double n_dot_h = dotProduct(N, H); double G1 = 2 * n_dot_h * n_dot_v / v_dot_h; double G2 = 2 * n_dot_h * n_dot_l / v_dot_h; double G = clamp(0, 1, std::min(G1, G2)); // Note that the clamp method is overridden in the framework, so there is no error here~

    D Function

    Finally, let's talk about the most critical and most contributing item, the microscopic surface item of specular reflection.

    The specular part of the Blinn-Phong model is one possible choice of D function that will make the specular part have Blinn-Phong-like properties, but it is not the only choice. In fact, there are many possible NDFs, each with different surface roughness and reflectivity properties.

    The most commonly used NDF is probably the GGX (Generalized Trowbridge-Reitz) model, which works well for a wide range of smooth to rough surfaces. Another common choice is the Beckmann model, which generally performs better for surfaces with medium to high roughness.

    Here we use the isotropic Beckmann model: D = e−tan2⁡αm2πm2cos4⁡α where,

    • $m$ represents the parameter of surface roughness
    • $\alpha$ is the angle between the half-vector $H$ and the surface normal $N$, calculated using $acos(\text{n_dot_h})$.
    double m = (type == MaterialType::MICROFACET_DIFFUSE) ? 0.6 : 0.2; double alpha = acos(n_dot_h); double D = exp(-pow(tan(alpha)/m, 2)) / (M_PI*m*m *pow(cos(alpha), 4));

    As the surface becomes very rough, the orientation of these microfacets becomes more and more random, causing the reflected light to become more scattered.

    Cook and Torrance recommended the Beckmann distribution. But now more people useGGX Distribution, I will talk about it in depth later.

    Code implementation

    The Cook-Torrance formula we need to implement is: r=ka+∑i=0nlc⋅(n→⋅l→)⋅((1−s)⋅rd+s⋅rs) We have just given the specific codes for the three terms of DGF. We can directly write the $r_s$ term in the Cook-Torrance formula:

    Vector3f Material::cookTorrance(const Vector3f &wi, const Vector3f &wo, const Vector3f &N) { auto V = wo; auto L = wi; auto H = normalize(V + L); auto type = m_type; double n_dot_v = dotProduct(N , V); double n_dot_l = dotProduct(N, L); if(!(n_dot_v > 0 && n_dot_l > 0)) return 0; double n_air = 1, n_diff = 1.2, n_glos = 1.2; double n2 = (type == MaterialType::MICROFACET_DIFFUSE) ? n_diff : n_glos; double r0 = (n_air-n2)/(n_air+ n2); r0*=r0; double F = r0+(1-r0)*pow(1 - n_dot_v, 5); double v_dot_h = dotProduct(V, H); double n_dot_h = dotProduct(N, H); double G1 = 2 * n_dot_h * n_dot_v / v_dot_h; double G2 = 2 * n_dot_h * n_dot_l / v_dot_h; double G = clamp(0, 1, std::min(G1, G2)); double m = (type == MaterialType::MICROFACET_DIFFUSE) ? 0.6 : 0.2; double alpha = acos(n_dot_h); double D = exp(-pow(tan(alpha)/m, 2)) / (M_PI*m*m*pow (cos(alpha), 4)); auto ans = F * G * D / (n_dot_l * n_dot_v * 4); return ans; }

    $r_d $ directly uses $\frac{1}{\pi}$. This assumption is based on the properties of Lambertian reflection. Lambertian reflection is an ideal, completely diffuse reflection surface.

    Then $k_a$ represents the reflection coefficient of ambient light. Ambient light is used to simulate the global illumination in the scene. Here, ambient light is directly ignored, that is, $k_a=0$.

    case MICROFACET_DIFFUSE: { float cosalpha = dotProduct(N, wo); if (cosalpha > 0.0f) { auto ans = Ks * cookTorrance(wi, wo, N) + Kd * eval_diffuse(wi, wo, N); return ans; } else return Vector3f(0.0f); break; }

    Additionally, the two coefficients for the material are set as follows:

    Material *white_m = new Material(MICROFACET_DIFFUSE, Vector3f(0.0f)); white_m->Kd = Vector3f(0.725f, 0.71f, 0.68f); white_m->Ks = Vector3f(1,1,1) - white_m-> Kd;

    Here is the final rendered result. The ball specifically uses the microsurface model MICROFACET_DIFFUSE, SSP=64, and resolution 784, 784:

    Reference

    1. Fundamentals of Computer Graphics 4th
    2. GAMES101 Lingqi Yan
    3. "A Reflectance Model for Computer Graphics" - 357290.357293
    4. https://zhuanlan.zhihu.com/p/152226698
    5. https://graphicscompendium.com/gamedev/15-pbr
    6. "Theory for Off-Specualr Reflection From Roughened Surfaces"
    7. https://hanecci.hatenadiary.org/entry/20130511/p
  • 音乐与Python

    Music and Python

    Written in September 2020. I wrote it for fun. I will optimize the code in the next article. It is very unprofessional QAQ.

    This article starts with the study of music theory, starting from the twelve-tone equal temperament, introducing some basic and necessary music theory knowledge, and then writing a python file to output a chord audio file.

    Music theory knowledge part:

    1. Brief Introduction to Temperament

    1. Introduction

    Temperament, also known as "music", is the science that studies the composition and application of musical scales. Temperament studies the musical scales used in music. The sounds used in music are mostly fixed, while the musical scale is based on a certainintervalA system of musical tones based on the law of pitch and specified by mathematical methods. Each unit in the system is called a "law"; a musical scale is a series of tones composed of several laws selected from the law system according to certain specifications of interval relationships, and each unit in it is called a "tone". When "tone" and "law" are collectively referred to as "musical law", in addition to the law system, it also refers to all musical tones with precise specifications.

    In short, the law is to use mathematical methods to determine the vibration frequency of each pitch (not just one). "Law" is the basic unit of the law system. The concepts of "law" and "sound" are similar but slightly different. Each unit in the law system is called a "law", while each unit in the scale is called a "sound". The relationship between the law system and the scale is very close.

    2. Calculation of Laws

    Temperament calculation method is the interval calculation method, which uses frequency ratio or interval value to represent and calculate the size of the interval.

    From ancient times to the present, laws have been constantly changing. Different legal systems are determined by different laws of life.

    There are four types of interval values: logarithmic value, octave value, cent value and average interval value.

    3. Circle-of-fifths system

    The fifth generation law stipulates that the frequencies of the two notes that form a perfect fifth interval are set at 2:3. This method of generating a law every fifth degree and continuing to generate each law is called the "fifth generation method".

    Among them, due to the existence of the maximum tone difference, the fifth-generation law cannot be circulated on the twelve-tone scale to form the scale of each key. That is, starting from the tonic, after generating the law twelve times (or more times) and incorporating it into the same octave, it is impossible to return to the tonic. This creates certain obstacles to the use of the fifth-generation law.

    4. Just intonation

    The frequencies of the notes in the scale are derived from ratios of small integers. In this system, the intervals between notes are based on simple numerical ratios, such as 2:1 for an octave, 3:2 for a perfect fifth, and 4:3 for a perfect fourth. These ratios create harmonically pure intervals that are said to produce more natural and pleasing to the ear than the isothermal tuning system used in modern Western music.

    5. Twelve-tone equal temperament

    The twelve-tone equal temperament is a tuning system that divides an octave into twelve semitones with equal frequency ratios, also known as the "twelve-equally proportional temperament".

    All of our codes below are based on the twelve-tone equal temperament. The reason is that the piano is designed based on the twelve-tone equal temperament.

    Baidu Encyclopedia writes: The twelve-tone equal temperament was first proposed by a scientist in the Ming Dynasty of my country.Zhu ZaiyuDiscovered in 1584. Later spread to the West via the Silk Road.

    1605 Dutch mathematicianSimon SteffenIn an unfinished manuscript "Van de Spiegheling der singconst" it was proposed to use

    He calculated the twelve-tone equal temperament, but due to insufficient precision, the string length figures he calculated deviated from the correct number by as much as one or two units.

    The frequency ratio of an octave is 2:1, so the frequency ratio between the various tones of the twelve equal temperaments should be:

    In music practice, musicians at that time were well aware of the convenience of the twelve-tone equal temperament. Composers and performers from all over the world began to use the twelve-tone equal temperament and were also committed to its development. For example, JS Bach of Germany wrote two volumes of "Well-Tempered Piano Music". Although these two volumes did not only use the twelve-tone equal temperament (some irregular temperaments were also used), they are considered to be exemplary works that fully utilized the effectiveness of the twelve-tone equal temperament and could be freely transposed.

    6. Comparison of the three systems

    Each of the three tuning systems has its own advantages and disadvantages. The twelve-tone equal temperament solves some contradictions between the fifth-tone temperament and the pure temperament, such as the contradiction that the starting temperament cannot be returned to after increasing the number of temperaments, but the twelve-tone equal temperament will affect the harmony of the interval. In general, the twelve-tone equal temperament reconciles and compromises the fifth-tone temperament and the pure temperament, which is between the two and closer to the fifth-tone temperament. The twelve-tone equal temperament is currently the most widely used tuning system.

    2. Basic Music Theory - P1 - Musical Tone System and Grouping

    1. Music system and musical series

    There are 88 keys on a piano. The sum of these musical notes is called –Music System.

    Arranged from low to high is called –Tone Series.

    2. Note name

    In the musical system, each musical note has a fixed name, i.e., note name, which is usually represented by letters C, D, E, F, G, A, B.

    These seven pitches are also called basic pitches, and their positions on the piano keyboard are fixed.

    3. Sound grouping

    The piano is divided into different areas.

    The group of notes starting from the middle C on the piano keyboard is called the small note group 1, also known as the "axis group". The small note group 1 to the right is called the small note group 2, small note group 3, small note group 4, and small note group 5; the small note group 1 to the left is called the small note group, large note group, large note group 1, and large note group 2.

    3. Basic Music Theory - P2 - Staff & Notes

    1. Staff

    The staff is composed of five lines and four spaces, which are used to record the pitch of notes. Because the staff has five equidistant parallel horizontal lines, it is called the five-line staff.

    image

    2. Clef

    Here we just need to briefly introduce the treble clef. In addition, there are also bass clef and alto clef.

    image

    3. Musical Notes

    A note consists of three parts: the note head (the hollow or filled oval symbol), the stem (the short vertical line), and the tail (the small arc on the right side of the stem).

    4. Rest

    Rests are symbols used to indicate pauses or interruptions in music.

    During the progress of music, although the rest indicates a brief silence, it has a special meaning at this time, and the music is not interrupted, so the rest is one of the important components of a musical work.

    image

    5. Time Signature

    The numerator number represents the number of beats in a measure, and the denominator number represents how many notes make up a beat. Be careful not to read it as a fraction of a beat!

    In addition, the time signature also clarifies the rules for changes in the strength of the melody.

    image

    4. Basic Music Theory - P3 - Intervals

    1. Interval

    The pitch distance between two notes is called the interval.

    There are two types of intervals: melodic intervals and harmonic intervals.

    In short, a melodic interval is when two notes sound one after the other, while a harmonic interval is when they sound at the same time.

    2. Melodic intervals

    Read in the order of pronunciation.

    3. Harmonic Intervals

    The lower note is called the root note, and the upper note is called the crown note.

    4. Consonant and dissonant intervals

    The sound effects of consonant intervals sound pleasant and harmonious, and can be divided into perfect consonant intervals and incomplete consonant intervals.

    Perfect consonant intervals include perfect first, perfect fourth, perfect fifth, and perfect octave; imperfect consonant intervals include minor third, major third, minor sixth, and major sixth.

    The sound of pure octave sounds very integrated, like the sound of one note, but because it is too integrated, the sound will sound hollow.

    The sound effects of the perfect first, perfect fourth, and perfect fifth intervals are slightly fuller than those of the perfect octave, but they also appear hollow.

    Therefore, all pure intervals (perfect first, perfect fourth, perfect fifth, perfect octave) are perfectly consonant intervals.

    The sound of dissonant intervals is harsh, tense and unstable, such as major and minor seconds, major and minor sevenths, augmented fourths and diminished fifths.

    Although dissonant intervals sound sharp, they are also important elements in musical works.

    5. Basic Music Theory - P4 - Mode

    1. Tuning

    A system of musical notes of different pitches, organized around a stable central tone (tonic) in a certain relationship, is called a mode. The most widely used mode in the world today is the major and minor mode.

    Among them, Bach's "Well-Tempered Clavier" is a classic among classics in major and minor keys.

    2. Tonic

    In the mode, the tonic is the central tone in the core position, which has the strongest sense of stability and other tones tend to it. In songs (music), the tonic often appears on the strong beat, the longer tone or the ending.

    3. Major Mode

    The major scale is abbreviated as "major" and consists of seven scales. The major third is the interval between the tonic and the Ⅲ scale, which is also the characteristic of the major scale. Among them, the Ⅰ, Ⅲ, and Ⅴ scales form a major triad, so the color of the major scale is bright and brilliant.

    There are three types of major keys: natural major, harmonic major, and melodic major.

    1. Natural major

    The natural major is the most commonly used major key, consisting entirely of natural notes.

    The natural major scale consists of five whole tones and two half tones.

    Its scale structure is: whole tone-whole tone-semitone-whole tone-whole tone-whole tone-semitone, that is, it is composed of major second, major second, minor second, major second, major second, major second, and minor second intervals.

    There is a tone (black key) between C and D, so the CD is called full tone.

    There is no interval between E and F, so it is called a semitone.

    To make it easier to memorize, the natural major key is compiled into a mnemonic: full, full, half, full, full, full, half.

    That is what we usually say: de re mi fa sol la si

    The key of C natural major is shown below:

    image

    2. Harmonic major

    On the basis of the natural major scale, lowering the VI note by a semitone results in the harmonic major scale.

    Its characteristic is the augmented second interval formed between the flat VI and VII notes.

    This augmented second is the mark of the harmonic major key and also the characteristic of the harmonic major key.

    The C harmonic major scale is shown below:

    image

    3. Melodic major

    On the basis of the natural major scale, lowering the VI note by a semitone results in the melodic major scale.

    Its characteristic is the augmented second interval formed between the flat VI and VII notes.

    This augmented second is the mark of the harmonic major key and also the characteristic of the harmonic major key.

    The C melodic major scale is shown below:

    image

    4. Minor mode

    The minor mode is simply called "minor" and is also composed of seven pitches.

    The relationship between its tonic and the Ⅲ degree is a minor third, which is also the characteristic of the minor mode. Among them, the Ⅰ, Ⅲ, and Ⅴ degrees form a minor triad, so the color of the minor mode is soft and dim.

    There are also three types of minor keys: natural minor, harmonic minor, and melodic minor.

    1. Natural minor

    The natural minor scale also consists of five whole tones and two semitones, and its scale structure is whole tone-semitone-whole tone-whole tone-semitone-whole tone-whole tone, that is, it is composed of major second, minor second, major second, major second, minor second, major second, and major second intervals.

    The a natural minor scale is shown below:

    To make it easier to memorize, the mnemonic is: whole, half, whole, whole, half, whole, whole.

    2. Harmonic minor

    On the basis of the natural minor scale, raising the seventh degree by a semitone results in the harmonic minor scale.

    Its characteristic is the augmented second interval formed between the VI and the raised VII, and the VII has the function of a leading tone tending towards the tonic after being raised by a semitone, and it is more tense than the natural minor.

    The a harmonic minor scale is shown below:

    3. Melodic minor

    On the basis of the natural minor scale, the VI and VII degrees in the ascending scale of the natural minor are raised by a semitone, and then these two notes are restored in the descending scale, which is the melodic minor.

    The a-melodic minor scale is shown below:

    6. Basic Music Theory - P5 - Key Signature

    1. Key Signature

    The key signature is a symbol that indicates the pitch (i.e. the pitch of the main note) of a song (music). It is located at the beginning of each line of the staff (after the clef) or where a new key appears during the music.

    Key signatures are written using sharp and flat signs.

    Python arrangement part:

    1. Understanding the MIDO library

    1. Import mido

    from mido import Message,MidiFile,MidiTrack

    2. Mido basic framework

    Create two mido library objects: MidiFile(), MidiTrack()

    The former is used to edit, generate and output Midi files, and the latter is used for midi file track editing.

    mid = MidiFile() track = MidiTrack() mid.tracks.append(track)

    Add information in track object, 'program_change' means switch track.

    The intuitive understanding is: before editing a track, you need to select the track first. (This is the official fixed routine, it doesn’t matter if you don’t understand it, just follow it)

    track.append(Message('program_change', program=0, time=0))

    3. Write notes

    After selecting the track, you can add notes directly to the track. Note that the time unit here is milliseconds.

    track.append(Message('note_on', note=60, velocity=64, time=0))

    After adding, add an end mark, and this is the point where the writing of a note is truly completed.

    track.append(Message('note_off', note=60, velocity=64, time=2000))

    After adding, you can directly generate a midi file

    mid.save('MyFirstDamnSong.mid')

    4. Code

    The code is as follows:

    2. Combined with Music Theory

    1. Midi file frequency number table:

    From the table above, we can see that the midi number of middle C is 60.

    2. Writing tool library

    1. Task Analysis

    We first write a tool library Notes_Toolbox.py to define some commonly used and basic things.

    Define the basics: intervals, note names, modes, chords, beats, fundamental notes, etc.

    Then write some simple methods for calling, such as returning a set of chords, automatically changing the key, and getting the MIDI frequency number through the five-line score.

    2. Define variables

    serial numberVariable NameIs static?Default valueeffect
    1bpmNo125Beat
    2timePerBeatNo60 / bpm * 100Duration of each beat (milliseconds)
    3base_notestatic60MIDI number for middle C
    4note_name[]static['C','D','E','F','G','A','B']musical alphabet
    5major_notes[]static[0, 2, 2, 1, 2, 2, 2, 1]Diatonic Scale
    6Cmajor_notes[]static
    7Eflatmajor_notes[]static
    8Cmajor{}static{'C': 60, 'D': 62, 'E': 64, 'F': 65, 'G': 67, 'A': 69, 'B': 71}C major dictionary
    9Eflatmajor{}static{'C': 63, 'D': 65, 'E': 67, 'F': 68, 'G': 70, 'A': 72, 'B': 74}E minor dictionary

    3. Define functions

    serial numberMethod NamereturnPassing in parameterseffect
    1get_noteMIDI Numbernote,group=0,**kwEnter the note name and note area, and return the corresponding MIDI number. (Needs improvement, no black keys)
    2get_chordChord Arrayname, **kwEnter the chord name and return the chord array
    3originToEflatMajorNew E minor MIDI number arraylist,**kwEnter C major and return E minor. (Needs improvement to support specifying which key to change to)
    bpm = 125 #why 125: #bpm = 1 * 1000 / 8 timePerBeat = 60 / bpm * 1000 base_note = 60 # C4 note_name =[ 'C','D','E','F','G','A ','B' ] major_notes = [0, 2, 2, 1, 2, 2, 2, 1] Cmajor_notes = [] Eflatmajor_notes = [] for num in range(12): Cmajor_notes.append(base_note+sum(major_notes[0:num+1])) Eflatmajor_notes.append(base_note+ 3+sum(major_notes[0:num+1])) # There is only one area Cmajor = dict(zip(note_name,Cmajor_notes)) # Cmajor = {'C':60,'D':62,'E':64,'F':65,'G':67,'A':69,' B':71} Eflatmajor = dict(zip(note_name,Eflatmajor_notes)) # Eflatmajor = {'C': 63, 'D': 65, 'E': 67, 'F': 68, 'G': 70, 'A': 72, 'B': 74} def get_note(note,group=0,**kw ):#Group = 0 means 4Group global base_note,major_notes return base_note + group*12 + sum(major_notes[0,note]) def originToEflatMajor(list,**kw): Ef=[] for x in list: Ef.append(x+3) return Ef #get_note(1,group=0) return 60 #get_note(2,group=0) return 62 def get_chord (name): chord = { "Major3":[0,4,7,12],# Major triad "Minor3":[0,3,7,12],# Minor triad "Augmented3":[0,4,8,12],# Augmented triad "Diminished3":[0,3,6,12],# Diminished triad "M7":[0,4,7,11],# Major seventh chord "Mm7":[0,4,7, 10],#dominant seventh chord "m7":[0,3,7,10],#minor seventh chord "mM7":[], #... } return chord[name] #get_chord("Major") return [ 0,4,7,12]

    3. Write a chord breakdown

    Now that we have written the tool module, we can focus on the chord part.

    1. Write an output chord function

    Output broken chords to midi files

    When using this method, you need to pass in:

    serial numberParameter name:Incoming:Example:
    1trackOutput audio interface of mido libraryMidiTrack()
    2rootThe name of the note'C','D','E','F','G','A','B'
    3nameThe name of the chord, defined in Notes_Toolbox'Major 3'…
    4formatOutputting the broken chords[0,1,2] [1,3,2,3]…
    5lengthHow long the note lasts4

    Just put the source code:

    def add_broken_chord(root, name, format, length, track, tone_name='Cmajor', root_base=0, channel=0): # The default is C major root_num = Notes_Toolbox.Cmajor if tone_name == 'Eflat': root_num = {'C': 63, 'D': 65, 'E': 67, 'F': 68, 'G': 70, 'A': 72, 'B': 74} root_note = root_num[root] + root_base*12 # 解作根音time = (length * 480) / len(format) # This is the official document writing, I don’t understand it either. Time refers to the duration of the note for broken_chord in format: # Through the for loop, output the notes of the chord one by one note = root_note + Notes_Toolbox.get_chord(name)[broken_chord] track.append(Message('note_on', note=note, velocity=60, time=0, channel=channel)) track.append(Message('note_on', note=note, velocity=60, time=round(time), channel=channel))

    2. Call reference:

    format = [0, 1, 2, 3] add_broken_chord('C', 'Major3', format, 4, track) add_broken_chord('C', 'Minor3', format, 4, track) add_broken_chord('C', ' Augmented3', format, 4, track) add_broken_chord('C', 'Diminished3', format, 4, track) add_broken_chord('C', 'Diminished3', format, 4, track)

    Finally, call to save the midi file.



    3. Standardized Code

    In order to facilitate calling, we adjust the order of function parameters.

    In addition, overload the play_note method so that it can receive the int type note (that is, directly input the MIDI number).

    Similarly, all note-related inputs can be overloaded.

    def play_note(note, track, length=1, tone_name='Cmajor', root_base=0, delay=0, velocity=1.0, channel=0): ... def play_note(note:int, track, length=1, tone_name='Cmajor', root_base=0, delay=0, velocity=1.0, channel=0): ... def play_broken_chord(root, name, format, track,length=1, tone_name='Cmajor', delay=0, velocity=1.0,root_base=0, channel=0): ...

    4. Conclusion: Mido's journey is yet to be continued...

    After the above learning and practice, I have a simple understanding of the connection between music and mathematics, and some basic music theory knowledge. Through the code, the output of various chords is realized.

    Combining music theory knowledge, next, we will write circle chords.


    References

    .Basics of Music Theory”, edited by Li Chongguang, People’s Music Publishing House;