{"id":1357,"date":"2023-11-22T22:57:28","date_gmt":"2023-11-22T13:57:28","guid":{"rendered":"https:\/\/xn--k10aa.com\/?p=1357"},"modified":"2024-10-20T21:12:16","modified_gmt":"2024-10-20T12:12:16","slug":"202hw3","status":"publish","type":"post","link":"https:\/\/remoooo.com\/en\/202hw3\/","title":{"rendered":"Games202 Assignment 3 SSR Implementation"},"content":{"rendered":"<p class=\"wp-block-paragraph\">Assignment source code:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/github.com\/Remyuu\/GAMES202-Homeworkgithub.com\/Remyuu\/GAMES202-Homework\">https:\/\/github.com\/Remyuu\/GAMES202-Homeworkgithub.com\/Remyuu\/GAMES202-Homework<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">TODO List<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Implements shading of the scene&#039;s direct lighting (taking shadows into account).<\/li>\n\n\n\n<li>Implements screen space ray intersection (SSR).<\/li>\n\n\n\n<li>Implements shading of indirect lighting of the scene.<\/li>\n\n\n\n<li>Implement RayMarch with dynamic step size.<\/li>\n\n\n\n<li><strong>(Not written yet)<\/strong> Bonus 1: Screen Space Ray Tracing with Mipmap Optimization.<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-232.png\" alt=\"img\" class=\"wp-image-1362 lazyload\"\/><noscript><img decoding=\"async\" width=\"1200\" height=\"612\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-232.png\" alt=\"img\" class=\"wp-image-1362 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-232.png 1200w, https:\/\/remoooo.com\/wp-content\/uploads\/image-232-300x153.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-232-1024x522.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-232-768x392.png 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Number of samples: 32<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Written in front<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The basic part of this assignment is the easiest among all the assignments in 202. There is nothing particularly complicated. But I don&#039;t know how to start with the bonus part. Can someone please help me?<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Depth buffer problem of framework<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This time, the operation encountered a more serious problem on macOS. The part of the cube close to the ground showed abnormal cutting jagged problems as the distance of the camera changed. This phenomenon did not occur on Windows, which was quite strange.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-230.png\" alt=\"img\" class=\"wp-image-1360 lazyload\"\/><noscript><img decoding=\"async\" width=\"910\" height=\"708\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-230.png\" alt=\"img\" class=\"wp-image-1360 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-230.png 910w, https:\/\/remoooo.com\/wp-content\/uploads\/image-230-300x233.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-230-768x598.png 768w\" sizes=\"(max-width: 910px) 100vw, 910px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">I personally feel that this is related to the accuracy of the depth buffer, and may be caused by z-fighting, in which two or more overlapping surfaces compete for the same pixel. There are generally several solutions to this problem:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Adjust the near and far planes: don&#039;t make the near plane too close to the camera, and don&#039;t make the far plane too far away.<\/li>\n\n\n\n<li>Improve the precision of the depth buffer: use 32-bit or higher precision.<\/li>\n\n\n\n<li>Multi-Pass Rendering: Use different rendering schemes for objects in different distance ranges.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">The simplest solution is to modify the size of the near plane, located in line 25 of the framework&#039;s engine.js.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ engine.js \/\/ const camera = new THREE.PerspectiveCamera(75, gl.canvas.clientWidth \/ gl.canvas.clientHeight, 0.0001, 1e5); const camera = new THREE.PerspectiveCamera(75, gl.canvas.clientWidth \/ gl.canvas.clientHeight, 5e-2, 1e2);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This will give you a pretty sharp border.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-231.png\" alt=\"img\" class=\"wp-image-1361 lazyload\"\/><noscript><img decoding=\"async\" width=\"1062\" height=\"708\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-231.png\" alt=\"img\" class=\"wp-image-1361 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-231.png 1062w, https:\/\/remoooo.com\/wp-content\/uploads\/image-231-300x200.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-231-1024x683.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-231-768x512.png 768w\" sizes=\"(max-width: 1062px) 100vw, 1062px\" \/><\/noscript><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Added &quot;Pause Rendering&quot; function<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This section is optional. To reduce the strain on your computer, simply write a button to pause the rendering.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ engine.js\nlet settings = {\n    'Render Switch': true\n};\n\nfunction createGUI() {\n    ...\n    \/\/ Add the boolean switch here\n    gui.add(settings, 'Render Switch');\n    ...\n}\n\nfunction mainLoop(now) {\n    if(settings&#91;'Render Switch']){\n        cameraControls.update();\n        renderer.render();\n    }\n    requestAnimationFrame(mainLoop);\n}\nrequestAnimationFrame(mainLoop);<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-228.png\" alt=\"img\" class=\"wp-image-1358 lazyload\"\/><noscript><img decoding=\"async\" width=\"512\" height=\"322\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-228.png\" alt=\"img\" class=\"wp-image-1358 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-228.png 512w, https:\/\/remoooo.com\/wp-content\/uploads\/image-228-300x189.png 300w\" sizes=\"(max-width: 512px) 100vw, 512px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">image-20231117191114477<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Implementing direct lighting<\/h2>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Implement EvalDiffuse(vec3 wi, vec3 wo, vec2 uv) and EvalDirectionalLight(vec2 uv) in shaders\/ssrShader\/ssrFragment.glsl.<\/p>\n<\/blockquote>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ ssrFragment.glsl\nvec3 EvalDiffuse(vec3 wi, vec3 wo, vec2 screenUV) {\n  vec3 reflectivity = GetGBufferDiffuse(screenUV);\n  vec3 normal = GetGBufferNormalWorld(screenUV);\n  float cosi = max(0., dot(normal, wi));\n  vec3 f_r = reflectivity * cosi;\n  return f_r;\n}\n\nvec3 EvalDirectionalLight(vec2 screenUV) {\n  vec3 Li = uLightRadiance * GetGBufferuShadow(screenUV);\n  return Li;\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The first code snippet actually implements the Lambertian reflection model, which corresponds to $f_r \\cdot \\text{cos}(\\theta_i)$ in the rendering equation.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Here I divide $\\pi$, but according to the results given in the assignment framework, there should be no division, so just take it as it is here.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">The second part is responsible for direct lighting (including shadow occlusion), relative to the $L_i \\cdot V$ of the rendering equation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Lo(p,\u03c9o)=Le(p,\u03c9o)+\u222b\u03a9Li(p,\u03c9i)\u22c5fr(p,\u03c9i,\u03c9o)\u22c5V(p,\u03c9i)\u22c5cos\u2061(\u03b8i)d\u03c9i<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#039;s review the Lambertian reflection model here. We noticed that EvalDiffuse passed in two directions, wi and wo, but we only used the direction of the incident light, wi. This is because the Lambertian model has nothing to do with the direction of observation, but only with the surface normal and the cosine value of the incident light.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, set the result in main().<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ ssrFragment.glsl\nvoid main() {\n  float s = InitRand(gl_FragCoord.xy);\n  vec3 L = vec3(0.0);\n  vec3 wi = normalize(uLightDir);\n  vec3 wo = normalize(uCameraPos - vPosWorld.xyz);\n  vec2 worldPos = GetScreenCoordinate(vPosWorld.xyz);\n\n  L = EvalDiffuse(wi, wo, worldPos) * \n      EvalDirectionalLight(worldPos);\n\n  vec3 color = pow(clamp(L, vec3(0.0), vec3(1.0)), vec3(1.0 \/ 2.2));\n  gl_FragColor = vec4(vec3(color.rgb), 1.0);\n}<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-229.png\" alt=\"img\" class=\"wp-image-1359 lazyload\"\/><noscript><img decoding=\"async\" width=\"1228\" height=\"774\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-229.png\" alt=\"img\" class=\"wp-image-1359 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-229.png 1228w, https:\/\/remoooo.com\/wp-content\/uploads\/image-229-300x189.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-229-1024x645.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-229-768x484.png 768w\" sizes=\"(max-width: 1228px) 100vw, 1228px\" \/><\/noscript><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">2. Specular SSR \u2013 Implementing RayMarch<\/h2>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Implement the RayMarch(ori, dir, out hitPos) function to find the intersection point between the ray and the object and return whether the ray intersects the object. The parameters ori and dir are values in the world coordinate system, representing the starting point and direction of the ray respectively, where the direction vector is a unit vector. For more information, please refer to EA&#039;s SIG15<a href=\"https:\/\/www.slideshare.net\/DICEStudio\/stochastic-screenspace-reflections\">Course Report<\/a>.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">The &quot;cube1&quot; of the work frame itself includes the ground, so the final SSR effect of this thing is not very beautiful. The &quot;beautiful&quot; here refers to the clarity of the result map in the paper or the exquisiteness of the water reflection effect in the game.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To be precise, what we implement in this article is the most basic &quot;mirror SSR&quot;, namely Basic mirror-only SSR.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-233.png\" alt=\"img\" class=\"wp-image-1363 lazyload\"\/><noscript><img decoding=\"async\" width=\"1036\" height=\"604\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-233.png\" alt=\"img\" class=\"wp-image-1363 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-233.png 1036w, https:\/\/remoooo.com\/wp-content\/uploads\/image-233-300x175.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-233-1024x597.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-233-768x448.png 768w\" sizes=\"(max-width: 1036px) 100vw, 1036px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The easiest way to implement &quot;mirror SSR&quot; is to use Linear Raymarch, which gradually determines the occlusion relationship between the current position and the depth position of gBuffer through small steps.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-238.png\" alt=\"img\" class=\"wp-image-1368 lazyload\"\/><noscript><img decoding=\"async\" width=\"636\" height=\"434\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-238.png\" alt=\"img\" class=\"wp-image-1368 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-238.png 636w, https:\/\/remoooo.com\/wp-content\/uploads\/image-238-300x205.png 300w\" sizes=\"(max-width: 636px) 100vw, 636px\" \/><\/noscript><\/figure>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ ssrFragment.glsl\nbool RayMarch(vec3 ori, vec3 dir, out vec3 hitPos) {\n  const int totalStepTimes = 60;\n  const float threshold = 0.0001;\n  float step = 0.05;\n  vec3 stepDir = normalize(dir) * step;\n  vec3 curPos = ori;\n\n  for(int i = 0; i &lt; totalStepTimes; i++) {\n    vec2 screenUV = GetScreenCoordinate(curPos);\n    float rayDepth = GetDepth(curPos);\n    float gBufferDepth = GetGBufferDepth(screenUV);\n\n    \/\/ Check if the ray has hit an object\n    if(rayDepth &gt; gBufferDepth + threshold){\n      hitPos = curPos;\n      return true;\n    }\n    curPos += stepDir;\n  }\n  return false;\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, fine-tune the step size. I ended up with 0.05. If the step size is too large, the reflection will be &quot;broken&quot;. If the step size is too small and the number of steps is not enough, the calculation may be terminated because the step distance is not enough where the reflection should be. The maximum number of steps in the figure below is 150.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-235.png\" alt=\"img\" class=\"wp-image-1365 lazyload\"\/><noscript><img decoding=\"async\" width=\"804\" height=\"382\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-235.png\" alt=\"img\" class=\"wp-image-1365 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-235.png 804w, https:\/\/remoooo.com\/wp-content\/uploads\/image-235-300x143.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-235-768x365.png 768w\" sizes=\"(max-width: 804px) 100vw, 804px\" \/><\/noscript><\/figure>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ ssrFragment.glsl vec3 EvalSSR(vec3 wi, vec3 wo, vec2 screenUV) { vec3 worldNormal = GetGBufferNormalWorld(screenUV); vec3 relfectDir = normalize(reflect(-wo, worldNormal)); vec3 hitPos; if(RayMarch(vPosWorld.xyz ,relfectDir, hitPos)){ vec2 INV_screenUV = GetScreenCoordinate(hitPos); return GetGBufferDiffuse(INV_screenUV); } else{ return vec3(0.); } }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Write a function that calls RayMarch and wraps it up so it can be used in main().<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ ssrFragment.glsl\nvoid main() {\n  float s = InitRand(gl_FragCoord.xy);\n  vec3 L = vec3(0.0);\n  vec3 wi = normalize(uLightDir);\n  vec3 wo = normalize(uCameraPos - vPosWorld.xyz);\n  vec2 screenUV = GetScreenCoordinate(vPosWorld.xyz);\n\n  \/\/ Basic mirror-only SSR\n  float reflectivity = 0.2;\n\n  L = EvalDiffuse(wi, wo, screenUV) * EvalDirectionalLight(screenUV);\n  L+= EvalSSR(wi, wo, screenUV) * reflectivity;\n\n  vec3 color = pow(clamp(L, vec3(0.0), vec3(1.0)), vec3(1.0 \/ 2.2));\n  gl_FragColor = vec4(vec3(color.rgb), 1.0);\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">If you just want to test the effect of SSR, please adjust it yourself in main().<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-234.png\" alt=\"img\" class=\"wp-image-1364 lazyload\"\/><noscript><img decoding=\"async\" width=\"962\" height=\"592\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-234.png\" alt=\"img\" class=\"wp-image-1364 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-234.png 962w, https:\/\/remoooo.com\/wp-content\/uploads\/image-234-300x185.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-234-768x473.png 768w\" sizes=\"(max-width: 962px) 100vw, 962px\" \/><\/noscript><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-236.png\" alt=\"img\" class=\"wp-image-1366 lazyload\"\/><noscript><img decoding=\"async\" width=\"1086\" height=\"580\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-236.png\" alt=\"img\" class=\"wp-image-1366 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-236.png 1086w, https:\/\/remoooo.com\/wp-content\/uploads\/image-236-300x160.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-236-1024x547.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-236-768x410.png 768w\" sizes=\"(max-width: 1086px) 100vw, 1086px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Before the release of &quot;Killzone Shadow Fall&quot; in 2013, SSR technology was still subject to great restrictions, because in actual development, we usually need to simulate glossy objects. Due to the performance limitations at the time, SSR technology was not widely adopted. With the release of &quot;Killzone Shadow Fall&quot;, it marks a significant progress in real-time reflection technology. Thanks to the special hardware of PS4, it is possible to render high-quality glossy and semi-reflective objects in real time.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-237.png\" alt=\"img\" class=\"wp-image-1367 lazyload\"\/><noscript><img decoding=\"async\" width=\"322\" height=\"326\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-237.png\" alt=\"img\" class=\"wp-image-1367 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-237.png 322w, https:\/\/remoooo.com\/wp-content\/uploads\/image-237-296x300.png 296w\" sizes=\"(max-width: 322px) 100vw, 322px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the following years, SSR technology developed rapidly, especially in combination with technologies such as PBR.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Starting with Nvidia&#039;s RTX graphics cards, the rise of real-time ray tracing has gradually replaced SSR in some scenarios. However, in most development scenarios, traditional SSR still plays a considerable role.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The future development trend will still be a mixture of traditional SSR technology and ray tracing technology.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">3. Indirect lighting<\/h2>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Write it according to the pseudocode. That is, use the Monte Carlo method to solve the rendering equation. Unlike before, the samples this time are all in screen space. In the sampling process, you can use the SampleHemisphereUniform(inout s, ou pdf) and SampleHemisphereCos(inout s, out pdf) provided by the framework. These two functions return local coordinates, and the input parameters are the random number s and the sampling probability pdf.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">For this part, you need to understand the pseudo code in the figure below, and then complete EvalIndirectionLight() accordingly.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-240.png\" alt=\"img\" class=\"wp-image-1370 lazyload\"\/><noscript><img decoding=\"async\" width=\"1304\" height=\"512\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-240.png\" alt=\"img\" class=\"wp-image-1370 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-240.png 1304w, https:\/\/remoooo.com\/wp-content\/uploads\/image-240-300x118.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-240-1024x402.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-240-768x302.png 768w\" sizes=\"(max-width: 1304px) 100vw, 1304px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">First of all, we need to know that our sampling is still based on screen space. Therefore, we treat the content that is not on the screen (gBuffer) as non-existent. It is understood that there is only one layer of shell facing the camera.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Indirect lighting involves random sampling of the upper hemisphere direction and the calculation of the corresponding PDF. Use InitRand(screenUV) to get the random number, then choose one of the two, SampleHemisphereUniform(inout float s, out float pdf) or SampleHemisphereCos(inout float s, out float pdf), update the random number and get the corresponding PDF and the position dir of the local coordinate system on the unit hemisphere.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Pass the normal coordinates of the current Shading Point into the function LocalBasis(n, out b1, out b2), and then return b1, b2, where the three unit vectors n, b1, b2 are orthogonal to each other. Through the local coordinate system formed by these three vectors, dir is converted to world coordinates. I will write about the principle of LocalBasis() at the end.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">By the way, the matrix constructed with the vectors n (normal), b1, and b2 is commonly referred to as the TBN matrix in computer graphics.<\/p>\n<\/blockquote>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ ssrFragment.glsl\n#define SAMPLE_NUM 5\n\nvec3 EvalIndirectionLight(vec3 wi, vec3 wo, vec2 screenUV){\n  vec3 L_ind = vec3(0.0);\n  float s = InitRand(screenUV);\n  vec3 normal = GetGBufferNormalWorld(screenUV);\n  vec3 b1, b2;\n  LocalBasis(normal, b1, b2);\n\n  for(int i = 0; i &lt; SAMPLE_NUM; i++){\n    float pdf;\n    vec3 direction = SampleHemisphereUniform(s, pdf);\n    vec3 worldDir = normalize(mat3(b1, b2, normal) * direction);\n\n    vec3 position_1;\n    if(RayMarch(vPosWorld.xyz, worldDir, position_1)){ \/\/ \u91c7\u6837\u5149\u7ebf\u78b0\u5230\u4e86 position_1\n      vec2 hitScreenUV = GetScreenCoordinate(position_1);\n      vec3 bsdf_d = EvalDiffuse(worldDir, wo, screenUV); \/\/ \u76f4\u63a5\u5149\u7167\n      vec3 bsdf_i = EvalDiffuse(wi, worldDir, hitScreenUV); \/\/ \u95f4\u63a5\u5149\u7167\n      L_ind += bsdf_d \/ pdf * bsdf_i * EvalDirectionalLight(hitScreenUV);\n    }\n  }\n  L_ind \/= float(SAMPLE_NUM);\n  return L_ind;\n}\n\/\/ ssrFragment.glsl\n\/\/ Main entry point for the shader\nvoid main() {\n  vec3 wi = normalize(uLightDir);\n  vec3 wo = normalize(uCameraPos - vPosWorld.xyz);\n  vec2 screenUV = GetScreenCoordinate(vPosWorld.xyz);\n\n  \/\/ Basic mirror-only SSR coefficient\n  float ssrCoeff = 0.0;\n  \/\/ Indirection Light coefficient\n  float indCoeff = 0.3;\n\n  \/\/ Direction Light\n  vec3 L_d = EvalDiffuse(wi, wo, screenUV) * EvalDirectionalLight(screenUV);\n  \/\/ SSR Light\n  vec3 L_ssr = EvalSSR(wi, wo, screenUV) * ssrCoeff;\n  \/\/ Indirection Light\n  vec3 L_i = EvalIndirectionLight(wi, wo, screenUV) * IndCorff;\n\n  vec3 result = L_d + L_ssr + L_i;\n  vec3 color = pow(clamp(result, vec3(0.0), vec3(1.0)), vec3(1.0 \/ 2.2));\n  gl_FragColor = vec4(vec3(color.rgb), 1.0);\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Show only indirect lighting. Samples = 5.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-245.png\" alt=\"img\" class=\"wp-image-1375 lazyload\"\/><noscript><img decoding=\"async\" width=\"700\" height=\"510\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-245.png\" alt=\"img\" class=\"wp-image-1375 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-245.png 700w, https:\/\/remoooo.com\/wp-content\/uploads\/image-245-300x219.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Direct lighting + indirect lighting. Number of samples = 5.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-241.png\" alt=\"img\" class=\"wp-image-1371 lazyload\"\/><noscript><img decoding=\"async\" width=\"667\" height=\"525\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-241.png\" alt=\"img\" class=\"wp-image-1371 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-241.png 667w, https:\/\/remoooo.com\/wp-content\/uploads\/image-241-300x236.png 300w\" sizes=\"(max-width: 667px) 100vw, 667px\" \/><\/noscript><\/figure>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">It was such a headache to write this part. Even with SAMPLE_NUM set to 1, my computer was sweating profusely. Once the Live Server was turned on, there was a delay when typing directly. I couldn&#039;t stand it. Is this the performance of the M1pro? And what I can&#039;t stand the most is that the Safari browser is stuck, why is the whole system stuck? Is this your User First strategy of macOS? I don&#039;t understand. I had no choice but to take out my gaming computer to pass the LAN test project (sad). I just didn&#039;t expect that the RTX3070 would also sweat profusely when running.<strong>It seems that the algorithm I wrote is a pile of shit, and my life is also a pile of shit.<\/strong>.<\/p>\n<\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\">4. RayMarch Improvements<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The current RayMarch() is actually problematic and will cause light leakage.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-239.png\" alt=\"img\" class=\"wp-image-1369 lazyload\"\/><noscript><img decoding=\"async\" width=\"490\" height=\"318\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-239.png\" alt=\"img\" class=\"wp-image-1369 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-239.png 490w, https:\/\/remoooo.com\/wp-content\/uploads\/image-239-300x195.png 300w\" sizes=\"(max-width: 490px) 100vw, 490px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">When the sampling number is 5, it is only about 46.2 frames. My device is M1pro 16GB.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-244.png\" alt=\"img\" class=\"wp-image-1374 lazyload\"\/><noscript><img decoding=\"async\" width=\"1106\" height=\"596\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-244.png\" alt=\"img\" class=\"wp-image-1374 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-244.png 1106w, https:\/\/remoooo.com\/wp-content\/uploads\/image-244-300x162.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-244-1024x552.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-244-768x414.png 768w\" sizes=\"(max-width: 1106px) 100vw, 1106px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Here we will focus on why light leakage occurs. See the figure below. Our gBuffer only has the depth information of the blue part. Even if our algorithm above has determined that the current curPos is deeper than the depth of gBuffer, it cannot ensure that this curPos is the collision point. Therefore, the algorithm above does not consider the situation in the figure, which leads to light leakage.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-242.png\" alt=\"img\" class=\"wp-image-1372 lazyload\"\/><noscript><img decoding=\"async\" width=\"646\" height=\"366\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-242.png\" alt=\"img\" class=\"wp-image-1372 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-242.png 646w, https:\/\/remoooo.com\/wp-content\/uploads\/image-242-300x170.png 300w\" sizes=\"(max-width: 646px) 100vw, 646px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">for<strong>Solve the light leakage problem<\/strong>We introduce a threshold to solve this problem (yes, it is an approximation). If the difference between curPos and the depth recorded by the current gBuffer is greater than a certain threshold, the situation shown in the figure below will occur. At this time, the information in the screen space cannot correctly provide the reflection information, so the SSR result of this Shading Point is vec3(0). It is so simple and crude!<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-243.png\" alt=\"img\" class=\"wp-image-1373 lazyload\"\/><noscript><img decoding=\"async\" width=\"518\" height=\"304\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-243.png\" alt=\"img\" class=\"wp-image-1373 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-243.png 518w, https:\/\/remoooo.com\/wp-content\/uploads\/image-243-300x176.png 300w\" sizes=\"(max-width: 518px) 100vw, 518px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The idea of the code is similar to the previous one. At each step, the relationship between the depth of the next step position and the depth of gBuffer is determined. If the next step position is in front of gBuffer (nextDepth<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bool RayMarch(vec3 ori, vec3 dir, out vec3 hitPos) { const float EPS = 1e-2; const int totalStepTimes = 60; const float threshold = 0.1; float step = 0.05; vec3 stepDir = normalize(dir) * step; vec3 curPos = ori + stepDir; vec3 nextPos = curPos + stepDir; for(int i = 0; i &lt; totalStepTimes; i++) { if(GetDepth(nextPos) &lt; GetGBufferDepth(GetScreenCoordinate(nextPos))){ curPos = nextPos; nextPos += stepDir; }else if(GetGBufferDepth(GetScreenCoordinate(curPos )) - GetDepth(curPos) + EPS &gt; threshold){ return false; }else{ curPos += stepDir; vec2 screenUV = GetScreenCoordinate(curPos); float rayDepth = GetDepth(curPos); float gBufferDepth = GetGBufferDepth(screenUV); if(rayDepth &gt; gBufferDepth + threshold){ hitPos = curPos; return true; } } } return false; }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The frame rate dropped to around 42.6, but the picture was significantly improved! At least there was no noticeable light leakage.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-247.png\" alt=\"img\" class=\"wp-image-1377 lazyload\"\/><noscript><img decoding=\"async\" width=\"1126\" height=\"612\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-247.png\" alt=\"img\" class=\"wp-image-1377 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-247.png 1126w, https:\/\/remoooo.com\/wp-content\/uploads\/image-247-300x163.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-247-1024x557.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-247-768x417.png 768w\" sizes=\"(max-width: 1126px) 100vw, 1126px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">However, there are still some flaws in the picture, that is, there will be hairy reflection patterns at the edges, which means that the light leakage problem is still not solved, as shown in the following figure:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-246.png\" alt=\"img\" class=\"wp-image-1376 lazyload\"\/><noscript><img decoding=\"async\" width=\"980\" height=\"638\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-246.png\" alt=\"img\" class=\"wp-image-1376 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-246.png 980w, https:\/\/remoooo.com\/wp-content\/uploads\/image-246-300x195.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-246-768x500.png 768w\" sizes=\"(max-width: 980px) 100vw, 980px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The above method<strong>There is indeed a problem<\/strong>When comparing with the threshold, we mistakenly used curPos for comparison (i.e., Step n in the figure below), which caused the code to enter the third branch and return the hitPos of the wrong curPos.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-249.png\" alt=\"img\" class=\"wp-image-1379 lazyload\"\/><noscript><img decoding=\"async\" width=\"586\" height=\"296\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-249.png\" alt=\"img\" class=\"wp-image-1379 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-249.png 586w, https:\/\/remoooo.com\/wp-content\/uploads\/image-249-300x152.png 300w\" sizes=\"(max-width: 586px) 100vw, 586px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Taking a step back, we have no way to guarantee that the final calculated curPos falls exactly on the line between the edge of the object and the origin of the camera. To put it bluntly, the blue line in the figure below is quite discrete. We want to get the curPos that is &quot;just&quot; at the boundary, and then deal with the defects in the distance from &quot;Step n&quot; to &quot;the &quot;just&quot; curPos&quot; (that is, the burr error above), but obviously due to various precision reasons, we can&#039;t get it. In the figure below, the green line represents a step.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-251.png\" alt=\"img\" class=\"wp-image-1381 lazyload\"\/><noscript><img decoding=\"async\" width=\"590\" height=\"292\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-251.png\" alt=\"img\" class=\"wp-image-1381 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-251.png 590w, https:\/\/remoooo.com\/wp-content\/uploads\/image-251-300x148.png 300w\" sizes=\"(max-width: 590px) 100vw, 590px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Even if we adjust the ratio of threshold\/step to make it close to 1, we can hardly eliminate the problem and can only alleviate it, as shown in the figure below.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-250.png\" alt=\"img\" class=\"wp-image-1380 lazyload\"\/><noscript><img decoding=\"async\" width=\"1044\" height=\"666\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-250.png\" alt=\"img\" class=\"wp-image-1380 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-250.png 1044w, https:\/\/remoooo.com\/wp-content\/uploads\/image-250-300x191.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-250-1024x653.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-250-768x490.png 768w\" sizes=\"(max-width: 1044px) 100vw, 1044px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Therefore, we need to improve the &quot;anti-light leakage&quot; method again.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In other words, the idea of improvement is very simple. Since I can&#039;t get the &quot;exact&quot; curPos point, I will guess it. Specifically, I will do a linear interpolation directly. Before interpolation, I will make an approximation, that is, I will regard the sight lines as parallel to each other, and then make a similar triangle as shown in the figure below, guess the curPos we want, and then use it as hitPos.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-248.png\" alt=\"img\" class=\"wp-image-1378 lazyload\"\/><noscript><img decoding=\"async\" width=\"470\" height=\"244\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-248.png\" alt=\"img\" class=\"wp-image-1378 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-248.png 470w, https:\/\/remoooo.com\/wp-content\/uploads\/image-248-300x156.png 300w\" sizes=\"(max-width: 470px) 100vw, 470px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">hitPos=curPos+s1s1+s2<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bool RayMarch(vec3 ori, vec3 dir, out vec3 hitPos) { bool result = false; const float EPS = 1e-3; const int totalStepTimes = 60; const float threshold = 0.1; float step = 0.05; vec3 stepDir = normalize(dir ) * step; vec3 curPos = ori + stepDir; vec3 nextPos = curPos + stepDir; for(int i = 0; i &lt; totalStepTimes; i++) { if(GetDepth(nextPos) &lt; GetGBufferDepth(GetScreenCoordinate(nextPos))){ curPos = nextPos; nextPos += stepDir; continue; } float s1 = GetGBufferDepth(GetScreenCoordinate(curPos)) - GetDepth(curPos) + EPS; float s2 = GetDepth(nextPos) - GetGBufferDepth(GetScreenCoordinate(nextPos)) + EPS; if(s1 &lt; threshold &amp;&amp; s2 &lt; threshold){ hitPos = curPos + stepDir * s1 \/ (s1 + s2); result = true; } break; } return result ; }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The effect is quite good, with no ghosting or border artifacts. And the frame rate is similar to the original algorithm, averaging around 49.2.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-252.png\" alt=\"img\" class=\"wp-image-1382 lazyload\"\/><noscript><img decoding=\"async\" width=\"1258\" height=\"512\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-252.png\" alt=\"img\" class=\"wp-image-1382 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-252.png 1258w, https:\/\/remoooo.com\/wp-content\/uploads\/image-252-300x122.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-252-1024x417.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-252-768x313.png 768w\" sizes=\"(max-width: 1258px) 100vw, 1258px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Next, we will focus on optimizing performance, specifically:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Add adaptive step<\/li>\n\n\n\n<li>Off-screen ignored judgment<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Off-screen ignored judgment<\/strong> Very simple. If the uvScreen of curPos is not between 0 and 1, then the current step is abandoned.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#039;s talk about the adaptive step in detail. That is, add two lines at the beginning of for. The actual frame rate will increase slightly by about 2-3 frames.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>vec2 uvScreen = GetScreenCoordinate(curPos); if(any(bvec4(lessThan(uvScreen, vec2(0.0)), greaterThan(uvScreen, vec2(1.0))))) break;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Adaptive step<\/strong> It is not difficult. First, set a larger value for the initial step. If<strong>After stepping<\/strong>curPos <strong>Not on screen<\/strong> or <strong>The depth value is deeper than gBuffer<\/strong> or <strong>&quot;s1 &lt; threshold &amp;&amp; s2 &lt; threshold&quot; is not satisfied<\/strong> , then let the step be halved to ensure accuracy.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bool RayMarch(vec3 ori, vec3 dir, out vec3 hitPos) { const float EPS = 1e-2; const int totalStepTimes = 20; const float threshold = 0.1; bool result = false, firstIn = false; float step = 0.8; vec3 curPos = ori; vec3 nextPos; for(int i = 0; i &lt; totalStepTimes; i++) { nextPos = curPos+dir*step; vec2 uvScreen = GetScreenCoordinate(curPos); if(any(bvec4(lessThan(uvScreen, vec2(0.0))), greaterThan(uvScreen, vec2(1.0))))) break; if (GetDepth(nextPos) &lt; GetGBufferDepth(GetScreenCoordinate(nextPos))){ curPos += dir * step; if(firstIn) step *= 0.5; continue; } firstIn = true; if(step &lt; EPS){ float s1 = GetGBufferDepth(GetScreenCoordinate(curPos)) - GetDepth(curPos) + EPS; float s2 = GetDepth(nextPos) - GetGBufferDepth(GetScreenCoordinate(nextPos)) + EPS; if(s1 &lt; threshold &amp;&amp; s2 &lt; threshold){ hitPos = curPos + 2.0 * dir * step * s1 \/ (s1 + s2); result = true; } break; } if(firstIn) step *= 0.5; } return result; }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">After the improvement, the frame rate suddenly reached 100 frames, almost doubling.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-254.png\" alt=\"img\" class=\"wp-image-1384 lazyload\"\/><noscript><img decoding=\"async\" width=\"1270\" height=\"496\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-254.png\" alt=\"img\" class=\"wp-image-1384 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-254.png 1270w, https:\/\/remoooo.com\/wp-content\/uploads\/image-254-300x117.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-254-1024x400.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-254-768x300.png 768w\" sizes=\"(max-width: 1270px) 100vw, 1270px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, tidy up the code.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#define EPS 5e-2 #define TOTAL_STEP_TIMES 20 #define THRESHOLD 0.1 #define INIT_STEP 0.8 bool outScreen(vec3 curPos){ vec2 uvScreen = GetScreenCoordinate(curPos); return any(bvec4(lessThan(uvScreen, vec2(0.0)), greaterThan(uvScreen, vec2(1.0)))); } bool testDepth(vec3 nextPos){ return GetDepth(nextPos) &lt; GetGBufferDepth(GetScreenCoordinate(nextPos)); } bool RayMarch(vec3 ori, vec3 dir, out vec3 hitPos) { float step = INIT_STEP; bool result = false, firstIn = false; vec3 nextPos, curPos = ori; for(int i = 0; i &lt; TOTAL_STEP_TIMES; i++) { nextPos = curPos + dir * step; if(outScreen(curPos)) break; if(testDepth(nextPos)){ \/\/ You can improve curPos += dir * step; continue; }else{ \/\/ Too advanced firstIn = true; if(step &lt; EPS){ float s1 = GetGBufferDepth(GetScreenCoordinate(curPos)) - GetDepth(curPos) + EPS; float s2 = GetDepth(nextPos) - GetGBufferDepth(GetScreenCoordinate(nextPos)) + EPS; if(s1 &lt; THRESHOLD &amp;&amp; s2 &lt; THRESHOLD){ hitPos = curPos + 2.0 * dir * step * s1 \/ (s1 + s2); result = true; } break; } if(firstIn) step *= 0.5; } } return result; }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Switching to the cave scene, the sampling rate is set to 32, and the frame rate is only a pitiful 4 frames.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-258.png\" alt=\"img\" class=\"wp-image-1388 lazyload\"\/><noscript><img decoding=\"async\" width=\"1268\" height=\"620\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-258.png\" alt=\"img\" class=\"wp-image-1388 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-258.png 1268w, https:\/\/remoooo.com\/wp-content\/uploads\/image-258-300x147.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-258-1024x501.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-258-768x376.png 768w\" sizes=\"(max-width: 1268px) 100vw, 1268px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And the quality of the secondary light source is very good.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-257.png\" alt=\"img\" class=\"wp-image-1387 lazyload\"\/><noscript><img decoding=\"async\" width=\"1262\" height=\"618\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-257.png\" alt=\"img\" class=\"wp-image-1387 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-257.png 1262w, https:\/\/remoooo.com\/wp-content\/uploads\/image-257-300x147.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-257-1024x501.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-257-768x376.png 768w\" sizes=\"(max-width: 1262px) 100vw, 1262px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">However, this algorithm will cause new problems when applied to reflections, especially the following picture, which has serious distortion.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-253.png\" alt=\"img\" class=\"wp-image-1383 lazyload\"\/><noscript><img decoding=\"async\" width=\"496\" height=\"286\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-253.png\" alt=\"img\" class=\"wp-image-1383 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-253.png 496w, https:\/\/remoooo.com\/wp-content\/uploads\/image-253-300x173.png 300w\" sizes=\"(max-width: 496px) 100vw, 496px\" \/><\/noscript><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-255.png\" alt=\"img\" class=\"wp-image-1385 lazyload\"\/><noscript><img decoding=\"async\" width=\"538\" height=\"278\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-255.png\" alt=\"img\" class=\"wp-image-1385 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-255.png 538w, https:\/\/remoooo.com\/wp-content\/uploads\/image-255-300x155.png 300w\" sizes=\"(max-width: 538px) 100vw, 538px\" \/><\/noscript><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">5. Mipmap Implementation<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.rastergrid.com\/blog\/2010\/10\/hierarchical-z-map-based-occlusion-culling\/\">Hierarchical-Z map based occlusion culling<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">6. LocalBasis builds TBN principle<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Generally speaking, constructing the normal tangent vector (normal, tangent, and bitangent vector) is achieved through the cross product. The implementation method is very simple. First, select an auxiliary vector that is not parallel to the normal vector, and do a cross product between the two to get the first tangent vector. Then, do a cross product between the tangent vector and the normal vector to get the bitangent vector. The specific code is written as follows:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>void CalculateTBN(const vec3 &amp;normal, vec3 &amp;tangent, vec3 &amp;bitangent) { vec3 helperVec; if (abs(normal.x) &lt; abs(normal.y)) helperVec = vec3(1.0, 0.0, 0.0); else helperVec = vec3(0.0 , 1.0, 0.0); tangent = normalize(cross(helperVec, normal)); bitangent = normalize(cross(normal, tangent)); }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">But the code in the job framework avoids using<strong>Cross Product<\/strong>, which is very clever. Simply put, it is to ensure that the vector<strong>Dot Product<\/strong>All are 0.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>$b1\u22c5n=0$<\/li>\n\n\n\n<li>$b2\u22c5n=0$<\/li>\n\n\n\n<li>$b1\u22c5b2=0$<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>void LocalBasis(vec3 n, out vec3 b1, out vec3 b2) { float sign_ = sign(nz); if (nz == 0.0) { sign_ = 1.0; } float a = -1.0 \/ (sign_ + nz); float b = nx * ny * a; b1 = vec3(1.0 + sign_ * nx * nx * a, sign_ * b, -sign_ * nx); b2 = vec3(b, sign_ + ny * ny * a, -ny); }<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This algorithm is a heuristic one, which introduces a symbolic function, which is quite impressive. It also considers the case of division by 0, and the pattern is also full. However, the following four lines should be the author&#039;s random disassembly when he wrote the formula one day. Here I will restore the author&#039;s disassembly steps at that time. That is, the process of reverse deduction.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" data-src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-256.png\" alt=\"img\" class=\"wp-image-1386 lazyload\"\/><noscript><img decoding=\"async\" width=\"1201\" height=\"786\" src=\"https:\/\/\u80a5\u80a5.com\/wp-content\/uploads\/image-256.png\" alt=\"img\" class=\"wp-image-1386 lazyload\" srcset=\"https:\/\/remoooo.com\/wp-content\/uploads\/image-256.png 1201w, https:\/\/remoooo.com\/wp-content\/uploads\/image-256-300x196.png 300w, https:\/\/remoooo.com\/wp-content\/uploads\/image-256-1024x670.png 1024w, https:\/\/remoooo.com\/wp-content\/uploads\/image-256-768x503.png 768w\" sizes=\"(max-width: 1201px) 100vw, 1201px\" \/><\/noscript><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">By the way, the sign function in the code can be multiplied in the last step.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In fact, I can create a hundred such formulas, and I don\u2019t know the difference between them. If you know, please tell me QAQ. If you insist, then it can be explained like this:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Traditional cross-product-based methods may be numerically unstable because the cross-product result is close to the zero vector in this case. The method adopted in this paper is a heuristic method that constructs an orthogonal basis through a series of carefully designed steps. This method pays special attention to numerical stability, making it effective and stable when dealing with normal vectors close to extreme directions.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">grateful <a href=\"https:\/\/www.zhihu.com\/people\/54d8555ea97664cbdc5b362dae58e376\">@I am a dragon set little fruit<\/a> As pointed out by , the above method is very particular. The algorithm provided in the homework framework was obtained by Tom Duff et al. in 2017 by improving Frisvad&#039;s method. For details, please refer to the following two papers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/graphics.pixar.com\/library\/OrthonormalB\/paper.pdfgraphics.pixar.com\/library\/OrthonormalB\/paper.pdf\">https:\/\/graphics.pixar.com\/library\/OrthonormalB\/paper.pdfgraphics.pixar.com\/library\/OrthonormalB\/paper.pdf<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/backend.orbit.dtu.dk\/ws\/portalfiles\/portal\/126824972\/onb_frisvad_jgt2012_v2.pdfbackend.orbit.dtu.dk\/ws\/portalfiles\/portal\/126824972\/onb_frisvad_jgt2012_v2.pdf\">https:\/\/backend.orbit.dtu.dk\/ws\/portalfiles\/portal\/126824972\/onb_frisvad_jgt2012_v2.pdfbackend.orbit.dtu.dk\/ws\/portalfiles\/portal\/126824972\/onb_frisvad_jgt2012_v2.pdf<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Games 202<\/li>\n\n\n\n<li><a href=\"https:\/\/learnopengl.com\/Advanced-Lighting\/Normal-Mapping\">LearnOpenGL \u2013 Normal Mapping<\/a><\/li>\n<\/ol>","protected":false},"excerpt":{"rendered":"<p>\u4f5c\u4e1a\u6e90\u4ee3\u7801\uff1a https:\/\/github.com\/Remyuu\/GAMES202-Homeworkgithu [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1364,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[53],"tags":[56,73],"class_list":["post-1357","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech","tag-cg","tag-ssr"],"_links":{"self":[{"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/posts\/1357","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/comments?post=1357"}],"version-history":[{"count":1,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/posts\/1357\/revisions"}],"predecessor-version":[{"id":1389,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/posts\/1357\/revisions\/1389"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/media\/1364"}],"wp:attachment":[{"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/media?parent=1357"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/categories?post=1357"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/remoooo.com\/en\/wp-json\/wp\/v2\/tags?post=1357"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}