1、VARIABLE RATE SHADINGF口DEEPLEARNINGSUPER SAMPLINGDLSS在UIGKSILVERX引擎的應用倪駿揚王北辰游戲客戶端開發,騰訊游戲客戶端開發,騰訊#page#目錄Variable Rate Shading簡介Quicksilverx引擎集成VRS在無限法則(RingofElysium)中的應用總結#page#WRS概述浪染分辯率越來越高,PixelShader(PS)的計算量也越來越大HDR格式加劇了帶寬問題VRS可以讓我們通過減少ps的調用次數嘉福從而提高性能!是一種性能和畫質的折中方案#page#TIER-1限制如果在shader的輸入輸出中定
2、義了SV_coverage,那么VRS不會有效果EvaluateAttributefAtcentroidlatSamplelSnappedl,在Tier1中無法使用,如果shader中有定義同樣會使得VRS失效,使用centroid”或者sample這些關鍵字也會引發同樣的問題#page#VRS可變著色率如圖16x16分辯率的全屏后處理pass使用2x2的著色率可以看到VRS大大減少了PS的調用次數,這個case大約減少到了1/4VRS1x1-256像素有256次PS調用VRS2x2-256像素有64次PS調用VRS4x4-256像素只有只有16次PS調用#page#邊緣保持VRS只是減少了P
3、S的調用次數,但是光柵化和深度模板測試都是全分辯率,所以可以保證物件的邊緣不會出現模糊問題VRS2X2Scale 50%#page#D12AP接口Tier1:通過調用commandlist中的RSSetShadingRate(.-)使得后續所有drawcall都會使用改著色率Tier2:指定著色率圖像或者通過SV_ShadingRate指定每三角形著色率通過combiner組合各種著色率Min/max/combine/passthroughenum D3D12_SHADING RATED3D12_SHADING_RATE_1X1=0D3D12 SHADING RATE_1X2=0x1D3D12
4、 SHADING RATE 2X1=0x4,D3D12_SHADING_RATE_2X2=0x5,D3D12 SHADING RATE_2X4=0x6,D3D12 SHADING RATE 4X2=0x9D3D12_SHADING_RATE_4X4=0xaD3D12_SHADING_RATE;#page#在咖ICKSILVER-X中集成VRSQuicksilverx引擎啟無限法則S0GREAT0022noghtosetitse8.5A98984#page#在咖ICKSILVER-X中集成VRS先進特性:DX12支持(2019.6),動態全局光照(2020.3),基于光線追蹤的全局光照(2020
5、.10),基于strrand的毛發系統#page#接口上兼容DX9DX12#ifdefQS_DX12#define VRS_API_ENABLED 1/VRS core API codeDX12API級別集成#define VRS_APP_ENABLED 1/ VRS application code#else/QS_DX17#define VRS API ENABLED e / VRS5 core API code#endif/QS_DX12支持Tier1級別,引擎為了兼容/Shading rate enwin7仍使用sm5.0代碼上兼容dx9和dx12.EShadingRate_1x1=0
6、,EShadingRate_1x2=x1,EShadingRate_2x1通過D3D12feature檢測VRS支持EShadingRate_2x2=0x5,/Additional Shading Rates SupportedEShadingRate_2x4=0x6,不同pass單獨使用自己的著色EShadingRate_4x2=0x9EShadingRate_4x4=xa率D5inline void QsD3D12Drv::SetshadingRate(EShadingRaterate)cache著色率參數減少api調用#1fVRS_API_ENABLEDGetcraphiccmdList
7、(SetShadingRate(rate)#endifinline EShadingRate QsD3p12prv::GetshadingRate()Hif VRS API_ENABLEDreturn GetGraphicCndlist(-GetshadingRate()#elsereturn EShadingRate_1x1#endif#page#可以每Pass/Drawcal配置VRS管線改動/ VRS enabled passes.enum EVRSPassEVRSPass_OpaqueShadow sGBufferOpaque是一個Pass!實現細節EVRSPass_GBufferop
8、aque,EVRSPass_GBufferTerrain,EVRSPass_GBufferRoad,細分passEVRSPass_PostGBufferyy-嵌套使用著色率EVRSPass_outdoorLightingPass使用著色率A,其中子pass可以使用著色率BEVRSPass_Sky,EVRSPass_FoB調試:著色率快速轉換用來快速查看不同著Sky是一個EVRSPass waterAndTranslucenEVRSPass_Post色率的效果差別Drawcall!EVRSPass_NumberV:y/ VRS pass configstruct vRsPassConfige.g
9、.在EShadingRate_1X2 和EShadingRate_2X1EShadingRatemShadingRate=EShadingRate_1x1;boolmAltered=false間快速切換用來查看效果8nmAlterState=0:15VRS LUT/ VRS pipeline look up tableusing VRSPipelineConfig sEVRSPass_Number;std::arraVRSPassConfig,#page#WRS管線改動瀘染線程簡單的push/pop狀態對原始管線代碼幾乎沒有改動最多30次API調用狀態嵌套使用void QSDirectLigh
10、tingRenderer:LaunchRoadRenderlist()QS_RENDER_EVENT_EX(GBufferRoad)VRS PUSHPASS_RATE(EVRSPass_GBufferRoad)mRoadList.Render();VRS_POP_PASS_RATE(EVRSPass_GBufferRoad);4#page#RINGOFELYSIUM中VRS的使用?;A物件。地表&道路,天空,霧水面以及下雨光照粒子#page#345330NV基礎物件2x22x11x21x1邊緣保持但是法線丟失2x44x24X4#page#34基礎物件快登美蛋2x11x22x21x1紋理細節丟失
11、4X42x44x2#page#基礎物件Alphatest細節丟失GmPanelF8#page#基礎物件啟用VRS后邊緣保持0丟失法線細節X丟失紋理細節X丟失alphatest細節X性能對比RTX20801440pShading RatePs Invocation (k)Pixel Rendered (k)GPU duration (us)Compare to 1x11x132021801658.622x213941795524.96-20.3%4x27751778518.66-21.3%#page#地表和路面2x11x11x24x22x22x44X4Gm Panel(Fej#page#地表和路
12、面各向異性采樣比較耗時。2x1/4x2比1x2/2x4效果更好4x2和1x2在10米開外效果差不多2x1在1080p下幾乎沒有瑕藏分離遠處地表和近處地表,最近使用1x1,中遠距離2x1,較遠4x2性能對比RTX20801440p2x1vs1x1,紅色像素誤差3%Shading RatePS Invocation (k)Pixel Rendered (k)GPU Duration (us)Compare to 1x11x127692052646.142x114942052425.66-34.1%2x28802052-43.1%367.364x24892052343.62-46.8%#page#口
13、243天空Set VRS:Sky-EShadingRate_1x110%100244SetVRS:Sky-EShadingRate_2x22440SatVRS:Sky-EshadingRate_4X42x2的的瑕癥幾乎不可見,這類遠景對于VRS比較友好#page#天空性能對比RTX20801440pShading RatePs Invocation (k)Pixel Rendered (k)GPU Duration (us)Compare to 1x11x119551906120.742x2513190686.08-28.7%4X4141190685.66-29.1%Note:4x4相對于2x
14、2沒有明顯提升#page#水面112812x24X2142x44X4#page#水面性能對比RTX20801440pShading ratePs invocation (k)Pixel rendered (k)GPU duration (us)Compare to 1x11x118681743543.262x19451743350.34-35.5%4x22751743180.51-66.8%4X41701743183.97-66.1%#page#雨01X12x44X4AM17113高速運動下2x4沒有明顯畫質下降#page#雨性能對比RTX20801440pShading ratePs inv
15、ocation (k)Pixel rendered (k)GPU Duration (us)Compare to 1x11x11906212.919182x4245190688.26-58.5%4X4125190685.47-59.9%#page#光照計算量大耗時長,但是沒有overdrawVRS不能完全適用影響整體光照效果瑕癥較大丟失法線以及邊緣細節部分應用VRS對部分遠處場景使用VRS,角色不使用#page#OriginalOutdoorlighting-2x2260光照1x1角色光照,高質量2x2用于場景提升性能2x2對于場景區別較小72m#page#光照性能對比RTX20801440p
16、Shading ratePs invocation (k)Pixel rendered (k)Gpu duration (us)Compare to 1x11x1968949947.232x2249949377.78-60.1%4X464.6949-76.5%223.00#page#粒子全屏煙霧(游戲中非常耗時的pass)覆蓋全屏幕的alphablend,即使在半分辨率下也很耗時Overdraw非常高Overdraw 47Overdraw view of RT#page#粒子1x12x24X4Gm PanelFsj#page#粒子性能對比RTX20801440pShading ratePs i
17、nvocation (k)Pixel rendered (k)Gpu duration (us)Compare to 1x11x1300002890186002x27601186002640-8.65%4X419521410-51.2%18600#page#Dawn熱12實際游戲VRS offVRSonVRSon3:01畫質對比#page#OriginalGBuferOpaque-1X2GBufferTerrain-1X2233實際游戲GBufferRoad-1x2PostGBuffer-2x2Sky-4x4非常接近!#page#SconfEnable vRS實際游戲Gliding overr
18、ide8.683Gliding Speed Threshold8.089ivingspedThresholdDriving OverridePasseclearDefaultAltVRSon/off對比AICRate_1x2EShacAICRate_1x2EVRSPass_GBufferTerraEShaAlt1x2EVRSPass_GBufferRoatEVRSPass_Po5tl7.47.221EVRS?ctLightinoLightin855F055-TrrReISLANEate_2x2Temporal AA enabled*AIEEshad#page#總結VRSTier1集成到Quic
19、ksilverx引擎,可以進行粗粒度優化VRS用來優化有瓶頸且同時效果影響不是很顯著pass受限于tier1不能做更細節的優化#page#未來工作性能優化Tier2探索基于圖像每三角形VRS+DLSS優化半透等DLSS無法優化的物件通過混合使用1x1和2x2著色率消除dlsss存在的部分瑕癥(ghostl以及一些邊緣問題)#page#引用Variable Rate Shading I DirectX-Specshttps:/microsoft.github.io/Directx-Specs/d3d/VariableRateShading.htmlVRWorks-Variable Rate Sh
20、ading(VRS)https:/ Rate Shading Tier 1 with Microsoft Directx* 12 from Theory to Practice-https:/ Rate Shading with Depth of Fieldhttps:/ Yang et als GDC talk-https:/ ResolutionStartPostEffectRenderTonemapRender ResolutionDSSUDisplay Out#page#JTTB點N機#page#速度向量#page#深度圖#page#性能DLSS OFF阿金司限#page#性能DLSS ONS#page#抗鋸齒?DLSS OFF振廷司徒勁南馬堂奔馬堂#page#抗鋸齒客官這是的牛肉!司徒振廷司徒勁南 DLSS ON奔馬堂奔馬堂#page#場景穩定性DLSS OFF#page#場景穩定性DLSS ON#page#總結在4k屏上頓率顯著提升場景穩定性高Jjitter點越多效果越好MotionVector要計算準確#page#謝謝!Q&A