鄧小鐵_watermark.pdf

報告預覽

鄧小鐵_watermark.pdf

編號：155527

PDF 41頁 1.18MB 下載積分：免費下載

下載報告請您先登錄！

鄧小鐵_watermark.pdf

1、PreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryProvably Bound of Nash EquilibriumApproximatorXiaotie Deng Peking University2023.11.24based onZhaohua Chen,Xiaotie Deng,Wenhan Huang,Hanyu Li,Yuhao Li:On tightness of Tsaknakis-Spirakisdescent methods

2、 for approximate Nash equilibria.Inf.Comput.293:105046(2023)Zhijian Duan,Wenhan Huang,Dinghuai Zhang,Yali Du,Jun Wang,Yaodong Yang,Xiaotie Deng:Is NashEquilibrium Approximator Learnable?AAMAS 2023:233-241Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash E

3、quilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummary1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function Approxima

4、torApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryFinite Normal-Form GamesIn a finite normal-form game,there are several players.Ea

5、ch player has a set of pure strategies.Their joint choices,called their strategy profiles,determine a payoff function for each player.At a play,each player chooses a mixed(i.e.,randomized)strategy,simultaneously and without communication.The goal of each player is to maximize its expected payoff.Xia

6、otie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryNash Equilibrium(NE)Nash equilibrium:a strategy profile that every player cangain no extra payoff by unilterally deviating fro

7、m it.Example:Paper-Scissors-Rock.Payoff functions:PSRP0,0-1,11,-1S1,-10,0-1,1R-1,11,-10,0Unique Nash equilibrium:both players play(P,S,R)withprobability(1/3,1/3,1/3)T.Theorem(Nash,1950)Every finite normal-form game has at least one Nash equilibrium.Xiaotie Deng Peking UniversityProvably Bound of Nas

8、h Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryApproximate Nash EquilibriumIn practice,we may not be able to reach a Nash equilibrium.Nash equilibrium requires players to be perfectly rational.Infinite computation power,t

9、ime,etc.We may want to find a strategy profile that is close to a Nashequilibrium.approximate Nash equilibrium:a strategy profile thatevery player can gain at most extra payoff by unilterallydeviating from it.We term as the approximation.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibri

10、um ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS Al

11、gorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash

12、 Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmReference1Zhaohua Chen,Xiaotie Deng,Wenhan Huang,Hanyu Li,andYuhao Li.On tightness of the Tsaknakis-Spirakis algorithmfor approximate Nash equilibrium.SAGT 2021.2Zhaohua C

13、hen,Xiaotie Deng,Wenhan Huang,Hanyu Li,andYuhao Li.On tightness of Tsaknakis-Spirakis descentmethods for approximate Nash equilibria.Information andComputation 293(2023):105046.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for

14、 One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilib

15、rium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Uppe

16、r Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmLower Bound ResultsComputing(approximate)Nash equilibrium is a well-studiedtopic in complexity theory.Computing 1/poly(n)-approximate Nash equilibrium isPPAD-complete for k-player games when k 2 DGP06,CD06.For small but constant,com

17、puting-approximate Nashequilibrium requires quasi-polynomial time for 2-player gamesunder plausible complexity assumptions(ETH for PPAD)Rubinstein,2016.Computing approximate Nash equilibrium is hard evenfor 2-player games and small but constantapproximations.Xiaotie Deng Peking UniversityProvably Bo

18、und of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmUpper Bound ResultsThe other direction:find the minimum (after normalization

19、)admitting a polynomial-time algorithm.Almost all known algorithms only work for 2-player games,and follow the same framework:Search-and-mix methods:Search for several strategies x1,xsof player 1 andy1,ytof player 2 in polynomial time.Make certain convex combinations on these strategies,obtaining st

20、rategy profiles(x1,y1),.,(xu,yu).Calculate i=argmin1iuapprox(xi,yi);output(xi,yi).Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS Alg

21、orithmTight InstancesTightness of DFM AlgorithmUpper Bound ResultsSummary of polynomial-time algorithms computing-NE withconstant.Author initialsPublish yearApproximationKPS20060.75DMP20060.5DMP20070.38+BBM20070.36TS20070.3393+CDFFJS20150.38DFM20221/3+Xiaotie Deng Peking UniversityProvably Bound of

22、Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Boun

23、d ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator fo

24、r One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTS AlgorithmThe previous state-of-the-art approximation,0.3393+,isattained by the TS algorithm.In WINE 2007,H.Tsaknakis and P.G.Spirakis proposed it.An optim

25、ization approach-descent procedure.The current state-of-the-art approximation is 1/3+DFM22,obtained by elaborately modifying the TS algorithm.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function

26、ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTS AlgorithmTS algorithm is hard to analyze and improve.0.3393 was only known as the upper bound of the algorithm.Experiment studies show that TS algorithm perform well inpractice.The observed

27、worst approximation was 0.3385 FIS2015.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algor

28、ithmMain ContributionsWe give a comprehensive analysis and further understandings forthe worst cases of the algorithm with the following results:0.3393 is the lower bound for TS algorithm,even when wearbitrarily modify its mixing phase(called generalized TSalgorithms).Characterize all tight instance

29、s for generalized TS algorithmsas LP feasible solutions.1/3 is the lower bound of the DFM algorithm.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper B

30、ound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTwo-Player Normal-Form GamesThe two players are called row player and column player.They have m and n pure strategies,respectively.Their payoff functions can be described by m n matrices,R and C,respectively.Their mixed strategies are

31、x,y,respectively.Let k=(x1,.,xk)Rk:Pixi=1,xi 0.x m,y n.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightn

32、ess of DFM AlgorithmApproximate Nash EquilibriumNotice that all Nash equilibria will not change if we scale and shiftpayoff functions uniformly.So w.l.o.g.we assume Rij,Cij 0,1.A strategy profile(x,y)is called an approximate Nashequilibrium if every player can gain at most extra payoff bydeviating f

33、rom the present strategy,i.e.,fR(x,y):=max(Ry)xTRy .fC(x,y):=max(CTx)xTCy .Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmT

34、ight InstancesTightness of DFM AlgorithmLoss Function Minimizationf(x,y):=maxfR(x,y),fC(x,y).Clearly,f(x,y)is the approximation of(x,y).Optimization formalization:minimizef(x,y)subject tox m,y n.Now we can use optimization algorithms to solve this problem.Xiaotie Deng Peking UniversityProvably Bound

35、 of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTS AlgorithmSearch:Use a linear-programming-based descent method to find astati

36、onary point(x,y),together with a dual solution,w,z.Mix:The adjusted strategy pair is(x,y):=?11+w+1+x,z?,?w,11+z+1+y?,0(r2lnN,1(H,r)m+2r)+4r2ln(4/)mLD(h):=EuDE(h(u),u)is the expected approximation of h.LS(h):=1|S|PuSE(h(u),u)is the empirical averageapproximation of h over dataset S.N,1(H,r)is the cov

37、ering number of H that characterizes itscapacity.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryNash Equilibrium Function ApproximatorApplicationGeneralization BoundGener

38、alization BoundFor a hypothesis class H of NE function approximators and a gamedistribution D,with probability at least 1 over the draw ofdataset S from D,for all h H,the following generalizationbound holds:LD(h)LS(h)2 infr0(r2lnN,1(H,r)m+2r)+4r2ln(4/)mLD(h):=EuDE(h(u),u)is the expected approximatio

39、n of h.LS(h):=1|S|PuSE(h(u),u)is the empirical averageapproximation of h over dataset S.N,1(H,r)is the covering number of H that characterizes itscapacity.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibr

40、ium Function ApproximatorSummary1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie

41、 Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummarySummaryPolynomial Time Algorithm Design for a Bi-matrix Game isDifficult in General.Any known algorithm with a provable approximation bound is1/3 up to now.Warmstarting approximation algorithm performs well in termsof sampling complexity.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium Approximator

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后，可能會被瀏覽器默認打開，此種情況可以點擊瀏覽器菜單，保存網頁到桌面，就可以正常下載了。
3、本站不支持迅雷下載，請使用電腦自帶的IE瀏覽器，或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮，下載后原文更清晰。

本文（鄧小鐵_watermark.pdf）為本站（張5G）主動上傳，三個皮匠報告文庫僅提供信息存儲空間，僅對用戶上傳內容的表現方式做保護處理，對上載內容本身不做任何修改或編輯。若此文所含內容侵犯了您的版權或隱私，請立即通知三個皮匠報告文庫（點擊聯系客服），我們立即給予刪除！

溫馨提示：如果因為網速或其他原因下載失敗請重新下載，重復下載不扣分。

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站