《斯坦福大學HAI:2023人工智能指數報告(英文版)(386頁).pdf》由會員分享,可在線閱讀,更多相關《斯坦福大學HAI:2023人工智能指數報告(英文版)(386頁).pdf(386頁珍藏版)》請在三個皮匠報告上搜索。
1、Artificial IntelligenceIndex Report 2023Artificial IntelligenceIndex Report 2023Introduction to the AI Index Report 2023Welcome to the sixth edition of the AI Index Report!This year,the report introduces more original data than any previous edition,including a new chapter on AI public opinion,a more
2、 thorough technical performance chapter,original analysis about large language and multimodal models,detailed trends in global AI legislation records,a study of the environmental impact of AI systems,and more.The AI Index Report tracks,collates,distills,and visualizes data related to artificial inte
3、lligence.Our mission is to provide unbiased,rigorously vetted,broadly sourced data in order for policymakers,researchers,executives,journalists,and the general public to develop a more thorough and nuanced understanding of the complex field of AI.The report aims to be the worlds most credible and au
4、thoritative source for data and insights about AI.From the Co-DirectorsAI has moved into its era of deployment;throughout 2022 and the beginning of 2023,new large-scale AI models have been released every month.These models,such as ChatGPT,Stable Diffusion,Whisper,and DALL-E 2,are capable of an incre
5、asingly broad range of tasks,from text manipulation and analysis,to image generation,to unprecedentedly good speech recognition.These systems demonstrate capabilities in question answering and the generation of text,image,and code unimagined a decade ago,and they outperform the state of the art on m
6、any benchmarks,old and new.However,they are prone to hallucination,routinely biased,and can be tricked into serving nefarious aims,highlighting the complicated ethical challenges associated with their deployment.Although 2022 was the first year in a decade where private AI investment decreased,AI is
7、 still a topic of great interest to policymakers,industry leaders,researchers,and the public.Policymakers are talking about AI more than ever before.Industry leaders that have integrated AI into their businesses are seeing tangible cost and revenue benefits.The number of AI publications and collabor
8、ations continues to increase.And the public is forming sharper opinions about AI and which elements they like or dislike.AI will continue to improve and,as such,become a greater part of all our lives.Given the increased presence of this technology and its potential for massive disruption,we should a
9、ll begin thinking more critically about how exactly we want AI to be developed and deployed.We should also ask questions about who is deploying itas our analysis shows,AI is increasingly defined by the actions of a small set of private sector actors,rather than a broader range of societal actors.Thi
10、s years AI Index paints a picture of where we are so far with AI,in order to highlight what might await us in the future.Jack Clark and Ray PerraultArtificial IntelligenceIndex Report 20231 Industry races ahead of academia.Until 2014,most significant machine learning models were released by academia
11、.Since then,industry has taken over.In 2022,there were 32 significant industry-produced machine learning models compared to just three produced by academia.Building state-of-the-art AI systems increasingly requires large amounts of data,computer power,and moneyresources that industry actors inherent
12、ly possess in greater amounts compared to nonprofits and academia.2 Performance saturation on traditional benchmarks.AI continued to post state-of-the-art results,but year-over-year improvement on many benchmarks continues to be marginal.Moreover,the speed at which benchmark saturation is being reac
13、hed is increasing.However,new,more comprehensive benchmarking suites such as BIG-bench and HELM are being released.3 AI is both helping and harming the environment.New research suggests that AI systems can have serious environmental impacts.According to Luccioni et al.,2022,BLOOMs training run emitt
14、ed 25 times more carbon than a single air traveler on a one-way trip from New York to San Francisco.Still,new reinforcement learning models like BCOOLER show that AI systems can be used to optimize energy usage.Top Ten Takeaways4 The worlds best new scientist AI?AI models are starting to rapidly acc
15、elerate scientific progress and in 2022 were used to aid hydrogen fusion,improve the efficiency of matrix manipulation,and generate new antibodies.5 The number of incidents concerning the misuse of AI is rapidly rising.According to the AIAAIC database,which tracks incidents related to the ethical mi
16、suse of AI,the number of AI incidents and controversies has increased 26 times since 2012.Some notable incidents in 2022 included a deepfake video of Ukrainian President Volodymyr Zelenskyy surrendering and U.S.prisons using call-monitoring technology on their inmates.This growth is evidence of both
17、 greater use of AI technologies and awareness of misuse possibilities.6 The demand for AI-related professional skills is increasing across virtually every American industrial sector.Across every sector in the United States for which there is data(with the exception of agriculture,forestry,fishing,an
18、d hunting),the number of AI-related job postings has increased on average from 1.7%in 2021 to 1.9%in 2022.Employers in the United States are increasingly looking for workers with AI-related skills.Artificial IntelligenceIndex Report 2023Top Ten Takeaways(contd)7 For the first time in the last decade
19、,year-over-year private investment in AI decreased.Global AI private investment was$91.9 billion in 2022,which represented a 26.7%decrease since 2021.The total number of AI-related funding events as well as the number of newly funded AI companies likewise decreased.Still,during the last decade as a
20、whole,AI investment has significantly increased.In 2022 the amount of private investment in AI was 18 times greater than it was in 2013.8 While the proportion of companies adopting AI has plateaued,the companies that have adopted AI continue to pull ahead.The proportion of companies adopting AI in 2
21、022 has more than doubled since 2017,though it has plateaued in recent years between 50%and 60%,according to the results of McKinseys annual research survey.Organizations that have adopted AI report realizing meaningful cost decreases and revenue increases.9 Policymaker interest in AI is on the rise
22、.An AI Index analysis of the legislative records of 127 countries shows that the number of bills containing“artificial intelligence”that were passed into law grew from just 1 in 2016 to 37 in 2022.An analysis of the parliamentary records on AI in 81 countries likewise shows that mentions of AI in gl
23、obal legislative proceedings have increased nearly 6.5 times since 2016.10 Chinese citizens are among those who feel the most positively about AI products and services.Americans not so much.In a 2022 IPSOS survey,78%of Chinese respondents(the highest proportion of surveyed countries)agreed with the
24、statement that products and services using AI have more benefits than drawbacks.After Chinese respondents,those from Saudi Arabia(76%)and India(71%)felt the most positive about AI products.Only 35%of sampled Americans(among the lowest of surveyed countries)agreed that products and services using AI
25、had more benefits than drawbacks.Artificial IntelligenceIndex Report 2023Steering CommitteeStaff and ResearchersCo-directorsMembersResearch Manager and Editor in ChiefResearch AssociateAffiliated ResearchersGraduate ResearcherJack ClarkAnthropic,OECDNestor MaslejStanford UniversityErik BrynjolfssonS
26、tanford UniversityJohn EtchemendyStanford UniversityJuan Carlos NieblesStanford University,SalesforceVanessa ParliStanford UniversityRaymond PerraultSRI InternationalLoredana FattoriniStanford UniversityHan BaiStanford UniversityElif Kiesow CortezStanford Law School Research FellowRobi RahmanData Sc
27、ientistAlexandra RomeFreelance ResearcherUndergraduate ResearchersKatrina LigettHebrew UniversityTerah LyonsJames Manyika GoogleYoav Shoham (Founding Director)Stanford University,AI21 LabsRussell WaldStanford UniversityHelen NgoHugging FaceVania ChowStanford UniversitySukrut OakStanford UniversityMe
28、na HassanStanford UniversityLucy ZimmermanStanford UniversityElizabeth ZhuStanford UniversitySiddhartha JavvajiStanford UniversityStone YangStanford UniversityNaima PatelStanford UniversityArtificial IntelligenceIndex Report 2023How to Cite This ReportPublic Data and ToolsAI Index and Stanford HAINe
29、stor Maslej,Loredana Fattorini,Erik Brynjolfsson,John Etchemendy,Katrina Ligett,Terah Lyons,James Manyika,Helen Ngo,Juan Carlos Niebles,Vanessa Parli,Yoav Shoham,Russell Wald,Jack Clark,and Raymond Perrault,“The AI Index 2023 Annual Report,”AI Index Steering Committee,Institute for Human-Centered AI
30、,Stanford University,Stanford,CA,April 2023.The AI Index 2023 Annual Report by Stanford University is licensed under Attribution-NoDerivatives 4.0 International.The AI Index 2023 Report is supplemented by raw data and an interactive tool.We invite each reader to use the data and the tool in a way mo
31、st relevant to their work and interests.The AI Index is an independent initiative at the Stanford Institute for Human-Centered Artificial Intelligence(HAI).We welcome feedback and new ideas for next year.Contact us at AI-Index-Reportstanford.edu.The AI Index was conceived within the One Hundred Year
32、 Study on AI(AI100).Raw data and charts:The public data and high-resolution images of all the charts in the report are available on Google Drive.Global AI Vibrancy Tool:Compare up to 30 countries across 21 indicators.The Global AI Vibrancy tool will be updated in the latter half of 2023.Artificial I
33、ntelligenceIndex Report 2023Analytics and Research PartnersSupporting PartnersArtificial IntelligenceIndex Report 2023ContributorsWe want to acknowledge the following individuals by chapter and section for their contributions of data,analysis,advice,and expert commentary included in the AI Index 202
34、3 Report:Research and DevelopmentSara Abdulla,Catherine Aiken,Luis Aranda,Peter Cihon,Jack Clark,Loredana Fattorini,Nestor Maslej,Besher Massri,Vanessa Parli,Naima Patel,Ray Perrault,Robi Rahman,Alexandra Rome,Kevin XuTechnical PerformanceJack Clark,Loredana Fattorini,Siddhartha Javvaji,Katrina Lige
35、tt,Nestor Maslej,Juan Carlos Niebles,Sukrut Oak,Vanessa Parli,Ray Perrault,Robi Rahman,Alexandra Rome,Yoav Shoham,Elizabeth ZhuTechnical AI EthicsJack Clark,Loredana Fattorini,Katrina Ligett,Nestor Maslej,Helen Ngo,Sukrut Oak,Vanessa Parli,Ray Perrault,Alexandra Rome,Elizabeth Zhu,Lucy ZimmermanEcon
36、omySusanne Bieller,Erik Brynjolfsson,Vania Chow,Jack Clark,Natalia Dorogi,Murat Erer,Loredana Fattorini,Akash Kaura,James Manyika,Nestor Maslej,Layla OKane,Vanessa Parli,Ray Perrault,Brittany Presten,Alexandra Rome,Nicole Seredenko,Bledi Taska,Bill Valle,Casey WestonEducationHan Bai,Betsy Bizot,Jack
37、 Clark,John Etchemendy,Loredana Fattorini,Katrina Ligett,Nestor Maslej,Vanessa Parli,Ray Perrault,Sean Roberts,Alexandra RomePolicy and GovernanceMeghan Anand,Han Bai,Vania Chow,Jack Clark,Elif Kiesow Cortez,Rebecca DeCrescenzo,Loredana Fattorini,Taehwa Hong,Joe Hsu,Kai Kato,Terah Lyons,Nestor Masle
38、j,Alistair Murray,Vanessa Parli,Ray Perrault,Alexandra Rome,Sarah Smedley,Russell Wald,Brian Williams,Catherina Xu,Stone Yang,Katie Yoon,Daniel ZhangDiversityHan Bai,Betsy Bizot,Jack Clark,Loredana Fattorini,Nezihe Merve Grel,Mena Hassan,Katrina Ligett,Nestor Maslej,Vanessa Parli,Ray Perrault,Sean R
39、oberts,Alexandra Rome,Sarah Tan,Lucy ZimmermanPublic OpinionJack Clark,Loredana Fattorini,Mena Hassan,Nestor Maslej,Vanessa Parli,Ray Perrault,Alexandra Rome,Nicole Seredenko,Bill Valle,Lucy ZimmermanConference AttendanceTerri Auricchio(ICML),Lee Campbell(ICLR),Cassio de Campos(UAI),Meredith Ellison
40、(AAAI),Nicole Finn(CVPR),Vasant Gajanan(AAAI),Katja Hofmann(ICLR),Gerhard Lakemeyer(KR),Seth Lazar(FAccT),Shugen Ma(IROS),Becky Obbema(NeurIPS),Vesna Sabljakovic-Fritz(IJCAI),Csaba Szepesvari(ICML),Matthew Taylor(AAMAS),Sylvie Thiebaux(ICAPS),Pradeep Varakantham(ICAPS)Artificial IntelligenceIndex Re
41、port 2023Code.org Sean RobertsCenter for Security and Emerging Technology,Georgetown UniversitySara Abdulla,Catherine AikenComputing Research AssociationBetsy BizotGitHubPeter Cihon,Kevin XuGoviniRebecca DeCrescenzo,Joe Hsu,Sarah SmedleyLightcastLayla OKane,Bledi TaskaLinkedIn Murat Erer,Akash Kaura
42、,Casey Weston McKinsey&CompanyNatalia Dorogi,Brittany PrestenNetBase QuidNicole Seredenko,Bill ValleOECD.AI Policy ObservatoryLuis Aranda,Besher Massri Women in Machine LearningNezihe Merve Grel,Sarah TanWe thank the following organizations and individuals who provided data for inclusion in the AI I
43、ndex 2023 Report:We also would like to thank Jeanina Casusi,Nancy King,Shana Lynch,Jonathan Mindes,Michi Turner,and Madeleine Wright for their help in preparing this report,and Joe Hinman and Santanu Mukherjee for their help in maintaining the AI Index website.OrganizationsArtificial IntelligenceInd
44、ex Report 2023Report Highlights 11Chapter 1 Research and Development 20Chapter 2 Technical Performance 69Chapter 3 Technical AI Ethics 125Chapter 4 The Economy 168Chapter 5 Education 234Chapter 6 Policy and Governance 263Chapter 7 Diversity 296Chapter 8 Public Opinion 319Appendix 344Table of Content
45、sACCESS THE PUBLIC DATAArtificial IntelligenceIndex Report 2023Report HighlightsChapter 1:Research and Development The United States and China had the greatest number of cross-country collaborations in AI publications from 2010 to 2021,although the pace of collaboration has slowed.The number of AI r
46、esearch collaborations between the United States and China increased roughly 4 times since 2010,and was 2.5 times greater than the collaboration totals of the next nearest country pair,the United Kingdom and China.However the total number of U.S.-China collaborations only increased by 2.1%from 2020
47、to 2021,the smallest year-over-year growth rate since 2010.AI research is on the rise,across the board.The total number of AI publications has more than doubled since 2010.The specific AI topics that continue dominating research include pattern recognition,machine learning,and computer vision.China
48、continues to lead in total AI journal,conference,and repository publications.The United States is still ahead in terms of AI conference and repository citations,but those leads are slowly eroding.Still,the majority of the worlds large language and multimodal models(54%in 2022)are produced by America
49、n institutions.Industry races ahead of academia.Until 2014,most significant machine learning models were released by academia.Since then,industry has taken over.In 2022,there were 32 significant industry-produced machine learning models compared to just three produced by academia.Building state-of-t
50、he-art AI systems increasingly requires large amounts of data,computer power,and moneyresources that industry actors inherently possess in greater amounts compared to nonprofits and academia.Large language models are getting bigger and more expensive.GPT-2,released in 2019,considered by many to be t
51、he first large language model,had 1.5 billion parameters and cost an estimated$50,000 USD to train.PaLM,one of the flagship large language models launched in 2022,had 540 billion parameters and cost an estimated$8 million USDPaLM was around 360 times larger than GPT-2 and cost 160 times more.Its not
52、 just PaLM:Across the board,large language and multimodal models are becoming larger and pricier.Artificial IntelligenceIndex Report 2023Chapter 2:Technical PerformancePerformance saturation on traditional benchmarks.AI continued to post state-of-the-art results,but year-over-year improvement on man
53、y benchmarks continues to be marginal.Moreover,the speed at which benchmark saturation is being reached is increasing.However,new,more comprehensive benchmarking suites such as BIG-bench and HELM are being released.Generative AI breaks into the public consciousness.2022 saw the release of text-to-im
54、age models like DALL-E 2 and Stable Diffusion,text-to-video systems like Make-A-Video,and chatbots like ChatGPT.Still,these systems can be prone to hallucination,confidently outputting incoherent or untrue responses,making it hard to rely on them for critical applications.AI systems become more flex
55、ible.Traditionally AI systems have performed well on narrow tasks but have struggled across broader tasks.Recently released models challenge that trend;BEiT-3,PaLI,and Gato,among others,are single AI systems increasingly capable of navigating multiple tasks(for example,vision,language).Capable langu
56、age models still struggle with reasoning.Language models continued to improve their generative capabilities,but new research suggests that they still struggle with complex planning tasks.AI is both helping and harming the environment.New research suggests that AI systems can have serious environment
57、al impacts.According to Luccioni et al.,2022,BLOOMs training run emitted 25 times more carbon than a single air traveler on a one-way trip from New York to San Francisco.Still,new reinforcement learning models like BCOOLER show that AI systems can be used to optimize energy usage.The worlds best new
58、 scientist AI?AI models are starting to rapidly accelerate scientific progress and in 2022 were used to aid hydrogen fusion,improve the efficiency of matrix manipulation,and generate new antibodies.AI starts to build better AI.Nvidia used an AI reinforcement learning agent to improve the design of t
59、he chips that power AI systems.Similarly,Google recently used one of its language models,PaLM,to suggest ways to improve the very same model.Self-improving AI learning will accelerate AI progress.Artificial IntelligenceIndex Report 2023Chapter 3:Technical AI EthicsThe effects of model scale on bias
60、and toxicity are confounded by training data and mitigation methods.In the past year,several institutions have built their own large models trained on proprietary dataand while large models are still toxic and biased,new evidence suggests that these issues can be somewhat mitigated after training la
61、rger models with instruction-tuning.Generative models have arrived and so have their ethical problems.In 2022,generative models became part of the zeitgeist.These models are capable but also come with ethical challenges.Text-to-image generators are routinely biased along gender dimensions,and chatbo
62、ts like ChatGPT can be tricked into serving nefarious aims.The number of incidents concerning the misuse of AI is rapidly rising.According to the AIAAIC database,which tracks incidents related to the ethical misuse of AI,the number of AI incidents and controversies has increased 26 times since 2012.
63、Some notable incidents in 2022 included a deepfake video of Ukrainian President Volodymyr Zelenskyy surrendering and U.S.prisons using call-monitoring technology on their inmates.This growth is evidence of both greater use of AI technologies and awareness of misuse possibilities.Fairer models may no
64、t be less biased.Extensive analysis of language models suggests that while there is a clear correlation between performance and fairness,fairness and bias can be at odds:Language models which perform better on certain fairness benchmarks tend to have worse gender bias.Interest in AI ethics continues
65、 to skyrocket.The number of accepted submissions to FAccT,a leading AI ethics conference,has more than doubled since 2021 and increased by a factor of 10 since 2018.2022 also saw more submissions than ever from industry actors.Automated fact-checking with natural language processing isnt so straight
66、forward after all.While several benchmarks have been developed for automated fact-checking,researchers find that 11 of 16 of such datasets rely on evidence“leaked”from fact-checking reports which did not exist at the time of the claim surfacing.Artificial IntelligenceIndex Report 2023Chapter 4:The E
67、conomyThe demand for AI-related professional skills is increasing across virtually every American industrial sector.Across every sector in the United States for which there is data(with the exception of agriculture,forestry,fishing,and hunting),the number of AI-related job postings has increased on
68、average from 1.7%in 2021 to 1.9%in 2022.Employers in the United States are increasingly looking for workers with AI-related skills.For the first time in the last decade,year-over-year private investment in AI decreased.Global AI private investment was$91.9 billion in 2022,which represented a 26.7%de
69、crease since 2021.The total number of AI-related funding events as well as the number of newly funded AI companies likewise decreased.Still,during the last decade as a whole,AI investment has significantly increased.In 2022 the amount of private investment in AI was 18 times greater than it was in 2
70、013.Once again,the United States leads in investment in AI.The U.S.led the world in terms of total amount of AI private investment.In 2022,the$47.4 billion invested in the U.S.was roughly 3.5 times the amount invested in the next highest country,China($13.4 billion).The U.S.also continues to lead in
71、 terms of total number of newly funded AI companies,seeing 1.9 times more than the European Union and the United Kingdom combined,and 3.4 times more than China.In 2022,the AI focus area with the most investment was medical and healthcare($6.1 billion);followed by data management,processing,and cloud
72、($5.9 billion);and Fintech($5.5 billion).However,mirroring the broader trend in AI private investment,most AI focus areas saw less investment in 2022 than in 2021.In the last year,the three largest AI private investment events were:(1)a$2.5 billion funding event for GAC Aion New Energy Automobile,a
73、Chinese manufacturer of electric vehicles;(2)a$1.5 billion Series E funding round for Anduril Industries,a U.S.defense products company that builds technology for military agencies and border surveillance;and(3)a$1.2 billion investment in Celonis,a business-data consulting company based in Germany.W
74、hile the proportion of companies adopting AI has plateaued,the companies that have adopted AI continue to pull ahead.The proportion of companies adopting AI in 2022 has more than doubled since 2017,though it has plateaued in recent years between 50%and 60%,according to the results of McKinseys annua
75、l research survey.Organizations that have adopted AI report realizing meaningful cost decreases and revenue increases.Artificial IntelligenceIndex Report 2023Chapter 4:The Economy(contd)AI is being deployed by businesses in multifaceted ways.The AI capabilities most likely to have been embedded in b
76、usinesses include robotic process automation(39%),computer vision(34%),NL text understanding(33%),and virtual agents(33%).Moreover,the most commonly adopted AI use case in 2022 was service operations optimization(24%),followed by the creation of new AI-based products(20%),customer segmentation(19%),
77、customer service analytics(19%),and new AI-based enhancement of products(19%).AI tools like Copilot are tangibly helping workers.Results of a GitHub survey on the use of Copilot,a text-to-code AI system,find that 88%of surveyed respondents feel more productive when using the system,74%feel they are
78、able to focus on more satisfying work,and 88%feel they are able to complete tasks more quickly.China dominates industrial robot installations.In 2013,China overtook Japan as the nation installing the most industrial robots.Since then,the gap between the total number of industrial robots installed by
79、 China and the next-nearest nation has widened.In 2021,China installed more industrial robots than the rest of the world combined.Artificial IntelligenceIndex Report 2023Chapter 5:EducationMore and more AI specialization.The proportion of new computer science PhD graduates from U.S.universities who
80、specialized in AI jumped to 19.1%in 2021,from 14.9%in 2020 and 10.2%in 2010.New AI PhDs increasingly head to industry.In 2011,roughly the same proportion of new AI PhD graduates took jobs in industry(40.9%)as opposed to academia(41.6%).Since then,however,a majority of AI PhDs have headed to industry
81、.In 2021,65.4%of AI PhDs took jobs in industry,more than double the 28.2%who took jobs in academia.New North American CS,CE,and information faculty hires stayed flat.In the last decade,the total number of new North American computer science(CS),computer engineering(CE),and information faculty hires
82、has decreased:There were 710 total hires in 2021 compared to 733 in 2012.Similarly,the total number of tenure-track hires peaked in 2019 at 422 and then dropped to 324 in 2021.The gap in external research funding for private versus public American CS departments continues to widen.In 2011,the median
83、 amount of total expenditure from external sources for computing research was roughly the same for private and public CS departments in the United States.Since then,the gap has widened,with private U.S.CS departments receiving millions more in additional funding than public universities.In 2021,the
84、median expenditure for private universities was$9.7 million,compared to$5.7 million for public universities.Interest in K12 AI and computer science education grows in both the United States and the rest of the world.In 2021,a total of 181,040 AP computer science exams were taken by American students
85、,a 1.0%increase from the previous year.Since 2007,the number of AP computer science exams has increased ninefold.As of 2021,11 countries,including Belgium,China,and South Korea,have officially endorsed and implemented a K12 AI curriculum.Artificial IntelligenceIndex Report 2023Chapter 6:Policy and G
86、overnancePolicymaker interest in AI is on the rise.An AI Index analysis of the legislative records of 127 countries shows that the number of bills containing“artificial intelligence”that were passed into law grew from just 1 in 2016 to 37 in 2022.An analysis of the parliamentary records on AI in 81
87、countries likewise shows that mentions of AI in global legislative proceedings have increased nearly 6.5 times since 2016.From talk to enactmentthe U.S.passed more AI bills than ever before.In 2021,only 2%of all federal AI bills in the United States were passed into law.This number jumped to 10%in 2
88、022.Similarly,last year 35%of all state-level AI bills were passed into law.When it comes to AI,policymakers have a lot of thoughts.A qualitative analysis of the parliamentary proceedings of a diverse group of nations reveals that policymakers think about AI from a wide range of perspectives.For exa
89、mple,in 2022,legislators in the United Kingdom discussed the risks of AI-led automation;those in Japan considered the necessity of safeguarding human rights in the face of AI;and those in Zambia looked at the possibility of using AI for weather forecasting.The U.S.government continues to increase sp
90、ending on AI.Since 2017,the amount of U.S.government AI-related contract spending has increased roughly 2.5 times.The legal world is waking up to AI.In 2022,there were 110 AI-related legal cases in United States state and federal courts,roughly seven times more than in 2016.The majority of these cas
91、es originated in California,New York,and Illinois,and concerned issues relating to civil,intellectual property,and contract law.Artificial IntelligenceIndex Report 2023Chapter 7:DiversityNorth American bachelors,masters,and PhD-level computer science students are becoming more ethnically diverse.Alt
92、hough white students are still the most represented ethnicity among new resident bachelors,masters,and PhD-level computer science graduates,students from other ethnic backgrounds(for example,Asian,Hispanic,and Black or African American)are becoming increasingly more represented.For example,in 2011,7
93、1.9%of new resident CS bachelors graduates were white.In 2021,that number dropped to 46.7%.New AI PhDs are still overwhelmingly male.In 2021,78.7%of new AI PhDs were male.Only 21.3%were female,a 3.2 percentage point increase from 2011.There continues to be a gender imbalance in higher-level AI educa
94、tion.Women make up an increasingly greater share of CS,CE,and information faculty hires.Since 2017,the proportion of new female CS,CE,and information faculty hires has increased from 24.9%to 30.2%.Still,most CS,CE,and information faculty in North American universities are male(75.9%).As of 2021,only
95、 0.1%of CS,CE,and information faculty identify as nonbinary.American K12 computer science education has become more diverse,in terms of both gender and ethnicity.The share of AP computer science exams taken by female students increased from 16.8%in 2007 to 30.6%in 2021.Year over year,the share of As
96、ian,Hispanic/Latino/Latina,and Black/African American students taking AP computer science has likewise increased.Artificial IntelligenceIndex Report 2023Chapter 8:Public OpinionChinese citizens are among those who feel the most positively about AI products and services.Americans not so much.In a 202
97、2 IPSOS survey,78%of Chinese respondents(the highest proportion of surveyed countries)agreed with the statement that products and services using AI have more benefits than drawbacks.After Chinese respondents,those from Saudi Arabia(76%)and India(71%)felt the most positive about AI products.Only 35%o
98、f sampled Americans(among the lowest of surveyed countries)agreed that products and services using AI had more benefits than drawbacks.Men tend to feel more positively about AI products and services than women.Men are also more likely than women to believe that AI will mostly help rather than harm.A
99、ccording to the 2022 IPSOS survey,men are more likely than women to report that AI products and services make their lives easier,trust companies that use AI,and feel that AI products and services have more benefits than drawbacks.A 2021 survey by Gallup and Lloyds Register Foundation likewise reveal
100、ed that men are more likely than women to agree with the statement that AI will mostly help rather than harm their country in the next 20 years.People across the world and especially America remain unconvinced by self-driving cars.In a global survey,only 27%of respondents reported feeling safe in a
101、self-driving car.Similarly,Pew Research suggests that only 26%of Americans feel that driverless passenger vehicles are a good idea for society.Different causes for excitement and concern.Among a sample of surveyed Americans,those who report feeling excited about AI are most excited about the potenti
102、al to make life and society better(31%)and to save time and make things more efficient(13%).Those who report feeling more concerned worry about the loss of human jobs(19%);surveillance,hacking,and digital privacy(16%);and the lack of human connection(12%).NLP researchers have some strong opinions as
103、 well.According to a survey widely distributed to NLP researchers,77%either agreed or weakly agreed that private AI firms have too much influence,41%said that NLP should be regulated,and 73%felt that AI could soon lead to revolutionary societal change.These were some of the many strong opinions held
104、 by the NLP research community.Table of ContentsChapter 1 Preview20Artificial IntelligenceIndex Report 2023Artificial IntelligenceIndex Report 2023CHAPTER 1:Research and DevelopmentTable of ContentsChapter 1 Preview21Artificial IntelligenceIndex Report 2023Overview 22Chapter Highlights 231.1 Publica
105、tions 24Overview 24 Total Number of AI Publications 24 By Type of Publication 25 By Field of Study 26 By Sector 27 Cross-Country Collaboration 29 Cross-Sector Collaboration 31AI Journal Publications 32 Overview 32 By Region 33 By Geographic Area 34 Citations 35AI Conference Publications 36 Overview
106、36 By Region 37 By Geographic Area 38 Citations 39AI Repositories 40 Overview 40 By Region 41 By Geographic Area 42 Citations 43 Narrative Highlight:Top Publishing Institutions 44 All Fields 44 Computer Vision 46 Natural Language Processing 47 Speech Recognition 481.2 Trends in Significant Machine L
107、earning Systems 49General Machine Learning Systems 49 System Types 49 Sector Analysis 50 National Affiliation 51 Systems 51 Authorship 53 Parameter Trends 54 Compute Trends 56Large Language and Multimodal Models 58 National Affiliation 58 Parameter Count 60 Training Compute 61 Training Cost 621.3 AI
108、 Conferences 64Conference Attendance 641.4 Open-Source AI Software 66Projects 66Stars 68Research and DevelopmentCHAPTER 1 PREVIEW:ACCESS THE PUBLIC DATA21Table of ContentsTable of ContentsChapter 1 Preview22Artificial IntelligenceIndex Report 2023OverviewThis chapter captures trends in AI R&D.It beg
109、ins by examining AI publications,including journal articles,conference papers,and repositories.Next it considers data on significant machine learning systems,including large language and multimodal models.Finally,the chapter concludes by looking at AI conference attendance and open-source AI researc
110、h.Although the United States and China continue to dominate AI R&D,research efforts are becoming increasingly geographically dispersed.Chapter 1:Research and DevelopmentTable of ContentsChapter 1 Preview23Artificial IntelligenceIndex Report 2023Chapter HighlightsThe United States and China had the g
111、reatest number of cross-country collaborations in AI publications from 2010 to 2021,although the pace of collaboration has since slowed.The number of AI research collaborations between the United States and China increased roughly 4 times since 2010,and was 2.5 times greater than the collaboration t
112、otals of the next nearest country pair,the United Kingdom and China.However,the total number of U.S.-China collaborations only increased by 2.1%from 2020 to 2021,the smallest year-over-year growth rate since 2010.Industry races ahead of academia.Until 2014,most significant machine learning models we
113、re released by academia.Since then,industry has taken over.In 2022,there were 32 significant industry-produced machine learning models compared to just three produced by academia.Building state-of-the-art AI systems increasingly requires large amounts of data,computer power,and moneyresources that i
114、ndustry actors inherently possess in greater amounts compared to nonprofits and academia.AI research is on the rise,across the board.The total number of AI publications has more than doubled since 2010.The specific AI topics that continue to dominate research include pattern recognition,machine lear
115、ning,and computer vision.China continues to lead in total AI journal,conference,and repository publications.The United States is still ahead in terms of AI conference and repository citations,but those leads are slowly eroding.Still,the majority of the worlds large language and multimodal models(54%
116、in 2022)are produced by American institutions.Large language models are getting bigger and more expensive.GPT-2,released in 2019,considered by many to be the first large language model,had 1.5 billion parameters and cost an estimated$50,000 USD to train.PaLM,one of the flagship large language models
117、 launched in 2022,had 540 billion parameters and cost an estimated$8 million USDPaLM was around 360 times larger than GPT-2 and cost 160 times more.Its not just PaLM:Across the board,large language and multimodal models are becoming larger and pricier.Chapter 1:Research and DevelopmentTable of Conte
118、ntsChapter 1 Preview24496.012010201120122013201420152016201720182019202020210100200300400500Number of AI Publications(in Thousands)Number of AI Publications in the World,201021 Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index Report Artificial IntelligenceIndex Report 2023
119、OverviewThe figures below capture the total number of English-language and Chinese-language AI publications globally from 2010 to 2021by type,affiliation,cross-country collaboration,and cross-industry collaboration.The section also breaks down 1.1 Publicationspublication and citation data by region
120、for AI journal articles,conference papers,repositories,and patents.Total Number of AI PublicationsFigure 1.1.1 shows the number of AI publications in the world.From 2010 to 2021,the total number of AI publications more than doubled,growing from 200,000 in 2010 to almost 500,000 in 2021.1 See the App
121、endix for more information on CSETs methodology.For more on the challenge of defining AI and correctly capturing relevant bibliometric data,see the AI Index teams discussion in the paper“Measurement in AI Policy:Opportunities and Challenges.”This section draws on data from the Center for Security an
122、d Emerging Technology(CSET)at Georgetown University.CSET maintains a merged corpus of scholarly literature that includes Digital Sciences Dimensions,Clarivates Web of Science,Microsoft Academic Graph,China National Knowledge Infrastructure,arXiv,and Papers With Code.In that corpus,CSET applied a cla
123、ssifier to identify English-language publications related to the development or application of AI and ML since 2010.For this years report,CSET also used select Chinese AI keywords to identify Chinese-language AI papers;CSET did not deploy this method for previous iterations of the AI Index report.1I
124、n last years edition of the report,publication trends were reported up to the year 2021.However,given that there is a significant lag in the collection of publication metadata,and that in some cases it takes until the middle of any given year to fully capture the previous years publications,in this
125、years report,the AI Index team elected to examine publication trends only through 2021,which we,along with CSET,are confident yields a more fully representative report.1.1 PublicationsChapter 1:Research and DevelopmentFigure 1.1.1Table of ContentsChapter 1 Preview25Artificial IntelligenceIndex Repor
126、t 2023By Type of PublicationFigure 1.1.2 shows the types of AI publications released globally over time.In 2021,60%of all published AI documents were journal articles,17%were conference papers,and 13%were repository submissions.Books,book chapters,theses,and unknown document types made up the remain
127、ing 10%of publications.While journal and repository publications have grown 3 and 26.6 times,respectively,in the past 12 years,the number of conference papers has declined since 2019.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210306090120150180210
128、240270300Number of AI Publications(in Thousands)2.76,Book5.82,Unknown13.77,Book Chapter29.88,Thesis65.21,Repository85.09,Conference293.48,JournalNumber of AI Publications by Type,201021 Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.2Table of ContentsCha
129、pter 1 Preview26Artificial IntelligenceIndex Report 2023By Field of StudyFigure 1.1.3 shows that publications in pattern recognition and machine learning have experienced the sharpest growth in the last half decade.Since 2015,the number of pattern recognition papers has roughly doubled while the num
130、ber of machine learning papers has roughly quadrupled.Following those two topic areas,in 2021,the next most published AI fields of study were computer vision(30,075),algorithm(21,527),and data mining(19,181).1.1 PublicationsChapter 1:Research and Development201020112012201320142015201620172018201920
131、2020210102030405060Number of AI Publications(in Thousands)6.74,Linguistics10.37,HumanComputer Interaction11.57,Control Theory14.99,Natural Language Processing19.18,Data Mining21.53,Algorithm30.07,Computer Vision42.55,Machine Learning59.36,Pattern RecognitionNumber of AI Publications by Field of Stud
132、y(Excluding Other AI),201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.3Table of ContentsChapter 1 Preview27Artificial IntelligenceIndex Report 2023By SectorThis section shows the number of AI publications affiliated with education,government,industr
133、y,nonprofit,and other sectorsfirst globally(Figure 1.1.4),then looking at the United States,China,and the European Union plus the United Kingdom(Figure 1.1.5).2 The education sector dominates in each region.The level of industry participation is highest in the United States,then in the European Unio
134、n.Since 2010,the share of education AI publications has been dropping in each region.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%10%20%30%40%50%60%70%80%AI Publications(%of Total)0.22%,Other3.74%,Government7.21%,Industry13.60%,Nonpro?t75.23%,Ed
135、ucationAI Publications(%of Total)by Sector,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index Report2 The categorization is adapted based on the Global Research Identifier Database(GRID).Healthcare,including hospitals and facilities,is included under nonprofit.Publicat
136、ions affiliated with state-sponsored universities are included in the education sector.Figure 1.1.4Table of ContentsChapter 1 Preview28Artificial IntelligenceIndex Report 20231.1 PublicationsChapter 1:Research and Development69.17%14.82%12.60%3.21%0.20%69.23%3.92%7.90%18.63%0.33%5.47%77.85%4.74%11.7
137、3%0.20%0%10%20%30%40%50%60%70%80%OtherGovernmentIndustryNonpro?tEducationUnited StatesEuropean Union and United KingdomChinaAI Publications(%of Total)AI Publications(%of Total)by Sector and Geographic Area,2021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1
138、.1.5Table of ContentsChapter 1 Preview29Artificial IntelligenceIndex Report 2023Cross-Country CollaborationCross-border collaborations between academics,researchers,industry experts,and others are a key component of modern STEM(science,technology,engineering,and mathematics)development that accelera
139、te the dissemination of new ideas and the growth of research teams.Figures 1.1.6 and 1.1.7 depict the top cross-country AI collaborations from 2010 to 2021.CSET counted cross-country collaborations as distinct pairs of countries across authors for each publication(e.g.,four U.S.and four Chinese-affi
140、liated authors on a single publication are counted as one U.S.-China collaboration;two publications between the same authors count as two collaborations).By far,the greatest number of collaborations in the past 12 years took place between the United States and China,increasing roughly four times sin
141、ce 2010.However the total number of U.S.-China collaborations only increased by 2.1%from 2020 to 2021,the smallest year-over-year growth rate since 2010.The next largest set of collaborations was between the United Kingdom and both China and the United States.In 2021,the number of collaborations bet
142、ween the United States and China was 2.5 times greater than between the United Kingdom and China.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210246810Number of AI Publications(in Thousands)10.47United States and China Collaborations in AI Publicati
143、ons,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.6Table of ContentsChapter 1 Preview30Artificial IntelligenceIndex Report 20231.1 PublicationsChapter 1:Research and Development20102011201220132014201520162017201820192020202101234Number of AI Publ
144、ications(in Thousands)1.83,United States and France2.61,United States and Australia2.80,China and Australia3.42,United States and Germany4.04,United States and United Kingdom4.13,United Kingdom and ChinaCross-Country Collaborations in AI Publications(Excluding U.S.and China),201021Source:Center for
145、Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.7Table of ContentsChapter 1 Preview31Artificial IntelligenceIndex Report 2023Cross-Sector CollaborationThe increase in AI research outside of academia has broadened and grown collaboration across sectors in general.Figure 1.1
146、.8 shows that in 2021 educational institutions and nonprofits(32,551)had the greatest number of collaborations;followed by industry and educational institutions(12,856);and educational and government institutions(8,913).Collaborations between educational institutions and industry have been among the
147、 fastest growing,increasing 4.2 times since 2010.1.1 PublicationsChapter 1:Research and Development201020112012201320142015201620172018201920202021051015202530Number of AI Publications(in Thousands)0.63,Industry and Government2.26,Industry and Nonpro?t2.95,Government and Nonpro?t8.91,Education and G
148、overnment12.86,Industry and Education32.55,Education and Nonpro?tCross-Sector Collaborations in AI Publications,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.8Table of ContentsChapter 1 Preview32Artificial IntelligenceIndex Report 2023AI Journal P
149、ublicationsOverviewAfter growing only slightly from 2010 to 2015,the number of AI journal publications grew around 2.3 times since 2015.From 2020 to 2021,they increased 14.8%(Figure 1.1.9).1.1 PublicationsChapter 1:Research and Development293.482010201120122013201420152016201720182019202020210501001
150、50200250300Number of AI Journal Publications(in Thousands)Number of AI Journal Publications,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.9Table of ContentsChapter 1 Preview33Artificial IntelligenceIndex Report 20231.1 PublicationsChapter 1:Resear
151、ch and Development2010201120122013201420152016201720182019202020210%10%20%30%40%50%AI Journal Publications(%of World Total)0.77%,Sub-Saharan Africa2.30%,Rest of the World2.66%,Latin America and the Caribbean4.64%,Middle East and North Africa6.75%,South Asia6.93%,Unknown11.61%,North America17.20%,Eur
152、ope and Central Asia47.14%,East Asia and Paci?cAI Journal Publications(%of World Total)by Region,201021 Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.10By Region3Figure 1.1.10 shows the share of AI journal publications by region between 2010 and 2021.In
153、 2021,East Asia and the Pacific led with 47.1%,followed by Europe and Central Asia(17.2%),and then North America(11.6%).Since 2019,the share of publications from East Asia and the Pacific;Europe and Central Asia;as well as North America have been declining.During that period,there has been an increa
154、se in publications from other regions such as South Asia;and the Middle East and North Africa.3 Regions in this chapter are classified according to the World Bank analytical grouping.Table of ContentsChapter 1 Preview34Artificial IntelligenceIndex Report 20231.1 PublicationsChapter 1:Research and De
155、velopment2010201120122013201420152016201720182019202020210%10%20%30%40%AI Journal Publications(%of World Total)5.56%,India6.88%,Unknown10.03%,United States15.05%,European Union and United Kingdom22.70%,Rest of the World39.78%,ChinaAI Journal Publications(%of World Total)by Geographic Area,201021Sour
156、ce:Center for Security and Emerging Technology,2022|Chart:2023 AI Index Report Figure 1.1.11By Geographic Area4Figure 1.1.11 breaks down the share of AI journal publications over the past 12 years by geographic area.This years AI Index included India in recognition of the increasingly important role
157、 it plays in the AI ecosystem.China has remained the leader throughout,with 39.8%in 2021,followed by the European Union and the United Kingdom(15.1%),then the United States(10.0%).The share of Indian publications has been steadily increasingfrom 1.3%in 2010 to 5.6%in 2021.4 In this chapter we use“ge
158、ographic area”based on CSETs classifications,which are disaggregated not only by country,but also by territory.Further,we count the European Union and the United Kingdom as a single geographic area to reflect the regions strong history of research collaboration.Table of ContentsChapter 1 Preview35Ar
159、tificial IntelligenceIndex Report 20231.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%5%10%15%20%25%30%AI Journal Citations(%of World Total)0.92%,Unknown6.05%,India15.08%,United States21.51%,European Union and United Kingdom27.37%,Rest of the World
160、29.07%,ChinaAI Journal Citations(%of World Total)by Geographic Area,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.12CitationsChinas share of citations in AI journal publications has gradually increased since 2010,while those of the European Union
161、and the United Kingdom,as well as those of the United States,have decreased(Figure 1.1.12).China,the European Union and the United Kingdom,and the United States accounted for 65.7%of the total citations in the world.Table of ContentsChapter 1 Preview36Artificial IntelligenceIndex Report 2023AI Confe
162、rence PublicationsOverviewThe number of AI conference publications peaked in 2019,and fell 20.4%below the peak in 2021(Figure 1.1.13).The total number of 2021 AI conference publications,85,094,was marginally greater than the 2010 total of 75,592.1.1 PublicationsChapter 1:Research and Development85.0
163、9201020112012201320142015201620172018201920202021020406080100Number of AI Conference Publications(in Thousands)Number of AI Conference Publications,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.13Table of ContentsChapter 1 Preview37Artificial Inte
164、lligenceIndex Report 2023By RegionFigure 1.1.14 shows the number of AI conference publications by region.As with the trend in journal publications,East Asia and the Pacific;Europe and Central Asia;and North America account for the worlds highest numbers of AI conference publications.Specifically,the
165、 share represented by East Asia and the Pacific continues to rise,accounting for 36.7%in 2021,followed by Europe and Central Asia(22.7%),and then North America(19.6%).The percentage of AI conference publications in South Asia saw a noticeable rise in the past 12 years,growing from 3.6%in 2010 to 8.5
166、%in 2021.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%5%10%15%20%25%30%35%40%AI Conference Publications(%of World Total)0.60%,Sub-Saharan Africa2.35%,Rest of the World2.76%,Unknown3.07%,Latin America and the Caribbean3.82%,Middle East and North
167、Africa8.45%,South Asia19.56%,North America22.66%,Europe and Central Asia36.72%,East Asia and Paci?cAI Conference Publications(%of World Total)by Region,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.14Table of ContentsChapter 1 Preview38Artificial
168、IntelligenceIndex Report 2023By Geographic AreaIn 2021,China produced the greatest share of the worlds AI conference publications at 26.2%,having overtaken the European Union and the United Kingdom in 2017.The European Union plus the United Kingdom followed at 20.3%,and the United States came in thi
169、rd at 17.2%(Figure 1.1.15).Mirroring trends seen in other parts of the research and development section,Indias share of AI conference publications is also increasing.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%5%10%15%20%25%30%AI Conference Pub
170、lications(%of World Total)2.70%,Unknown6.79%,India17.23%,United States20.29%,European Union and United Kingdom26.15%,China26.84%,Rest of the WorldAI Conference Publications(%of World Total)by Geographic Area,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigu
171、re 1.1.15Table of ContentsChapter 1 Preview39Artificial IntelligenceIndex Report 2023CitationsDespite China producing the most AI conference publications in 2021,Figure 1.1.16 shows that the United States had the greatest share of AI conference citations,with 23.9%,followed by Chinas 22.0%.However,t
172、he gap between American and Chinese AI conference citations is narrowing.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%5%10%15%20%25%30%35%AI Conference Citations(%of World Total)0.87%,Unknown6.09%,India21.59%,European Union and United Kingdom22.
173、02%,China23.86%,United States25.57%,Rest of the WorldAI Conference Citations(%of World Total)by Geographic Area,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index Report Figure 1.1.16Table of ContentsChapter 1 Preview40Artificial IntelligenceIndex Report 2023AI Reposit
174、oriesOverviewPublishing pre-peer-reviewed papers on repositories of electronic preprints(such as arXiv and SSRN)has become a popular way for AI researchers to disseminate their work outside traditional avenues for publication.These repositories allow researchers to share their findings before submit
175、ting them to journals and conferences,thereby accelerating the cycle of information discovery.The number of AI repository publications grew almost 27 times in the past 12 years(Figure 1.1.17).1.1 PublicationsChapter 1:Research and Development65.2120102011201220132014201520162017201820192020202101020
176、30405060Number of AI Repository Publications(in Thousands)Number of AI Repository Publications,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index Report Figure 1.1.17Table of ContentsChapter 1 Preview41Artificial IntelligenceIndex Report 2023By RegionFigure 1.1.18 show
177、s that North America has maintained a steady lead in the world share of AI repository publications since 2016.Since 2011,the share of repository publications from Europe and Central Asia has declined.The share represented by East Asia and the Pacific has grown significantly since 2010 and continued
178、growing from 2020 to 2021,a period in which the year-over-year share of North American as well European and Central Asian repository publications declined.1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%10%20%30%AI Repository Publications(%of World
179、 Total)0.34%,Sub-Saharan Africa1.80%,Latin America and the Caribbean1.81%,Rest of the World3.06%,Middle East and North Africa3.41%,South Asia17.88%,East Asia and Paci?c21.40%,Europe and Central Asia23.99%,Unknown26.32%,North AmericaAI Repository Publications(%of World Total)by Region,201021Source:Ce
180、nter for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.18Table of ContentsChapter 1 Preview42Artificial IntelligenceIndex Report 2023By Geographic AreaWhile the United States has held the lead in the percentage of global AI repository publications since 2016,China is cat
181、ching up,while the European Union plus the United Kingdoms share continues to drop(Figure 1.1.19).In 2021,the United States accounted for 23.5%of the worlds AI repository publications,followed by the European Union plus the United Kingdom(20.5%),and then China(11.9%).1.1 PublicationsChapter 1:Resear
182、ch and Development2010201120122013201420152016201720182019202020210%10%20%30%AI Repository Publications(%of World Total)2.85%,India11.87%,China18.07%,Rest of the World20.54%,European Union and United Kingdom23.18%,Unknown23.48%,United StatesAI Repository Publications(%of World Total)by Geographic Ar
183、ea,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.19Table of ContentsChapter 1 Preview43Artificial IntelligenceIndex Report 2023CitationsIn the citations of AI repository publications,Figure 1.1.20 shows that in 2021 the United States topped the li
184、st with 29.2%of overall citations,maintaining a dominant lead over the European Union plus the United Kingdom(21.5%),as well as China(21.0%).1.1 PublicationsChapter 1:Research and Development2010201120122013201420152016201720182019202020210%10%20%30%40%AI Repository Citations(%of World Total)1.91%,I
185、ndia4.59%,Unknown20.98%,China21.52%,European Union and United Kingdom21.79%,Rest of the World29.22%,United StatesAI Repository Citations(%of World Total)by Geographic Area,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index Report Figure 1.1.20Table of ContentsChapter 1
186、 Preview44Artificial IntelligenceIndex Report 2023Artificial IntelligenceIndex Report 2023All Fields Since 2010,the institution producing the greatest number of total AI papers has been the Chinese Academy of Sciences(Figure 1.1.21).The next top four are all Chinese universities:Tsinghua University,
187、the University of the Chinese Academy of Sciences,Shanghai Jiao Tong University,and Zhejiang University.5 The total number of publications released by each of these institutions in 2021 is displayed in Figure 1.1.22.Top Publishing InstitutionsNarrative Highlight:5 It is important to note that many C
188、hinese research institutions are large,centralized organizations with thousands of researchers.It is therefore not entirely surprising that,purely by the metric of publication count,they outpublish most non-Chinese institutions.1.1 PublicationsChapter 1:Research and Development2010201120122013201420
189、1520162017201820192020202110987654321Rank1,Chinese Academy of Sciences2,Tsinghua University3,University of Chinese Academy of Sciences4,Shanghai Jiao Tong University5,Zhejiang University6,Harbin Institute of Technology7,Beihang University8,University of Electronic Science and Technology of China9,Pe
190、king University10,Massachusetts Institute of TechnologyTop Ten Institutions in the World in 2021 Ranked by Number of AI Publications in All Fields,201021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.21Table of ContentsChapter 1 Preview45Artificial Intel
191、ligenceIndex Report 2023Artificial IntelligenceIndex Report 2023Top Publishing Institutions(contd)Narrative Highlight:1.1 PublicationsChapter 1:Research and Development1,7451,8931,9511,9702,0162,5902,7032,9043,3735,09905001,0001,5002,0002,5003,0003,5004,0004,5005,000Massachusetts Institute ofTechnol
192、ogyPeking UniversityUniversity of Electronic Scienceand Technology of ChinaBeihang UniversityHarbin Institute of TechnologyZhejiang UniversityShanghai Jiao Tong UniversityUniversity of Chinese Academyof SciencesTsinghua UniversityChinese Academy of SciencesNumber of AI PublicationsTop Ten Institutio
193、ns in the World by Number of AI Publications in All Fields,2021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.22Table of ContentsChapter 1 Preview46Top Publishing Institutions(contd)Narrative Highlight:Artificial IntelligenceIndex Report 2023Computer Vis
194、ionIn 2021,the top 10 institutions publishing the greatest number of AI computer vision publications were all Chinese(Figure 1.1.23).The Chinese Academy of Sciences published the largest number of such publications,with a total of 562.1.1 PublicationsChapter 1:Research and Development182210229231247
195、2892963143165620100200300400500Tianjin UniversityHarbin Institute of TechnologyBeijing Institute of TechnologyWuhan UniversityBeihang UniversityZhejiang UniversityTsinghua UniversityUniversity of Chinese Academyof SciencesShanghai Jiao Tong UniversityChinese Academy of SciencesNumber of AI Publicati
196、onsTop Ten Institutions in the World by Number of AI Publications in Computer Vision,2021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.23Table of ContentsChapter 1 Preview47Top Publishing Institutions(contd)Narrative Highlight:Artificial IntelligenceInd
197、ex Report 2023Natural Language ProcessingAmerican institutions are represented to a greater degree in the share of top NLP publishers (Figure 1.1.24).Although the Chinese Academy of Sciences was again the worlds leading institution in 2021(182 publications),Carnegie Mellon took second place(140 publ
198、ications),followed by Microsoft(134).In addition,2021 was the first year Amazon and Alibaba were represented among the top-ten largest publishing NLP institutions.1.1 PublicationsChapter 1:Research and Development981001121131161161271341401820102030405060708090100110120130140150160170180190Amazon(Un
199、ited States)Alibaba Group(China)University of Chinese Academyof SciencesPeking UniversityGoogle(United States)Carnegie Mellon UniversityAustraliaTsinghua UniversityMicrosoft(United States)Carnegie Mellon UniversityChinese Academy of SciencesNumber of AI PublicationsTop Ten Institutions in the World
200、by Number of AI Publications in Natural Language Processing,2021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.24Table of ContentsChapter 1 Preview48Top Publishing Institutions(contd)Narrative Highlight:Artificial IntelligenceIndex Report 2023Speech Reco
201、gnitionIn 2021,the greatest number of speech recognition papers came from the Chinese Academy of Sciences(107),followed by Microsoft(98)and Google(75)(Figure 1.1.25).The Chinese Academy of Sciences reclaimed the top spot in 2021 from Microsoft,which held first position in 2020.1.1 PublicationsChapte
202、r 1:Research and Development5455575759616675981070102030405060708090100110Amazon(United States)Chinese University of Hong KongTencent(China)Carnegie Mellon UniversityUniversity of Scienceand Technology of ChinaTsinghua UniversityUniversity of Chinese Academyof SciencesGoogle(United States)Microsoft(
203、United States)Chinese Academy of SciencesNumber of AI PublicationsTop Ten Institutions in the World by Number of AI Publications in Speech Recognition,2021Source:Center for Security and Emerging Technology,2022|Chart:2023 AI Index ReportFigure 1.1.25Table of ContentsChapter 1 Preview4911122342302468
204、1012141618202224GamesOtherText-to-VideoSpeechVisionDrawingMultimodalLanguageNumber of Signi?cant Machine Learning SystemsNumber of Significant Machine Learning Systems by Domain,2022Source:Epoch,2022|Chart:2023 AI Index ReportArtificial IntelligenceIndex Report 2023General Machine Learning SystemsTh
205、e figures below report trends among all machine learning systems included in the Epoch dataset.For reference,these systems are referred to as significant machine learning systems throughout the subsection.1.2 Trends in Significant Machine Learning Systems System TypesAmong the significant AI machine
206、 learning systems released in 2022,the most common class of system was language(Figure 1.2.1).There were 23 significant AI language systems released in 2022,roughly six times the number of the next most common system type,multimodal systems.6 There were 38 total significant AI machine learning syste
207、ms released in 2022,according to Epoch;however,one of the systems,BaGuaLu,did not have a domain classification and is therefore omitted from Figure 1.2.1.Epoch AI is a collective of researchers investigating and forecasting the development of advanced AI.Epoch curates a database of significant AI an
208、d machine learning systems that have been released since the 1950s.There are different criteria under which the Epoch team decides to include particular AI systems in their database;for example,the system may have registered a state-of-the-art improvement,been deemed to have been historically signif
209、icant,or been highly cited.This subsection uses the Epoch database to track trends in significant AI and machine learning systems.The latter half of the chapter includes research done by the AI Index team that reports trends in large language and multimodal models,which are models trained on large a
210、mounts of data and adaptable to a variety of downstream applications.1.2 Trends in Significant Machine Learning Systems Chapter 1:Research and DevelopmentFigure 1.2.16Table of ContentsChapter 1 Preview502002200420062008201020122014201620182020202205101520253035Number of Signi?cant Machine Learning S
211、ystems0,Nonpro?t1,Industry-Academia Collaboration2,Research Collective3,Academia32,IndustryNumber of Significant Machine Learning Systems by Sector,200222Source:Epoch,2022|Chart:2023 AI Index ReportArtificial IntelligenceIndex Report 2023Sector AnalysisWhich sector among industry,academia,or nonprof
212、it has released the greatest number of significant machine learning systems?Until 2014,most machine learning systems were released by academia.Since then,industry has taken over(Figure 1.2.2).In 2022,there were 32 significant industry-produced machine learning systems compared to just three produced
213、 by academia.Producing state-of-the-art AI systems increasingly requires large amounts of data,computing power,and money;resources that industry actors possess in greater amounts compared to nonprofits and academia.Chapter 1:Research and DevelopmentFigure 1.2.21.2 Trends in Significant Machine Learn
214、ing Systems Table of ContentsChapter 1 Preview51Artificial IntelligenceIndex Report 2023National AffiliationIn order to paint a picture of AIs evolving geopolitical landscape,the AI Index research team identified the nationality of the authors who contributed to the development of each significant m
215、achine learning system in the Epoch dataset.7SystemsFigure 1.2.3 showcases the total number of significant machine learning systems attributed to researchers from particular countries.8 A researcher is considered to have belonged to the country in which their institution,for example a university or
216、AI-research firm,was headquartered.In 2022,the United States produced the greatest number of significant machine learning systems with 16,followed by the United Kingdom(8)and China(3).Moreover,since 2002 the United States has outpaced the United Kingdom and the European Union,as well as China,in ter
217、ms of the total number of significant machine learning systems produced(Figure 1.2.4).Figure 1.2.5 displays the total number of significant machine learning systems produced by country since 2002 for the entire world.Chapter 1:Research and Development7 The methodology by which the AI Index identifie
218、d authors nationality is outlined in greater detail in the Appendix.8 A machine learning system is considered to be affiliated with a particular country if at least one author involved in creating the model was affiliated with that country.Consequently,in cases where a system has authors from multip
219、le countries,double counting may occur.111112238160246810121416SingaporeRussiaIsraelIndiaFranceGermanyCanadaChinaUnited KingdomUnited StatesNumber of Signi?cant Machine Learning SystemsNumber of Significant Machine Learning Systems by Country,2022Source:Epoch and AI Index,2022|Chart:2023 AI Index Re
220、port20022004200620082010201220142016201820202022051015202530Number of Significant Machine Learning Systems3,China12,European Union andUnited Kingdom16,United StatesNumber of Significant Machine Learning Systems by Select Geographic Area,200222Source:Epoch and AI Index,2022|Chart:2023 AI Index Report
221、Figure 1.2.3Figure 1.2.41.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview52Artificial IntelligenceIndex Report 2023Chapter 1:Research and Development1101120216061255Number of Signifcant Machine Learning Systems by Country,200222(Sum)Source:AI Index,2022|Chart:2023
222、 AI Index Report0Figure 1.2.51.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview53Artificial IntelligenceIndex Report 2023AuthorshipFigures 1.2.6 to 1.2.8 look at the total number of authors,disaggregated by national affiliation,that contributed to the launch of sig
223、nificant machine learning systems.As was the case with total systems,in 2022 the United States had the greatest number of authors producing significant machine learning systems,with 285,more than double that of the United Kingdom and nearly six times that of China(Figure 1.2.6).Chapter 1:Research an
224、d Development12378132149139285050100150200250300FranceIndiaRussiaGermanySwedenIsraelCanadaChinaUnited KingdomUnited StatesNumber of AuthorsNumber of Authors of Significant Machine Learning Systems by Country,2022Source:Epoch and AI Index,2022|Chart:2023 AI Index Report2002200420062008201020122014201
225、6201820202022050100150200250300350400Number of Authors49,China155,European Union andUnited Kingdom285,United StatesNumber of Authors of Significant Machine Learning Systems by Select Geographic Area,200222Source:Epoch and AI Index,2022|Chart:2023 AI Index ReportFigure 1.2.6Figure 1.2.711011202160611
226、801813703716806812000Number of Authors of Signifcant Machine Learning Systems by Country,200222(Sum)Source:AI Index,2022|Chart:2023 AI Index Report0Figure 1.2.81.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview54Artificial IntelligenceIndex Report 2023Parameter Tre
227、ndsParameters are numerical values that are learned by machine learning models during training.The value of parameters in machine learning models determines how a model might interpret input data and make predictions.Adjusting parameters is an essential step in ensuring that the performance of a mac
228、hine learning system is optimized.Figure 1.2.9 highlights the number of parameters of the machine learning systems included in the Epoch dataset by sector.Over time,there has been a steady increase in the number of parameters,an increase that has become particularly sharp since the early 2010s.The f
229、act that AI systems are rapidly increasing their parameters is reflective of the increased complexity of the tasks they are being asked to perform,the greater availability of data,advancements in underlying hardware,and most importantly,the demonstrated performance of larger models.Chapter 1:Researc
230、h and Development19501954195819621966197019741978198219861990199419982002200620102014201820221.0e+21.0e+41.0e+61.0e+81.0e+101.0e+121.0e+14AcademiaIndustryIndustry-Academia CollaborationNonpro?tResearch CollectiveNumber of Parameters(Log Scale)Number of Parameters of Significant Machine Learning Syst
231、ems by Sector,19502022Source:Epoch,2022|Chart:2023 AI Index ReportFigure 1.2.91.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview55Artificial IntelligenceIndex Report 2023Figure 1.2.10 demonstrates the parameters of machine learning systems by domain.In recent years
232、,there has been a rise in parameter-rich systems.Chapter 1:Research and Development1954195819621966197019741978198219861990199419982002200620102014201820221.0e+21.0e+41.0e+61.0e+81.0e+101.0e+12LanguageVisionGamesNumber of Parameters(Log Scale)Number of Parameters of Significant Machine Learning Syst
233、ems by Domain,19502022Source:Epoch,2022|Chart:2023 AI Index ReportFigure 1.2.101.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview56Artificial IntelligenceIndex Report 2023Compute TrendsThe computational power,or“compute,”of AI systems refers to the amount of comput
234、ational resources needed to train and run a machine learning system.Typically,the more complex a system is,and the larger the dataset on which it is trained,the greater the amount of compute required.The amount of compute used by significant AI machine learning systems has increased exponentially in
235、 the last half-decade(Figure 1.2.11).9 The growing demand for compute in AI carries several important implications.For example,more compute-intensive models tend to have greater environmental impacts,and industrial players tend to have easier access to computational resources than others,such as uni
236、versities.Chapter 1:Research and Development19501954195819621966197019741978198219861990199419982002200620102014201820221.0e+01.0e+31.0e+61.0e+91.0e+121.0e+151.0e+181.0e+211.0e+24AcademiaIndustryIndustry-Academia CollaborationNonpro?tResearch CollectiveTraining Compute(FLOP/s Log Scale)Training Comp
237、ute(FLOP/s)of Significant Machine Learning Systems by Sector,19502022Source:Epoch,2022|Chart:2023 AI Index ReportFigure 1.2.119 FLOP/s stands for“Floating Point Operations per second”and is a measure of the performance of a computational device.1.2 Trends in Significant Machine Learning Systems Tabl
238、e of ContentsChapter 1 Preview57Artificial IntelligenceIndex Report 2023Since 2010,it has increasingly been the case that of all machine learning systems,language models are demanding the most computational resources.Chapter 1:Research and Development1954195819621966197019741978198219861990199419982
239、002200620102014201820221.0e+31.0e+61.0e+91.0e+121.0e+151.0e+181.0e+211.0e+24LanguageVisionGamesTraining Compute(FLOP/s Log Scale)Training Compute(FLOP/s)of Significant Machine Learning Systems by Domain,19502022Source:Epoch,2022|Chart:2023 AI Index ReportFigure 1.2.121.2 Trends in Significant Machin
240、e Learning Systems Table of ContentsChapter 1 Preview58Artificial IntelligenceIndex Report 2023Large Language and Multimodal ModelsLarge language and multimodal models,sometimes called foundation models,are an emerging and increasingly popular type of AI model that is trained on huge amounts of data
241、 and adaptable to a variety of downstream applications.Large language and multimodal models like ChatGPT,DALL-E 2,and Make-A-Video have demonstrated impressive capabilities and are starting to be widely deployed in the real world.National AffiliationThis year the AI Index conducted an analysis of th
242、e national affiliation of the authors responsible for releasing new large language and multimodal models.10 The majority of these researchers were from American institutions(54.2%)(Figure 1.2.13).In 2022,for the first time,researchers from Canada,Germany,and India contributed to the development of l
243、arge language and multimodal models.Chapter 1:Research and Development20192020202120220%20%40%60%80%100%Authors of Large Language and Multimodal Models(%of Total)0.00%,Korea0.89%,India3.12%,Germany5.80%,Israel6.25%,Canada8.04%,China21.88%,United Kingdom54.02%,United StatesAuthors of Select Large Lan
244、guage and Multimodal Models(%of Total)by Country,201922Source:Epoch and AI Index,2022|Chart:2023 AI Index ReportFigure 1.2.1310 The AI models that were considered to be large language and multimodal models were hand-selected by the AI Index steering committee.It is possible that this selection may h
245、ave omitted certain models.Figure 1.2.14 offers a timeline view of the large language and multimodal models that have been released since GPT-2,along with the national affiliations of the researchers who produced the models.Some of the notable American large language and multimodal models released i
246、n 2022 included OpenAIs DALL-E 2 and Googles PaLM(540B).The only Chinese large language and multimodal model released in 2022 was GLM-130B,an impressive bilingual(English and Chinese)model created by researchers at Tsinghua University.BLOOM,also launched in late 2022,was listed as indeterminate give
247、n that it was the result of a collaboration of more than 1,000 international researchers.1.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview59Artificial IntelligenceIndex Report 2023Chapter 1:Research and Development2019-Jan2019-Apr2019-Jul2019-Oct2020-Jan2020-Apr20
248、20-Jul2020-Oct2021-Jan2021-Apr2021-Jul2021-Oct2022-Jan2022-Apr2022-Jul2022-Oct2023-JanGPT-2Grover-MegaMegatron-LM(Original,8.3B)T5-3BT5-11BMeenaTuring NLGGPT-3 175B(davinci)ERNIE-GEN(large)DALL-EWu Dao-Wen YuanGPT-NeoPanGu-alphaGPT-J-6BHyperClovaCogViewWu Dao 2.0ERNIE 3.0CodexJurassic-1-JumboMegatro
249、n-Turing NLG 530BGopherInstructGPTAlphaCodeGPT-NeoX-20BChinchillaPaLM(540B)DALLE 2Stable Diffusion(LDM-KL-8-G)OPT-175BJurassic-XImagenMinerva(540B)GLM-130BBLOOMSource:AI Index,2022|Chart:2023 AI Index ReportUnited StatesUnited KingdomChinaUnited States,United Kingdom,Germany,IndiaKoreaCanadaIsraelGe
250、rmanyIndeterminateTimeline and National Affiliation of Select Large Language and Multimodal Model ReleasesFigure 1.2.141111 While we were conducting the analysis to produce Figure 1.2.14,Irene Solaiman published a paper that has a similar analysis.We were not aware of the paper at the time of our re
251、search.1.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview60Artificial IntelligenceIndex Report 2023Parameter CountOver time,the number of parameters of newly released large language and multimodal models has massively increased.For example,GPT-2,which was the first
252、 large language and multimodal model released in 2019,only had 1.5 billion parameters.PaLM,launched by Google in 2022,had 540 billion,nearly 360 times more than GPT-2.The median number of parameters in large language and multimodal models is increasing exponentially over time(Figure 1.2.15).Chapter
253、1:Research and DevelopmentGPT-2Grover-MegaMegatron-LM(Original,8.3B)T5-3BT5-11BMeenaTuring NLGGPT-3 175B(davinci)ERNIE-GEN(large)DALL-EWu Dao-Wen YuanGPT-NeoPanGu-GPT-J-6BHyperClovaCogViewWu Dao 2.0ERNIE 3.0CodexJurassic-1-JumboMegatron-Turing NLG 530BGopherGPT-NeoX-20BChinchillaPaLM(540B)DALLE 2Sta
254、ble Di?usion(LDM-KL-8-G)OPT-175BJurassic-XMinerva(540B)GLM-130BBLOOM2019-Feb2019-May2019-Sep2019-Oct2020-Jan2020-Feb2020-May2020-Aug2021-Jan2021-Mar2021-Apr2021-May2021-Jun2021-Jul2021-Aug2021-Oct2021-Dec2022-Feb2022-Mar2022-Apr2022-May2022-Jun2022-Aug2022-Nov3.2e+81.0e+93.2e+91.0e+103.2e+101.0e+113
255、.2e+111.0e+123.2e+12Number of Parameters(Log Scale)Number of Parameters of Select Large Language and Multimodal Models,201922Source:Epoch,2022|Chart:2023 AI Index ReportFigure 1.2.151.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview61Artificial IntelligenceIndex Re
256、port 2023Training ComputeThe training compute of large language and multimodal models has also steadily increased(Figure 1.2.16).The compute used to train Minerva(540B),a large language and multimodal model released by Google in June 2022 that displayed impressive abilities on quantitative reasoning
257、 problems,was roughly nine times greater than that used for OpenAIs GPT-3,which was released in June 2022,and roughly 1839 times greater than that used for GPT-2(released February 2019).Chapter 1:Research and DevelopmentGPT-2Megatron-LM(Original,8.3B)T5-3BT5-11BMeenaTuring NLGGPT-3 175B(davinci)DALL
258、-EWu Dao-Wen YuanGPT-NeoPanGu-GPT-J-6BHyperClovaCogViewERNIE 3.0Jurassic-1-JumboMegatron-Turing NLG 530BGopherAlphaCodePaLM(540B)ChinchillaOPT-175BMinerva(540B)GLM-130BBLOOM2019-Feb2019-Sep2019-Oct2020-Jan2020-Feb2020-May2021-Jan2021-Mar2021-Apr2021-May2021-Jul2021-Aug2021-Oct2021-Dec2022-Feb2022-Ma
259、r2022-Apr2022-May2022-Jun2022-Aug2022-Nov1.0e+183.2e+181.0e+193.2e+191.0e+203.2e+201.0e+213.2e+211.0e+223.2e+221.0e+233.2e+231.0e+243.2e+24Training Compute(FLOP/s Log Scale)Training Compute(FLOP/s)of Select Large Language and Multimodal Models,201922Source:Epoch,2022|Chart:2023 AI Index ReportStable
260、 Diffusion GPT-NeoX-20BFigure 1.2.161.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview62Artificial IntelligenceIndex Report 2023Training CostA particular theme of the discourse around large language and multimodal models has to do with their hypothesized costs.Alth
261、ough AI companies rarely speak openly about training costs,it is widely speculated that these models cost millions of dollars to train and will become increasingly expensive with scale.This subsection presents novel analysis in which the AI Index research team generated estimates for the training co
262、sts of various large language and multimodal models(Figure 1.2.17).These estimates are based on the hardware and training time disclosed by the models authors.In cases where training time was not disclosed,we calculated from hardware speed,training compute,and hardware utilization efficiency.Given t
263、he possible variability of the estimates,we have qualified each estimate with the tag of mid,high,or low:mid where the estimate is thought to be a mid-level estimate,high where it is thought to be an overestimate,and low where it is thought to be an underestimate.In certain cases,there was not enoug
264、h data to estimate the training cost of particular large language and multimodal models,therefore these models were omitted from our analysis.The AI Index estimates validate popular claims that large language and multimodal models are increasingly costing millions of dollars to train.For example,Chi
265、nchilla,a large language model launched by DeepMind in May 2022,is estimated to have cost$2.1 million,while BLOOMs training is thought to have cost$2.3 million.Chapter 1:Research and DevelopmentFigure 1.2.170.051.971.470.111.800.230.020.090.430.270.010.1411.358.550.090.242.118.010.601.691.030.162.29
266、GPT-2T5-11BMeenaTuring NLGGPT-3 175BDALL-EWu Dao-Wen YuanGPT-NeoGPT-J-6BHyperClovaERNIE 3.0CodexMegatron-Turing NLG 530BGopherAlphaCodeGPT-NeoX-20BChinchillaPaLM(540B)Stable Di?usion(LDM-KL-8-G)OPT-175BMinerva(540B)GLM-130BBLOOM2019202020212022024681012MidHighLowEstimated Training Cost of Select Lar
267、ge Language and Multimodal ModelsSource:AI Index,2022|Chart:2023 AI Index ReportTraining Cost(in Millions of U.S.Dollars)12 See Appendix for the complete methodology behind the cost estimates.1.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview63Artificial Intelligen
268、ceIndex Report 2023There is also a clear relationship between the cost of large language and multimodal models and their size.As evidenced in Figures 1.2.18 and 1.2.19,the large language and multimodal models with a greater number of parameters and that train using larger amounts of compute tend to
269、be more expensive.Chapter 1:Research and DevelopmentFigure 1.2.18Figure 1.2.19BLOOMGLM-130BMinerva(540B)OPT-175BStable Di?usionPaLM(540B)ChinchillaGPT-NeoX-20BAlphaCodeGopherMegatron-Turing NLG 530BCodexERNIE 3.0HyperClovaGPT-J-6BGPT-NeoWu Dao-Wen YuanDALL-EGPT-3 175BTuring NLGMeenaT5-11BGPT-210k100
270、k1M10M1.0e+92.0e+95.0e+91.0e+102.0e+105.0e+101.0e+112.0e+115.0e+11Training Cost(in U.S.Dollars-Log Scale)Number of Parameters(Log Scale)Estimated Training Cost of Select Large Languageand Multimodal Models and Number of ParametersSource:AI Index,2022|Chart:2023 AI Index ReportBLOOMGLM-130BMinerva(54
271、0B)OPT-175BChinchillaGPT-NeoX-20BAlphaCodeGopherPaLM(540B)Megatron-Turing NLG 530BERNIE 3.0Stable DiffusionGPT-J-6BGPT-NeoWu Dao-Wen YuanDALL-ETuring NLGMeenaT5-11BGPT-210k100k1M10M1.0e+181.0e+201.0e+221.0e+24Training Cost(in U.S.Dollars-Log Scale)Training Compute(FLOP/s Log Scale)Estimated Training
272、 Cost of Select Large Language andMultimodal Models and Training Compute(FLOP/s)Source:AI Index,2022|Chart:2023 AI Index Report1.2 Trends in Significant Machine Learning Systems Table of ContentsChapter 1 Preview6420102011201220132014201520162017201820192020202120220102030405060708090Number of Atten
273、dees(in Thousands)59.45Number of Attendees at Select AI Conferences,201022 Source:AI Index,2022|Chart:2023 AI Index ReportArtificial IntelligenceIndex Report 2023Conference AttendanceAfter a period of increasing attendance,the total attendance at the conferences for which the AI Index collected data
274、 dipped in 2021 and again in 2022(Figure 1.3.1).13 This decline may be attributed to the fact that many conferences returned to hybrid or in-person formats after being fully virtual in 2020 and 2021.For example,the International Joint Conference on Artificial Intelligence(IJCAI)and the 1.3 AI Confer
275、encesInternational Conference on Principles of Knowledge Representation and Reasoning(KR)were both held strictly in-person.Neural Information Processing Systems(NeurIPS)continued to be one of the most attended conferences,with around 15,530 attendees(Figure 1.3.2).14 The conference with the greatest
276、 one-year increase in attendance was the International Conference on Robotics and Automation(ICRA),from 1,000 in 2021 to 8,008 in 2022.13 This data should be interpreted with caution given that many conferences in the last few years have had virtual or hybrid formats.Conference organizers report tha
277、t measuring the exact attendance numbers at virtual conferences is difficult,as virtual conferences allow for higher attendance of researchers from around the world.14 In 2021,9,560 of the attendees attended NeurIPS in-person and 5,970 remotely.AI conferences are key venues for researchers to share
278、their work and connect with peers and collaborators.Conference attendance is an indication of broader industrial and academic interest in a scientific field.In the past 20 years,AI conferences have grown in size,number,and prestige.This section presents data on the trends in attendance at major AI c
279、onferences.1.3 AI ConferencesChapter 1:Research and DevelopmentFigure 1.3.1Table of ContentsChapter 1 Preview652010201120122013201420152016201720182019202020212022051015202530Number of Attendees(in Thousands)3.56,AAAI4.32,IROS5.35,ICLR7.73,ICML8.01,ICRA10.17,CVPR15.53,NeurIPSAttendance at Large Conf
280、erences,201022Source:AI Index,2022|Chart:2023 AI Index Report20102011201220132014201520162017201820192020202120220.000.501.001.502.002.503.003.50Number of Attendees(in Thousands)0.12,KR0.39,ICAPS0.50,AAMAS0.66,UAI1.09,FaccT2.01,IJCAIAttendance at Small Conferences,201022Source:AI Index,2022|Chart:20
281、23 AI Index ReportArtificial IntelligenceIndex Report 20231.3 AI ConferencesChapter 1:Research and DevelopmentFigure 1.3.2Figure 1.3.3Table of ContentsChapter 1 Preview66201120122013201420152016201720182019202020212022050100150200250300350Number of AI Projects(in Thousands)348Number of GitHub AI Pro
282、jects,201122Source:GitHub,2022;OECD.AI,2022|Chart:2023 AI Index Report Artificial IntelligenceIndex Report 2023ProjectsA GitHub project is a collection of files that can include the source code,documentation,configuration files,and images that constitute a 1.4 Open-Source AI Softwaresoftware project
283、.Since 2011,the total number of AI-related GitHub projects has steadily increased,growing from 1,536 in 2011 to 347,934 in 2022.GitHub is a web-based platform where individuals and coding teams can host,review,and collaborate on various code repositories.GitHub is used extensively by software develo
284、pers to manage and share code,collaborate on various projects,and support open-source software.This subsection uses data provided by GitHub and the OECD.AI policy observatory.These trends can serve as a proxy for some of the broader trends occuring in the world of open-source AI software not capture
285、d by academic publication data.1.4 Open-Source AI SoftwareChapter 1:Research and DevelopmentFigure 1.4.1Table of ContentsChapter 1 Preview672011201220132014201520162017201820192020202120220%5%10%15%20%25%30%35%40%AI Projects(%of Total)2.40%,China14.00%,United States17.30%,European Union and United K
286、ingdom24.19%,India42.11%,Rest of the WorldGitHub AI Projects(%Total)by Geographic Area,201122Source:GitHub,2022;OECD.AI,2022|Chart:2023 AI Index Report Artificial IntelligenceIndex Report 2023As of 2022,a large proportion of GitHub AI projects were contributed by software developers in India(24.2%)(
287、Figure 1.4.2).The next most represented geographic area was the European Union and the United Kingdom(17.3%),and then the United States(14.0%).The share of American GitHub AI projects has been declining steadily since 2016.1.4 Open-Source AI SoftwareChapter 1:Research and DevelopmentFigure 1.4.2Tabl
288、e of ContentsChapter 1 Preview682011201220132014201520162017201820192020202120220.000.501.001.502.002.503.003.50Number of Cumulative GitHub Stars(in Millions)0.46,India1.53,China2.34,European Union and United Kingdom2.69,Rest of the World3.44,United StatesNumber of GitHub Stars by Geographic Area,20
289、1122Source:GitHub,2022;OECD.AI,2022|Chart:2023 AI Index Report Artificial IntelligenceIndex Report 2023StarsGitHub users can bookmark or save a repository of interest by“starring”it.A GitHub star is similar to a“like”on a social media platform and indicates support for a particular open-source proje
290、ct.Some of the most starred GitHub repositories include libraries like TensorFlow,OpenCV,Keras,and PyTorch,which are widely used by software developers in the AI coding community.Figure 1.4.3 shows the cumulative number of stars attributed to projects belonging to owners of various geographic areas.
291、As of 2022,GitHub AI projects from the United States received the most stars,followed by the European Union and the United Kingdom,and then China.In many geographic areas,the total number of new GitHub stars has leveled off in the last few years.1.4 Open-Source AI SoftwareChapter 1:Research and Deve
292、lopmentFigure 1.4.3Artificial IntelligenceIndex Report 2023CHAPTER 2:Technical PerformanceTable of ContentsChapter 2 Preview70Artificial IntelligenceIndex Report 2023Artificial IntelligenceIndex Report 2023Technical PerformanceCHAPTER 2 PREVIEW:70Table of ContentsOverview 72Chapter Highlights 732.1
293、Whats New in 2022:A Timeline 742.2 Computer VisionImage 81Image Classification 81 ImageNet 81Face Detection and Recognition 82 National Institute of Standards and Technology Face Recognition Vendor Test(FRVT)83Deepfake Detection 84 Celeb-DF 84Human Pose Estimation 85 MPII 85Semantic Segmentation 86
294、Cityscapes Challenge,Pixel-Level Semantic Labeling Task 86Medical Image Segmentation 87 Kvasir-SEG 87Object Detection 88 Common Objects in Context(COCO)88Image Generation 89 CIFAR-10 and STL-10 89 Narrative Highlight:A Closer Look at Progress in Image Generation 90Visual Reasoning 92 Visual Question
295、 Answering(VQA)Challenge 92 Narrative Highlight:The Rise of Capable Multimodal Reasoning Systems 93 Visual Commonsense Reasoning(VCR)952.3 Computer VisionVideo 96Activity Recognition 96 Kinetics-400,Kinetics-600,Kinetics-700 96 Narrative Highlight:A Closer Look at the Progress of Video Generation 98
296、2.4 Language 99English Language Understanding 99 SuperGLUE 99 Reading Comprehension Dataset Requiring Logical Reasoning(ReClor)100 Narrative Highlight:Just How Much Better Have Language Models Become?102 Narrative Highlight:Planning and Reasoning in Large Language Models 103Text Summarization 104 ar
297、Xiv and PubMed 104Table of ContentsChapter 2 Preview71Artificial IntelligenceIndex Report 2023Artificial IntelligenceIndex Report 2023Technical PerformanceCHAPTER 2 PREVIEW(CONTD):71Table of ContentsNatural Language Inference 105 Abductive Natural Language Inference(aNLI)105Sentiment Analysis 106 SS
298、T-5 Fine-Grained Classification 106Multitask Language Understanding 107 Massive Multitask Language Understanding(MMLU)107Machine Translation(MT)108 Number of Commercially Available MT Systems 1082.5 Speech 109Speech Recognition 109 VoxCeleb 109 Narrative Highlight:Whisper 1102.6 Reinforcement Learni
299、ng 112Reinforcement Learning Environments 112 Procgen 112 Narrative Highlight:Benchmark Saturation 1142.7 Hardware 115MLPerf Training Time 115MLPerf Inference 117Trends in GPUs 1182.8 Environment 120Environmental Impact of Select Large Language Models 120 Narrative Highlight:Using AI to Optimize Ene
300、rgy Usage 1222.9 AI for Science 123Accelerating Fusion Science Through Learned Plasma Control 123Discovering Novel Algorithms for Matrix Manipulation With AlphaTensor 123Designing Arithmetic Circuits With Deep Reinforcement Learning 124Unlocking de Novo Antibody Design With Generative AI 124ACCESS T
301、HE PUBLIC DATATable of ContentsChapter 2 Preview72Artificial IntelligenceIndex Report 2023Chapter 2 PreviewOverviewThis years technical performance chapter features analysis of the technical progress in AI during 2022.Building on previous reports,this chapter chronicles advancement in computer visio
302、n,language,speech,reinforcement learning,and hardware.Moreover,this year this chapter features an analysis on the environmental impact of AI,a discussion of the ways in which AI has furthered scientific progress,and a timeline-style overview of some of the most significant recent AI developments.Cha
303、pter 2:Technical PerformanceTable of ContentsChapter 2 Preview73Artificial IntelligenceIndex Report 2023Chapter HighlightsPerformance saturation on traditional benchmarks.AI continued to post state-of-the-art results,but year-over-year improvement on many benchmarks continues to be marginal.Moreover
304、,the speed at which benchmark saturation is being reached is increasing.However,new,more comprehensive benchmarking suites such as BIG-bench and HELM are being released.Generative AI breaks into the public consciousness.2022 saw the release of text-to-image models like DALL-E 2 and Stable Diffusion,
305、text-to-video systems like Make-A-Video,and chatbots like ChatGPT.Still,these systems can be prone to hallucination,confidently outputting incoherent or untrue responses,making it hard to rely on them for critical applications.AI systems become more flexible.Traditionally AI systems have performed w
306、ell on narrow tasks but have struggled across broader tasks.Recently released models challenge that trend;BEiT-3,PaLI,and Gato,among others,are single AI systems increasingly capable of navigating multiple tasks(for example,vision,language).AI is both helping and harming the environment.New research
307、 suggests that AI systems can have serious environmental impacts.According to Luccioni et al.,2022,BLOOMs training run emitted 25 times more carbon than a single air traveler on a one-way trip from New York to San Francisco.Still,new reinforcement learning models like BCOOLER show that AI systems ca
308、n be used to optimize energy usage.Chapter 2:Technical PerformanceCapable language models still struggle with reasoning.Language models continued to improve their generative capabilities,but new research suggests that they still struggle with complex planning tasks.The worlds best new scientist AI?A
309、I models are starting to rapidly accelerate scientific progress and in 2022 were used to aid hydrogen fusion,improve the efficiency of matrix manipulation,and generate new antibodies.AI starts to build better AI.Nvidia used an AI reinforcement learning agent to improve the design of the chips that p
310、ower AI systems.Similarly,Google recently used one of its language models,PaLM,to suggest ways to improve the very same model.Self-improving AI learning will accelerate AI progress.Table of ContentsChapter 2 Preview74Artificial IntelligenceIndex Report 2023DeepMind Releases AlphaCode AlphaCode,an AI
311、 system that writes computer programs at a competitive level,achieves a rank within the top 54%of participants in a human programming competition.This represents an improvement on the more complex problem-solving tasks with which AI has traditionally struggled.DeepMind Trains Reinforcement Learning
312、Agent to Control Nuclear Fusion Plasma in a TokamakNuclear fusion is a potential source of clean,limitless energy,but producing such energy in tokamaks is difficult due to a lack of experimental data.DeepMind simulated optimal tokamak management,an example of how AI can accelerate science and combat
313、 climate change.IndicNLG Benchmarks Natural Language Generation for Indic LanguagesAn international research collective launches IndicNLG,a collection of datasets for benchmarking natural language generation for 11 Indic languages.The creation of IndicNLG increases the potential for AI systems to ge
314、nerate language in more diverse,non-English linguistic settings.Artificial IntelligenceIndex Report 20232.1 Whats New in 2022:A Timeline2.1 Whats New in 2022:A TimelineChapter 2:Technical PerformanceFigure 2.1.1Figure 2.1.2Figure 2.1.3The technical performance chapter begins with an overview of some
315、 of the most significant technical developments in AI during 2022,as selected by the AI Index Steering Committee.Feb.2,2022Feb.16,2022March 10,2022Table of ContentsChapter 2 Preview75Artificial IntelligenceIndex Report 2023Meta AI Releases Make-A-Scene Make-A-Scene is a text-to-image AI model that e
316、nables users to generate images through text.Make-A-Scene is one of many text-to-image models released in 2022.Google Releases PaLMGoogles AI team trains one of the worlds largest language models,PaLM.Made up of 540 billion parameters,PaLM reinforces the belief that researchers can improve performan
317、ce on large language models by simply training them on more data.OpenAI Releases DALL-E 2DALL-E 2,a text-to-image AI system that can create realistic art and images from textual descriptions,is released to the public,igniting a generative AI craze.DeepMind Launches GatoGato is a new reinforcement le
318、arning agent capable of doing a wide range of tasks such as robotic manipulation,game playing,image captioning,and natural language generation.The release of such models suggests that AI systems are becoming better at generalization.Artificial IntelligenceIndex Report 20232.1 Whats New in 2022:A Tim
319、elineChapter 2:Technical PerformanceFigure 2.1.4Figure 2.1.5Figure 2.1.6Figure 2.1.7March 24,2022April 5,2022April 13,2022May 12,2022Table of ContentsChapter 2 Preview76Artificial IntelligenceIndex Report 2023Google Releases ImagenImagen is a text-to-image diffusion model capable of producing images
320、 with a high degree of photorealism.Imagens launch also comes with the release of DrawBench,a challenging new benchmark for text-to-image systems.442 Authors Across 132 Institutions Team Up to Launch BIG-benchIn order to better challenge increasingly capable large language models,a team of 442 autho
321、rs across 132 institutions launch the Beyond the Imitation Game benchmark(BIG-bench).The benchmark consists of 204 tasks ranging from linguistics,childhood development,math,common-sense reasoning,biology,physics,social bias,and software development.GitHub Makes Copilot Available as a Subscription-Ba
322、sed Service for Individual DevelopersCopilot is a generative AI system capable of turning natural language prompts into coding suggestions across multiple languages.Similar systems include OpenAIs Codex and Salesforces CodeGen.Surveys suggest that Copilot makes coders more productive and less frustr
323、ated.Artificial IntelligenceIndex Report 20232.1 Whats New in 2022:A TimelineChapter 2:Technical PerformanceFigure 2.1.8Figure 2.1.9Figure 2.1.10May 23,2022June 9,2022June 21,2022Table of ContentsChapter 2 Preview77Artificial IntelligenceIndex Report 2023Nvidia Uses Reinforcement Learning to Design
324、Better-Performing GPUsNvidia uses its AI systems to improve the performance of its latest H100 class of GPU chips.GPUs being essential to AI training,this is one example of how AI is starting to develop better AI.Meta Announces No Language Left BehindNo Language Left Behind(NLLB)is a family of model
325、s that can translate across 200 distinct languages.NLLB is one of the first systems that can perform well across a wide range of low-resource languages like Kamba and Lao.Tsinghua Researchers Launch GLM-130BChinese researchers affiliated with Tsinghua University release GLM-130B,a large language mod
326、el that outperforms others such as Metas OPT,Hugging Faces BLOOM,and OpenAIs original GPT-3.Stability AI Releases Stable DiffusionStable Diffusion is an open-source text-to-image diffusion-based model,meaning users can freely use the model weights to generate their own images.Stable Diffusion is tra
327、ined on existing images created by humans and gives no credit or acknowledgment,leaving open questions around the ethical use of image generators.Artificial IntelligenceIndex Report 20232.1 Whats New in 2022:A TimelineChapter 2:Technical PerformanceFigure 2.1.11Figure 2.1.12Figure 2.1.13Figure 2.1.1
328、4July 8,2022July 11,2022Aug 4,2022Aug 22,2022Table of ContentsChapter 2 Preview78Artificial IntelligenceIndex Report 2023OpenAI Launches WhisperWhisper is a large-scale speech-recognition system trained on roughly 700,000 hours of audio data and capable of respectable performance on various speech r
329、ecognition tasks.The fact that Whisper required neither supervised pre-training nor unsupervised training with fine-tuning yet was able to achieve strong performance by merely increasing training data further validates the approach of increasingly scaling AI models.Meta Releases Make-A-VideoMake-A-V
330、ideo is a system that allows users to create videos from short text descriptions.The quality of the videos is high and again demonstrates the validity of the scaling approach.DeepMind Launches AlphaTensorAlphaTensor is an AI reinforcement-learning-based system able to discover new and efficient algo
331、rithms for matrix manipulation.Matrix manipulation is essential to a wide range of digital practices and is a process that researchers have been trying to make more efficient for decades.Artificial IntelligenceIndex Report 20232.1 Whats New in 2022:A TimelineChapter 2:Technical PerformanceFigure 2.1
332、.15Figure 2.1.16Figure 2.1.17Sept 21,2022Sept 29,2022Oct 5,2022Table of ContentsChapter 2 Preview79Artificial IntelligenceIndex Report 2023Google Uses PaLM to Improve the Reasoning of PaLMGoogle researchers use one of their existing language models,PaLM,to improve the reasoning of the very same mode
333、l.This process is yet another example of AI systems using their own knowledge to improve.International Research Group Releases BLOOMA collaboration of over 100 researchers from across the globe develop an open-access language model called BLOOM.BLOOM impresses with its public release and for furthering the possibilities of international collaboration in AI research.Stanford Researchers Release HEL