《stateofai:2022人工智能全景報告(英文版)(114頁).pdf》由會員分享,可在線閱讀,更多相關《stateofai:2022人工智能全景報告(英文版)(114頁).pdf(114頁珍藏版)》請在三個皮匠報告上搜索。
1、State of AI ReportOctober 11,2022#stateofaistateof.aiIan HogarthNathan BenaichAbout the authorsNathan is the General Partner of Air Street Capital,a venture capital firm investing in AI-first technology and life science companies.He founded RAAIS and London.AI(AI community for industry and research)
2、,the RAAIS Foundation(funding open-source AI projects),and Spinout.fyi(improving university spinout creation).He studied biology at Williams College and earned a PhD from Cambridge in cancer research.Nathan BenaichIan Hogarth Ian is a co-founder at Plural,an investment platform for experienced found
3、ers to help the most ambitious European startups.He is a Visiting Professor at UCL working with Professor Mariana Mazzucato.Ian was co-founder and CEO of Songkick,the concert service.He started studying machine learning in 2005 where his Masters project was a computer vision system to classify breas
4、t cancer biopsy images.Introduction|Research|Industry|Politics|Safety|Predictions#stateofai|2stateof.ai 2022Artificial intelligence(AI)is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.We believe that AI will be a force multiplier on technological p
5、rogress in our increasingly digital,data-driven world.This is because everything around us today,ranging from culture to consumer products,is a product of intelligence.The State of AI Report is now in its fifth year.Consider this report as a compilation of the most interesting things weve seen with
6、a goal of triggering an informed conversation about the state of AI and its implication for the future.We consider the following key dimensions in our report:-Research:Technology breakthroughs and their capabilities.-Industry:Areas of commercial application for AI and its business impact.-Politics:R
7、egulation of AI,its economic implications and the evolving geopolitics of AI.-Safety:Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.-Predictions:What we believe will happen in the next 12 months and a 2021 performance review to keep us honest.Pro
8、duced by Nathan Benaich(nathanbenaich),Ian Hogarth(soundboy),Othmane Sebbouh(osebbouh)and Nitarshan Rajkumar(nitarshan).stateof.ai 2022#stateofai|3 Introduction|Research|Industry|Politics|Safety|PredictionsThank you!Othmane SebbouhResearch AssistantOthmane is a PhD student in ML at ENS Paris,CREST-E
9、NSAE and CNRS.He holds an MsC in management from ESSEC Business School and a Master in Applied Mathematics from ENSAE and Ecole Polytechnique.#stateofai|4Nitarshan RajkumarResearch AssistantNitarshan is a PhD student in AI at the University of Cambridge.He was a research student at Mila and a softwa
10、re engineer at Airbnb.He holds a BSc from University of Waterloo.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Definitionsstateof.ai 2022#stateofai|5Artificial intelligence(AI):a broad discipline with the goal of creating intelligent machines,as opposed to the natural inte
11、lligence that is demonstrated by humans and animals.Artificial general intelligence(AGI):a term used to describe future machines that could match and then exceed the full range of human cognitive ability across all economically valuable tasks.AI Safety:a field that studies and attempts to mitigate t
12、he catastrophic risks which future AI could pose to humanity.Machine learning(ML):a subset of AI that often uses statistical techniques to give machines the ability to learn from data without being explicitly given the instructions for how to do so.This process is known as“training”a“model”using a l
13、earning“algorithm”that progressively improves model performance on a specific task.Reinforcement learning(RL):an area of ML in which software agents learn goal-oriented behavior by trial and error in an environment that provides rewards or penalties in response to their actions(called a“policy”)towa
14、rds achieving that goal.Deep learning(DL):an area of ML that attempts to mimic the activity in layers of neurons in the brain to learn how to recognise complex patterns in data.The“deep”refers to the large number of layers of neurons in contemporary models that help to learn rich representations of
15、data to achieve better performance gains.Introduction|Research|Industry|Politics|Safety|PredictionsDefinitionsstateof.ai 2022#stateofai|6Model:once a ML algorithm has been trained on data,the output of the process is known as the model.This can then be used to make predictions.Self-supervised learni
16、ng(SSL):a form of unsupervised learning,where manually labeled data is not needed.Raw data is instead modified in an automated way to create artificial labels to learn from.An example of SSL is learning to complete text by masking random words in a sentence and trying to predict the missing ones.(La
17、rge)Language model(LM,LLM):a model trained on textual data.The most common use case of a LM is text generation.The term“LLM”is used to designate multi-billion parameter LMs,but this is a moving definition.Computer vision(CV):enabling machines to analyse,understand and manipulate images and video.Tra
18、nsformer:a model architecture at the core of most state of the art(SOTA)ML research.It is composed of multiple“attention”layers which learn which parts of the input data are the most important for a given task.Transformers started in language modeling,then expanded into computer vision,audio,and oth
19、er modalities.Introduction|Research|Industry|Politics|Safety|PredictionsResearch-Diffusion models took the computer vision world by storm with impressive text-to-image generation capabilities.-AI attacks more science problems,ranging from plastic recycling,nuclear fusion reactor control,and natural
20、product discovery.-Scaling laws refocus on data:perhaps model scale is not all that you need.Progress towards a single model to rule them all.-Community-driven open sourcing of large models happens at breakneck speed,empowering collectives to compete with large labs.-Inspired by neuroscience,AI rese
21、arch are starting to look like cognitive science in its approaches.Industry-Have upstart AI semiconductor startups made a dent vs.NVIDIA?Usage statistics in AI research shows NVIDIA ahead by 20-100 x.-Big tech companies expand their AI clouds and form large partnerships with A(G)I startups.-Hiring f
22、reezes and the disbanding of AI labs precipitates the formation of many startups from giants including DeepMind and OpenAI.-Major AI drug discovery companies have 18 clinical assets and the first CE mark is awarded for autonomous medical imaging diagnostics.-The latest in AI for code research is qui
23、ckly translated by big tech and startups into commercial developer tools.Politics-The chasm between academia and industry in large scale AI work is potentially beyond repair:almost 0%of work is done in academia.-Academia is passing the baton to decentralized research collectives funded by non-tradit
24、ional sources.-The Great Reshoring of American semiconductor capabilities is kicked off in earnest,but geopolitical tensions are sky high.-AI continues to be infused into a greater number of defense product categories and defense AI startups receive even more funding.Safety-AI Safety research is see
25、ing increased awareness,talent,and funding,but is still far behind that of capabilities research.Executive Summarystateof.ai 2022#stateofai|7 Introduction|Research|Industry|Politics|Safety|PredictionsScorecard:Reviewing our predictions from 2021stateof.ai 2022#stateofai|8 Introduction|Research|Indus
26、try|Politics|Safety|PredictionsOur 2021 PredictionGradeEvidenceTransformers replace RNNs to learn world models with which RL agents surpass human performance in large and rich games.YesDeepMinds Gato model makes progress in this direction in which a transformer predicts the next state and action,but
27、 it is not trained with RL.University of Genevas GPT-like transformer model IRIS solves tasks in Atari environments.ASMLs market cap reaches$500B.NoCurrent market cap is circa$165B(3 Oct 2022)Anthropic publishes on the level of GPT,Dota,AlphaGo to establish itself as a third pole of AGI research.NoN
28、ot yet.A wave of consolidation in AI semiconductors with at least one of Graphcore,Cerebras,SambaNova,Groq,or Mythic being acquired by a large technology company or major semiconductor incumbent.NoNo new announced AI semiconductor consolidation has happened yet.Small transformers+CNN hybrid models m
29、atch current SOTA on ImageNet top-1 accuracy(CoAtNet-7,90.88%,2.44B params)with 10 x fewer parameters.YesMaxViT from Google with 475M parameters almost matched(89.53%)CoAtNet-7s performance(90.88%)on ImageNet top-1 accuracy.DeepMind shows a major breakthrough in the physical sciences.YesThree(!)Deep
30、Mind papers in mathematics and material science.The JAX framework grows from 1%to 5%of monthly repos created as measured by Papers With Code.NoJAX usage still accounts for$100M in the next 12 months.YesHelsing(Germany)raised$100M Series A in 2022.2020NVIDIA does not end up completing its acquisition
31、 of Arm.YesDeal is formally cancelled in 2022.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Section 1:Researchstateof.ai 2022#stateofai|11 Introduction|Research|Industry|Politics|Safety|Predictions In 2021,we predicted:“DeepMind releases a major research breakthrough in th
32、e physical sciences.”The company has since made significant advancements in both mathematics and materials science.stateof.ai 2022 One of the decisive moments in mathematics is formulating a conjecture,or a hypothesis,on the relationship between variables of interest.This is often done by observing
33、a large number of instances of the values of these variables,and potentially using data-driven conjecture generation methods.But these are limited to low-dimensional,linear,and generally simple mathematical objects.#stateofai|122021 Prediction:DeepMinds breakthroughs in the physical sciences(1/3)In
34、a Nature article,DeepMind researchers proposed an iterative workflow involving mathematicians and a supervised ML model(typically a NN).Mathematicians hypothesize a function relating two variables(input X(z)and output Y(z).A computer generates a large number of instances of the variables and a NN is
35、 fit to the data.Gradient saliency methods are used to determine the most relevant inputs in X(z).Mathematicians can turn refine their hypothesis and/or generate more data until the conjecture holds on a large amount of data.Introduction|Research|Industry|Politics|Safety|Predictions In 2021,we predi
36、cted:“DeepMind releases a major research breakthrough in the physical sciences.”The company has since made significant advancements in both mathematics and materials science.stateof.ai 2022 DeepMind researchers used their framework in a collaboration with mathematics professors from the University o
37、f Sydney and the University of Oxford to(i)propose an algorithm that could solve a 40 years-long standing conjecture in representation theory and(ii)prove a new theorem in the study of knots.#stateofai|13 DeepMind made an important contribution in materials science as well.It showed that the exact f
38、unctional in Density Functional Theory,an essential tool to compute electronic energies,can be efficiently approximated using a neural network.Notably,instead of constraining the neural network to verify mathematical constraints of the DFT functional,researchers simply incorporate them into the trai
39、ning data to which they fit the NN.2021 Prediction:DeepMinds breakthroughs in the physical sciences(2/3)Introduction|Research|Industry|Politics|Safety|Predictions In 2021,we predicted:“DeepMind releases a major research breakthrough in the physical sciences.”The company has since made significant ad
40、vancements in both mathematics and materials science.stateof.ai 2022 DeepMind repurposed AlphaZero(their RL model trained to beat the best human players of Chess,Go and Shogi)to do matrix multiplication.This AlphaTensor model was able to find new deterministic algorithms to multiply two matrices.To
41、use AlphaZero,the researchers recast the matrix multiplication problem as a single-player game where each move corresponds to an algorithm instruction and the goal is to zero-out a tensor measuring how far from correct the predicted algorithm is.Finding faster matrix multiplication algorithms,a seem
42、ingly simple and well-studied problem,has been stale for decades.DeepMinds approach not only helps speed up research in the field,but also boosts matrix multiplication based technology,that is AI,imaging,and essentially everything happening on our phones.#stateofai|142021 Prediction:DeepMinds breakt
43、hroughs in the physical sciences(3/3)Introduction|Research|Industry|Politics|Safety|Predictions A popular route to achieving nuclear fusion requires confining extremely hot plasma for enough time using a tokamak.A major obstacle is that the plasma is unstable,loses heat and degrades materials when i
44、t touches the tokamaks walls.Stabilizing it requires tuning the magnetic coils thousands of times per second.DeepMinds deep RL system did just that:first in a simulated environment and then when deployed in the TCV in Lausanne.The system was also able to shape the plasma in new ways,including making
45、 it compatible with ITERs design.DeepMind trained a reinforcement learning system to adjust the magnetic coils of Lausannes TCV(Variable Configuration tokamak).The systems flexibility means it could also be used in ITER,the promising next generation tokamak under construction in France.stateof.ai 20
46、22#stateofai|15Reinforcement learning could be a core component of the next fusion breakthrough Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Predicting the structure of the entire known proteome:what could this unlock next?Since its open sourcing,DeepMinds AlphaFold 2 has
47、 been used in hundreds of research papers.The company has now deployed the system to predict the 3D structure of 200 million known proteins from plants,bacteria,animals and other organisms.The extent of the downstream breakthroughs enabled by this technology-ranging from drug discovery to basic scie
48、nce-will need a few years to materialize.There are 190k empirically determined 3D structures in the Protein Data Bank today.These have been derived through X-Ray crystallography and cryogenic electron microscopy.The first release of AlphaFold DB in July 20221 included 1M predicted protein structures
49、.This new release 200 xs the database size.Over 500,000 researchers from 190 countries have made use of the database.AlphaFold mentions in AI research literature is growing massively and is predicted to triple year on year(right chart).#stateofai|16 Introduction|Research|Industry|Politics|Safety|Pre
50、dictions This is because ESMFold doesnt rely on the use of multiple sequence alignments(MSA)and templates like AlphaFold 2 and RoseTTAFold,and instead only uses protein sequences.stateof.ai 2022Researchers independently applied language models to the problems of protein generation and structure pred
51、iction while scaling model parameter.They both report large benefits from scaling their models.Language models for proteins:a familiar story of open source and scaled models Salesforce researchers find that scaling their LMs allows them to better capture the training distribution of protein sequence
52、s(as measured by perplexity).Using the 6B param ProGen2,they generated proteins with similar folds to natural proteins,but with a substantially different sequence identity.But to unlock the full potential of scale,the authors insist that more emphasis be placed on data distribution.#stateofai|17 Met
53、a et al.introduced the ESM family of protein LMs,whose sizes range from 8M to 15B(dubbed ESM-2)parameters.Using ESM-2,they build ESMFold to predict protein structure.They show that ESMFold produces similar predictions to AlphaFold 2 and RoseTTAFold,but is an order of magnitude faster.Introduction|Re
54、search|Industry|Politics|Safety|Predictionsstateof.ai 2022Researchers used CRISPR-based endogenous tagging modifying genes by illuminating specific aspects of the proteins function to determine protein localization in cells.They then used clustering algorithms to identify protein communities and for
55、mulate mechanistic hypotheses on uncharacterized proteins.OpenCell:understanding protein localization with a little help from machine learning An important goal of genomic research is to understand where proteins localize and how they interact in a cell to enable particular functions.With its datase
56、t of 1,310 tagged proteins across 5,900 3D images,the OpenCell initiative enabled researchers to draw important links between spatial distribution of proteins,their functions,and their interactions.Markov clustering on the graph of protein interactions successfully delineated functionally related pr
57、oteins.This will help researchers better understand so-far uncharacterized proteins.We often expect ML to deliver definitive predictions.But here as with math,ML first gives partial answers(here clusters),humans then interpret,formulate and test hypotheses,before delivering a definitive answer.#stat
58、eofai|18 Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Researchers from UT Austin engineered an enzyme capable of degrading PET,a type of plastic responsible for 12%of global solid waste.Plastic recycling gets a much-needed ML-engineered enzyme The PET hydrolase,called FAS
59、T-PETase,is more robust to different temperatures and pH levels than existing ones.FAST-PETase was able to almost completely degrade 51 different products in 1 week.#stateofai|19 They also showed that they could resynthesize PET from monomers recovered from FAST-PETase degradation,potentially openin
60、g the way for industrial scale closed-loop PET recycling.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022With the increased use of ML in quantitative sciences,methodological errors in ML can leak to these disciplines.Researchers from Princeton warn of a growing reproducibili
61、ty crisis in ML-based science driven in part by one such methodological error:data leakage.Beware of compounded errors:in science,ML in and garbage out?Data leakage is an umbrella term covering all cases where data that shouldnt be available to a model in fact is.The most common example is when test
62、 data is included in the training set.But the leakage can be more pernicious:when the model uses features that are a proxy of the outcome variable or when test data come from a distribution which is different from the one about which the scientific claim is made.#stateofai|20 The authors argue that
63、the ensuing reproducibility failures in ML-based science are systemic:they study 20 reviews across 17 science fields examining errors in ML-based science and find that data leakage errors happened in every one of the 329 papers the reviews span.Inspired by the increasingly popular model cards in ML,
64、the authors propose that researchers use model info sheets designed to prevent data leakage issues.Introduction|Research|Industry|Politics|Safety|Predictions OpenAI gathered 2,000 hours of video labeled with mouse and keyboard actions and trained an inverse dynamics model(IDM)to predict actions give
65、n past and future frames this is the PreTraining part.They then used the IDM to label 70 hours of video on which they trained a model to predict actions given only past video frames.They show that the model can be fine-tuned with imitation learning and reinforcement learning(RL)to achieve a performa
66、nce which is too hard to reach using RL from scratch.stateof.ai 2022OpenAI trained a model(Video PreTraining,VPT)to play Minecraft from video frames using a small amount of labeled mouse and keyboard interactions.VPT is the first ML model to learn to craft diamonds,“a task that usually takes profici
67、ent humans over 20 minutes(24,000 actions)”.#stateofai|21OpenAI uses Minecraft as a testbed for computer-using agents Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022OpenAIs Codex,which drives GitHub Copilot,has impressed the computer science community with its ability to co
68、mplete code on multiple lines or directly from natural language instructions.This success spurred more research in this space,including from Salesforce,Google and DeepMind.With the conversational CodeGen,Salesforce researchers leverage the language understanding of LLMs to specify coding requirement
69、s in multiturn language interactions.It is the only open source model to be competitive with Codex.A more impressive feat was achieved by Googles LLM PaLM,which achieves a similar performance to Codex,but with 50 x less code in its training data(PaLM was trained on a larger non-code dataset).When fi
70、ne-tuned on Python code,PaLM outperformed(82%vs.71.7%SOTA)peers on Deepfix,a code repair task.DeepMinds AlphaCode tackles a different problem:the generation of whole programs on competitive programming tasks.It ranked in the top half on Codeforces,a coding competitions platform.It was pre-trained on
71、 GitHub data and fine-tuned on Codeforces problems and solutions.Millions of possible solutions are then sampled,filtered,and clustered to obtain 10 final candidate submissions.#stateofai|22Corporate AI labs rush into AI for code research Introduction|Research|Industry|Politics|Safety|Predictionssta
72、teof.ai 2022The attention layer at the core of the transformer model famously suffers from a quadratic dependence on its input.A slew of papers promised to solve this,but no method has been adopted.SOTA LLMs come in different flavors(autoencoding,autoregressive,encoder-decoders),yet all rely on the
73、same attention mechanism.Five years after the Transformer,there must be some efficient alternative,right right?A Googol of transformers have been trained over the past few years,costing millions(billions?)to labs and companies around the world.But so-called“Efficient Transformers”are nowhere to be f
74、ound in large-scale LM research(where they would make the biggest difference!).GPT-3,PaLM,LaMDA,Gopher,OPT,BLOOM,GPT-Neo,Megatron-Turing NLG,GLM-130B,etc.all use the original attention layer in their transformers.Several reasons can explain this lack of adoption:(i)the potential linear speed-up is o
75、nly useful for large input sequences,(ii)the new methods introduce additional constraints that make the architectures less universal,(iii)the reported efficiency measures dont translate in actual computational cost and time savings.#stateofai|23 Introduction|Research|Industry|Politics|Safety|Predict
76、ionsstateof.ai 2022Built on Googles 540B parameter LM PaLM,Googles Minerva achieves a 50.3%score on the MATH benchmark(43.4 pct points better than previous SOTA),beating forecasters expectations for best score in 2022(13%).Meanwhile,OpenAI trained a network to solve two mathematical olympiad problem
77、s(IMO).Google trained its(pre-trained)LLM PaLM on an additional 118GB dataset of scientific papers from arXiv and web pages using LaTeX and MathJax.Using chain of thought prompting(including intermediate reasoning steps in prompts rather than the final answer only)and other techniques like majority
78、voting,Minerva improves the SOTA on most datasets by at least double digit pct points.Minerva only uses a language model and doesnt explicitly encode formal mathematics.It is more flexible but can only be automatically evaluated on its final answer rather than its whole reasoning,which might justify
79、 some score inflation.In contrast,OpenAI built a(transformer-based)theorem prover built in the Lean formal environment.Different versions of their model were able to solve a number of problems from AMC12(26),AIME(6)and IMO(2)(increasing order of difficulty).#stateofai|24Mathematical abilities of Lan
80、guage Models largely surpass expectations Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Only 66%of machine learning benchmarks have received more than 3 results at different time points,and many are solved or saturated soon after their release.BIG(Beyond the Imitation Game
81、),a new benchmark designed by 444 authors across 132 institutions,aims to challenge current and future language models.A study from the University of Vienna,Oxford,and FHI examined 1,688 benchmarks for 406 AI tasks and identified different submission dynamics(see right).They note that language bench
82、marks in particular tend to be quickly saturated.#stateofai|25Fast progress in LLM research renders benchmarks obsolete,but a BIG one comes to help Rapid LLM progress and emerging capabilities seem to outrun current benchmarks.As a result,much of this progress is only captured through circumstantial
83、 evidence like demos or one-off breakthroughs,and/or evaluated on disparate dedicated benchmarks,making it difficult to identify actual progress.The new BIG benchmark contains 204 tasks,all with strong human expert baselines,which evaluate a large set of LLM capabilities from memorization to multi-s
84、tep reasoning.They show that,for now,even the best models perform poorly on the BIG benchmark.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022DeepMind revisited LM scaling laws and found that current LMs are significantly undertrained:theyre not trained on enough data given
85、their large size.They train Chinchilla,a 4x smaller version of their Gopher,on 4.6x more data,and find that Chinchilla outperforms Gopher and other large models on BIG-bench.Empirical LM scaling laws determine,for a fixed compute budget,the model and training data sizes that should be used.Past work
86、 from OpenAI had established that model size should increase faster than training data size as the compute budget increases.DeepMind claims that the model size and the number of training tokens should instead increase at roughly the same rate.Compared to OpenAIs work,DeepMind uses larger models to d
87、erive their scaling laws.They emphasize that data scaling leads to better predictions from multibillion parameter models.#stateofai|26Ducking language model scaling laws:more data please Following these new scaling laws,Chinchilla(70B params)is trained on 1.4T tokens.Gopher(230B)on 300B.Though train
88、ed with the same compute budget,the lighter Chinchilla should be faster to run.Introduction|Research|Industry|Politics|Safety|Predictions Emergence is not fully understood:it could be that for multi-step reasoning tasks,models need to be deeper to encode the reasoning steps.For memorization tasks,ha
89、ving more parameters is a natural solution.The metrics themselves may be part of the explanation,as an answer on a reasoning task is only considered correct if its conclusion is.Thus despite continuous improvements with model size,we only consider a model successful when increments accumulate past a
90、 certain point.A possible consequence of emergence is that there are a range of tasks that are out of reach of current LLMs that could soon be successfully tackled.Alternatively,deploying LLMs on real-world tasks at larger scales is more uncertain as unsafe and undesirable abilities can emerge.Along
91、side the brittle nature of ML models,this is another feature practitioners will need to account for.While model loss can be reasonably predicted as a function of size and compute using well-calibrated scaling laws,many LLM capabilities emerge unpredictably when models reach a critical size.These acq
92、uired capabilities are exciting,but the emergence phenomenon makes evaluating model safety more difficult.stateof.ai 2022#stateofai|27Ducking language model scaling laws:emergenceArithmeticsTransliterationMulti-task NLUFig.of speechTraining FLOPs Introduction|Research|Industry|Politics|Safety|Predic
93、tionsstateof.ai 2022Language models can learn to use tools such as search engines and calculators,simply by making available text interfaces to these tools and training on a very small number of human demonstrations.OpenAIs WebGPT was the first model to demonstrate this convincingly by fine-tuning G
94、PT-3 to interact with a search engine to provide answers grounded with references.This merely required collecting data of humans doing this task and converting the interaction data into text that the model could consume for training by standard supervised learning.Importantly,the use of increasing a
95、mounts of human demonstration data significantly increased the truthfulness and informativeness of answers(right panel,white bars for WebGPT),a significant advance from when we covered truthfulness evaluation in our 2021 report(slide 44).Adept,a new AGI company,is commercializing this paradigm.The c
96、ompany trains large transformer models to interact with websites,software applications and APIs(see more at adept.ai/act)in order to drive workflow productivity.#stateofai|28Teach a machine to fish:tool use as the next frontier?Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 202
97、2A study documents the incredible acceleration of compute requirements in machine learning.It identifies 3 eras of machine learning according to training compute per model doubling time.The Pre-Deep Learning Era(pre-2010,training compute doubled every 20 months),the Deep Learning Era(2010-15,doublin
98、g every 6 months),and the Large-Scale Era(2016-present,a 100-1000 x jump,then doubling every 10 months).#stateofai|29Looking back:three eras of compute in machine learning Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022When we covered diffusion models in the 2021 Report(sli
99、de 36),they were overtaking GANs in image generation on a few benchmarks.Today,they are now the undisputable SOTA for text-to-image generation,and are diffusing(pun intended)into text-to-video,text generation,audio,molecular design and more.Diffusion models(DMs)learn to reverse successive noise addi
100、tions to images by modeling the inverse distribution(generating denoised images from noisy ones)at each step as a Gaussian whose mean and covariance are parametrized as a neural network.DMs generate new images from random noise.Sequential denoising makes them slow,but new techniques(like denoising i
101、n a lower-dimensional space)allow them to be faster at inference time and to generate higher-quality samples(classifier-free guidance trading off diversity for fidelity).SOTA text-to-image models like DALL-E 2,Imagen and Stable Diffusion are based on DMs.Theyre also used in controllable text generat
102、ion(generating text with a pre-defined structure or semantic context),model-based reinforcement learning,video generation and even molecular generation.#stateofai|30Diffusion models take over text-to-image generation and expand into other modalities Introduction|Research|Industry|Politics|Safety|Pre
103、dictionsstateof.ai 2022The second iteration of OpenAIs DALL-E,released in April 2022,came with a significant jump in the quality of generated images.Soon after,another at least equally impressive diffusion-based model came from Google(Imagen).Meanwhile,Googles Parti took a different,autoregressive,r
104、oute.DALL-E 2,Imagen and Partithe battle for text-to-image generation rages Instead of using a diffusion model,Parti treats text-to-image generation as a simple sequence-to-sequence task,where the sequence to be predicted is a representation of the pixels of the image.Notably,as the number of parame
105、ters and training data in Parti are scaled,the model acquires new abilities like spelling.Other impressive text-to-image models include GLIDE(OpenAI)and Make-a-Scene(Meta can use both text and sketches),which predate DALL-E 2,and CogView2(Tsinghua,BAAI both English and Chinese).#stateofai|31DALL-E 2
106、ImagenParti-350M Parti-20B Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Stability.ai and Midjourney came out of seemingly nowhere with text-to-image models that rival those of established AI labs.Both have APIs in beta,Midjourney is reportedly profitable,and Stability has
107、 already open-sourced their model.But more on their emergence and research dynamics in our Politics section.#stateofai|32The text-to-image diffusion model frenzy gives birth to new AI labsImage Credits:Fabian Stelzer Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022The text-t
108、o-video generation race has startedResearch on diffusion-based text-to-video generation was kicked-off around April 2022,with work from Google and the University of British Columbia.But in late September,new research from Meta and Google came with a jump in quality,announcing a sooner-than-expected
109、DALL-E moment for text-to-video generation.Meta made the first splash from Big Tech in text-to-video generation by releasing Make-a-Video,a diffusion model for video generation.In an eerily similar fashion to text-to-image generation,Google then published(less than a week later)almost simultaneously
110、 two models:one diffusion-model based,Imagen,and another non diffusion-model based,Phenaki.The latter can dynamically adapt the video via additional prompts.#stateofai|33Imagen VideoMake-a-VideoPhenaki Introduction|Research|Industry|Politics|Safety|PredictionsClosed for 14 months:community-driven op
111、en sourcing of GPT et al.Landmark models from OpenAI and DeepMind have been implemented/cloned/improved by the open source community much faster than wed have expected.stateof.ai 2022#stateofai|34GPT-3(175B)Pan-Gu(200B)HyperCLOVA(204B)Jurassic-1 Jumbo(204B)FLAN(137B)June 2020May 2021Megatron Turing-
112、NLG(137B)Yuan 1.0(246B)Sep 2021Gopher(280B)Ernie 3.0 Titan(260B)LaMDA(280B)Jan 2022GPT-j(6B)GPT-NeoX(20B)Aug 2021PaLM(540B)OPT(175B)BLOOM(176B)GLM(130B)Open-sourced models in redMay 2022Aug 2022Chinchilla(70B)Introduction|Research|Industry|Politics|Safety|PredictionsLandmark models from OpenAI and D
113、eepMind have been implemented/cloned/improved by the open source community much faster than wed have expected.stateof.ai 2022#stateofai|35DALL-EJan 2021DALL-E 2DALL-E miniApr 2022Make-a-sceneMar 2022Aug 2022May 2022June 2022July 2022ImagenPartiStable DiffusionCogView2Closed for 15 months:community-d
114、riven open sourcing of DALL-E et al.Open-sourced models in red Introduction|Research|Industry|Politics|Safety|PredictionsLandmark models from OpenAI and DeepMind have been implemented/cloned/improved by the open source community much faster than wed have expected.stateof.ai 2022#stateofai|36AlphaFol
115、d 1Aug 2018ESMESM 2AlphaFold 2July 2021RosettaFoldApr 2019OpenFoldAug 2022Closed for 35 months:community-driven open sourcing of AlphaFold et al.Open-sourced models in red.Note that the models we reference are not necessarily replicas,but can be improved versions or independently developed.Introduct
116、ion|Research|Industry|Politics|Safety|PredictionsESM-1BNov 2020stateof.ai 2022Thanks to their large range of capabilities,LLMs could in principle enable robots to perform any task by explaining its steps in natural language.But LLMs have little contextual knowledge of the robots environment and its
117、abilities,making their explanations generally infeasible for the robot.PaLM-SayCan solves this.LLMs empower robots to execute diverse and ambiguous instructions Given an ambiguous instruction“I spilled my drink,can you help?”,a carefully prompt-engineered LLM(e.g.Googles PaLM)can devise a sequence o
118、f abstract steps to pick up and bring you a sponge.But any given skill(e.g.pick up,put down)needs to be doable by the robot in concordance with its environment(e.g.robot sees a sponge).To incentivise the LLM to output feasible instructions,SayCan maximises the likelihood of an instruction being succ
119、essfully executed by the robot.Assume the robot can execute a set of skills.Then,for any given instruction and state,the system selects the skill that maximizes:the probability of a given completion(restricted to the set of available skills)times the probability of success given the completion and t
120、he current state.The system is trained using reinforcement learning.Researchers tested SayCan on 101 instructions from 7 types of language instructions.It was successful in planning and execution 84%and 74%of the time respectively.#stateofai|37 Introduction|Research|Industry|Politics|Safety|Predicti
121、onsstateof.ai 2022The introduction of Vision Transformers(ViT)and other image transformers last year as SOTA models on imaging benchmarks announced the dawn of ConvNets.Not so fast:work from Meta and UC Berkeley argues that modernizing ConvNets gives them an edge over ViTs.2021 Prediction:in vision,
122、convolutional networks want a fair fight with transformers The researchers introduce ConvNeXt,a ResNet which is augmented with the recent design choices introduced in hierarchical vision Transformers like Swin,but doesnt use attention layers.ConvNeXt is both competitive with Swin Transformer and ViT
123、 on ImageNet-1K and ImageNet-22K and benefits from scale like them.Transformers quickly replaced recurrent neural networks in language modeling,but we dont expect a similar abrupt drop-off in ConvNets usage,especially in smaller scale ML use-cases.Meanwhile,our 2021 prediction of small transformers+
124、CNN hybrid models manifested in MaxViT from Google with 475M parameters almost matching(89.53%)CoAtNet-7s performance(90.88%)on ImageNet top-1 accuracy.#stateofai|38 Introduction|Research|Industry|Politics|Safety|Predictions Transformer-based autoencoder LMs are trained to predict randomly masked wo
125、rds in large text corpora.This results in powerful models that are SOTA in language modeling tasks(e.g.BERT).While masking a word in a sentence makes the sentence nonsensical and creates a challenging task for LMs,reconstructing a few randomly masked pixels in images is trivial thanks to neighbourin
126、g pixels.The solution:mask large patches of pixels(e.g.75%of the pixels).Meta use this and other adjustments(the encoder only sees visible patches,the decoder is much smaller than the encoder)to pre-train a ViT-Huge model on ImageNet-1K and then fine-tune it to achieve a task-best 87.8%top-1 accurac
127、y.Self-supervised learning isnt new to computer vision(see for e.g.Metas SEER model).Nor are masking techniques(e.g.Context encoders,or a more recent SiT).But this work is further evidence that SOTA techniques in language transition seamlessly vision.Can domains unification be pushed further?stateof
128、.ai 2022Self-supervision techniques used to train transformers on text are now transposed almost as is to images and are achieving state of the art results on ImageNet-1K.but the inevitable vision and language modeling unification continues#stateofai|39 Introduction|Research|Industry|Politics|Safety
129、|Predictionsstateof.ai 2022Transformers trained on a specific task(via supervised or self-supervised learning)can be used for a broader set of tasks via fine-tuning.Recent works show that a single transformer can be directly and efficiently trained on various tasks across different modalities(multi-
130、task multimodal learning).Attempts at generalist multitask,multimodal models date back to at least Googles“One model to learn them all”(2017),which tackled 8 tasks in image,text and speech.DeepMinds Gato brings this effort to another level:researchers train a 1.2B parameter transformer to perform hu
131、ndreds of tasks in robotics,simulated environments,and vision and language.This partially proves our 2021 Prediction.They showed that scaling consistently improved the model,but it was kept“small”for live low-latency robotics tasks.#stateofai|402021 Prediction:culminating in a single transformer to
132、rule them all?To train their model on different modalities,all data was serialized into a sequence of tokens which are embedded in a learned vector space.The model is trained in a fully supervised fashion.Separately:With data2vec,on a narrower set of tasks,Meta devised a unified self-supervision str
133、ategy across modalities.But for now,different transformers are used for each modality.Introduction|Research|Industry|Politics|Safety|Predictions In 2021,we predicted:“Transformers replace RNNs to learn world models with which RL agents surpass human performance in large and rich game environments.”R
134、esearchers from the University of Geneva used a GPT-like transformer to simulate the world environment.They showed that their agent(dubbed IRIS)was sample efficient and surpassed human performance on 10 of the 26 games of Atari.IRIS was notably the best method among the ones that dont use lookahead
135、search.stateof.ai 2022#stateofai|412021 Prediction:transformers for learning in world models in reinforcement learning Introduction|Research|Industry|Politics|Safety|PredictionsIn the 2020 State of AI Report we predicted that transformers would expand beyond NLP to achieve state of the art in comput
136、er vision.It is now clear that transformers are a candidate general purpose architecture.Analysing transformer-related papers in 2022 shows just how ubiquitous this model architecture has become.stateof.ai 2022#stateofai|42Transformers are becoming truly cross-modality41%22%16%9%7%5%81%2%Introductio
137、n|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022The seminal NeRF paper was published in March 2020.Since then,fundamental improvements to the methods and new applications have been quickly and continuously developed.For example,more than 50 papers on NeRF alone appeared at CVPR in 2022
138、.NeRFs expand into their own mature field of research From last years Report(slide 18):Given multiple views of an image,NeRF uses a multilayered perceptron to learn a representation of the image and to render new views of it.It learns a mapping from every pixel location and view direction to the col
139、or and density at that location.Among this years work,Plenoxels stands out by removing the MLP altogether and achieving a 100 x speedup in NeRF training.Another exciting direction was rendering large scale sceneries from a few views with NeRFs,whether city-scale(rendering entire neighborhoods of San
140、 Francisco with Block-NeRF)or satellite-scale with Mega-NeRF*.#stateofai|43 Given the current quality of the results and the fields rate of progress,we expect that in a year or two,NeRFs will feature prominently in our industry section.*You can better appreciate NeRF research by checking demos.E.g.B
141、lock-NeRF,NeRF in the dark,Light Field Neural Rendering Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Resistance to antibacterial agents is common and often arises as a result of a different pathogen already present in a patients body.So how should doctors find the right a
142、ntibiotic that cures the infection but doesnt render the patient susceptible to a new infection?Treating bacterial infections by data-driven personalised selection of antibacterial agents By comparing the microbiome profiles of 200,000 patients with urinary tract or wound infections who were treated
143、 with known antibiotics before and after their infections,ML can be used to predict the risk of treatment-induced gain of resistance on a patient-specific level.Indeed,urinary tract infection(UTI)patients treated with antibiotics that the ML system would not have recommended resulted in significantl
144、y resistance(E).Both UTI(F)and wound infection(G)patients would suffer far fewer reinfections if theyd have been prescribed antibiotics according to the ML system.#stateofai|44 Introduction|Research|Industry|Politics|Safety|Predictions Very few biological samples can typically be identified from ref
145、erence libraries.Property-prediction transformers outperform at predicting a suite of medicinally-relevant chemical properties like solubility,drug likeness,and synthetic accessibility directly from MS/MS,without using structure prediction intermediates or reference lookups.stateof.ai 2022Interpreti
146、ng small molecule mass spectra using transformersTandem mass spectrometry(MS/MS)is commonly used in metabolomics,the study of small molecules in biological samples.Less than 10%of small molecules can be identified from spectral reference libraries as most of natures chemical space is unknown.Transfo
147、rmers enable fast,accurate,in silico,characterization of the molecules in metabolic mixtures,enabling biomarker and natural product drug discovery at scale.#stateofai|45 Introduction|Research|Industry|Politics|Safety|Predictions The researchers had trained their“MegaSyn”model to maximize bioactivity
148、 and minimize toxicity.To design toxic molecules,they kept the same model,but now simply training it to maximize both bioactivity and toxicity.They used a public database of drug-like molecules.They directed the model towards generation of the nerve agent VX,known to be one of the most toxic chemica
149、l warfare agents.However,as is the case with regular drug discovery,finding molecules with a high predicted toxicity doesnt mean it is easy to make them.But as drug discovery with AI in the loop is being dramatically improved,we can imagine best practices in drug discovery diffusing into building ch
150、eap biochemical weapons.Researchers from Collaborations Pharmaceuticals and Kings College London showed that machine learning models designed for therapeutic use can be easily repurposed to generate biochemical weapons.stateof.ai 2022Drug discovery,the flagship“AI for good”application,is not immune
151、to misuse#stateofai|46 Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Compared to US AI research,Chinese papers focus more on surveillance related-tasks.These include autonomy,object detection,tracking,scene understanding,action and speaker recognition.#stateofai|47Comparin
152、g data modalities in Chinese vs.US papersComparing machine learning tasks in Chinese vs.US papersRed=more common in ChinaBlue=more common in the US Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022While US-based authors published more AI papers than Chinese peers in 2022,Chin
153、a and Chinese institutions are growing their output at a faster rate#stateofai|48#papers published in 2022 and change vs.2021+11%+24%-2%+3%+10%+4%#papers published in 2022 and change vs.2021+27%+13%+13%+11%+1%Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022Chinese institutio
154、ns author 4.5x the number of papers than American institutions since 2010.The China-US AI research paper gap explodes if we include the Chinese-language database,China National Knowledge Infrastructure#stateofai|49 Introduction|Research|Industry|Politics|Safety|PredictionsSection 2:Industrystateof.a
155、i 2022#stateofai|50 Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022NVIDIAs FY 2021 datacenter revenue came in at$10.6B.In Q4 2021,they recognised$3.26B,which on an annualised basis is greater than the combined valuation of top-3 AI semiconductor startups.NVIDIA has over 3 m
156、illion developers on their platform and the companys latest H100 chip generation is expected to deliver 9x training performance vs.the A100.Meanwhile,revenue figures for Cerebras,SambaNova and Graphcore are not publicly available.#stateofai|51Do upstart AI chip companies still have a chance vs.NVIDI
157、As GPU?$13 billionLatest private valuation$4 billion$2.8 billion$5.1 billionAnnualised datacenter revenue Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022GPUs are 131x more commonly used than ASICs,90 x more than chips from Graphcore,Habana,Cerebras,SambaNova and Cambricon c
158、ombined,78x more than Googles TPU,and 23x more than FPGAs.#stateofai|52NVIDIAs chips are the most popular in AI research papersand by a massive margin23x78-131xlog scale Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022The V100,released in 2017,is NVIDIAs workhorse chip,follo
159、wed by the A100 that was released in 2020.The H100 is hotly awaited in 2022.Of the major AI chip challengers,Graphcore is cited most often.#stateofai|53For NVIDIA,the V100 is most popular,and Graphcore is most used amongst challengersNumber of AI papers citing use of specific NVIDIA cardsNumber of A
160、I papers citing use of specific AI chip startups Introduction|Research|Industry|Politics|Safety|PredictionsAnnounced at$40B,NVIDIAs attempted acquisition of Arm fell through due to significant geopolitical and anti competition pushback.Nonetheless,NVIDIAs enterprise value grew by$295B during the per
161、iod(!)stateof.ai 2022#stateofai|54NVIDIA fails to acquire Arm and grows its revenue 2.5x and valuation 2x during the dealDeal announcedDeal cancelled Introduction|Research|Industry|Politics|Safety|PredictionsNVIDIA has been investing heavily in AI research and producing some of the best works in ima
162、ging over the years.For instance,their latest work on view synthesis just won the best paper award at SIGGRAPH,one of the most prestigious computer graphics conferences.But NVIDIA has now gone a step further and applied their reinforcement learning work to design their next-generation AI chip,the H1
163、00 GPU.stateof.ai 2022#stateofai|55NVIDIA reaps rewards from investing in AI research tying up hardware and software Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022The hyperscalers and challenger AI compute providers are tallying up major AI compute partnerships,notably Mic
164、rosofts$1B investment into OpenAI.We expect more to come.#stateofai|56David teaming up with Goliath:training large models requires compute partnershipsNone yet?None yet?Introduction|Research|Industry|Politics|Safety|Predictions“We think the most benefits will go to whoever has the biggest computer”G
165、reg Brockman,OpenAI CTO#stateofai|57In a gold rush for compute,companies build bigger than national supercomputersCurrent NVIDIA A100 GPU countFuture NVIDIA H100 GPU count*estimated Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022stateof.ai 2022In 1962 the US government boug
166、ht all integrated circuits in the world,supercharging the development of this technology and its end markets.Some governments are providing that opportunity again,as“buyers of first resort”for AI companies.With access to unique high-quality data,companies could gain an edge in building consumer or e
167、nterprise AI software.The compounding effects of government contracting in AI#stateofai|58 Researchers examined Chinese facial recognition AI companies and showed a causal relationship between the number of government contracts they signed and the cumulative amount of general AI software they produc
168、ed.Unsurprisingly,leadership in the computer vision space has largely been ceded to Chinese companies now.The principle should stand in other heavily regulated sectors,like defence or healthcare,which build an expertise through unique data that is transferable to everyday AI products.Introduction|Re
169、search|Industry|Politics|Safety|Predictionsstateof.ai 2022Metas release of the BlenderBot3 chatbot for free public use in August 2022 was faced with catastrophic press because the chatbot was spitting misinformation.Meanwhile,Google,which published a paper on their chatbot LaMDA in May 2021,had deci
170、ded to keep the system in-house.But a few weeks after BlenderBots release,Google announced a larger initiative called“AI test kitchen”,where regular users will be able to interact with Googles latest AI agents,including LaMDA.Large-scale release of AI systems to the 1B+users of Google and Facebook a
171、ll but ensures that every ethics or safety issue with these systems will be surfaced,either by coincidence or by adversarially querying them.But only by making these systems widely available can these companies fix those issues,understand user behaviour and create useful and profitable systems.Runni
172、ng away from this dilemma,4 of the authors of the paper introducing LaMDA went on to found/join Character.AI,which describes itself as“an AI company creating revolutionary open-ended conversational applications”.Watch this space#stateofai|59How should big tech deal with their language model consumer
173、 products?Introduction|Research|Industry|Politics|Safety|PredictionsOnce considered untouchable,talent from Tier 1 AI labs is breaking loose and becoming entrepreneurial.Alums are working on AGI,AI safety,biotech,fintech,energy,dev tools and robotics.Others,such as Meta,are folding their centralised
174、 AI research group after letting it run free from product roadmap pressure for almost 10 years.Meta concluded that“while the centralized nature of the AI organization gave us leverage in some areas it also made it a challenge to integrate as deeply as we would hope.”stateof.ai 2022#stateofai|60DeepM
175、ind and OpenAI alums form new startups and Meta disbands its core AI group Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022All but one author of the landmark paper that introduced transformer-based neural networks have left Google to build their own startups in AGI,conversat
176、ional agents,AI-first biotech and blockchain.#stateofai|61Attention is all you need to build your AI startup$580M$225M$125M$65MAmount raised in 2022 Introduction|Research|Industry|Politics|Safety|PredictionsOpenAIs Codex quickly evolved from research(July 2021)to open commercialization(June 2022)wit
177、h(Microsofts)GitHub Copilot now publicly available for$10/month or$100/year.Amazon followed suit by announcing CodeWhisperer in preview in June 2022.Google revealed that it was using an internal ML-powered code completion tool(so maybe in a few years in a browser IDE?).Meanwhile,with its 1M+users,ta
178、bnine raised$15M,promising accurate multiline code completions.stateof.ai 2022#stateofai|62AI coding assistants are deployed fast,with early signs of developer productivity gainsMetrics for Googles coding assistant.Users are 10k+Google-internal developers(5k+for multi-line experiments).Single lineMu
179、lti-lineFraction of code added by ML2.6%0.6%Average characters per accept2173Acceptance rate(for suggestions visible for 750ms)25%34%Reduction in coding iteration duration6%-Introduction|Research|Industry|Politics|Safety|PredictionsAnd many more assets in early discovery stages.We expect early clini
180、cal trial readouts from 2023 onwards.stateof.ai 2022#stateofai|63AI-first drug discovery companies have 18 assets in clinical trials,up from 0 in 2020#of assets per pipeline stage per company%of assets per pipeline stage overallUpdated as of 26 Aug 2022 Introduction|Research|Industry|Politics|Safety
181、|PredictionsA study of 6,151 successful phase transitions between 20112020 found that it takes 10.5 years on average for a drug to achieve regulatory approval.This includes 2.3 years at Phase I,3.6 years at Phase II,3.3 years at Phase III,and 1.3 years at the regulatory stage.Whats more,it costs$6.5
182、k on average to recruit one patient into a clinical trial.With 30%of patients eventually dropping out due to non-compliance,the fully-loaded recruitment cost is closer to$19.5k/patient.While AI promises better drugs faster,we need tosolve for the physical bottlenecks of clinical trials today.stateof
183、.ai 2022#stateofai|64Can AI and compute bend the physical reality of clinical trial chokepoints?#of registered studies(ClinicalTrials.gov EOY)Stepwise probability of drug success Introduction|Research|Industry|Politics|Safety|Predictions A large pre-trained protein language model was trained on vira
184、l spike protein sequences of variants.New spike protein variants are fed to a transformer that outputs embeddings and a probability distribution of the 20 natural amino acids for each position to determine how this would affect immune escape and fitness.The red dash line indicates the date when the
185、EWS predicted the variant would be high-risk and the green dash-dot line is when the WHO designated the variant.In almost all cases,EWS alerted several months before the WHO designation.stateof.ai 2022mRNA vaccine leader,BioNTech,and enterprise AI company,InstaDeep,collaboratively built and validate
186、d an Early Warning System(EWS)to predict high-risk variants.The EWS could identify all 16 WHO-designated variants on average more than one and a half months prior to officially receiving the designation.#stateofai|65Predicting the evolution of real-world covid variants using language models Introduc
187、tion|Research|Industry|Politics|Safety|Predictions Due to a shortage of radiologists and an increasing volume of imaging,the diagnostic task of assessing which X-rays contain disease and which dont is challenging.Oxipits ChestLink is a computer vision system that is tasked with identifying scans tha
188、t are normal.The system is trained on over a million diverse images.In a retrospective study of 10,000 chest X-rays of Finnish primary stateof.ai 2022Lithuanian startup Oxipit received the industrys first autonomous certification for their computer vision-based diagnostic.The system autonomously rep
189、orts on chest X-rays that feature no abnormalities,removing the need for radiologists to look at them.#stateofai|66The first regulatory approval for an autonomous AI-first medical imaging diagnostichealth care patients,the AI achieved a sensitivity of 99.8%and specificity of 36.4%for recognising cli
190、nically significant pathology on a chest X-ray.As such,the AI could reliably remove 36.4%of normal chest X-rays from a primary health care population data set with a minimal number of false negatives,leading to effectively no compromise on patient safety and a potential significant reduction of work
191、load.Introduction|Research|Industry|Politics|Safety|PredictionsUniversities are a hotbed for AI spinouts:the UK case studystateof.ai 2022Universities are an important source of AI companies including Databricks,Snorkel,SambaNova,Exscientia and more.In the UK,4.3%of UK AI companies are university spi
192、nouts,compared to 0.03%for all UK companies.AI is indeed among the most represented sectors for spinouts formation.But this comes at a steep price:Technology Transfer Offices(TTOs)often negotiate spinout deal terms which are unfavourable to founders,e.g.a high equity share in the company or royaltie
193、s on sales.#stateofai|67 Introduction|Research|Industry|Politics|Safety|PredictionsSpinout.fyi:an open database to help founders and policymakers fix the spinout problem stateof.ai 2022Spinout.fyi crowdsourced a database of spinout deal terms from founders representing 70 universities all over the w
194、orld.The database spans AI and non-AI companies across different product categories(software,hardware,medical,materials,etc.),and shows that the UK situation,while particularly discouraging for founders,isnt isolated.Only a few regions stand out as being founder-friendly,like the Nordics and Switzer
195、land(ETH Zrich in particular).A major reason for the current situation is the information asymmetry between founders and TTOs,and the spinout.fyi database aims to give founders a leg up in the process.#stateofai|68 Introduction|Research|Industry|Politics|Safety|PredictionsAs 5-year programmes in Ber
196、keley and Stanford wrap up,what comes next?stateof.ai 2022In 2011,UC Berkeley launched the“Algorithms,Machines,and People”(AMPLab)as a 5-year collaborative research agenda amongst professors and students,supported by research agencies and companies.The program famously developed the critical Big Dat
197、a technology Spark(spun out as Databricks),as well as Mesos(spun out as Mesosphere).This hugely successful program was followed in 2017 by the“Real-time intelligence secure explainable systems”(RISELab)at Berkeley and“Data Analytics for Whats Next”(DAWN)at Stanford,which focused on AI technologies.R
198、ISELab created the Ray ML workload manager(spun out as Anyscale),and DAWN created and spun out the Snorkel active labelling platform.Will other universities and countries learn from the successes of the 5-year model to fund ambitious open-source research with high spinout potential?#stateofai|69Lab
199、nameOSS project createdSpinouts that emerged$38B val$250M raised$1B val$1B val2011-162017-22 Introduction|Research|Industry|Politics|Safety|PredictionsIn 2022,investment in startups using AI has slowed down along with the broader marketstateof.ai 2022Private companies using AI are expected to raise
200、36%less money in 2022*vs.last year,but are still on track to exceed the 2020 level.This is comparable with the investment in all startups&scaleups worldwide.$100B$50B$150B0$250M+$100-250M$40-100M(Series C)$15-40M(Series B)$4-15M(Series A)$1-4M(Seed)$0-1M(Pre-Seed)$111.4B$70.9B*2021202020192018201720
201、162015201420132012201120102022 YTDWorldwide investment in startups&scaleups using AI by round size view online$600B$400B$800B0$250M+$100-250M$40-100M(Series C)$15-40M(Series B)$4-15M(Series A)$1-4M(Seed)$0-1M(Pre-Seed)2021202020192018201720162015201420132012201120102022 YTDWorldwide investment in al
202、l startups&scaleupsby round size view online$726.5B$399.7B$553.2B*$47.5B$200B$351.7B$69.3B-36%-24%Introduction|Research|Industry|Politics|Safety|Predictions#stateofai|70In this slide,startups&scaleups using AI include both startups&scaleups with AI-first and AI-enabled products and solutions.(*)esti
203、mated amount to be raised by the end of 2022The drop in investment is most noticeable in megaroundsstateof.ai 2022The drop in VC investment is most noticeable in 100M+rounds,whereas smaller rounds are expected to amount to$30.9B worldwide by the end of 2022,which is almost on track with the 2021 lev
204、el.$40B$20B$80B0$250M+$100-250M2021202020192018201720162015201420132012201120102022 YTDWorldwide investment in startups&scaleups using AI by round size view online$30B$20B$40B0$40-100M(Series C)$15-40M(Series B)$4-15M(Series A)$1-4M(Seed)$0-1M(Pre-Seed)20212020201920182017201620152014201320122011201
205、02022 YTDWorldwide investment in startups&scaleups using AI by round size view online$10B$77.5B$34.9B*$25.2B$44.4B$60B$33.9B$24.9B$22.3B$30.9B*-55%-9%Introduction|Research|Industry|Politics|Safety|Predictions#stateofai|71In this slide,startups&scaleups using AI include both startups&scaleups with AI
206、-first and AI-enabled products and solutions.(*)estimated amount to be raised by the end of 2022Public valuations have dropped in 2022,while private keep growingstateof.ai 2022Combined public enterprise value(EV)has dropped to the 2020 level.Meantime,private valuations keep growing,with the combined
207、 EV already reaching$2.2T,up 16%from last year.$4.0T$2.0T$8.0T0 2021202020192018201720162015201420132012201120102022 YTDCombined EV of public startups&scaleups using AI by launch year;worldwide view online$1.5T$1.0T 02021202020192018201720162015201420132012201120102022 YTDCombined EV of privately ow
208、ned startups&scaleups using AI by launch year;worldwide view online$500B$6.0T$2.0T$2.5T$10.0T$9.6T$6.8T$6.8T$2.2T$1.4T$1.9T 2015-2022 YTD 2010-2014 2005-2009 2000-2004 1995-1999 1990-1994 2015-2022 YTD 2010-2014 2005-2009 2000-2004 1995-1999 1990-1994-29%+16%Introduction|Research|Industry|Politics|S
209、afety|Predictions#stateofai|72In this slide,startups&scaleups using AI include both startups&scaleups with AI-first and AI-enabled products and solutions.(*)estimated amount to be raised by the end of 2022The US leads by the number of AI unicorns,followed by China&the UKstateof.ai 2021 Introduction|
210、Research|Industry|Politics|Safety|Predictions Number of AI unicornsCombined enterprise value(2022 YTD)ExamplesUnited StatesChinaUnited KingdomIsraelGermanyCanadaSingaporeSwitzerland29269241410766$4.6T$1.4T$207B$53B$56B$12B$39B$14BCountries with the largest number of AI unicorns view online The US ha
211、s created 292 AI unicorns,with the combined enterprise value of$4.6T.In this slide,startups&scaleups using AI include both startups&scaleups with AI-first and AI-enabled products and solutions.#stateofai|73Investment in the USA accounts for more than half of the worldwide VCstateof.ai 20222019202020
212、21201620172018201520142012201320112010 China United Kingdom Rest of the world$60B$40B$140B$20B 0Despite significant drop in investment in US-based startups&scaleups using AI,they still account for more than half of the AI investment worldwide.2022 YTD$80B$100B$120B EU-27,Switzerland&Norway USA$25.1B
213、$63.6B53%in 2022 YTD vs.57%in 2021$8.9B$7.5B$3.8B$4.3B$7.5B$13.1BAmount invested in the companies using AI Introduction|Research|Industry|Politics|Safety|Predictions#stateofai|74In this slide,startups&scaleups using AI include both startups&scaleups with AI-first and AI-enabled products and solution
214、s.Enterprise software is the most invested category globally,while robotics captures the largest share of VC investment into AIstateof.ai 2022 Introduction|Research|Industry|Politics|Safety|Predictions#stateofai|75In this slide,startups&scaleups using AI include both startups&scaleups with AI-first
215、and AI-enabled products and solutions.IndustriesAmount invested in startups&scaleups using AI,2010-2022 YTDAmount invested in startups&scaleups using AI,2021-2022 YTDAmount invested in startups&scaleups using AI as%of all VC,2021-2022 YTDInvestment in startups&scaleups using AI,2021-2022 YTD;number
216、of rounds$6.5B$6.0B$5.2B$4.9B$3.4B$2.5B$2.4B$2.2B$2.1B$927M$292M$85M$61.7B$48.9B$44.1B$43.0B$37.9B$27.4B$17.9B$17.8B$14.7B$13.1B$10.6B$7.7B$133.7B$109.5B$629M$1.5B$2.4B$568M$421M$28M$349M$408M$348M$33M$69M$5M$21.9B$18.4B$6.0B$13.5B$12.2B$4.0B$2.7B$6.8B$2.5B$3.6B$3.4B$617M$44.7B$22.8B$25.1B3%8%14%7%4
217、%1%2%9%3%1%2%4%14%71%10%24%25%13%13%41%9%6%24%2%30%21%13%5531103885622481656141346893672204063061824473143231271481288332660$87.8BGamingHome livingJobs recruitmentLegalSportsMusicFashionHostingWellness beautyEvent techKidsDatingHealthRoboticsFoodMarketingSecurityMediaTelecomSemiconductorsEducationEn
218、ergyTravelReal estateEnterprise softwareTransportationFintech1%1%3%2%1%1%1%1%1%1%1%300,000 x in the last decade.Over the same period,the%of these projects run by academics has plummeted from 60%to almost 0%.If the AI community is to continue scaling models,this chasm of“have”and“have nots”creates si
219、gnificant challenges for AI safety,pursuing diverse ideas,talent concentration,and more.A widening compute chasm is separating industry from academia in large model AI#stateofai|82 Introduction|Research|Industry|Politics|Safety|PredictionsThere is a growing appreciation that AI is an engineering sci
220、ence in which the objects of study need to first be built.Western academics and governments are starting to wake up to this reality,most notably through the National AI Research Resource process in the US.While spending years on consultations and marketing however,others in China and outside academi
221、a are finding creative ways to do large-scale AI projects.stateof.ai 2022Slow progress in providing academics with more compute leaves others to act faster#stateofai|83National AI Initiative Act enactedJan 2021Stanford opens Center for Research on Foundation ModelsResearch CollectivesGovernments/Aca
222、demiaNAIRRTF Final ReportDec 2022MayDecAprFeb 2022JulOctAugThe Pile(Eleuther)JulySepOctNAIRRTF Meeting 5NAIRRTF Meeting 6NAIRRTF Meeting 7NAIRRTF Meeting 8NAIRRTF Meeting 9NAIRRTF Meeting 4National AI Research Resource Task Force CreatedGPT-J-6B(Eleuther)GPT-NeoX-20B(Eleuther)AugStable Diffusion(Sta
223、bility)GLM-130B(Tsinghua)GPT-Neo(Eleuther)MarJunNAIRRTF Meeting 2NAIRRTF Meeting 3NAIRRTF Meeting 1BLOOM-178B(BigScience)NAIRRTF Meeting 10 Introduction|Research|Industry|Politics|Safety|Predictions The most notable large-scale academic project this year came from China:Tsinghuas GLM-130B LLM.Eleuth
224、er,the original AI research collective,released the 20B parameter GPT-NeoX.However,core members have since moved on to OpenAI,Stability and Conjecture.Hugging Face led the BigScience initiative,releasing the 178B parameter BLOOM multilingual LLM.Stability came out of nowhere,obtained 4,000 A100 GPUs
225、,brought together multiple open-source communities and created Stable Diffusion.Decentralized research projects are gaining members,funding and momentum.They are succeeding at ambitious large-scale model and data projects that were previously thought to be only possible in large centralised technolo
226、gy companies most visibly demonstrated by the public release of Stable Diffusion.stateof.ai 2022The baton is passing from academia to decentralized research collectives#stateofai|84Large-Scale AI Results Introduction|Research|Industry|Politics|Safety|PredictionsGraph data source:Sevilla et al.Parame
227、ter,Compute and Data Trends in Machine Learning Stability has embedded itself as a compute platform for independent and academic open-source AI communities:supporting LAION for building a dataset of 5B image-text pairs and training an open-source CLIP model,and supporting the CompVis groups research
228、 in efficient diffusion models.It funds PhD students to work on community projects,and has directly hired generative AI artists,core members of Eleuther,and renowned ML researchers such as David Ha.Stable Diffusion cost$600K to train,and while weights were released,access is also sold through the Dr
229、eamStudio API.Where there was previously a dependence on ad-hoc compute donations to enable large-scale projects,Stability is pioneering a new approach of structured compute and resource provision for open-source communities,while also commercializing these projects with revenue-sharing for develope
230、rs.stateof.ai 2022Stability AI is attempting a new paradigm in commercializable open-source AI#stateofai|85 Introduction|Research|Industry|Politics|Safety|Predictions Epirus,founded in 2018,has built a next-generation electromagnetic pulse weapon capable of defeating swarms of drones that pose threa
231、ts to human safety.Swedens Saab is also making efforts towards AI-driven automation of electronic warfare:they built the COMINT and C-E.SM sensors to balance automated and operator-controlled surveillance depending on the context on the field.The company is also collaborating with defense startup,He
232、lsing.Modern Intelligence,founded in 2020,builds a platform-independent AI for geospatial sensor data fusion,situational awareness and maritime surveillance.Meanwhile,through both organic and inorganic growth,Anduril has expanded its autonomous hardware platforms.For example,Anduril acquired Area-I
233、to launch a new product in Air Launched Effects with an increased payload,data sharing and integration capabilities with other UAVs.Anduril also expanded into Underwater Autonomous Vehicles by acquiring Dive Technologies.Defense technology companies are applying AI to electronic warfare,geospatial s
234、ensor fusion,and to create autonomous hardware platforms.stateof.ai 2022AI continues to be infused into a greater number of defense product categories#stateofai|86 Introduction|Research|Industry|Politics|Safety|Predictions Nato published their AI Strategy and announced a$1B fund to invest in compani
235、es working on a range of dual-use technologies.It was described as the worlds first multi-sovereign venture capital fund spanning 22 nations.Helsing,a European defense AI company,announced a 102.5M Series A led by Daniel Ek of Spotify.Microsoft,Amazon and Google continue to compete for a major role
236、in defense-most notably Microsofts$10B contract with the Pentagon was cancelled after a lawsuit from Amazon.The new beneficiaries of the contract will now be announced in late 2022.Anduril landed their largest DoD contract to date and is now reportedly valued at$7B.Shield AI,developer of military dr
237、ones,raised at a$2.3B valuation Heavily funded start-ups and Amazon,Microsoft,and Google continue to normalise the use of AI in Defense.stateof.ai 2022AI in defense gathers big funding momentum#stateofai|87 Introduction|Research|Industry|Politics|Safety|Predictions GIS Arta is a homegrown applicatio
238、n developed prior to Russias invasion based on lessons learned from the conflict in the Donbas.Its a guidance command and control system for drone,artillery or mortar strikes.The app ingests various forms of intelligence(from drones,GPS,forward observers etc)and converts it into dispatch requests fo
239、r reconnaissance and artillery.GIS Arta was allegedly developed by a volunteer team of software developers led by Yaroslav Sherstyvk,inspired by the Uber taxi model.The use of geospatial(GIS)software has reportedly reduced the decision chain around artillery from 20 minutes to under one minute.state
240、of.ai 2022Ukraines homegrown geospatial intelligence GIS Arta software is a sign of things to come#stateofai|88 Introduction|Research|Industry|Politics|Safety|PredictionsBetween 1990 and 2020,China accelerated its output of greenfield fab projects by almost 7x while the US slowed down by 2.5x.Moreov
241、er,while China and Taiwan fabs take roughly 650 days from construction start to being production-ready,the US builds fabs 42%slower today than they did 30 years ago.stateof.ai 2022The Great Reshoring will be slow:US lags in new fab projects,which take years to build#stateofai|89Total#of greenfield f
242、ab projectsAvg.#of days from build start to production Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022#stateofai|90 The bill poses a dilemma for Korean(e.g.Samsung),Taiwanese(e.g.TSMC)and other manufacturers:if they accept US subsidies,then they must pivot away from China w
243、ithout backlash from Beijing,which is opposed to this“friendshoring”.Since passing the bill,Micron announced a$40B investment in memory chip manufacturing to increase US market share from 2%to 10%.Qualcomm will expand its US semiconductor production by 50%by 2027 and in partnership with GlobalFoundr
244、ies the two will invest$4.2B to expand the latters upstate New York facility.CSET estimates the US should focus on its manufacturing capabilities in leading-edge,legacy logic and DRAM(right chart).The US CHIPS and Science Act of 2022:$250B for US semiconductor R&D and productionThe bipartisan legisl
245、ation was signed into law in August 2022.It provides for$53B to boost US-based semiconductor R&D,workforce and manufacturing,as well as a 25%investment tax credit for semiconductor manufacturers capital expenses.In exchange,recipients must not upgrade or expand their existing operations in China for
246、 10 years,nor can they use funds for share buybacks or to issue dividends.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022#stateofai|91 Earlier this year,CSET analysed 24 public contracts awarded by Chinese PLA units and state-owned defense enterprises in 2020.They found tha
247、t nearly all of the 97 AI chips in these purchase orders were designed by NVIDIA,AMD,Intel and Microsemi.Domestic AI chip companies were not featured.Thus,American chips are arming Chinese defense capabilities.Chinese semiconductor manufacturers have been already cut off from advanced lithography ma
248、chines made by ASML and related equipment from Lam Research and Applied Materials.It is unlikely that domestic AI chip companies(e.g.Biren)can fill the void:leading-edge node manufacturing is still only possible by TSMC in Taiwan and because domestic talent,software and technology is still years awa
249、y from NVIDIA.China will still accelerate its development.The US cuts China off from NVIDIA and AMD chipswill this spur Chinese AI R&D?NVIDIA GPUs are used by all major Chinese technology companies(Baidu,Tencent et al.)and universities(Tsinghua,Chinese Academy of Sciences et al.).Washington ordered
250、NVIDIA and AMD to stop exporting their latest AI chips(e.g.NVIDIA A100 and H100,and AMD M100 and M200)to China as a means of curbing their use in applications that threaten American national security via China.The companies will have to provide statistics on previous shipments and customer lists.Not
251、 having access to state of the art AI chips could stall a large swath of Chinese industry if domestic suppliers dont step into the void and fast.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022#stateofai|92 The AI Act moves through the EU legislative process.The European Par
252、liament has worked over the summer on a compromise text to address tabled amendments and opinions by the Parliaments various committees.The compromise text is scheduled to go through the various stages of the voting process at the European Parliament by the end of 2022.The AI Act is expected to be v
253、oted into law in 2023,either under the Swedish or the Spanish Presidency of the EU.Current realistic expectations are that the AI Act will become effective in the second-half of 2023.The EU advances with its plans to regulate AIIn April 2021,the EU tabled a proposal for regulating the placement on t
254、he market and use in the EU of AI systems(the“AI Act”).The proposal introduces certain minimum requirements(e.g.mainly information obligations)that all AI systems in use must meet.It also introduces more elaborate requirements(e.g.risk assessments,quality management)with respect to AI systems that p
255、ose higher risks to users.The AI Act bans the use of certain types of AI-based techniques(e.g.social scoring,real-time biometric remote identification(subject to exceptions),“subliminal techniques”).Source:Dessislava Fessenko Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022#
256、stateofai|93The EU aims at quick operationalization of the AI ActThe EU aims at quick operationalization of the requirements under the AI Act through standardization,setting up of testing facilities,and launch of pan-European and national regulatory sandboxes.European standardization efforts are alr
257、eady underway.The EU standardization organizations CEN and CENELEC have already commenced preparatory works on standardization and expect to be requested to develop relevant sets of standards by 31 October 2024.The EU appears to favor testing of high-risk AI systems,in either controlled or even poss
258、ibly in real-world conditions,as a suitable mode for supporting and promoting compliance with the AI Act among businesses of all sizes.Pan-European and national regulatory sandboxes start to emerge in the EU.Spain launched the first one in June 2022.Other EU member states(e.g.the Czech Republic)have
259、 announced similar plans.Sandboxes are considered by EU regulators as suitable testbeds for technical,policy and standardization solutions.They are also intended as a medium for supporting small and medium-sized businesses in particular in attaining compliance with the AI Act.Source:Dessislava Fesse
260、nko Introduction|Research|Industry|Politics|Safety|PredictionsSection 4:Safetystateof.ai 2022#stateofai|94 Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022#stateofai|95While AI advances rapidly,the safety of highly-capable future systems remains unclearWhile many concerns st
261、ill appear speculative,early AI pioneers considered that highly capable and economically integrated AI systems of the future could fail catastrophically and pose a risk to humanity,including through the emergence of behaviours directly opposed to human oversight and control.“.it seems probable that
262、once the machine thinking method has started,it would not take long to outstrip our feeble powers.At some stage therefore we should have to expect the machines to take control”“Thus the first ultraintelligent machine is the last invention that man need ever make,provided that the machine is docile e
263、nough to tell us how to keep it under control.”“The problem is that,with such powerful machines,it would require but the slightest accident of careless design for them to place their goals ahead of ours”Alan Turing1951I.J.Good1965Marvin Minsky1984 Introduction|Research|Industry|Politics|Safety|Predi
264、ctionsstateof.ai 2022#stateofai|96The UK is taking the lead on acknowledging these uncertain but catastrophic risksThe UKs national strategy for AI,published in late 2021,notably made multiple references to AI safety and the long-term risks posed by misaligned AGI.“While the emergence of Artificial
265、General Intelligence(AGI)may seem like a science fiction concept,concern about AI safety and non-human-aligned systems is by no means restricted to the fringes of the field.”“We take the firm stance that it is critical to watch the evolution of the technology,to take seriously the possibility of AGI
266、 and more general AI,and to actively direct the technology in a peaceful,human-aligned direction.”“The government takes the long term risk of non-aligned AGI,and the unforeseeable changes that it would mean for the UK and the world,seriously.”“We must establish medium and long term horizon scanning
267、functions to increase governments awareness of AI safety.”“We must work with national security,defence,and leading researchers to understand how to anticipate and prevent catastrophic risks.”Introduction|Research|Industry|Politics|Safety|Predictions A survey of the ML community found that 69%believe
268、 AI safety should be prioritized more than it currently is.A separate survey of the NLP community found that a majority believe AGI is an important concern we are making progress towards.Over 70%believe AI will lead to social change at the level of the Industrial Revolution this century,and nearly 4
269、0%believe AI could cause a catastrophe as bad as nuclear war during that time.Long dismissed as science fiction by mainstream AI research and academia,researchers are now shifting consensus towards greater concern for the risks of human-level AI and superhuman AGI in the near future.stateof.ai 2022A
270、I researchers increasingly believe that AI safety is a serious concern#stateofai|97Note:The number in green represents the fraction of respondents who agree with the position out of all those who took a side.The number in black shows the average predicted rate of agreement.Introduction|Research|Indu
271、stry|Politics|Safety|Predictions New non-profit research labs include the Center for AI Safety and the Fund for Alignment Research.The Centre for the Governance of AI was spun out as an independent organization from the Future of Humanity Institute in Oxford.There was a huge increase in interest for
272、 education programmes with over 750 people taking part in the online AGI Safety Fundamentals course.New scholarships were created,including the Vitalik Buterin PhD Fellowship in AI Existential Safety.Notably,Ilya Sutskever,OpenAIs Chief Scientist,has shifted to spending 50%of his time on safety rese
273、arch.Increased awareness of AI existential risk is leading to increased headcount,with an estimated 300 researchers now working full-time on AI safety.However this is still orders of magnitude fewer researchers than are working in the broader field,which itself is growing faster than ever(right char
274、t).stateof.ai 2022AI safety is attracting more talent yet remains extremely neglected#stateofai|98Researchers by venue/field30 x Introduction|Research|Industry|Politics|Safety|PredictionsIncreased awareness of AI existential risk has led to rapidly increasing funding for research into the safety of
275、highly-capable systems,primarily through donations and investments from sympathetic tech billionaires Dustin Moskovitz(Open Philanthropy)and Sam Bankman-Fried(FTX Foundation).However,total VC and philanthropic safety funding still trails behind resources for advanced capabilities research,not even m
276、atching DeepMinds 2018 opex.stateof.ai 2022Funding secured,though trailing far behind what goes into capabilities#stateofai|99Philanthropic AI Safety funding pales in comparison to AI capabilities funding*We include fundraises for Adept,Hugging Face,Cohere,AI21,Stability and Inflection under Capabil
277、ities VC and fundraises for Anthropic under Safety VC.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022RLHF has emerged as a key method to finetune LLMs and align them with human values.This involves humans ranking language model outputs sampled for a given input,using these
278、rankings to learn a reward model of human preferences,and then using this as a reward signal to finetune the language model with using RL.OpenAI started the year by finetuning GPT-3 using RLHF to produce InstructGPT models that improved on helpfulness for instruction-following tasks.Notably,the fine
279、-tuning only needed$1B into an AGI or open source AI company(e.g.OpenAI).6.Reality bites for semiconductor startups in the face of NVIDIAs dominance and a high profile start-up is shut down or acquired for$100M is invested in dedicated AI Alignment organisations in the next year as more people becom
280、e aware of the risk we are facing by letting AI capabilities run ahead of safety.9.A major user generated content side(e.g.Reddit)negotiates a commercial settlement with a start-up producing AI models(e.g.OpenAI)for training on their corpus of user generated content.Introduction|Research|Industry|Po
281、litics|Safety|PredictionsThanks!Congratulations on making it to the end of the State of AI Report 2022!Thanks for reading.In this report,we set out to capture a snapshot of the exponential progress in the field of artificial intelligence,with a focus on developments since last years issue that was p
282、ublished on 12 October 2021.We believe that AI will be a force multiplier on technological progress in our world,and that wider understanding of the field is critical if we are to navigate such a huge transition.We set out to compile a snapshot of all the things that caught our attention in the last
283、 year across the range of AI research,industry,politics and safety.We would appreciate any and all feedback on how we could improve this report further,as well as contribution suggestions for next years edition.Thanks again for reading!Nathan Benaich(nathanbenaich),Ian Hogarth(soundboy),Othmane Sebb
284、ouh(osebbouh)and Nitarshan Rajkumar(nitarshan).stateof.ai 2022#stateofai|109 Introduction|Research|Industry|Politics|Safety|PredictionsWed like to thank the following individuals for providing critical review of this years Report:-Andrej Karpathy-Moritz Mueller-Freitag-Shubho Sengupta-Miles Brundage
285、-Markus Anderlung-Elena SamuylovaReviewersstateof.ai 2022#stateofai|110 Introduction|Research|Industry|Politics|Safety|PredictionsThe authors declare a number of conflicts of interest as a result of being investors and/or advisors,personally or via funds,in a number of private and public companies w
286、hose work is cited in this report.Ian is an angel investor in the following companies mentioned in this report:Anthropic and Helsing AI.Nathan is an investor in the following companies: of intereststateof.ai 2022#stateofai|111 Introduction|Research|Industry|Politics|Safety|Predictions#stateofai|112
287、Introduction|Research|Industry|Politics|Safety|PredictionsAbout the authorsNathan is the General Partner of Air Street Capital,a venture capital firm investing in AI-first technology and life science companies.He founded RAAIS and London.AI(AI community for industry and research),the RAAIS Foundatio
288、n(funding open-source AI projects),and Spinout.fyi(improving university spinout creation).He studied biology at Williams College and earned a PhD from Cambridge in cancer research.Nathan BenaichIan Hogarth Ian is a co-founder at Plural,an investment platform for experienced founders to help the most
289、 ambitious European startups.He is a Visiting Professor at UCL working with Professor Mariana Mazzucato.Ian was co-founder and CEO of Songkick,the concert service.He started studying machine learning in 2005 where his Masters project was a computer vision system to classify breast cancer biopsy imag
290、es.stateof.ai 2022Othmane SebbouhResearch AssistantOthmane is a PhD student in ML at ENS Paris,CREST-ENSAE and CNRS.He holds an MsC in management from ESSEC Business School and a Master in Applied Mathematics from ENSAE and Ecole Polytechnique.#stateofai|113Nitarshan RajkumarResearch AssistantNitars
291、han is a PhD student in AI at the University of Cambridge.He was a research student at Mila and a software engineer at Airbnb.He holds a BSc from University of Waterloo.Introduction|Research|Industry|Politics|Safety|Predictionsstateof.ai 2022State of AI ReportOctober 11,2022#stateofaistateof.aiIan HogarthNathan Benaich