當前位置：首頁 > 報告詳情

TensorRT Onnx Parser 使用案例分享.pdf

上傳人： li 編號：29474 2021-02-07 PDF PDF 46頁 1.02MB

該報告所屬合集： 2020年GTC中國線上大會嘉賓演講PPT資料合集

打包下載報告合集

文檔加載中……請稍候！
如果長時間未打開，您也可以點擊刷新試試。

下載報告到電腦，查找使用更方便

VIP專享文檔

書簽

分享

收藏

已收藏

版權投訴

/46

立即下載

word格式文檔無特別注明外均可編輯修改，預覽文件經過壓縮，下載原文更清晰！

三個皮匠報告文庫所有資源均是客戶上傳分享，僅供網友學習交流，未經上傳用戶書面授權，請勿作商用。

《TensorRT Onnx Parser 使用案例分享.pdf》由會員分享，可在線閱讀，更多相關《TensorRT Onnx Parser 使用案例分享.pdf（46頁珍藏版）》請在三個皮匠報告上搜索。

1、NVIDIABest Practices of TensorRT ONNX ParserWANG Meng，2020/12#page#OUTLINE口ONNX Introduction口 TF2ONNX Introduction口 TensorRT ONNX Parser口Optimization口 Refit口Summary#page#ONNX INTRODUCTIONONNX： Open Neural Network ExchangeTraining frameworkDeployment targetOPTcCaffehttps:/ INTRODUCTIONONNX: an open a

2、nd interoperable format for ML modelsTraining frameworkDeployment targetOPyiONNXFocus hardware innovation onpue uado ueFreedom to use toolls） ofNN optimizations for a singleinteroperablechoice compatible with ONNXformat forformat instead of manyMLmodelshttps:/ INTRODUCTIONONNX Overview口https:/ ONNX

3、is an open specification that consists of the following componentsA definition of an extensible computation graph modelDefinitions of standard data types.Definitions of built-in operators.口 Operator sets define the available built-in operators and their version (currently 6-12）口 The newest operator

4、set supports around 160 operatorshttps:/ INTRODUCTIONIntermediate Representation口 Model is top-level ONNX construct and represented in protocol buffers as the typeonnx.ModelProto.Model is consisted of graph and associated metadata.Graph defines the computational logic of a model and contains a list

5、of nodes that form adirected acyclic graph based on their inputs and outputs. The nodes in the graph are sortedtopologically.Edges in the computation graph are established by outputs of one node being referenced byname in the inputs of a subsequent node.Nodes are comprised of a name， the name of an

6、operator that it invokes，a list of named口inputs，a list of named outputs，and a list of attributesAU node output names MUST be unique within a graphhttps:/ INTRODUCTIONStructure of onnx.proto3onnx.proto3 is a general network definition protobuf.message NodeProtofmessage GraphProto frepeated string inp

7、ut=1；repeated ValuelnfoProto input = 11message ModelProtofrepeated string output=2repeated ValuelnfoProto output = 12；GraphProto graph=7；string name=3；repeated TensorProto initializer = 5；repeated OperatorSetldProtoopset_import=8；string op_type=4/ namespace Operatorstring name=2；3string domains7；/na

8、mespace Domainrepeated NodeProto node =1；3repeated AttributeProto attribute = 5；3e.g. Cony RelunVICLhttps:/ ONNX Introduction口TF2ONNX Introduction口 TensorRT ONNX Parser口Optimization口 Refit口Summary#page#TF2ONNX INTRODUCTIONConvert TensorFlow models to ONNX口https:/ Python version 2 3.6tf2onnx supports

9、 ONNX opset-6 to opset-12tf2onnx supports nearly 200 kinds of opFor those unsupported op，-continue_on_error is recommended to produce ONNX model口files口 Higher opset like 11 or 12 is recommended#page#TF2ONNX INTRODUCTIONExperience Share口 Usage examplepython3 -m tf2onnx.convert -input onnx_model/model

10、.pb -output onnx model/model.onnxverbose -opset=11-continue_on_error -inputs GRAPH_INPUTS -outputs GRAPH_OUTPUS-inputs，-outputs are the list of input/output node names in graph， with name format asnode_name:port id， typically like input0:0. If some input nodes are not actually used， we needto remove

11、 them from -inputs.AIl output names should be unique within a graph. So pleasewell-prepare your graph input and output node names.-verbose will summary abundant information， like the type and number of operators and graphoptimization results.-continue_on_error allows unsupported operators and will p

12、reserve all the informationtf2onnx optimizer might change the graph undesirably， like removing Add node if bias is zeroinitialized. So a really trained pb model is ppreferred.#page#OUTLINE口 ONNX Introduction口 TF2ONNX Introduction口TensorRT ONNX Parser口Optimization口 Refit口Summary#page#TENSORRT ONNX PA

13、RSERConvert ONNX model to TensoRT-builtin_op_importers.cppbuiltin_op_importers.hpp口https:/ parser supports more than one hundred OPS. For those口Modellmporter.cppModellmporter.hppunsupported OPs， we can implement it with TensorRT pluginsNOnnxParser.cppNOnnxParser.hInputXNNOonnx2trt.hppTensorsonnx2trt

14、_runtime.hppTensorflowonnx2trt_utils.cpponnx2trt_utils.hppCaffeOnnxAttrs.cppMetworkRuntimeBuitderEngineOnnxAttrs.hppDefinitonPythonAP1onnx_trt_backend.cpponnx_utils.hppC+APIShapedweights.cppDeseriatizeSerializeShapedWeiehts.hppMATLABShapeTensor.cppOptimizationOutputShapeTensor.hppParametersTensorsSt

15、atus.hppbytesTensororWeights.hpptoposort.hpptrt_utils.hpputils.hpphttps:/ api/parsers/onnx/pyOnnx.html#page#TENSORRT ONNX PARSERParse the ONNX model file and populate TensorRT network2Create builderCreate networkcreatelnferBuilder(glogger）iBuidercreateNtworkOiNetworkDefnilioniBuideriLoggeCreateiBuil

16、der with glogger as theinput argumentcreateNetwork(）create the network43Create parser 8 parse imported modelBuild engineONNX： parser = nvonnxparser:createParser(network， glogger）；Optimization phaseiCudaEngineiParserparseOiBuideriNeworDeinionbuildCudaEngine0buildCudaEngineO） of iBuilder is called to

17、createUsing parseO to read the model file andpopulate TensorRT network with model as input args andengine with networkas input argumentnetworkasoutputargshttps:/ ONNX PARSERUsage of Python API口 Usage example:#Createbuilder，networkand parserTRT_LOGGER=trt.Logger(trt.Logger.VERBOSE）EXPLICIT_BATCH=1net

18、work(）-XTaddActivation(input， nvinfer1:ActivationType:：kRELU）Input tensorreturnfflayer-getOutput(0）33；sindinoYT#define DEFINE_BUILTIN_OP_IMPORTER（op）Josuat ndinoNodelmportResult import#fopdTYPE CONSTRAINTSllmporterContext* ctx，T：tensor（float16），tensor（float），tensor（double）：ONNX NAMESPACE:：NodeProto

19、const8t node，Constrain input andoutput typesto float tensors.std:vector8 inputs）；https:/ opimporters.cpp#page#TENSORRT ONNX PARSERHow to Support New Layers/oPs ？Solution 1: modify TensorRT OSS (parsers and plugins）口Implement TensorRT plugins口Add plugins to the main TensorRT repository口Add specific i

20、mporter function DEFINE_BUILTIN_OP_IMPORTER（plugin） inbuildin_op_importers.cpp口Build TensorRT-OSShttps:/ ONNX PARSERNODEPROPERTIESOneHot in ONNX:bpeOneHet口Produces a one-hot tensor based on inputs.namonehot口Inputs；ATTRIBUTEindices: input tensor containing indices.INPUTSdepth: scalar specifying the n

21、umber ofindicesnameCast4:0classes in one-hot tensor.depthmeconst fold_opt 41values: rank 1 tensor off_valuekind Initializeron_value，like O，1.type:int32116口Attributes:oncat_17:0aluesaxis:along which one-hot representationSInIOin added； Default as-1.indinoZnVDIA#page#TENSORRT ONNX PARSERDevelop Tensor

22、RT Plugin GOneHot）口A custom layer is implemented by extending the class iPluginCreator and oneof TensorRTs base classes for plugins:Table1.Baseclasses，orderedfrom least expressiveto most expressiveIntroduced in TensorRT version？Mixedinput/output formats/types Dynamicshapes？5.1LimitedNoIPluginv2Ext6.

23、0.1NoGeneralIPluginv2IOExtGeneralIPluginv2pynamicExt6.0.1Yes口 For these interfaces， we recommend to use IPluginvzioExt if you do not needto support dynamic shapes， otherwise use IPluginV2DynamicExt.https:/ api/namespacenvinfer1.html#page#TENSORRT ONNX PARSERDevelop TensorRT Plugin (OneHot）./OnehotPl

24、ugin/OnehotPlugin.hclass onehotEncoder：public IPluginv2IOExt.implement al class methods for your plugin3class onehotEncoderCreator： public IPluginCreator.implement all creator methods here3口./OnehotPlugin/OnehotPlugin.cuint onehotEncoderenqueue(int batchsize，const void*const inputs，void *outputs，voi

25、d* workspace，cudaStream_t stream）provide parameters and call onehotEncoderKernel3REGISTER_TENSORRT_PLUGIN(onehotEncoderCreator）口./OnehotPlugin/CMakeLists.txtfile(GLOB SRCS*.cu）https:/ ONNX PARSERAdd plugin to the main TensorRT repository (OneHot）口 Add a new folder OnehotPlugin and the source code un

26、der the STRT/plugins directoryOnehotPluginCMakelists.txtOnehotPlugin.cuOnehotPlugin.h口Add the folder to STRT/plugins/CMakeLists.txtSet(PLUGIN LISTSOnehotPlugin#page#TENSORRT ONNX PARSERAdd specific importer function DEFINE BUILTIN_OP_IMPORTER（OneHot）DEFINE_BUILTINOP_IMPORTER（OneHot）nvinfer1:：Tensor*

27、 indices= aconvertToTensor(inputs.at(0），ctx）；autoweight=inputs.at(1）weights(；int depth=static_castint*（weight.values)0OnnxAttrsattrs(node，ctx）/ Populate OneHot plugin properties.const std:string pluginName= onehotEncoderconst std:：string pluginVersion=“1”std:vectorf；f.emplace_back(depth，adepth，nvinf

28、er1:PluginFieldType：kINT32，1）；nvinfer1:Pluginv2* plugin = createPlugin（node.nameO，importPluginCreator(pluginName，pluginVersion），f）；ASSERT(plugin l= nullptr 88 “OneHot plugin was not found in the plugin registry！”ErrorCode:：kUNSUPPORTED_NODE）nvinfer1:：ITensor *input_datal= findices3RETURNFIRST_OUTPUT

29、(ctx-network(-addPluginv2(inputt_data，1，plugin）#page#TENSORRT ONNX PARSERBuild TensorRT-OSS口Steps:export LD_LIBRARY_PATH=SLD_LIBRARY_PATH:STRT/TensorRT-7.2.1.6/libcd STRT/buildcmake.-DTRT_LIB_DIR=STRT/TensorRT-7.2.1.6/lib -DTRT_BIN DIR= pwd/outmake -jS(nproc）m make install口 Verify that new plugins h

30、ave been integrated to TensorRT successfully.https:/ ONNX PARSERHow to Support New Layers/oPs ？Solution 2: utilize Fallback mechanismImplement TensorRT pluginsBuild a standalone library for individual plugins口Pre-load the library and ONNX parser will automatically attempt to importunsupported Layers

31、/oPs as plugins (FallbackPluginlmporter）nVIDIA#page#TENSORRT ONNX PARSERTips口 Implement TensorRT pluginsThe inputs/outputs of the plugin layer in the ONNX graph should be the same as your TensorRTpluginThe name/version of the plugin layer in the ONNX graph should be the same as thename/version retur

32、ned by the getPluginName/Version function of the PluginCreator classThe attributes set for the custom layer in ONNX must match with the plugin attributes ofPluginCreator classRemember to implement iPluginCreator:getFieldNamesO groupNormalizationPlugin is a goodexample to learn.口 Build a standalone l

33、ibrary for individual pluginsUse makefile to build a standalone library (./lib/onehot.so）口 Preload the plugin library and parsePython APl: ctypes.cdl.LoadLibrary(./lib/onehot.so）Command line tool: trtexec -onnx=model.onnx -plugins=./lib/onehot.so#page#TENSORRT ONNX PARSERAlign ONNX Layer/op with Ten

34、sorRT plugin口 OneHot in ONNX口 AlignmentInputs:indices， depth，valuesModify OneHot to customized OP in ONNXAttributes: axisfor node in graph.node:if node.op_type= “OneHot：口 onehotEncoder plugin in TensorRTonehot = onnx.helper.make_node(onehotEncoder，Inputs: indicesname = node.name，depth=depth，Attribut

35、es: depthinputs= Inode.inputrojoutputs = node.outputfoAssume axis is-1（the last1dimension），values are O，1nodes_remove.append(node）nodes_extend.apppend(onehotl）#page#TENSORRT ONNX PARSERCase 1: unsupported OPs and implemented by plugins口 Example: OneHot口 Solution 1: modify TensorRT OSSvery complicate

36、d since we need to set up the building environment andmodify parser and plugin folderno strict restrictions on plugins as long as the importer function is welwritten， very flexible口 Solution 2:utilize Fallback mechanismeasy since we only build a standalone librarysuitable for importing unsupported O

37、Ps as plugins, restrictedrequired to modify the definition of ONNX OPs when the inputs， outputs orattributes of ONNX OPs dont match pluginsnVIOIA#page#TENSORRT ONNX PARSERCase 2: unsupported OPs and implemented by TensorRT layers口 Example: Sign， CudnnRNNV3,when we want to import these OPs withTensor

38、RT layers since plugin implementation is complicated and prone toerrors for beginners口 Solution 1: modify TensorRT OSSadd specific importer function for unsupported OPs= for example， use IRNNv2Layer to parse CudnnRNNV3 op口 Solution 2: utilize Faltback mechanismm not suitablenVIOL#page#TENSORRT ONNX

39、PARSERCase 3: supported OPs and but inefficient口 Example: Resize口 Solution 1: modify TensorRT OSSmodify the importer function to call plugins instead of original TRT layers口Solution 2:utilize Fallback mechanismrequired to modify the op_type of ONNX OPs like appending a tag “Plugin”（from Resize to Re

40、sizePlugin） since falback mechanism is only used forunsupported OPsnVIOL#page#TENSORRT ONNX PARSERCase 4: required to select TRT layers or plugins based on conditions口 Example: Reduce口 Solution 1: modify TensorRT OSS sometimes we only implement a plugin for specific caseadd specific importer functio

41、n to do selection based on conditions like inputshape or axis口 Solution 2: utilize Fallback mechanism modify the op_type of ONNX OPs based on conditionsnVIOL#page#TENSORRT ONNX PARSERCase 5:required to fix the issues of ONNX parser口 Example: ONNX parser v6.0 doesnt support bool weights口 Solution 1:

42、modify TensorRT OSSfix issues of ONNX parser， for example,add support for importing boolweights口 Solution 2: utilize Faltback mechanismnot suitablenVIDIA#page#TENSORRT ONNX PARSERComparison based on my own experienceCasesSolution 1:Solution2:modify TensorRT OSSfallback mechanism1.Unsupported OPsComp

43、licatedPrefered， much simplerImplemented by plugins2. Unsupported OPsModify parserImplemented by TensorRT Layers3.Supported OPs but inefficientModify parser to call pluginsModify op_type to utilizeImplemented by pluginsinstead of original TRT layersfaltback plugin importer4.Required to select TRT la

44、yers orModify parserModify op_type based onplugins based on conditionsconditions5.Required to fix the issues ofModify parserONNX parser#page#OUTLINE口 ONNX Introduction口 TF2ONNX Introduction口 TensorRT ONNX Parser口Optimization口 Refit口 Summary#page#OPTIMIZATIONIntroduction口 Kernel optimization口 Graph f

45、usionONNX graph level:ONNX Python APIONNX GraphSurgeon is a tool that allows you to easily generate new ONNX graphs， ormodify existing ones. Please refer to this excellent NVIDIA developer blog for more details.TennsorFlow graph level:graphsurgeon-tf allows you to transform TensorFlow graphs. lts ca

46、pabilities are broadlydivided into two categories: search and manipulation. Search functions allow you to findnodes in a TensorFlow graph. Manipulation functions allow you to modify，add， or removenodes.https:/ api/graphsurgeon/graphsurgeon.html#page#OPTIMIZATIONKernel OptimizationCUDA(Tesla V100-PCI

47、E-32GB99.9%Defaultstream（17）￥93.2%Kernels18.3%ResizeBilinearKernel13.8%ttvolta_scudnnwinograd.128x128ldgg4.re11.9%ttvoltascudnn.128x3.relu.smaln_y009.9%implicitconvolve_sgemm9.8% copyPackedKemnelI16.3%trt volta scudnn 128x64relu interior nnv6.1%initArray2Val4.5%trtvolta_scudnn_128x32relumediumnn_v14

48、.3%launchPointwise3.4%op_generic_tensor kernel26kemelgroupshidden6.89MemoryNVTX（TensorRT）#page#OPTIMIZATIONKernel Optimization口 Solution 1:Implement your resize pluginAdd the plugin to TensorRTaulayaugazsaz neyp Jo peasu uonid au o lasied XNNO ApOWBuild TensorRT-OSS口 Solution 2 (preferred）Implement

49、your resize plugin named by ResizePluginModify the op_type from original Resize in ONNX graph to ResizePluginBuild a standalone libraryhttps:/ dilated convIn ONNX graph， dilated conv isSpaceToBstcNDconverted to even 12 OPs.（2=2）However both ONNX and TRTsupport dialated conv.Corv2DCon2DBetchToSpceNDB

50、etchToSpOptimization planMerge these small OPs to dilatedBisAddiwAddconv in ONNX-graph level00In TensorFlow graph， dilatedconv is implemented by 5small OPs.#page#OPTIMIZATIONGraph Fusion with ONNX Python APIfornodeingraph.nodesiffirstandfirst.op_type=Padandsecondandsecond.op_type=“Transpose”andthird

51、 and third.op type = SpaceToDepth and fourth and fourth.op_type = Transpose“Transposex=first.inputfodilations=third.attributeo.ikernel_shape=fifth.attribute2.intsstrides=fifth.attribute1.intsweights=fifth.input1biases=/.join(（fifth.outputo）.split（/）:-1）+/biases/read:0y=/.join（(fifth.outputo）.spiit(/

52、:-1）+/BatchTospaceND:0dilated_conyonnx.helper.make_nodeConv，name=fifth.name，inputs=x，weights，biases，outputs=y，kernelshape=kernel_shape，strides=strides，dilations=dilations，dilations，#Defaultvaluesfor otherattributes:strides=1，1，dilations=1，1，groups=1auto_pad=SAME_UPPER，first= secondsecond=thirdthird=

53、 fourthfourth=fifthfifth=nodenVIOL#page#OPTIMIZATIONGraph Fusion with ONNX Python APIfor node in graph.nodes：#Search for pattern for modification in ONNX graphiffirst and first.op_type=“DepthToSpaceand second andsecond.op_type=“Transposeand thirdand third.op_type =slice and fourth and fourth.op_type

54、 =Addandfifth and fifth.op_type=“Relu and node.op_type=Transpose：relu=onnx.helper.make_nodelRelu，inputs=fourth.input，outputs=node.output.nodes_remove.extend(first，second，third，fourth，fifth，node）nodes_extend.append(irelu）first= secondsecond= thirdthird= fourthfourth=fifthfifth=nodefor node in nodes_r

55、emove:graph.node.removenode）for node in nodes_extend:graph.node.extend(node）model_def=onnx.helper.make_model(graph）onnx.save(model def/onnx model/modeldilated_conv.onnx）https:/ ONNX Introduction口 TF2ONNX Introduction口 TensorRT ONNX Parser口Optimization口 Refit口Summarry#page#REFITIntroductionul pitngal

56、 o Suleu nouam sulem Meu qIM eusua ue lJel ue Lalosuel口engine must be built as “refittable”Set builder.refittable = TrueCreate a refitter object with trt.Refitter（engine，TRT_LOGGER）as refitter:Use refitterget aO to get a list of all refittable layers (ayer name，string） and associated weightRoles in

57、the networkUpdate the weights that you want to update. refitter.set_weightsCMyLayer，trt.WeightsRole.KERNEL，trt.Weights(tf_weight）Update the engine with all the weights that are provided.refitter.refit_cuda_engine0https:/ api/inffer/Core/Refiitterhtm#page#REFITRefit Map口 Target: update TRT refittable

58、 weights with new trained TF weight valuesaupu uBM Hl o aueu Jare) YIOMN IL deu O MOu :Kiny口口 Solution: ONNX parser maintains refit map to record the mapping relation of ONNX weightname and TRT Network layer name.口Refit map:weightNames，layerNames，roles = parser.get_refit_mapO）forw，l，rin zip（weightNa

59、mes，layerNames，roles）:print(layerName:8，weightName:8，weightRole:8.format(l，w，r）口TF-ONNX-TRTTF weight name -s ONNX weight name s TRT Network layer namenVIOL#page#REFITRemaining Issues口TensorRT does not support refitting together with dynamic shapes right now.Although ONNX parser provides refit map fo

60、r easier name mapping, sometimes we stilhave to determine mapping rules manually because tf2onnx might change layer or weightnames.口The refitting time might be too long when one weight is repeatedly refit. For example，ifone weight is used by five layers，we have to do refitting five times with this w

61、eight.口We have some workarounds to deal with these issues and are working positively to solvethem.anvohttps:/ api/infer/Core/Refiitterhtml#page#OUTLINE口 ONNX Introduction口 TF2ONNX Introduction口 TensorRT ONNX Parser口Optimization口 Refit口Summary#page#SUMMARYup u Aueau XHomau Losual aeindod pup pou XNNO

62、 asJed ue lasied XNNO口automaticway.ONNX parser only supports full-dimensions mode， meaning that your network definitionmust be created with the EXPLICIT_BATCH flag set. Please refer to dynamic shapes formore details.ONNX parser might cause some performance drop compared with a wellbuilt TensorRTnetwork through TensorRT API. But the performance gap can be reduced with differentkinds of optimization，like kernel optimization and graph fusion.pup udes epns uolsald paxlui Au o papuauuosal ale noK uoez!uldo Jauiny JOmulti-streams，Welcome to contribute to ONNX parser!RVIOIhttps:/

相關圖表

本文主要介紹了如何使用TensorRT ONNX解析器將ONNX模型轉換為TensorRT引擎。主要內容包括： 1. ONNX介紹：ONNX是一個開放且可互操作的機器學習模型格式，由計算圖模型定義、標準數據類型定義和內置操作定義組成。 2. TF2ONNX介紹：TF2ONNX是一個將TensorFlow模型轉換為ONNX模型的工具，支持從TensorFlow模型到ONNX模型的轉換。 3. TensorRT ONNX解析器：TensorRT ONNX解析器可以將ONNX模型轉換為TensorRT引擎，支持超過一百個操作。 4. 優化：介紹了如何通過內核優化和圖融合來優化TensorRT引擎的性能。 5. 重新適應：介紹了如何使用TensorRT重新適應引擎的新權重。 6. 總結：TensorRT ONNX解析器可以自動將ONNX模型轉換為TensorRT網絡，但可能存在一些性能下降。通過優化和重新適應，可以提高性能。

如何使用TensorRT ONNX解析器優化模型性能？如何將TensorFlow模型轉換為ONNX模型？如何為TensorRT實現新的層/操作？

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站