當前位置：首頁 > 報告詳情

Austin Hom（NIST）：FRTE 視頻人臉識別（FIVE）.pdf

上傳人：蘆葦編號：651657 2025-05-01 PDF PDF 29頁 1.97MB

該報告所屬合集： 2025年國際人臉和指紋性能會議（IFPC）嘉賓演講PPT合集

打包下載報告合集

文檔加載中……請稍候！
如果長時間未打開，您也可以點擊刷新試試。

下載報告到電腦，查找使用更方便

VIP專享文檔

書簽

分享

收藏

已收藏

版權投訴

/29

立即下載

word格式文檔無特別注明外均可編輯修改，預覽文件經過壓縮，下載原文更清晰！

三個皮匠報告文庫所有資源均是客戶上傳分享，僅供網友學習交流，未經上傳用戶書面授權，請勿作商用。

《Austin Hom（NIST）：FRTE 視頻人臉識別（FIVE）.pdf》由會員分享，可在線閱讀，更多相關《Austin Hom（NIST）：FRTE 視頻人臉識別（FIVE）.pdf（29頁珍藏版）》請在三個皮匠報告上搜索。

1、1NIST FACE IN VIDEO EVALUATION(FIVE)NIST FACE IN VIDEO EVALUATION(FIVE)AUSTIN HOM,PATRICK GROTHER,MEI NGAN AUSTIN HOM,PATRICK GROTHER,MEI NGAN NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGYNATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGYIFPC April 1st,2025April 1st,202521:1 VERIFICATION1:N SEARCHTWINS

2、 DISAMBIGUATIONFACE IN VIDEO 2024MORPH DETECTIONPADAGE ESTIMATIONTWO PEOPLE IN ONE FACE?SUBVERSIVE PHOTO?HOW OLD?OLD ENOUGH?SAME PERSON OR NOT?WHO?WHERE?WHEN?SAME PERSON,OR TWIN?1:N ON NON-COOP PEOPLEQUALITY+DIAGNOSTICSHOW BAD IS THIS PHOTOFRTEFACE RECOGNITIONRECOGNITION TECHNOLOGY EVALUATIONFATEFAC

3、E ANALYSIS ANALYSIS TECHNOLOGY EVALUATIONRECOGNITION:WHO IS IN AN IMAGENISTS Open BenchmarksBenchmarks are:Independent Free Regular Fast Repeatable Fair Black box IP-protecting Open globally Large-scale Sequestered datasets Statistically robust Public Transparent Extensible ABSOLUTE ACCU RELATIVE AC

4、CUANALYSIS:ABOUT IN AN IMAGE3Challenges of FR in video Pose Compound rotation of head to optical axis Resolution Range to subject Legacy camera Adverse compression for storage or transmission Motion blur But Multiple frames Challenges for FR4NIST Face in Video Evaluation(FIVE)2024Assessment of the s

5、tate-of-the-art of 1:N face recognition(FR)on video sequences(and relative improvements since FIVE 2015)Assessment of FR on videos with low qualityLow resolution including compressed video and long range imaging affected by atmospheric turbulenceElevated cameras resulting in high look-down anglePass

6、ively observed subjects who at no point face the camera directlyFace detection in wide field-of-view imageryAbsolute accuracyComparative accuracy of algorithmsComparative computational costThreshold calibration-ability to target specific false positive identification ratesGoalsRe-identificationAnoma

7、ly detectionDetection of un-cooperative actions,evasionOther modalities(e.g.,body and GAIT recognition)Clothing and other non-facial metrics 1:1 verificationOut of Scope5FIVE 2024-TimelineDateActivity2024-01-23First draft of API published2024-02-29API comments due2024-03-07Final API published2024-03

8、-11Phase 1 submission window opens2024-05-18Phase 1 submission window closes2024-06-28Phase 1 report cards distributed2024-07-01Phase 2 submission window opens2024-08-30Phase 2 submission window closes2024-10-18Phase 2 initial report cards distributed2024-12-06Phase 2 final report cards distributed2

9、025 Q2/Q3Public report published6FIVE 2024-Who Participated 11 developers from around the world submitted 31 algorithms totalCognitecCorsightDermalogGpstechvnIdemiaInnovatricsNECNeurotechnologyROCVianteVidemo7FIVE 2024 How Algorithms Are Run Dynamically-linked C+library(.so file)Run“bare metal”on Li

10、nux(Ubuntu 20.04.3)Hardware:Intel server-class CPUs (no GPUs)Hard duration limits-measured on a single CPU core Still Enrollment(face detection+feature extraction+and encoding):1.5 sec/image Video Enrollment(face detection+tracking+feature extraction+encoding):1.5 sec/frame/person Finalize Enrollmen

11、t(gallery size=10,000):4000 seconds Search(gallery size=10,000):1 second Code that crashes is rejected.8Whats In The Gallery Typical Galleries were typically composed of templates generated from still imagerytemplatetemplatetemplateEnrollment gallery9Whats In The Gallery Occasionallytemplate+templat

12、e+Enrollment galleryvideo sequencevideo sequencevideo sequencestill image9 Galleries will occasionally be composed of templates generated from a combination of multiple stills and/or video sequences of the same subject10Probes Probes were single video clips(sequence of frames)with one or more people

13、 in the scenetemplatetemplate.11RECOGNITION IS THE GOAL:NOT DETECTION,NOT TRACKINGSoftware should maximize recognition performance bydetecting person,tracking that person through time,determining which is most recognizable imagery,andextracting features/embeddingsWe do not report metrics forfalse de

14、tectionmissed detectionspatial(bounding box)location accuracytrack integrityrestoration12ALGORITHM SOFTWARE1.Detects K=1 faces2.Software extracts a set of features3.Software searches gallery producing a candidate list of fixed length L N.The value of L is an input specified by NIST via the APINIST E

15、VALUATION1.Chooses a threshold T e.g.4.02.Records a false negative error unless the candidate list includes ID=123 at or above TRepeats this for many probes and many threshold values to produce FNIR vs.T.Non-detection is immaterial if subject is(later)found correctly and identifiedtemplateScoreID4.4

16、98Marcia1.616Mei0.750Mae0.300Maria0.128Melissa0.072Marissa0.012Melani0.007JamesTPOne video,one person,one track K=1 search13templatetemplatetemplateScoreID3.142 Mary2.998 Maria1.626 Marcia0.707 Mae0.330 Mei0.198 Melissa0.074 Marissa0.016 MelaniScoreID2.901 Mary2.798 Marcia1.616 Mei0.750 Mae0.300 Mar

17、ia0.128 Melissa0.072 Marissa0.012 MelaniScoreID4.498 Marcia1.616 Mei0.750 Mae0.300 Mary0.128 Melissa0.072 Marissa0.012 Melani0.007 JamesTPFNFNALGORITHM SOFTWARE1.Even if the person is present in entire clip,as she is here,an algorithm might find the person say K=3 times(broken tracks)2.Software extr

18、acts K sets of features(aka templates)3.Software searches gallery producing K candidate lists each of fixed length L N.NIST1.Chooses a threshold T,e.g.4.02.Records a false negative error unless ANY of the K candidate lists includes ID=Marcia at or above threshold TRepeats this for many probes and ma

19、ny threshold values to produce FNIR vs.T.DISCUSSIONThis method allows tracks to be broken.NIST doesnt care about track integrity per se,only that recognition succeeds.The algorithm implementation is free to select best quality frames,to perform restoration,to perform feature level fusion,to produce

20、a template that internally contains multiple embeddings to allow score-level fusion.One Video,One Person,Multiple Tracks K 0 Searches14templateScoreID4.498 Julio1.616 Jean0.750 Jacques0.300 Julian0.128 Jesus0.072 Javier0.012 JimitemplateScoreID4.298 Julio1.516 John0.850 Jacques0.600 Julian0.428 Jaso

21、n0.172 Job0.002 JimitemplateScoreID4.498 Pedro1.616 Pierre0.750 Peter0.300 Prado0.128 Papa0.072 Paolo0.012 PaulusDISCUSSIONIf say only Julio is in the gallery,then the algorithm is rewarded for correctly finding him at some point.The scoring software does not reward twice for finding the person twic

22、e.If say the person on the right is not in the gallery,then the high score against gallery person Pedro could be accumulated into a count of false positives.That said,false positives are usually measured over sets where the gallery and probes are subject-disjoint.If the gallery is unconsolidated,and

23、 Julio is enrolled multiple times,the algorithm is rewarded for finding any occurrence of Julio in the gallery.One Video,Two Persons K 0 Searches151:N SEARCHFALSE POSITIVES IN OPERATIONShttps:/ Eduardo Medina,2023-12-21https:/ REPORTS THAT“THE SYSTEM GENERATED THOUSANDS OF FALSE-POSITIVE MATCHES”htt

24、ps:/www.ftc.gov/news-events/news/press-releases/2023/12/rite-aid-banned-using-ai-facial-recognition-after-ftc-says-retailer-deployed-technology-without16Measuring False PositivesFALSE POSITIVE IDENTIFICATION RATE Conventional to measure the false positive identification rate(FPIR)Run searches of ind

25、ividuals who are known to be absent from the enrolled galleryFPIR computed as the proportion of searches that produces one or more false positives above a threshold,T In FIVE 2024,false positive error is reported as FPIR given the availability of videos where we know the exact number of people prese

26、ntNUMBER OF FALSE POSITIVES In FIVE 2015,calculating FPIR was not possible,because the number of individuals in the search imagery was not known Instead,false positive errors were stated as the number of false positives from running searches of individuals who are known to be absent from the gallery

27、17Calculating False Positive Identification Rate Gallery containing only nonmated subjects Probe videos containing a known number of people known not to be in the gallery Any subjects who come up above threshold contribute to FPIR FPIR=#of subjects with any track with hit above threshold T(max 1 per

28、 subject in probe)/total#of subjects in probesS1ProbeN1N2N3Nonmated GalleryS218Calculating False Negative Identification Rate Gallery containing mated subjects Mated probe videos containing people known to be in the gallery,can also include unknown subjects Any subjects who does not come up above th

29、reshold contribute to FNIR FNIR=#of mated subjects with no tracks with hit above threshold T/total#of mated subjects in probesS1ProbeS1S2S3Mated Gallery S2U1U220FIVE RESULTS214)Self Boarding GateClassic chokepoint1 webcam250 actors5)In the WildPhotojournalismNot social mediaTV cameras500 actors1)Spo

30、rts Arena11 consumer cameras50 actors2)Passenger Loading Bridge10 pro cameras50 actors3)Concourse10 pro cameras50 actors6)Public SpaceMultiple professional+legacy cameras80 actors7)Luggage2 webcams375 actorsAdverse res,poseSee details on some datasets in the FIVE 2015 reportFIVE 2015 Datasets:People

31、 On The Move22DatasetSelf Boarding GateLuggageSports ArenaConcoursePublic SpaceVideo JournalismGallery Size480004800480480004800935Total video footage duration(minutes)184879952883600699#False Positives112001010100Best 20150.090.450.240.350.260.62Best 20240.000.000.030.050.010.20Miss rates across di

32、fferent datasets(best 2015 vs.best 2024)Accuracy gains since 201523FIVE 2024 ResultsAlgorithm matters!Video Journalism is difficult,because it is comprised of celebrity videos,where high yaw angles are typical,and theenrollment images are also unconstrained24Imagery collected at long range can poten

33、tially be distorted by atmospheric turbulence.Turbulence here refers to the distortions in an image caused by the movement of air due to temperature differentials.Here are examples of imagery collected at 300 meter,650 meter,and 1000 meter ranges,at low and high turbulence levels.Note that some vide

34、os collected at long range will have turbulence,some will not.Low turbulenceHigh turbulenceNew in FIVE 2024:long range with turbulence25Recognition is possible at 300m,even with high turbulenceAlgorithm matters!Recognition accuracy decreases significantly at 650m and abovePhase 2 Results:Long rangeC

35、ooperative subjectsLong lensFNIR FPIR=0.01akamiss rate with T set to have a 1%false alarms rateN=4800026100m100m200mElevation angle:18 degrees10mNew in FIVE 2024:long range and high altitude400m600mUAVUAV27Dataset Description Wide range of optical setups for probes Various ranges(close range to 1km+

36、)Various pitch angles Specialized sensors,non specialized sensors,UAVs Detailed enrollment data High quality stills at various pitch and yaw angles High quality,close range enrollment videos Random walk Structured walk Multiple different collection locations and scenarios28Longe Range Dataset,1:N Op

37、en Search(Main,Blended Gallery,Face Included)Probes:Various types:Some collected outdoors,sometimes at close distance but often at longer range distances,using pole-mounted cameras and unmanned aerial vehicles(UAVs),cameras facing down.Subjects were sometimes stationary,sometimes walked around.Gallery:Two galleries,average N=559 people.Each gallery subject has a variable amount and type of imagery,which could include any combination of videos and still imagery.29Long Range Dataset By Range30THANK YOUFRVTNIST.GOVPatrickGrother(not)Mei NganKayee Hanaoka(+Mei Ngan)AustinHomJoyceYangJimMatey

相關圖表

本文介紹了NIST FACE IN VIDEO EVALUATION (FIVE) 2024，一個評估視頻中人臉識別技術的項目。該項目評估了1:N人臉識別在視頻序列中的表現，包括低質量和壓縮視頻，以及受大氣湍流影響的遠距離成像。評估涵蓋了不同場景，如體育場館、機場、公共空間等，并采用了多種相機設置。關鍵數據包括： 1. 2024年與2015年相比，算法準確性有顯著提升，例如在視頻新聞領域的準確率從62%提高到了93%。 2. 在某些情況下，算法能夠在300米的距離識別人物，即使大氣湍流也不影響。 3. 2024年的挑戰包括：識別視頻中的多個人物、檢測異常行為、識別人物是否合作等。 4. 評估中使用了多種類型的數據集，包括名人視頻、長距離成像等，并考慮了不同的姿態和成像條件。本文還討論了算法在實際應用中可能遇到的問題，如假陽性識別率過高，以及如何通過設置閾值來控制錯誤率。FIVE 2024的目的是推動人臉識別技術的發展，提高其在各種復雜環境下的性能。

如何提升人臉識別技術？" 人臉識別技術未來發展趨勢如何？" 如何應對非合作環境下的人臉識別挑戰？"

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站