`US 8,531,535 B2
`do) Patent No.:
`Sep. 10, 2013
`(45) Date of Patent:
`Kwatraetal.
`
`US008531535B2
`
`(54) METHODS AND SYSTEMS FOR
`PROCESSING A VIDEO FOR STABILIZATION
`AND RETARGETING
`
`(75)
`
`Inventors: Vivek Kwatra, Santa Clara, CA (US);
`Matthias Grundmann,Atlanta, GA
`(US)
`
`(73) Assignee: Google Inc., Mountain View, CA (US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 198 days.
`
`(21) Appl. No.: 13/023,299
`
`(22)
`
`Filed:
`
`Feb. 8, 2011
`
`(65)
`
`Prior Publication Data
`
`US 2012/0105654 Al
`
`May3, 2012
`
`Related U.S. Application Data
`
`(60) Provisional application No. 61/407,630, filed on Oct.
`28, 2010.
`
`(51)
`
`(2006.01)
`(2006.01)
`
`Int. CL.
`HOAN 5/228
`HOAN 7/18
`(52) U.S. Cl.
`USPC oie cece ceeseeesesenenens 348/208.14; 348/155
`(58) Field of Classification Search
`USPC wee 348/208.99, 208.12, 208.3, 143,
`348/208.4, 154, 155, 208.14
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`7A33,497 B2* 10/2008 Chem wesc 382/107
`
`7,558,405 B2*
`7/2009 Tico etal.
`wee 382/107
`1/2007 Ticoetal. wu. 348/208. 1
`2007/0002146 Al*
`2009/0066800 Al
`3/2009 Wei
`2009/0251594 Al* 10/2009 Huaetal.0... 348/441
`
`
`
`OTHER PUBLICATIONS
`
`International Search Report and Written Opinion issued by the U.S.
`Patent Office in PCT/US2011/58111, mailed Feb. 13, 2012.
`Smith, Brandonet al., “Light Field Video Stabilization”, IEEE Inter-
`national Conference on Computer Vision (CCV), Sep. 29-Oct 2,
`2009,
`Improving the
`Gleicher, Michael et al., “RE—Cinematography:
`Camera Dynamics of Casual Video”, ACM Multimedia 2007, Sep.
`2007.
`Liu, Fenget al., “Content-Preserving Warps for 3D Video Stabiliza-
`tion”, ACM Transactions on Graphics (Proceedings of ACM SIG-
`GRAPH 2009), vol. 28, Nov. 3, 2009.
`Wikepedia, “OpenCV”. Retrieved from http://en.wikipedia.org/
`wiki/OpenCYV, printed on Feb. 8, 2011.
`Wikepedia, “RANSAC”. Retrieved from http://en.wikipedia.org/
`wiki/RANSAC, printed. on Feb. 8, 2011.
`
`* cited by examiner
`
`Primary Examiner — James M Hannett
`Assistant Examiner — Ahmed A Berhan
`
`(74) Attorney, Agent, or Firm — McDonnell Boehnen
`Hulbert & Berghoff LLP
`
`(57)
`
`ABSTRACT
`
`Methods and systems for processing a videofor stabilization
`and retargeting are described. A recorded video maybesta-
`bilized by removing shake introduced in the video, and a
`video may be retargeted by modifying the video to fit to a
`different aspect ratio. Constraints can be imposedthat require
`a modified video to contain pixels from the original video
`and/or to preserve salient regions. In one example, a video
`may be processedto estimate an original path of a camera that
`recorded the video, to estimate a new camera path, and to
`recastthe video fromthe original pathto the new camerapath.
`To estimate a new camera path,a virtual crop window can be
`designated. A difference transformation betweenthe original
`and new camera path can be applied to the video using the
`crop window to recast the recorded video from the smooth
`camera path.
`
`25 Claims, 10 Drawing Sheets
`
`
`
`300;
`
`2001
`
`100}
`
`0x00 +0
`
` _ 100
`—
`Motion in x
`
`300
`
`Exhibit 1012
`
`Insta360
`Page 1
`
`Insta360
`Exhibit 1012 Page 1
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 1 of 10
`
`US 8,531,535 B2
`
`CAMERA PATH TRANSLATION SYSTEM
`100
`
`VIDEO STABILIZATION AND
`RETARGETING ENGINE
`104
`
`CAMERA PATH ESTIMATION
`ENGINE
`102
`
`106
`
`VIDEO TRANSLATION ENGINE
`
`CLIENT 204A
`
`PLAYER
`208A
`
`CLIENT 204N
`
`PLAYER
`208N
`
`NETWORK
`206
`
`
`
`
`
`
`
`
`
`VIDEO HOSTING SERVICE 200
`
`
`CAMERA PATH TRANSLATION SYSTEM
`VIDEO
`INGEST
`202
`SERVER
`SERVER
`210
`212
`
`CAMERA PATH ESTIMATION
`ENGINE
`218
`
`VIDEO STABILIZATION AND
`
`VIDEO
`RETARGETING ENGINE
`DATABASE
`220
`216
`
`Exhibit 1012
`
`Insta360
`Page 2
`
`Insta360
`Exhibit 1012 Page 2
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 2 of 10
`
`US 8,531,535 B2
`
`300
`™
`
`RECEIVE A RECORDED VIDEO
`
`EXTRACT TRACKABLE FEATURE MATCHESIN
`H
`DEO
`FRAMES OF THE VI
`
`302
`
`304
`
`
`
`PERFORM LOCAL OUTLIER REJECTION TO REMOVE
`SPURIOUS FEATURE MATCHES
`
`306
`
`
`
`
`FiT 2D LINEAR MOTION MODELS TO THE
`TRACKABLE FEATURE MATCHES TO DESCRIBE
`
`
`
`
`MOTION OF THE CAMERA BETWEEN VIDEO FRAMES}
`
`
`
`TRANSFORM LINEAR MOTION MODELS TO A
`COMMON COORDINATE SYSTEM AND
`
`CONCATENATE THE LINEAR MOTION MODELS
`
`
`
`
`
`FIGURE 3
`
`308
`
`310
`
`Exhibit1012
`
`Insta360
`Page3
`
`Insta360
`Exhibit 1012 Page 3
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 3 of 10
`
`US 8,531,535 B2
`
`400
`™
`
`RECEIVE ESTIMATE OF ORIGINAL CAMERA PATH
`
`MOTION
`
`RECEIVE CONSTRAINTS LIMITING CHANGES TO THE
`ORIGINAL CAMERA PATH MOTION
`
`402
`
`404
`
`
`
`RECEIVE COST FUNCTION AND PERFORM MINIMIZATION
`
`406
`
`DETERMINING A CROP WINDOW TRANSFORMATION OF
`A PRE-DEFINED SCALE LESS THAN ONE WITH RESPECT
`
`
`
`
`TO THE ORIGINAL FRAME SIZE SUBJECT TO THE
`
`CONSTRAINTS AND MINIMIZATIONS OF THE RESIDUALS
`
`
`
`
`
`
`408
`
`410
`
`APPLYING THE CROP WINDOW TRANSFORMATION TO
`THE ORIGINAL VIDEO
`
`
`
`Exhibit1012
`
`Insta360
`Page 4
`
`Insta360
`Exhibit 1012 Page 4
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 4 of 10
`
`US 8,531,535 B2
`
` 400
`
`502
`
`500
`
`NN
`
`300
`
`200
`
`100
`
`0
`
`45
`
`80
`
`720
`
`T60
`
`0
`
`40
`
`80
`
`720
`
`760
`
` FIGURE 5D
`
`Exhibit1012
`
`Insta360
`Page 5
`
`Insta360
`Exhibit 1012 Page 5
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 5 of 10
`
`US 8,531,535 B2
`
`
`
`
`
`[u‘o]x[mM‘o]e]6ue1001sueu4
`
`
`
`jenpisay
`
`UOI}OWW
`
`MOPUIM
`
`doin
`
`BIOWUeD
`
`1)yyed
`
`(MOU)
`
`Exhibit 1012
`
`Insta360
`Page 6
`
`DSeWJOJSUBI}
`
`IDSIQUIOD
`
`(iqAq
`
`Insta360
`Exhibit 1012 Page 6
`
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 6 of 10
`
`US 8,531,535 B2
`
`
`OOE
`
`1002
`
`7001
`
`
`
`
`SOLUBIISAOAUlUCHOPY
`
` +r
`ooOOLoOOl-
`
`SeUIe]}J6A0XUlUONO)
`700€
`
`
`
`1002
`
`1001
`
`
`
`Z06
`
` a6aunois
`
`006
`
`eS|ejbuejoesdoieee
`
`
`
`MOGNIAAdOwD
`
`206
`
`eules
`
`(dyAq
`
`a8Sandia
`
`Auluojopy
`
`V8sands
`
`XUfUGHOYy
`
`Exhibit 1012
`
`Insta360
`Page 7
`
`Insta360
`Exhibit 1012 Page 7
`
`
`
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 7 of 10
`
`US 8,531,535 B2
`
`Residual
`motion
`
`Crop
`window
`(fixed)
`
`(known)
`
`Warp
`Transform
`
`Feature
`transforms
`
`Salient
`point sj
`transformed
`by A(pr)
`
`0,0
`
`1100
`
`
` 1102
`
`FIGURE 11
`
`Exhibit 1012
`
`Insta360
`Page 8
`
`Insta360
`Exhibit 1012 Page 8
`
`
`
`Sep. 10, 2013
`
`Sheet 8 of 10
`
`¥d
`
`U.S. Patent
`
`Ssluelf-Aey
`
`vdVSvl=€dq Ed&.Se=G|6Sld=cd
`—_1)SoludesBowon
`
`
`
` 14SWUJOJSUB]]OLUBJ-feuwei4-Aey
`
`BISSOWUBIIWIS
`
`jewndo
`
`BIOUBO
`
`tquyed
`
`US 8,531,535 B2
`
`Exhibit 1012
`
`Insta360
`Page 9
`
`Insta360
`Exhibit 1012 Page 9
`
`
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 9 of 10
`
`US 8,531,535 B2
`
`OLEL
`
`SOIHdVES)
`
`LINN
`
`cSEL
`
`O8EL
`
`SLEL
`
`VVEL
`
`CVelL
`
`aTaVAONSY
`
`(©Ot888Fwme
`
`
`
`ONILNAIIOD||(S)LHOd{_}YaATIONLNOD
`(s)Z01aaq|‘MocLeL
`
`YSHLOMYOMLIN
`(GH“5"3)39vHOLShigLJTSVAOWFY-NON$s
`
`SADVAYNSALN]NOLLVOINNAWOD
`
`OSE)SADVAYALNILAdLNO
`
`
`OPC)SSDIAAGADVYOLS
`
`
`ONISSS90Nd|(>
`éladNDIis
`poocesecemencenmenyproncnerenceeecensNOILVINYONI
`
`
`
`|OLELYOSSa5.0ud
`YATIONLNODAXONSfl
`
`OfELSNEAXOWSIN
`VLVGWvedOud
`
`
`
`
`
`OZELAYOWSIN,WALSAS
`
`INVA/INO’
`
`
`
`CechNOWLVIMddy
`
`NOILVZIMISVLS
`
`NHLINODTY
`
`OxdIA
`
`Ecel
`
`iGCEoL
`
`CEL
`
`LINSLNOD
`
`Exhibit 1012
`
`Insta360
`Page 10
`
`Insta360
`Exhibit 1012 Page 10
`
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Sep. 10, 2013
`
`Sheet 10 of 10
`
`US 8,531,535 B2
`
`COMPUTER PROGRAM PRODUCT 1400
`
`SIGNAL BEARING MEpDium 1401
`
`PROGRAM INSTRUCTIONS 140
`
`ESTIMATING AN ORIGINAL MOTION PATH OF A CAMERA THAT RECORDED A
`VIDEO
`
`MEDIUM
`
`DETERMINING AT EACH TIME T A SUBSTANTIALLY CONSTANT PATH, A
`SUBSTANTIALLY CONSTANT VELOCITY, OR A SUBSTANTIALLY CONSTANT
`ACCELERATION OF THE ORIGINAL MOTION PATH OF THE CAMERA
`
`DETERMINING A MODIFIED MOTION CAMERA PATH OF THE ORIGINAL
`MOTION PATH OF THE CAMERA INCLUDING FOR EACH TIME 7 THE
`
`SUBSTANTIALLY CONSTANT PATH, THE SUBSTANTIALLY CONSTANT
`VELOCITY, OR THE SUBSTANTIALLY CONSTANT ACCELERATION OF THE
`ORIGINAL MOTION PATH OF THE CAMERA
`
`BASED ON THE MODIFIED MOTION CAMERA PATH AND THE ORIGINAL
`
`MOTION PATH OF THE CAMERA, DETERMINING A CROP WINDOW
`TRANSFORM THAT DESCRIBES HOW TO MODIFY THE ORIGINAL MOTION
`PATH OF THE CAMERA TO THE MODIFIED MOTION CAMERA PATH, THE CROP
`WINDOW TRANSFORM DETERMINED ACCORDING TO AT LEAST ONE
`CONSTRAINT LIMITING CHANGES TO THE ORIGINAL MOTION PATH OF THE
`CAMERA
`
`APPLYING THE CROP WINDOW TRANSFORM TO THE VIDEO TO RECAST THE
`VIDEO FROM A VIEWPOINT OF THE ORIGINAL MOTION PATH OF THE
`CAMERA TO A VIEWPOINT OF THE MODIFIED MOTION CAMERA PATH
`
`COMPUTER
`| READABLE MEDIUM
`
`COMPUTER
`RECORDABLE
`
`COMMUNICATIONS
`
`FIGURE 14
`
`Exhibit 1012
`
`Insta360
`Page 11
`
`Insta360
`Exhibit 1012 Page 11
`
`
`
`US 8,531,535 B2
`
`1
`METHODS AND SYSTEMS FOR
`PROCESSING A VIDEO FOR STABILIZATION
`AND RETARGETING
`
`CROSS REFERENCE TO RELATED
`APPLICATION
`
`The present application claimspriority to U.S. provisional
`patentapplication Ser. No. 61/407,630,filed on Oct. 28, 2010,
`the entire contents of which are herein incorporated byrefer-
`ence as if fully set forth in this description.
`
`BACKGROUND
`
`Image stabilization includes many techniques used to
`reduce blurring associated with motion of a camera during
`exposure. Image stabilization techniques may compensate
`for pan and tilt (angular movement) of a camera or other
`imaging, device. With still cameras, camera shake can be
`problematic at slow shutter speeds or with long focal length
`(telephoto) lenses, and imagestabilization techniques can be
`used to improvea still picture.
`Similarly, video stabilization techniques may be used to
`improve recorded videos. With video cameras, camera shake
`can cause visible frame-to-framejitter in a recorded video.
`For example, handheld cameraor handheld video recordingis
`a film and video technique in which a camerais held in the
`camera operator’s hands, and a handheld recorded video may
`be perceptibly shakier than a video recorded using a tripod-
`mounted camera (or other stabilization equipment, such as
`camera dollies or steady-cams) due to motion of the operator
`holding the camera during recording. However, recording
`videos using handheld video recording may enable more
`opportunities for filming.
`Video stabilization techniques may be used to creale a
`stable version ofa casually shot video (e.g., a video recorded
`on a device with little or no stabilization equipment). Video
`stabilization techniques generally attempt
`to render the
`recorded video as the video would have been recorded from a
`smooth or stable camera path.
`
`SUMMARY
`
`The present application discloses embodiments of systems
`and methods for processing a video for stabilization and
`retargeting. In one aspect, a method for processing a video is
`described. The method may comprise estimating an original
`motion path of a camera that recorded a video. The method
`may also comprise determining at each time t a substantially
`constant path, a substantially constant velocity, or a substan-
`tially constant acceleration of the original motion path of the
`camera. The method also may comprise determining a modi-
`fied motion camera path of the original motion path of the
`camera including for each time t the substantially constant
`path, the substantially constant velocity, or the substantially
`constant acceleration of the original motion path of the cam-
`era. The method may further comprise based on the modified
`motion camera path and the original motion path of the cam-
`era, determining a crop windowtransform that describes how
`to modify the original motion path of the camerato the modi-
`fied motion camera path, and the crop window transform may
`be determined according to at least one constraint limiting
`changes to the original motion path of the camera. The
`method may further comprise applying the crop window
`transformto the video to recast the video from a viewpoint of
`the original motion path of the camera to a viewpoint of the
`modified motion camera path.
`
`2
`In another aspect, a non-transitory computer readable
`medium having stored therein instructions executable by a
`computing device to cause the computing device to perform
`functionsis described. The functions may comprise estimat-
`ing an original motion path of a camerathat recordeda video.
`The function may further comprise determining at each time
`ta substantially constant path, a substantially constant veloc-
`ity, or a substantially constant acceleration of the original
`motion path of the camera. The functions also may comprise
`determining a modified motion camera path of the original
`motion path of the camera including for each time t the
`substantially constant path, the substantially constant veloc-
`ity, or the substantially constant acceleration of the original
`motion path of the camera. The functions further may com-
`prise based on the modified motion camera path and the
`original motion path of the camera, determining a crop win-
`dow transform that describes how to modify the original
`motion path of the camera to the modified motion camera
`path, the crop window transform determined according to at
`least one constraint limiting changes to the original motion
`path of the camera. The functions further may comprise
`applying the crop windowtransformtothe video to recast the
`video from a viewpoint of the original motion path of the
`camera to a viewpoint of the modified motion camera path.
`In still another aspect, a camera path translation system is
`provided that comprises a camera path estimation engine, a
`video stabilization and retargeting engine, and a video trans-
`lation engine. The camera path estimation engine may be
`configured to receive a video, and to estimate an original
`motion path of a camera that recorded the video based on
`motion of objects within the video. The video stabilization
`and retargeting engine maybe configured to determine a crop
`window transform that describes how to modifythe original
`motion path of the camera to a modified motion camera path,
`and the crop windowtransform maybe determined according
`to at least one constraint limiting changes to the original
`motion path of the camera. The videotranslation engine may
`be configured to apply the crop window transform to the
`video to recast the video from a viewpoint of the original
`motion path of the camera to a viewpoint of the modified
`motion camera path.
`The foregoing summary is illustrative only and is not
`intended to be in any way limiting. In additionto the illustra-
`tive aspects, embodiments, and features described above, fur-
`ther aspects, embodiments, and features will become appar-
`ent by reference to the figures and the following detailed
`description.
`
`BRIEF DESCRIPTION OF THE FIGURI tSs
`
`40
`
`45
`
`w o
`
`VIG.1 illustrates a block diagram of an example camera
`path translation system.
`FIG.2 is a block diagram illustrating an example system
`view of a video hosting service that includes a camera path
`translation system.
`FIG.3 is an example block diagram of a method to process
`a videoto estimate an original motion ofthe camera or camera
`path,
`in accordance with at
`least
`some embodiments
`described herein.
`FIG.4 is an example block diagram of a methodto process
`a video to estimate an updated or modified motion of the
`camera or camera path, in accordance with at least some
`embodiments described herein.
`
`65
`
`FIGS. 5A-5Dillustrate example graphs of an optimal cam-
`era path determined based on a synthetic camera path.
`FIG.6 illustrates an example video frame rectangle and a
`crop window rectangle.
`
`Exhibit 1012
`
`Insta360
`Page 12
`
`Insta360
`Exhibit 1012 Page 12
`
`
`
`US 8,531,535 B2
`
`3
`FIG,7 illustrates example video frames and an example
`crop window for each video frame.
`FIGS. 8A-8Billustrate example graphs of an original cam-
`era path motion versus an updated or smooth camera path
`motion.
`
`FIGS. 9A-9Billustrate an example original video frame
`and features points a, b, c, and d.
`FIG.10 illustrates example video frames including a fixed
`crop windowto transform the video frames such that salient
`points reside within the fixed crop window.
`FIG. 11 illustrates an example video frame and a crop
`window for transformation of the video frame.
`
`FIG,12 illustrates example video frames of a video, and
`performing wobble suppression of the video frames.
`FIG, 13 is a functional block diagram illustrating an
`example computing device used in a computing system thatis
`arranged in accordance with at least some embodiments
`described herein.
`
`FIG, 14 is aschematic illustrating a conceptual partial view
`of an example computer program product that includes a
`computer program for executing a computer process on a
`computing device, arranged according to at
`least some
`embodiments presented herein.
`
`DETAILED DESCRIPTION
`
`10
`
`15
`
`20
`
`25
`
`4
`102, a video stabilization and retargeting engine 104, and a
`video translation engine 106. The camera path translation
`system 100 may be configured to receive a video, and to
`perform video stabilization processes on the video. For
`example, the camera path estimation engine 102 may esti-
`mate a path of a camera that recorded the video based on
`motion of objects within the received video. The video stabi-
`lization and retargeting engine 104 may then estimate a new
`steady/smooth camera path, and the video translation engine
`106 may recast the received video from a viewpoint of the
`smooth camera path determined by the video stabilization
`and retargeting engine 104.
`One or more of the described functions or components of
`the system 100 may be divided up into additional functional
`or physical components, or combinedinto fewer functional or
`physical components. In some further examples, additional
`functional and/or physical components may be addedto the
`examples illustrated by FIG. 1. Still further, any ofthe camera
`path estimation engine 102, the video stabilization andretar-
`geting engine 104, and/or the video translation engine 106
`may includeor be provided in the form of a processor(e.g., a
`micro processor, a digital signal processor (DSP), etc.) con-
`figured to execute program code including one or more
`instructions for implementing logical functions described
`herein. The system 100 may further include any type of com-
`puter
`readable medium (non-transitory medium),
`for
`example, such as a storage device including a disk or hard
`drive, to store the program code. In other examples, the cam-
`era pathtranslation system 100 may be included within other
`systems.
`FIG, 2 is a block diagram illustrating a system view of a
`video hosting service 200 that includes a camera path trans-
`lation system 202. Multiple users/viewers may use clients
`204A-N to send video hosting requests to the video hosting
`service 200, such as to upload videos to a video hosting
`website, and to receive the requested services from the video
`hosting service 200. The video hosting service 200 may be
`configured to communicate with the one or more clients
`204A-N via a network 206. The video hosting service 200
`may receive the video hosting service requests from the cli-
`ents 204A-N over wired or wireless connections.
`
`Turningto the individualentities illustrated on FIG.2, each
`client 204A-N maybe used by a userto request video hosting
`services. For example, a user can use the client 204A to send
`arequest for uploading a video for sharing,or playing a video.
`The clients 204A-N can be any type of computer device, such
`as a personal computer (e.g., desktop, notebook, tablet, lap-
`top) computer, as well as devices such as a mobile telephone,
`personal digital assistant, or IP enabled video player. The
`clients 204A-N may include a processor, a display device (or
`output to a display device), anda local storage, such as a hard
`drive or flash memory device to which the clients 2044-N
`store data used by the user in performing tasks, and anetwork
`interface for coupling to the video hosting service 200 via the
`network 206.
`The clients 204A-N may include a video player 208A-N
`(e.g., the Flash™ player from Adobe Systems, Inc., or a
`proprietary one) for playing a video stream. The video player
`208A-N may be a standalone application, or a plug-in to
`another application such as a network or Internet browser.
`Wherethe client 204A-N is a general purpose device(e.g., a
`desktop computer, mobile phone), the player 208A-N may be
`implemented as software executed by the computer. Where
`the client 204A-N is a dedicated device (e.g., a dedicated
`video player), the player 208A-N may be implemented in
`hardware, or a combination of hardware and software. The
`player 208A-N mayincludeuser interface controls (and cor-
`
`Exhibit 1012
`
`Insta360
`Page 13
`
`30
`
`35
`
`40
`
`45
`
`The following detailed description describes various fea-
`tures and functions of the disclosed systems and methods
`with reference to the accompanying figures. In the figures,
`similar symbols identify similar components, unless context
`dictates otherwise. The illustrative system and method
`embodiments described herein are not meantto be limiting.It
`may be readily understood that certain aspects of the dis-
`closed systems and methodscan be arranged and combined in
`a wide variety of different configurations, all of which are
`contemplated herein.
`This disclosure may disclose,inter alia, systems and meth-
`ods for stabilizing and retargeting recorded videos. For
`example, a recorded video may bestabilized by removing at
`least a portion of shake introduced in the video, and a video
`mayberetargeted by modifying the videotofit to a different
`aspect ratio. The disclosure describes examples for stabiliz-
`ing and retargeting recorded video by imposing constraints
`that require a modified video to contain valid pixels from the
`original recorded video and/orto preserve salient regions and
`objects, for example.
`In one example, a video maybestabilized by performing
`post-processing techniques. The video may be processed to
`estimate an original path (e.g., motion) of a camera that
`recorded the video, to estimate a new steady/smooth camera 5
`path, and to recast the video from the original path to the
`smooth camera path. In one example, to estimate a new cam-
`era path, a virtual crop window ofa pre-defined scale less than
`one with respect to an original frame size can be designated.
`Adifference transformation betweenthe original and smooth
`camera path can be applied to the recorded video using the
`crop windowto recast the recorded video as if the video had
`been recorded from the smooth camera path to remove shake
`from the recorded video, for example. If the crop window
`doesnotfit in the original frame, in one example, undefined
`areas may be filled using motion-in-painting. In another
`example, constraints can be imposed to prevent undefined
`areas from occurring.
`J. Example Camera Path Translation System
`Referring now to FIG. 1, a block diagram of a camera path
`translation system 100 is illustrated. The camera path trans-
`lation system 100 includes a camera path estimation engine
`
`60
`
`65
`
`Insta360
`Exhibit 1012 Page 13
`
`
`
`US 8,531,535 B2
`
`5
`responding application programming interfaces) for select-
`ing a video feed, starting, stopping, and rewinding a video
`feed. Also, the player 208A-N canincludein a user interface
`a video display format selection configured to indicate a
`video display format (e.g., a standard definition TV or a
`high-definition TV). Other types of user interface controls
`(e.g., buttons, keyboard controls) can he used as well to con-
`trol the playback and video format selection functionality of
`the player 208A-N.
`The network 206 enables communications between the
`
`clients 204A-N and the video hosting service 200. In one
`embodiment, the network 206 is the Internet, and uses stan-
`dardized internetworking communications technologies and
`protocals, known nowor subsequently developed that enable
`the clients 204A-N to communicate with the video hosting
`service 200. In another embodiment, the network 206 may be
`a wireless cellular network that enables wireless communi-
`
`10
`
`cation between the clients 204A-N and the video hosting
`service 200.
`
`20
`
`The video hosting service 200 comprises the camera path
`translation system 202, a video server 210, an ingest server
`212, and a video database 216. The video server 210 may be
`configured to serve videos from the video database 216 in
`response to user video hosting service requests. The ingest
`server 212 may be configured to receive user uploaded videos
`and store the videos in the video database 216. The video
`
`25
`
`ap
`Le. a 0 (where P is a function representing the camera path),
`
`a path of constant velocity may represent a panningor a dolly
`shot,
`
`and a path of constant acceleration may represent ease-in and
`out transitions between static and panning cameras,
`
`Pe
`dp
`
`The estimated smooth camera path maybe segments of con-
`stant, linear, or parabolic motion. The segments maybestatic
`segmentsrather than a superposition of the segments to avoid
`residual motion.
`
`In one example,to estimate a camera path P(t) comprising
`segments of constant, linear, and parabolic motion, an opti-
`mization may be performedas a constrained L1 minimization
`solution. For example, an N-dimensional vector norm of
`order p is defined as
`
`database 216 may be configuredto store user uploaded videos
`and videos processed by the camera path translation system
`1
`202. In one embodiment,the video database 216 stores a large
`P
`N
`video corpus.
`P
`
`
`i=l
`The camera path translation system 202 may include a
`camera path estimation engine 218, a video stabilization and
`retargeting engine 220, and a video translation engine 222.
`The camerapathtranslation system 202 may be configured to
`receive user uploaded videos from the ingest server 212, and
`to perform video stabilization of the videos.
`II. Video Stabilization
`
`lap =|Dbal? |
`
`40
`
`In one example, the camera path estimation engine 218
`may estimate a path of a camera that recorded the video based
`on motion of objects or images within the received video. A
`camera path may be estimated by extracting trackable fea-
`tures in frames of the video, matching features, and perform-
`ing local outlier rejection to remove spurious matches that
`may distort motion estimation. Linear motion models(e.g.,
`translation, similarity, affine) may be fit to the tracked fea-
`tures to estimate a motion of the camera between two frames,
`and the motion models can be transformed to a common
`
`coordinate system and concatenated to yield an estimated
`original camera path overall frames of the video.
`The video stabilization and retargeting engine 220 may
`then estimate a new steady/smooth camera path based on >
`constraints. or example, a base vertical line may be estab-
`lished for desired vertical camera path motion, and con-
`straints can be established to allow a camera path to be modi-
`fied by a constrained amount(e.g., if camera motion moves
`downward, pixels in images are moved upwardto align with
`a previous frame and bottom row(s) of pixels can be removed
`or cropped out to an extent as allowed bythe constraints).
`A smooth camera path can be estimated using minimiza-
`tion of derivatives ofthe original camera path as estimated by
`the camera path estimation engine 218. For example, a con-
`stant path may represent a static camera,
`
`which is a sum of the pth powerofthe absolute value ofits
`components followed bythe pth root of the result. An [1/T.2
`norm calculation may be performed using this standard. L1
`minimization may result in a path with derivatives (described
`above) being zero for many segments. An L1 camera path
`may comprise segments resembling a static camera, linear
`motion, and constant acceleration. In another embodiment,
`L2 minimization may be used to minimize the above deriva-
`tives on average to result in small, but possibly non-zero
`gradients(e.g., which may result in an L2 camera paththat has
`somesmall non-zero motion).
`In addition, a camera path P(t) can be determined that
`minimizes the above derivatives while satisfying constraints.
`A variety of constraints may be used such as an inclusion,
`proximity, and saliency constraints. An inclusion constraint
`requires a crop window transformedbythe path P(t) to always
`or substantially always be contained in an original frame
`rectangle transformed by C(t), the camera path. A proximity
`constraint includes a new camera path P(t) preserving (or
`substantially preserve) the original intent ofthe movie, e.g.,if
`the original path contained segments of zooming, the new
`camera path may follow this motion. A saliency constraint
`includes salient points (e.g., obtained by a face detector or
`general modefinding ina saliency map) within all ora part of
`a crop window transformed byP(t). Many other constraints
`may be used as well.
`The video translation engine 222 may recast the received
`video from a viewpoint of the smooth camera path deter-
`mined by the video stabilization and retargeting engine 220
`by applying a transformation to the video to produce a
`cropped video with less shake, for example. Thus, in one
`embodiment, video stabilization may be performed by (1)
`
`Exhibit 1012
`
`Insta360
`Page 14
`
`Insta360
`Exhibit 1012 Page 14
`
`
`
`US 8,531,535 B2
`
`7
`estimating per-frame motion transforms F,, (2) determining
`an optimal camera path P=C,B, (where C, is based on the
`motion transforms F, and B, is a crop window transform
`estimated as described below) and(3) stabilizing the video by
`warping according to B,.
`FIG, 3 is an example block diagram of a methodto process
`a videoto estimate an original motion ofthe camera or camera
`path,
`in accordance with at
`least some embodiments
`described herein. Method 300 shown in FIG. 3 presents an
`embodiment of a method that, for example, could be used
`with the systems 100 and 200, for example, and may be
`performed by a device, a server, or a combination of the
`device and the server. Method 300 may include one or more
`operations, functions, or actionsas illustrated by one or more
`of blocks 302, 304, 306, 308, and 310. Although the blocks
`are illustrated in a sequential order, these blocks may in some
`instances be performed in parallel, and/or in a different order
`than those described herein. Also, the various blocks may be
`combined into fewer blocks, divided into additional blocks,
`and/or removed based upon the desired implementation.
`In addition, for the method 300 and other processes and
`methods disclosed herein, the flowchart shows functionality
`and operation of one possible implementation of present
`embodiments. In this regard, each block may represent a
`module, a segment, or a portion of program code, which
`includes one or more instructions executable by a processor
`for implementing specific logical functions or steps in the
`process. The program code may be stored on any type of
`computer readable medium, for example, such as a storage
`device including a disk or hard drive. The computer readable
`medium may include a non-transitory computer readable
`medium, for example, such as computer-readable media that
`stores data for short periods of time like register memory,
`processor cache and Random Access Memory (RAM). The
`computer readable medium mayalso include non-transitory
`media, such as secondary orpersistent long term storage, like
`read only memory (ROM), optical or magnetic disks, com-
`pact-disc read only memory (CD-ROM), for example. The
`computer readable media may also be anyother volatile or
`non-volatile storage systems. The computer readable medium
`may be considered a computer readable storage medium, a
`tangible storage device, or other article of manufacture, for
`example.
`In addition, for the method 300 and other processes and
`methods disclosed herein, each block in FIG. 3 may represent
`circuitry that is wired to perform the specific logical functions
`in the process.
`At block 302, a recorded video is received. At block 304,
`trackable feature matches in framesofthe video are extracted.
`
`For example, trackable features in each frameofthe video are
`extracted, or trackable features in substantially all frames of
`the video are extracted. Trackable features in frames of the
`
`40
`
`45
`
`8
`changes in illumination, brightness values of a small image
`patch (e.g., 7x7 pixels) centered aroundthe feature pointx in
`J,_,; and its matching point y in I, may be nearly identical. For
`each feature x in I,_,, a displacement vector d may be deter-
`mined such that the I, (x)=I,{x+d), and therefore x+d=y
`using the previousnotation(e.g., that is feature matches (x<-
`>y)). This expression can be linearized by Taylor Series
`expansion aroundx, yielding DL(x)*d=L,_, (x)-I,(x) which is
`linear in the unknowndisplacement vector d. An over deter-
`mined linear system of equations may be determined of the
`form A*d=b that can be then solved by using normal equa-
`tions (i.e., solving the symmetric linear system A7A d=A7b
`by Gaussian Elimination, where A’ denotesthe transpose of
`A). This process may be referred to as pyramidical Lucas-
`Kanade Tracking.
`During feature tracking from one frameto the next frame,
`errors may accumulate. To detect potentially poor feature
`matches, images in a window aroundthe feature in the current
`frame can be monitored to determine ifthe imagesare similar
`to the images around the feature in the first frame. Features
`may be tracked over many frame



