CN102957869A

CN102957869A - Video stabilization

Info

Publication number: CN102957869A
Application number: CN2012103630530A
Authority: CN
Inventors: C.奥文; P.卡尔森
Original assignee: Skype Ltd Ireland
Current assignee: Microsoft Technology Licensing LLC
Priority date: 2011-09-26
Filing date: 2012-09-26
Publication date: 2013-03-06
Anticipated expiration: 2032-09-26
Also published as: CN102957869B

Abstract

The invention relates to method, device and computer program product for transmitting a video signal from a user device. A plurality of frames of the video signal are captured using a camera at the user device. A functional state of the device is determined and the video signal is selectively stabilised prior to transmission based on the functional state.

Description

Video stabilization

Technical field

The present invention relates to stable (stabilisation) of vision signal.Especially, the present invention relates to use video camera capturing video signal frame and the motion of video camera is compensated, thus the stable video signal.

Background technology

Video camera can be used to catch the sequence of the image that uses as video signal frame.Video camera can be fixed to stable object, for example video camera can be installed on the support of tripod and so on, thereby in the capturing video frame, keep video camera static.Yet video camera often realizes in mobile device, and may not be installed on the fixing object, and for example, video camera may be held or be positioned in the moving object of vehicle and so on.When video camera capturing video signal frame, the movement of video camera may cause vision signal itself undesirable movement to occur.

It is a kind of method of not wishing movement that can be used for the compensation vision signal that image stabilization is processed.Some systems carry out estimation and produce the motion vector of processing for image stabilization.In " Online Video Stabilization Based on Particle Filters " literary composition that the people such as Julan Yang deliver, a kind of such system has been described.Image stabilization algorithm can comprise three major parts: estimation, motion smoothing and motion compensation.The estimation parts can be estimated the local motion vector of vision signal inside, and calculate global motion vector based on these partial estimation.Then, the motion smoothing parts can be processed the filtering of carrying out for estimated global motion vector, in order to occur very large undesirable difference between the value that smoothing computation obtains and the motion vector that prevents from formerly calculating.Afterwards, the motion compensation parts can be on the direction opposite with the global motion vector of filtering moving images, thereby the stable video signal.Described motion compensation parts can take in the complex transformations of rotation, distortion or convergent-divergent and so on.

Process for based on motion vector as mentioned above comes carry out image stabilized, might need so a large amount of processing resources.When wanting the real-time stabilization vision signal, namely use the vision signal of (for example transmit or export) stable release in video camera capturing video signal in video call from equipment, this might become a problem.In addition, when being the small type mobile devices of this class processing resource-constrained of mobile phone, the equipment that carry out image stabilized is processed to go wrong equally.

In recent years, it is simpler and cheap that the manufacturing of motion sensor has become, and the size of motion sensor also significantly diminishes.Now, realize that in mobile device motion sensor is feasible.Motion sensor produces the sampling of representative sensor motion.In " Accelerometer Based Digital Video Stabilization for General Security Surveillance Systems " these two pieces of existing documents that the people such as " Using Sensors for Efficient Video Coding in Hand-held devices " that Andy L. Lin delivers and Martin Drahansk deliver, addressed the possibility of coming the stable video signal with the data that are derived from motion sensor.

Known handling implement " VirtualDub " provides stable process (namely non real-time) of off-line that needs to prune vision signal.

In Britain's application 1109071.9, proposed a kind of for online (namely real-time) mechanism that the transmitter side numeral is stable.This mechanism all is effectively in a lot of situations, but it need to prune the video that transmits.And this might affect the quality of the video that transmits.

Summary of the invention

According to an aspect of the present invention, provide a kind of method that transmits vision signal from subscriber equipment, the method comprises: use video camera to catch a plurality of video signal frames at subscriber equipment; Determine the functional status of equipment; And before transmitting, come selectively stable video signal based on functional status.

Preferably, described functional status is the movement degree of video camera, and the method comprises: the motion of surveillance camera and with itself and a threshold.

Described supervision can comprise: use the motion sensor that is associated with video camera to produce to represent a plurality of samplings of camera motion; Sample to determine the displacement of video camera between the successive frame that described video camera is caught with these; And the pixel displacement of determining the motion in the vision signal between the successive frame that the determined video camera displacement of representative causes, and the method also comprises: pixel displacement and threshold value compared; If pixel displacement exceeds described threshold value, then before transmission, stablize described vision signal; Otherwise transmit vision signal in the situation that do not implement stable the processing.

Preferably, the motion of video camera is to rotatablely move, and motion sensor is the transducer that rotatablely moves, and the displacement of video camera is the angular displacement of video camera.

Subscriber equipment comprises forward direction video camera and backward video camera, when functional status is when having selected the forward direction video camera, at this moment can transmit vision signal in the situation that do not carry out stable the processing.Preferably, having selected the rear in video camera of equipment, at this moment can stablize vision signal.

Preferably, the sample rate of the sampling of use motion sensor generation will be higher than the frame rate of vision signal.

Video camera and motion sensor can be installed in mobile device inside.

The step of stable video signal can comprise: pixel displacement is carried out filtering; And the image that moves at least one frame in the first and second frames according to the pixel displacement of filtering.

The step of pixel displacement being carried out filtering can comprise: determine the accumulation pixel displacement based on the described location of pixels that is second frame determined; And be based upon the filtering accumulation pixel displacement that accumulation pixel displacement that second frame determine and the weighted sum of the filtering accumulation pixel displacement that is used for first frame are identified for second frame.

Preferably, add time migration among at least one in following: a plurality of frames of (i) catching; And (2) a plurality of samplings of producing, so that the timing of the timing of a plurality of frames of catching and a plurality of samplings that produce is complementary.

If pixel displacement exceeds threshold value, then can determine whether through the predetermined period with timer, and only should just stablize described vision signal in the period having passed through.If pixel displacement does not exceed described threshold value, this timer then can reset.

According to a second aspect of the invention, provide a kind of equipment for the stable video signal, this equipment comprises: the video camera that is configured to a plurality of frames of capturing video signal; The device that is used for the functional status of definite equipment; And the device that is used for before transmission, stablizing selectively based on described functional status described vision signal.

Preferably, described functional status is the movement degree of video camera, and this equipment comprises: be used for the surveillance camera motion and with the device of itself and a threshold.

This equipment can also comprise: the motion sensor that is associated with video camera, and it is configured to produce a plurality of samplings that represent camera motion; The pixel displacement determining means, it is configured to determine the pixel displacement of video camera between the successive frame that this video camera is caught with described sampling, described pixel displacement represents the vision signal motion between the successive frame that camera motion causes; Comparing unit, it is configured to pixel displacement is compared with a predetermined threshold; And the motion compensation parts, it is configured to: before transmission, if pixel displacement exceeds described threshold value, then stablize described vision signal, otherwise, before transmission, do not realize stable the processing.

Described motion sensor can be the Motions of Gyroscope transducer.Described equipment can be mobile device.

According to a third aspect of the present invention, a kind of computer program for the stable video signal is provided, described computer program is included in the computer-readable medium of nonvolatile, and is configured to require when enforcement of rights when device handler moves the operation of the arbitrary claim among the 1-13.

The inventor has realized that in order to maximize the resolution of output video, wishes only to realize in needs the video stabilization processing, in order to avoid unnecessarily prune the vision signal that transmits.

Description of drawings

In order to understand better the present invention and to show how to implement the present invention, now will be by way of example with reference to the following drawings, wherein:

Fig. 1 a and 1b have shown the equipment according to preferred embodiment;

Fig. 2 is the flow chart according to the processing that is used for the stable video signal of preferred embodiment;

Fig. 3 is the flow chart of processing according to the supervision of preferred embodiment;

Fig. 4 is the illustration diagrammatic representation of the camera shutter position of passing in time;

Fig. 5 is the illustration diagrammatic representation of the angular speed of the video camera passed in time;

Fig. 6 is the flow chart according to the stable processing of preferred embodiment; And

Fig. 7 is the expression of the image before and after the pruning modes.

Embodiment

Now the preferred embodiments of the present invention will only be described by way of example.

Fig. 1 a and 1b have shown the equipment 100 according to a preferred embodiment.For instance, equipment 100 can be mobile device, for example mobile phone or other handheld devices.Equipment 100 comprises and the forward direction video camera 104 of screen 101 towards equidirectional, and with the backward video camera 102 of the opposite direction of screen 101.Equipment 100 also comprises motion sensor 110, CPU 106 and memory 108.In forward direction video camera 104 and the backward video camera 102 each is configured to catch image in by user selection.The image of catching can be used to form vision signal, and each image is used as video signal frame thus, and catches image with the frame rate of vision signal.For instance, described frame rate can be per second 25 frames, but these video cameras also can be with different frame rate work.Be used for realizing that the minimum frame speed of moving image perception is about per second 15 frames, but this can depend on people and the content in the vision signal (how many motions are namely arranged) of watching video image in the theme of vision signal.Motion sensor 110 is configured to produce the sampling of the motion of representative equipment 100.Because motion sensor 110 and video camera 104,102 all are in the equipment 100, so they are related each other, and so, the sampling that motion sensor 104 produces can be used for representing video camera 104,102 any one motions.As known in the art, CPU 106 is configured to carry out computing at equipment 100.And as known in the art, memory 108 is used to store data in equipment 100.As known in the art, parts 102,104,106 and 108 can send data by the bus (not showing) via equipment 100 each other and communicate in Fig. 1.

The stabilizer of here describing can be in three kinds of states.This stabilizer can be used as at the code of CPU 106 operations and realizes.

The state of stabilizer is based on the functional status of mobile device and is selecteed.The first state is the processing that stable video signal as described below was closed and do not realized to stabilizer.When forward direction video camera 104 capturing video signal, stabilizer is in the first state.Because pruning modes can make the visual angle narrow down, and the processing of stabilizer Steady Background Light might increase more motion in vision signal, and therefore, stable processing is disabled at forward direction video camera 104.

The second state of stabilizer is surveillance equipment high frequency motion (vibrations (shakiness)) but vision signal is not pruned.When backward video camera 102 capturing video signal, stabilizer will be in the second state.

The third state of stabilizer is to use stable the processing, prunes thus the vision signal that backward video camera 102 is caught.

Fig. 2 shows and comes the processing of stable video signal according to preferred embodiment with equipment 100.

At step S201, this processing is determined what whether vision signal was caught by forward direction video camera 104.That is to say, determine whether stabilizer is in the first state.If vision signal is caught by forward direction video camera 104, this processing is no longer proceeded so.If vision signal is not forward direction video camera 104 (that is to say, vision signal is caught by backward video camera 102) of catching, this processing advances to step S202 so.

At step S202, the vibrations of backward video camera 102 will be monitored.

In Fig. 3, show in greater detail and monitor step S202.At step S302, backward video camera 102 is caught and will be used as the image of video signal frame.For example, video camera 102 can have one and is recorded in the photosensor array that the time durations of distributing to video signal frame is incident on the energy level of the light on the transducer.The shutter of video camera 102 is used to separate in time these frames, and thus in each image duration, described shutter all can be opened in a period and be closed in another period.The video signal frame of catching is provided for preprocessor (for example being realized by CPU 106) in processing unit.Before coming with video coding technique known in the art frame encoded, described preprocessor comes image in the stable video signal frame by operation.

At step S304, when video camera 102 capturing video signal frame, motion sensor 110 will produce the sampling of the motion of representative equipment 100.For example, motion sensor 110 can be the transducer that rotatablely moves of gyroscope and so on.The angular speed of 110 pairs of equipment 100 of gyroscope is measured, and represents the sampling of described angular speed with the output of specific interval.Described interval both can be regular interval, also can not be regular interval.Preferably, will be higher than the frame rate of vision signal from the average sample rate of the sampling of gyroscope 110 output, but this is not essential.For example, the sample rate of exporting from gyroscope 110 can be 60 samplings of per second, the maximum usually vibration frequency of this sample rate reflection equipment 100, and current irrelevant with frame rate.The sampling that gyroscope 110 produces will be provided for preprocessor.

At step S306, the angular displacement of video camera 102 between two video signal frames (frame 1 and frame 2) will be determined.Should determine and to be carried out by the processing unit of CPU 106.The inventor is already definite, effectively determines two angular displacements between the frame in order to use from the data of gyroscope 110, and it is very useful on the time interval between the time for exposure mid point of the frame that video camera 102 is caught angular speed being carried out integration.The inventor also determines, because may not the frame rate of the sample rate of gyroscope 110 and video camera 102 is synchronous, therefore, this processing might have problem especially, especially when following state occurring:

Video camera 102 is adjusted to according to available light and regulates the time for exposure (a lot of video cameras are not always the case);

The timestamp of the video signal frame that video camera 102 provides be associated with the shutter closing time (be the concluding time of frame, the mid point of the time for exposure of this and frame opposes); And

At the mid point of time for exposure of frame, gyro data is unavailable.

As mentioned above, preprocessor receives the frame of video from video camera 105, and can receive the sampling from gyroscope 110.The speed that equates with the frame rate of the vision signal of catching with video camera 105 at least from the sampling of gyroscope 110 and be provided for preprocessor (for example with regular interval).By in gyroscope 110, using higher sample rate, can provide more accurate angle estimation, but higher in the cost of CPU use.

Video camera 102 offers the timestamp t of first frame (frame 1) in the vision signal ₁Indicate the concluding time of this frame, namely the shutter closure of video camera 102 is with the time of end frame 1.Equally, video camera 102 offers the timestamp t of second frame (frame 2) in the vision signal ₂Indicate the concluding time of described frame, namely the shutter closure of video camera 102 is with the time of end frame 2.In order to determine equipment 100 in the angular displacement (θ) between first frame and second frame, represent the asynchronism(-nization) of frame with the timestamp with frame, be the mid point of the time for exposure of use frame 1 and frame 2 more accurately.The time for exposure of first second frame is to use e ₁And e ₂Expression.Angular displacement is by at time t ₁-0.5e ₁With time t ₂-0.5e ₂Between the angular speed of equipment 100 (using from the sampled representation of gyroscope 110 outputs) carried out integration determines.Thus, the angular displacement between frame 1 and the frame 2 is provided by following formula:

Figure 2012103630530100002DEST_PATH_IMAGE001

。

Fig. 4 is the illustration diagrammatic representation of the shutter position of the video camera 105 passed in time.The shutter of video camera 102 when frame 1 finishes at time t ₁Closed.The shutter of this video camera 102 reopens again, in order to catch frame 2, and then can be at time t when frame 2 finishes ₂Closed.In Fig. 4, it is e that the time for exposure of frame 1 is shown as ₁, and the time for exposure of frame 2 be shown as e ₂In Fig. 4, use T ₁₂Represent angular speed is carried out the time of integration.By watching Fig. 4 can understand time T ₁₂On integration corresponding at the time for exposure of first frame mid point (at time t ₁-0.5e ₁) with the time for exposure mid point of second frame (at time t ₂-0.5e ₂) between the integration that carries out.Fig. 4 has shown that the opening time of shutter equates with the closing time of shutter, but this only is an example.In certain embodiments (realization short exposure time), the time that shutter is opened is smaller than the time of shutter closure.By contrast, (realize the long time for exposure) in other embodiments, the time that shutter is opened will be longer than the time of shutter closure.

Because therefore gyroscope 110 does not produce sampling at the mid point of frame (frame 1 and frame 2) situation might appear in the Timing Synchronization of the video signal frame that the sampling of gyroscope 110 is not caught with video camera 12.In this case, equipment 100 can be determined by the angular speed that inserts the sampling representative that gyroscope 110 generates at the angular speed of the mid point of frame.This angular speed is evaluated by carry out inserting at any time, and the mid point of the time for exposure of frame will be defined in the integration interval of using when calculating angular displacement according to above-mentioned equation.

Fig. 5 is the illustration diagrammatic representation of the angular speed of the video camera 102 passed in time.In Fig. 5, gyroscope 110 produced and the sampling of the angular speed of representative equipment 100 to be shown as be sampling 502,504,506,508 and 510.It can be seen, in the example depicted in fig. 5, the sampling timing of gyroscope 110 is not regular.For example, the time between the

sampling

504 and 506 is smaller than the time between the sampling 506 and 508.In Fig. 5, the dotted line that connects sampling has shown the value of the angular speed that can be determined according to the time and by the angular speed that inserts the sampling representative that gyroscope 110 produces.The angular speed that inserts (with dashed lines demonstration) can be at time (t ₁-0.5e ₁) and (t ₂-0.5e ₂) between integration, in order to determine the angular displacement of video camera 102 between the first frame and the second frame.Fig. 5 has shown from the simple linear interpolation between the sampling of gyroscope 110.In other embodiments, more advanced interpolation method also is operable.

A kind of situation might occur, will stable frame be to receive at preprocessor after the last sampling from gyroscope 110 wherein.For example, when when video camera 102 captures frame 2, described frame 2 can be at the mid point (t of time for exposure of frame 2 at gyroscope ₂-0.5e ₂) produce afterwards and receive at preprocessor before any sampling.For example, frame 2 can be to receive at preprocessor before sampling shown in Figure 5 510.Under this situation, in video flowing, can introduce delay, in order to before processed frame 2, receive sampling 510 at preprocessor, allow thus before preprocessor processed frame 2, to determine time (t ₂-0.5e ₂) angular speed.Replacedly, angular speed can be extrapolated from the sampling that before was received from gyroscope 110 and be obtained, in order to determine that equipment 100 is at time (t ₂-0.5e ₂) angular speed.

If video camera 102 not mobile (for example fixed placement equipment 100), gyroscope 110 can be disabled so, so that save battery life.Nonmotile state can determine by the feedback from video encoder, and wherein said video encoder can be to encoding video signal after having carried out the image stability method of being realized by preprocessor and describing here.As the part that coding is processed, described video encoder can be carried out estimation, and can determine thus whether video camera moves.In addition, when video camera 102 moves, at this moment also can determine motion state and use it for and enable gyroscope 110.When equipment 100 is not having in the state of motion when working, can be at poll motion sensor 110 in the interval slowly, so that it is mobile to determine whether equipment 100 begins again.According to the hardware of in the operating system of equipment 100, realizing and API (API), may there be the less mode of computing cost to determine when equipment 100 begins mobile.

The operation timing that is used for the hardware of video camera 102 and gyroscope 110 may not mated.This reason wherein might described video camera 102 and gyroscope 119 be independently to realize in the hardware chip.It may be very useful that the timestamp of the sampling that therefore, generates to gyroscope 110 and any one in the video signal frame (or all these two) adds a skew.In this way, can correctly mate with the timing of video signal frame from the timing of the sampling of gyroscope 110.For specific hardware chip combination, described skew can be a constant.Thus, delay can be used by calculated off-line and at equipment 100, and can not cause treatment loss for method described herein.

Back with reference to figure 3, at step S308, the pixel displacement that represents the motion of video camera 102 will be determined.Usually, the rotation meeting of video camera 102 causes producing in the two field picture of vision signal and the range-independence of object in image and the pixel displacement of approximately constant.This camera motion with linearity forms contrast, and wherein concerning described linear camera motion, pixel displacement is the function to the distance of object.The rotation map of equipment 100 is depended on the parameter (for example focal length of video camera 102 and camera lens width) of video camera 102 and the resolution of the image that video camera 102 captures to the function (or algorithm) of pixel displacement.Encoder feedback can be used for the precision of the sampling of definite gyroscope 110 generations, and is used for adaptive mapping algorithm.In addition, also have some about the situation of motion and object displacement, wherein described herein based on from the stable model of the sampling of gyroscope 110 and inaccuracy (for example for the rotation around the video camera 102 of user's face, user's face might be stable in the centre of frame, but gyroscope 110 will detect rotation, thus, described stable processing will be attempted Steady Background Light), this point can be detected and feed back to stable algorithm by encoder.Can carry out adaptive to stable algorithm in this way.

The pixel displacement of in step S308, determining represent motion amplitude in the vision signal two field picture that the motion of video camera 102 causes producing (with the sports of image subject in pairs than).In this way, the pixel displacement of determining in step S308 represents the undesirable high frequency motion (vibrations) in the image of video signal frame.

Back with reference to figure 2, after monitoring step S202, at step S204, the motion amplitude in the image of the video signal frame that is caused by the motion of video camera 102 will be compared with predetermined threshold, in order to determine that whether equipment 100 is in vibrations.

If determine not vibrations of equipment 100 in step S204, this processing oppositely advances to and monitors step S202 so, can not prune vision signal thus, so that vision signal keeps maximum output video resolution.Determine in case make this, then the timer (not showing in Fig. 1 b) of realizing in equipment 100 is resetted.

At step S204, if determine equipment 100 in vibrations, this processing advances to step S206 so.In step S206, will determine to begin whether to have passed through a period when resetting for the last time timer.

If not yet pass through the described period, then this processing oppositely advances to and monitors step S202.If passed through the described period, then this processing advances to stabilizing step S208.

Below will stabilizing step S208 be described in more detail with reference to figure 6.

At step S602, the pixel displacement of determining in step S308 is with filtered.Carry out this processing so that smoothly those are passed in time and are applied to the variation of vision signal in image stabilization is processed, the more level and smooth vision signal of processing through stable is provided thus.Can design in different ways for the filter that pixel displacement is carried out filtering, for example, its design can depend on the resolution of the image that video camera 102 is caught, can be applied to the acceptable delay of vision signal, and the license pruning rate that can be applied on the preprocessor raw video signal image that receives from video camera 102.Give an example, concerning the high frequency variation of the pixel displacement of application in image stabilization is processed, the frame of video that resolution is higher can be benefited from its larger algorithm.On the other hand, pruning rate can arrange a hard restriction for maximum algorithm.

Can be with the exponential filter that comes pixel displacement is carried out filtering according to following equation:

x_filt(n)?=?(1-w)*x_filt(n-1)?+?w*x(n)，

Wherein n represents the frame number of vision signal, the x representative is according to the accumulation displacement (or " position ") of the pixel displacement of determining among the step S308, and x_filt represents the accumulation displacement of filtering, the accumulation displacement of wherein said filtering will be used to determine how to calibrate input picture subsequently, in order to as hereinafter it is stablized in greater detail.In this way, described filter has served as an exponential filter.When motion stopped, x_filt-x will converge to zero, this means that image is not mobile.Described filter by with the corresponding filtering pixel displacement of previous frame and the pixel displacement in step S308, determined for present frame as the basis of filtering pixel displacement, so that variations of those definite pixel displacements of passing in time of elimination (smooth out).The weighting that is applied to the filtering pixel displacement of previous frame is (1-w), and the weighting that is applied to as the definite pixel displacement of present frame then is w.Thus, adjust weighting parameters w and how will adjust filter to the response for changing in the pixel displacement (x).Compare with finite impulse response (FIR) filter, output x_filt is being pruned and making it be in scope [x-crop, x+crop] time, more suitable is recurrence (infinite impulse response (IIR)) filter, because the value after pruning will be fed to filter loop, and can make follow-up x_filt output be difficult for pruning.

Weighting parameters w is adapted to resolution and the instant frame rate of vision signal, in order to obtain the constant physics cut-off frequency of measuring with hertz.If filter is ideal filter, the physics cut-off frequency will define the highest frequency component of the variation that is introduced in the x among the x_filt so.Variation with x of the frequency higher than cut-off frequency will be decayed by ideal filter, and will can not occur in x_filt.Yet this filter is not ideal filter, and just because of this, the highest frequency that the cut-off frequency definition is such, the decay of being used by filter for it is lower than 3dB.Therefore, for nonideal filter, below cut-off frequency, exist some decay, and more than cut-off frequency, also do not have complete attenuation.Filter output will be pruned, so that the difference between x_filt and the x can not pruned size greater than frame.W is adapted to and causes the physics cut-off frequency is constant, for example 0.5Hz.From filter transfer function, can derive function w(fc, a fs), described function can be mapped to w with physics cut-off frequency fc.When sample frequency (frame rate) fs changed, even fc is constant, w also can change.Than other filters, be fit to very much the cut-off frequency (changing w) of instant variation according to the filter of above-mentioned filtering equation.

At step S604, by using the image that moves second frame (frame 2) from the filtering pixel displacement of step S602.In this way, since the motion in the image of second frame (with respect to first frame) that the motion of video camera 102 causes will be attenuated.In other words, the pixel displacement of filtering will be used to compensate the motion in the vision signal between first frame that camera motion causes and second frame, stablize thus described vision signal.

The pixel displacement of filtering whole by changing (round) is helped pixel displacement (being the displacement of an integer pixel).Do like this image that allows to move with simple method second frame.This image is stride (stride) value with the storage space of indicating image, a plurality of pixel values, and the pointer of the position of first pixel of indicating image, and the width value of indicating image width represents.The movement of image comprises: in the situation that uncomfortable synchronizing amplitude is adjusted pointer and width value.It can be seen, width value and stride value have nothing to do, and so then allow to change picture traverse in the situation that do not affect the image stride.Thus, be moved at image in (and/or readjusting size), the storage space of image (for example in memory 108) does not need to change.This means described method need to be in memory 108 copies data.This has formed contrast with the normal image pruning method that the pruning zone of image is copied to the new memory zone.Copy is pruned the zone might be very complicated on calculating, and this processing may be harmful to, particularly when mobile device that may be limited in the processing resource that CPU 106 can use is realized described method.Utilize method described herein, because width value is independent of stride value, keep stride constant by changing pointer and width, image new, that moved can be created.

Image can represent by a plurality of planes of delineation, for example luminance plane (Y) and two colorimetric planes (U and V).By change pointing to simply the pointer of brightness and colorimetric plane, the plane of delineation that can mobile input picture and the size of readjusting this plane of delineation are revised the width of the plane of delineation thus when keeping stride constant.The plane of delineation then is moved identical amount, in order to guarantee the plane of delineation that moved to be used for together representing mobile image.

In order to realize that this image moves processing, the plane of delineation needs corresponding pointer, that is to say, they can not all represent with same pointer.In addition as mentioned above, described image must have independently width and stride value.

Fig. 7 is the expression of the image before and after mobile and the pruning modes.Original image represents with 702, and represents with 704 through image mobile and that prune.It can be seen, the stride value of image remains unchanged, and the width of image then is reduced.In addition, original pointer points to the top left corner pixel of original image, then points to the upper left corner (it is in the position different from the top left corner pixel of original image) that has been moved with the image of pruning through the pointer of adjusting.In this way, changing simply width value and pointer is removable and the described image of pruning.

As the summary of said method, before with the video encoder encodes vision signal, in preprocessor, realized the image with next stage stable video signal frame:

1. estimate the angular displacement (step S306) of video camera 102 between frame 1 and frame 2;

2. estimated angular displacement is mapped to the pixel displacement (step S308) of the image of frame 2;

3. remove the unexpected motion (step S602) in the image of frame 2 by filter being applied to pixel displacement sequence (or aforesaid accumulation pixel displacement); And

4. create the stabilized image (step 604) of frame 2 by the position that image is moved to filter calculating.The frame dimension that is used for the described stabilized image of frame 2 is equal to or less than the respective dimensions of the original image of frame 2.In other words, the stabilized image of vision signal is that the moving boundary of the original image inside by cutting away the vision signal that video camera 102 catches is configured.

Back with reference to figure 2, after stabilizing step S208, this processing advances to step S210.

At step S210, wherein will consider the next frame of in step S302, catching, in order to determine that whether equipment 100 is still in vibrations.That is to say, the angular displacement of video camera 102 between two frames (frame 2 and frame 3) will be determined and be mapped to pixel displacement, and wherein said pixel displacement represents the motion in the image of the video signal frame that the motion of video camera 102 causes.Refer step S306 and S308 describe in detail this hereinbefore.

At step S210, cause in the motion by video camera 102, the motion amplitude through filtering in the image of video signal frame will be compared with predetermined threshold, in order to determine that whether equipment 100 is still in vibrations.

At step S210, if determine equipment 100 still in vibrations, then this processing turns back to step S208.At step S208, remove the unexpected motion in the image of frame 3 by filter being applied to pixel displacement sequence (or aforesaid accumulation pixel displacement), and create the stable image of frame 3 by the position that image is moved to filter calculating, this is described in detail hereinbefore.In case carry out this operation, the timer of realizing at equipment 100 so (not showing in Fig. 1 b) will be reset.

At step S210, do not shake if determine equipment 100, whether this processing advances to step S212 so, will determine to begin to have passed through a period when resetting for the last time timer in this step.

At step S212, if determine not yet to pass through this period, then this processing turns back to stabilizing step S208, and can create by the position that image is moved to filter calculating the stable image of frame 3 in this step.

At step S212, if determining to have passed through should the period, this processing turns back to and monitors step S208 so, frame 3 is not realized pruning modes thus.

Will be appreciated that by using timeout value (defining the above mentioned period) to improve the continuity of the vision signal that transmits at step S206 and S210.

Being used for opening the stable predetermined threshold of processing in step S204 calculates with a threshold (after filtering) by degree of will speed up variation.In step S210, be used for closing the stable predetermined threshold of processing and be by with pixel motion vector and a pixel threshold on each direction calculate.

Will be appreciated that above-described threshold value and overtimely can specific to implementation, depend on the complexity of sensing data accuracy and needed change resolution.

Will be appreciated that above-mentioned processing will continue circulation, whether need to stablize in order to consider each frame that video camera 102 is caught.Use above processing with reference to figure 2 descriptions to guarantee only enabling stable processing in needs, namely only in needs, just prune vision signal, keep thus the ultimate resolution of vision signal.

In the above-described embodiments, motion sensor 110 is the gyroscopes that produce the sampling that rotatablely moves of representative equipment 100.In other embodiments, the motion that motion sensor 110 can the sensing other types, for example translational motion, and can produce the sampling of the translational motion of representative equipment 100.These samplings can adopt the mode identical with the mode of above describing in conjunction with rotatablely moving to come the stable video signal.Yet as mentioned above, for translational motion, pixel displacement will depend on the distance of the object in the image, therefore, must consider it when determining pixel displacement.Give an example, a plurality of accelerometers can be estimated to rotatablely move, and in this case, and described accelerometer can be in the situation that do not make other modifications and be used.Process for more generally translation is stable, because the zones of different in the image has moved different pixel total amounts, therefore, realize that method described herein may become more difficult.Yet, if be constant (and known) to the distance of object, realize that for translational motion described method will be simple so.Even be not constant (but still be known) in the distance to object, also can realize described method for translational motion, but in the process of determining the pixel displacement that video camera 102 translational motions cause, will add extra complexity.

After having stablized vision signal, process encoded video signal with Video coding.For instance, can be used as for the part of another user's video call or as broadcast singal through the vision signal of coding and transmit.Thus, for vision signal, the very important point is can be by real-time stabilization and coding (namely in the situation that postpone very little), in order to use in the communication event that the event of video call and so on or other those users are perfectly clear to the perception of signal delay.Replacedly, can be kept on the equipment 100 through the vision signal of encoding, for example be kept in the memory 108.

Method step S200-S212 can realize with software or hardware on equipment 100.For example, CPU 106 can move processing unit and comes performing step S200-S212.Give an example, the computer program that is used for the stable video signal can be provided, and wherein this product can be kept in the memory 108 and by CPU 106 and move.Described computer program product micromicro is to be configured to manner of execution step S200-S212 when in CPU 106 operation the time.Replacedly, can realize that in equipment 100 hardware component comes performing step S200-S212.

In addition, although specifically show and describe the present invention with reference to preferred embodiment, it should be appreciated by those skilled in the art that the modification of various forms and details aspect all is feasible in the situation of the scope of the present invention that does not break away from the claims definition.

Claims

1. method that sends from the vision signal of user's equipment (100), the method comprises:

Use video camera (102,104) to catch a plurality of video signal frames at subscriber equipment (100);

Determine the functional status of equipment (100); And

Before transmitting, come selectively stable video signal based on functional status.

2. the process of claim 1 wherein that described functional status is the movement degree of video camera (102,104), the method comprises:

The motion of surveillance camera (102,104) and with itself and threshold.

3. the method for claim 2, wherein said supervision comprises:

Use the motion sensor (110) that is associated with video camera (102,104) to produce a plurality of samplings that represent video camera (102,104) motion;

Sample to determine the displacement of video camera (102,104) between the successive frame that described video camera (102,104) is caught with these; And

The pixel displacement of the motion in the vision signal between the successive frame that the displacement of definite representative determined video camera (102,104) causes, the method also comprises:

Pixel displacement and threshold value are compared;

If pixel displacement exceeds described threshold value, then before transmission, stablize described vision signal; And

Otherwise transmit vision signal in the situation that do not implement stable the processing,

Wherein the motion of video camera (102,104) is to rotatablely move, and motion sensor (110) is the transducer that rotatablely moves, and the displacement of video camera (102,104) is the angular displacement of video camera (102,104).

4. the process of claim 1 wherein that subscriber equipment (100) comprises forward direction video camera (104) and backward video camera (102), and functional status is to have selected forward direction video camera (104), vision signal is in the situation that stablize to process and be transmitted.

5. the method for aforementioned arbitrary claim, wherein the processing of stable video signal comprises:

Pixel displacement is carried out filtering; And

Move at least one image in the first and second frames according to the pixel displacement of filtering.

6. the method for claim 5, the wherein said processing that pixel displacement is carried out filtering comprises:

Determine the accumulation pixel displacement based on the described pixel displacement that is second frame determined; And

Be based upon second frame the accumulation pixel displacement of determining and the filtering that is used for first frame and accumulate the filtering accumulation pixel displacement that the weighted sum of pixel displacement is identified for second frame.

7. the method for aforementioned arbitrary claim also comprises:

Add time migration among at least one in following, so that the timing of the timing of a plurality of frames of catching and a plurality of samplings that produce is complementary: a plurality of frames of (i) catching; And (2) a plurality of samplings of producing.

8. the method for arbitrary claim in the claim 3 to 7, also comprise: if pixel displacement exceeds described threshold value, then determine whether to have passed through the predetermined period with timer, and only should just stablize described vision signal in the period having passed through

Wherein the method also comprises: if pixel displacement does not exceed described threshold value, described timer then resets.

9. equipment (100) that is used for the stable video signal, this equipment (100) comprising:

The video camera (102,104) that is configured to a plurality of frames of capturing video signal;

The device that is used for the functional status of definite equipment (100); And

Be used for before transmission, stablizing selectively based on functional status the device of described vision signal.

10. computer program that is used for the stable video signal, described computer program is included in the computer-readable medium of nonvolatile, and is configured to the operation of the arbitrary claim in enforcement of rights requirement 1-8 in processor (106) operation in equipment (100).