Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.

Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2009/0129666 A1 to Goevert et al., hereinafter, “Goevert” in view of Periodic Motion Detection and Segmentation via Approximate Sequence Alignment to Laptev et al., hereinafter, “Laptev” 
Claim 1. A system for detecting movement in a scene, the system comprising a processor in communication with memory, the processor being configured to execute instructions stored in memory that cause the processor to: access a first set of images and a second set of images of a scene over time; Goevert [0003] teaches passive methods for three-dimensional scene reconstruction by means of image data are generally based on the determination of spatial correspondences between a number of images of the scene recorded from various directions and distances. This determination of correspondences corresponds to an identification of pixel positions or pixel areas in the images with points or objects or object sections in the scene to be reconstructed.
Goevert [0010] teaches …at least one camera for recording a plurality of images of the scene including the object, recording a first sequence of first images of the scene from a first perspective relative to the scene, and recording a second sequence of second images of the scene from a second perspective relative to the scene… comprise a plurality of first and second images so that a spatial position of the image areas and any movement over time of the image areas are used in order to identify the correspondences.
generate, based on the first set of images, a first temporal pixel image comprising a first set of temporal pixels, wherein each temporal pixel in the first set of temporal pixels comprises a set of pixel values at an associated position from each image of the first set of images; Goevert [0002] teaches the invention relates to a method and a device for three-dimensional reconstruction of a scene, and more particularly to a method and a device for determining spatial correspondences between image areas in a number of images forming at least two image sequences of a scene that are recorded from different observation perspectives.
Goevert [0010] teaches determining a plurality of first image areas within the first images and determining a plurality of second image areas within the second images, identifying a plurality of correspondences between the first and second image areas, and reconstructing the scene based on the correspondences between the first and second image areas, wherein the correspondences are identified by matching a parameterized function to each image area in order to obtain a plurality of first and second function parameters representing the first and second image areas, and by comparing respective first and second function parameters, and wherein the first and second sequences each comprise a plurality of first and second images so that a spatial position of the image areas and any movement over time of the image areas are used in order to identify the correspondences.
generate, based on the second set of images, a second temporal pixel image comprising a second set of temporal pixels, wherein each temporal pixel in the second set of temporal pixels comprises a set of pixel values at an associated position from each image of the second set of images; Goevert [0010] teaches determining a plurality of first image areas within the first images and determining a plurality of second image areas within the second images, identifying a plurality of correspondences between the first and second image areas, and reconstructing the scene based on the correspondences between the first and second image areas, wherein the correspondences are identified by matching a parameterized function to each image area in order to obtain a plurality of first and second function parameters representing the first and second image areas, and by comparing respective first and second function parameters, and wherein the first and second sequences each comprise a plurality of first and second images so that a spatial position of the image areas and any movement over time of the image areas are used in order to identify the correspondences.
determine one or more derived values based on values of the temporal pixels in the first temporal pixel image, the second temporal pixel image, or both; Goevert [Abstract] teaches a parameterized function h(u,v,t) is matched to each of the image areas in a space R(uvgt) defined by pixel position (u, v), image value g and time t. The parameters of the parameterized functions are used to form a similarity measure between the image areas.
Goevert [0010] teaches wherein the correspondences are identified by matching a parameterized function to each image area in order to obtain a plurality of first and second function parameters representing the first and second image areas, and by comparing respective first and second function parameters, 
Goevert [0042] teaches a parameterized function h(u,v,t) is adapted to each individual interest pixel and the local environment thereof, preferably on the basis of the original image and/or of the difference image. The interest pixels are in this case represented in a four-dimensional space R(uvgt) that is defined by the pixel position u, v, the image value or gray value g, and the time t. The parameterized function h(u,v,t) is designed in the simplest case as a hyperplane. This parameterized function h(u,v,t) is matched to an interest pixel and the environment thereof by using information relating to the image value or gray value distribution and the temporal behavior thereof. The local environment of the interest pixel covers the environment with reference to the pixel position u, v and the environment with reference to the time t.
 Goevert [0024] teaches the matching of the parameterized function h(u,v,t) to each an interest pixel and the local environment thereof is advantageously performed in the space R(uvgt) such that a parameterized function is adapted for each interest pixel, taking account of its environment. The local environment preferably covers a pixel area with pixels that is directly adjacent to the respective interest pixel. This pixel area is preferably of square design, in particular with an odd number of pixels at the boundary edge. The local environment alternatively or in addition covers the temporal environment in the space R(uvgt), which extends over a suitably selected number of images of the image sequence.
Goevert [0043] teaches local pixel environment and/or the temporal environment of the recorded individual images are/is preferably selected in a fashion specific to the application. However, it is also possible to select the pixel environment and/or the temporal environment in an object-specific fashion by means of the object size. 
Goevert [Abstract] teaches a parameterized function h(u,v,t) is matched to each of the image areas in a space R(uvgt) defined by pixel position (u, v), image value g and time t. The parameters of the parameterized functions are used to form a similarity measure between the image areas.
Goevert [0010] teaches…and wherein the first and second sequences each comprise a plurality of first and second images so that a spatial position of the image areas and any movement over time of the image areas are used in order to identify the correspondences.
Goevert fails to explicitly teach based on the one or more derived values and the correspondence data, an indication of whether there is a likelihood of motion in the scene. Laptev, in the field of motion detection in image data, teaches determine, based on the one or more derived values and the correspondence data, an indication of whether there is a likelihood of motion in the scene. Laptev [Abstract] teaches we note that periodic motion detection can be seen as an approximate case of sequence alignment where an image sequence is matched to itself over one or more periods of time… For periodic motion, we match corresponding points across periods and develop n RANSAC procedure to simultaneously estimate the period and the dynamic geometric transformations between periodic views.
Laptev [Introduction] teaches describe our approach to solving the correspondence problem using spatio-temporal interest points in Section 3.
Laptev Figures 3 and 4
Laptev [3. Space-time image features] teaches to estimate the dynamic F(t) and H(t) matrices of Section 2, we can take advantage of time linearity and apply SVD-based methods that are commonly used for estimating static F. H from two views [7]. Unlike the static case, however, estimation of F(t); H(t) requires correspondences of space-time points in two image sequences. We find these correspondences by directly matching points in space-time… To estimate corresponding points in two sequences, we consider space-time interest points with significant variation of local motion and shape. Such points or Local
Space-Time Features (LSTF) can be detected by maximizing the local variation of the image function over space and time [9]. Given the distinctive spatio-temporal properties of such points, correspondence can be estimated from the similarity of their local spatio-temporal neighborhoods… Figure 3 illustrates LSTF points detected for a sequence containing a jogging person. Close similarity of spatio-temporal neighborhoods of matching periodic points can be confirmed in Figure 3(c). The detector in [9] delivers a rather sparse set of points that is sufficient for the detection of periodic motion described in Section 4. Segmentation of periodic motion in Section 5, however, requires a denser set of points that enable more accurate alignment of periodic views. To detect such points, we relax the assumption of local extrema of the image variation over time and detect Weak Local Space-Time Features (WLSTF) by applying a standard static interest point detector [141 restricted to the regions of non-constant motion [11]. For each detected point we then compute a local spatio-temporal descriptor according to [10]. Examples of WLSTF points detected for pairs of periodic frames are illustrated in Figures 5(a)-(b), and Figure 1.

Laptev [4. Periodic motion detection] 

Hence the prior art includes each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. Thus, it would have been obvious to one of ordinary skill in the art to modify the access a first set of images and a second set of images of a scene over time by Goevert with Laptev’s teaching of an indication of whether there is a likelihood of motion in the scene. One would have been motivated to perform this combination due to the fact that it allows one to accurately detect motion in using correspondences in image data. In combination, Goevert is not altered in that Goevert continues to identify corresepondences in image data. Laptev's teachings perform the same as they do separately of detecting motion in a scene.
Therefore one of ordinary skill in the art, such as an individual working in the field of mobile robots could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to claim 2.

Claim 2. Goevert further teaches wherein determining the one or more derived values comprises: determining a first set of derived values based on values of the temporal pixels in the first temporal pixel image; and Goevert [Abstract] teaches a parameterized function h(u,v,t) is matched to each of the image areas in a space R(uvgt) defined by pixel position (u, v), image value g and time t. The parameters of the parameterized functions are used to form a similarity measure between the image areas.
Goevert [0042] teaches a parameterized function h(u,v,t) is adapted to each individual interest pixel and the local environment thereof, preferably on the basis of the original image and/or of the difference image. The interest pixels are in this case represented in a four-dimensional space R(uvgt) that is defined by the pixel position u, v, the image value or gray value g, and the time t. The parameterized function h(u,v,t) is designed in the simplest case as a hyperplane. This parameterized function h(u,v,t) is matched to an interest pixel and the environment thereof by using information relating to the image value or gray value distribution and the temporal behavior thereof. The local environment of the interest pixel covers the environment with reference to the pixel position u, v and the environment with reference to the time t.
determining a second set of derived values based on values of the temporal pixels in the second temporal pixel image. Goevert [Abstract] teaches a parameterized function h(u,v,t) is matched to each of the image areas in a space R(uvgt) defined by pixel position (u, v), image value g and time t. The parameters of the parameterized functions are used to form a similarity measure between the image areas.
Goevert [0042] teaches a parameterized function h(u,v,t) is adapted to each individual interest pixel and the local environment thereof, preferably on the basis of the original image and/or of the difference image. The interest pixels are in this case represented in a four-dimensional space R(uvgt) that is defined by the pixel position u, v, the image value or gray value g, and the time t. The parameterized function h(u,v,t) is designed in the simplest case as a hyperplane. This parameterized function h(u,v,t) is matched to an interest pixel and the environment thereof by using information relating to the image value or gray value distribution and the temporal behavior thereof. The local environment of the interest pixel covers the environment with reference to the pixel position u, v and the environment with reference to the time t. [0045-0050]
Claim 3. Goevert further teaches wherein determining the one or more derived values comprises: determining, for each temporal pixel of a first set of temporal pixels of the first temporal pixel image, first average data indicative of an average of values of the temporal pixel; and determining, for each temporal pixel of the first set of temporal pixels, first deviation data indicative of a deviation of values of the temporal pixel. Goevert [0049-0050]
Claim 4. Goevert further teaches wherein determining the one or more derived values further comprises: determining, for each temporal pixel of a second set of temporal pixels of the second temporal pixel image, second average data indicative of an average of values of the temporal pixel; and determining, for each temporal pixel of the second set of temporal pixels, second deviation data indicative of a deviation of values of the temporal pixel. Goevert [0049-0050]
Claim 5. Goevert further teaches wherein calculating the first average data comprises calculating, for each temporal pixel in the first set of temporal pixels: a temporal average of intensity values of the temporal pixel; and a root mean square deviation of the intensity values of the temporal pixel. Goevert [0050] teaches 2. The amplitude p.sub.1(v,t) of the sigmoid is spatially constant and proportional to the standard deviation .sigma..sub.1(t) of the pixel intensities in the spatial local environment of the interest pixel with p.sub.1(v,t)=k.sigma..sub.1(t), k representing a constant factor pre-scribed by the user. The value of k preferably lies between 0.8 and 3. It is also possible here to use spatial temporal mean values and standard deviations instead of spatial mean values and standard deviations.
Claim 6. Goevert and Laptev further further teaches, wherein determining the indication comprises: determining a plurality of regions of the first temporal pixel image, the second temporal pixel image, or both; and Goevert [0018] teaches the relevant image regions may have a variability in the image values exceeding a threshold sufficient for forming correspondence.
Goevert [0019] teaches the relevant image regions are preferably determined by means of an interest operator.
Goevert [0020] teaches spatial temporal features are applied to determine the relevant image regions.
determining, for each region of the plurality of regions: an average of the one or more derived values associated with the region; a correspondence indication based on correspondences associated with the region; Goevert [0011] teaches and wherein the calculating unit is configured for reconstructing the scene based on the correspondences between the first and second image areas, wherein the correspondences are identified by matching a parameterized function to each image area in order to obtain a plurality of first and second function parameters representing the first and second image areas, and by comparing respective first and second function parameters
and determining, based on the average and the correspondence indication, a region indication of whether there is a likelihood of motion in the region. Goevert [0049]  teaches 1. The offset p.sub.4(u,v,t) is spatially constant and corresponds to the pixel intensity .sub.uv(t) of the local environment of the interest pixel, averaged over the spatial image coordinates u and v. 
Goevert [0050] teaches 2. The amplitude p.sub.1(v,t) of the sigmoid is spatially constant and proportional to the standard deviation .sigma..sub.1(t) of the pixel intensities in the spatial local environment of the interest pixel with p.sub.1(v,t)=k.sigma..sub.1(t), k representing a constant factor pre-scribed by the user. The value of k preferably lies between 0.8 and 3. It is also possible here to use spatial temporal mean values and standard deviations instead of spatial mean values and standard deviations. 
Goevert [0030] teaches In order to minimize the computational outlay, it is envisaged that differential images are formed in order to ascertain the relevant image regions, specifically between images of the image sequences and of previously recorded reference images of the scene. Thus, instead of the current image, it is the absolute difference between the current image and a reference image of the scene that is used. In particular, this method variant is used when the image sequences are recorded by means of stationary cameras, i.e. cameras that do not move over time. A possible criterion for the interest operator is in this case a rise from 0 to 1 or a drop from 1 to 0 in a difference image binarized by means of a fixed threshold value.
Goevert [0040] teaches In a step 4, relevant image regions are determined by means of an interest operator. For this purpose, the difference images are binarized, pixels of image regions with image values below a defined threshold value being given the value 0, and image regions above the threshold value being given the value 1. The image regions with pixel values 1 are denoted below as relevant image regions.
Laptev [3. Space-time image features], [4. Periodic motion detection] and Figures 3 and 4
Claim 7. Goevert and Laptev further teaches wherein determining the region indication comprises: determining the average meets a first metric; determining the correspondence indication meets a second metric; and generating the region indication to indicate a likelihood of motion in the region. Goevert [0030] teaches In order to minimize the computational outlay, it is envisaged that differential images are formed in order to ascertain the relevant image regions, specifically between images of the image sequences and of previously recorded reference images of the scene. Thus, instead of the current image, it is the absolute difference between the current image and a reference image of the scene that is used. In particular, this method variant is used when the image sequences are recorded by means of stationary cameras, i.e. cameras that do not move over time. A possible criterion for the interest operator is in this case a rise from 0 to 1 or a drop from 1 to 0 in a difference image binarized by means of a fixed threshold value.
Goevert [0040] teaches in a step 4, relevant image regions are determined by means of an interest operator. For this purpose, the difference images are binarized, pixels of image regions with image values below a defined threshold value being given the value 0, and image regions above the threshold value being given the value 1. The image regions with pixel values 1 are denoted below as relevant image regions.
Laptev [3. Space-time image features], [4. Periodic motion detection] and Figures 3 and 4

Claim 8. Laptev further teaches further comprising determining, based on a set of region indications associated with each region of the plurality of regions, an indication to indicate a likelihood of motion in the scene. Laptev [3. Space-time image features], [4. Periodic motion detection] and Figures 3 and 4
Claim 9. Goevert further teaches wherein: each image in the first set of images and the second set of images captures an associated portion of a light pattern projected onto the scene; Goevert [0013] teaches image sequences are recorded in order to implement the method, with an image sequence consisting of a succession of individual images of the scene, which images are recorded from an observation perspective and preferably have an equidistant temporal spacing. The image sequences are recorded from different observation perspectives, preferably from different observation directions and/or observation distances and/or using various optical imaging devices. Examiner interprets various optical imaging devices to include the ability to project light pattern.
each image in the first set of images is of a first perspective of the scene; and Goevert [0002] teaches the invention relates to a method and a device for three-dimensional reconstruction of a scene, and more particularly to a method and a device for determining spatial correspondences between image areas in a number of images forming at least two image sequences of a scene that are recorded from different observation perspectives.
Goevert [0003] teaches passive methods for three-dimensional scene reconstruction by means of image data are generally based on the determination of spatial correspondences between a number of images of the scene recorded from various directions and distances.
Goevert [0010] teaches there is provided a method comprising the steps of providing at least one camera for recording a plurality of images of the scene including the object, recording a first sequence of first images of the scene from a first perspective relative to the scene, and recording a second sequence of second images of the scene from a second perspective relative to the scene, the first and second perspectives being different from one another, determining a plurality of first image areas within the first images and determining a plurality of second image areas within the second images
each image in the second set of images is of a second perspective of the scene. Goevert [0010] teaches there is provided a method comprising the steps of providing at least one camera for recording a plurality of images of the scene including the object, recording a first sequence of first images of the scene from a first perspective relative to the scene, and recording a second sequence of second images of the scene from a second perspective relative to the scene, the first and second perspectives being different from one another, determining a plurality of first image areas within the first images and determining a plurality of second image areas within the second images
Goevert [0013] teaches image sequences are recorded in order to implement the method, with an image sequence consisting of a succession of individual images of the scene, which images are recorded from an observation perspective and preferably have an equidistant temporal spacing. The image sequences are recorded from different observation perspectives, preferably from different observation directions and/or observation distances and/or using various optical imaging devices.
Claim 10. The system of claim 1, wherein: each image in the first set of images is captured by a camera; and Goevert [0010] teaches there is provided a method comprising the steps of providing at least one camera for recording a plurality of images of the scene including the object
each image in the second set of images comprises a portion of a pattern sequence projected onto the scene by a projector. Goevert [0013] teaches image sequences are recorded in order to implement the method, with an image sequence consisting of a succession of individual images of the scene, which images are recorded from an observation perspective and preferably have an equidistant temporal spacing. The image sequences are recorded from different observation perspectives, preferably from different observation directions and/or observation distances and/or using various optical imaging devices. Examiner interprets various optical imaging devices to include the ability to project light pattern.
Goevert [0032] teaches new device comprises a camera system and at least one evaluation unit that is connected to the camera system. It is preferable that the camera system is designed as a stereo camera system that comprises two cameras.
Goevert [0032] teaches the camera system may generally be designed as a calibrated multicamera system, i.e. it has two or more cameras. In particular, the cameras may have an overlapping observation area. It is preferred for the camera system to be calibrated, while it is also possible as an alternative to provide automatic calibration. 
Per specification[0002] Typical machine vision systems include one or more cameras directed at an area of interest, a frame grabber/image processing elements that capture and transmit images, a computer or onboard processing device, and a user interface for running the machine vision software application and manipulating the captured images, and appropriate illumination on the area of interest.   [0003] One form of 3D vision system is based upon stereo cameras employing at least two cameras arranged in a side-by-side relationship with a baseline of one-to-several inches therebetween.
Claim 11. It differs from claim 1 in that it is a computerized method performed by the system of claim 1. Therefore claim 11 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 
Claim 12. It differs from claim 2 in that it is a computerized method performed by the system of claim 2. Therefore claim 12 has been analyzed and reviewed in the same way as claim 2. See the above analysis. 
Claim 13. It differs from claim 3 in that it is a computerized method performed by the system of claim 3. Therefore claim 13 has been analyzed and reviewed in the same way as claim 3. See the above analysis. 
Claim 14. It differs from claim 4 in that it is a computerized method performed by the system of claim 4. Therefore claim 14 has been analyzed and reviewed in the same way as claim 4. See the above analysis. 
Claim 15. It differs from claim 5 in that it is a computerized method performed by the system of claim 5. Therefore claim 15 has been analyzed and reviewed in the same way as claim 5. See the above analysis. 
Claim 16. It differs from claim 6 in that it is a computerized method performed by the system of claim 6. Therefore claim 16 has been analyzed and reviewed in the same way as claim 6. See the above analysis. 
Claim 17. It differs from claim 7 in that it is a computerized method performed by the system of claim 7. Therefore claim 17 has been analyzed and reviewed in the same way as claim 7. See the above analysis. 
Claim 18. It differs from claim 8 in that it is a computerized method performed by the system of claim 8. Therefore claim 18 has been analyzed and reviewed in the same way as claim 8. See the above analysis. 
Claim 19. It differs from claim 9 in that it is a computerized method performed by the system of claim 9. Therefore claim 19 has been analyzed and reviewed in the same way as claim 9. See the above analysis. 
Claim 20. If differs from claim 1 in that it is at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one computer hardware processor, cause the at least one computer hardware processor to perform the acts of the system of claim 1. Therefore claim 20 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2011/0043706 A1 to Van Beek et al.,
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DELOMIA L GILLIARD whose telephone number is (571)272-1681.  The examiner can normally be reached on 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on 571 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/DELOMIA L GILLIARD/Primary Examiner, Art Unit 2661