DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
a first segmentation block, a first feature extraction block, a second segmentation block, a second feature extraction block, a matching block, in claim 13; a motion estimation block in claim 15.  
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2, 6, 7, 8-10, 13, 15-16 and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Risinger et al (US Pub. 2017/0083790).
With respect to claim 1, Risinger discloses A method of analyzing one or more objects in a set of frames comprising at least a first frame and a second frame, (see Abstract) the method comprising:
segmenting the first frame, to produce a plurality of first masks, each first mask identifying pixels belonging to a potential object-instance detected in the first frame, (see paragraph 0035, segmenting scene foreground; paragraph 0023, BG/FG mask may depict every pixel of foreground, also paragraph 0036 normalized values as feature vector for each frame);
for each potential object-instance detected in the first frame, extracting from the first frame a first feature vector characterising the potential object-instance, (see paragraph 0022, identify distinct instances of a foreground);
segmenting the second frame, to produce a plurality of second masks, each second mask identifying pixels belonging to a potential object-instance detected in the second frame, (see figure 2, 205 video stream as input i.e. multiple frames, 220 BG/FG component i.e. segmenting BG/FG for every frame);
for each potential object-instance detected in the second frame, extracting from the second frame a second feature vector characterising the potential object-instance, (see paragraph 0022, identify distinct instances of a foreground); and
matching at least one of the potential object-instances in the first frame with one of the potential object-instances in the second frame, based at least in part on the first feature vectors, the first masks, the second feature vectors and the second masks, (see paragraph 0007, geometric matching), as claimed.

With respect to claim 2, Risinger further discloses wherein the matching comprises clustering the potential object-instances detected in the first and second frames, based at least in part on the first feature vectors and the second feature vectors, to generate clusters of potential object-instances, (see paragraph 0037, organize the vectors into clusters), as claimed.

With respect to claim 6, Risinger further discloses wherein the matching comprises rejecting potential object-instances based on any one or any combination of two or more of the following:
an object confidence score, which estimates whether a potential object- instance is more likely to be an object or part of the background; a mask confidence score, which estimates a likelihood that a mask represents an object; and a mask area, (see paragraph 0052, scores on the range), as claimed. 

With respect to claim 7, Risinger further discloses wherein the mask confidence score is generated by a machine learning algorithm trained to predict a degree of correspondence between the mask and a ground truth mask, (see figure 2, 230 background model), as claimed.

With respect to claim 8, Risinger further discloses wherein the masks and feature vectors are generated by a first machine learning algorithm, (see figure 2, 230), as claimed.

With respect to claims 9 and 10, Risinger further discloses for at least one matched object in the first frame and the second frame, estimating a motion of the object between the first frame and the second frame; and wherein estimating the motion of the object comprises, for each of a plurality of pixels of the object: estimating a translational motion vector; estimating a non-translational motion vector; and calculating a motion vector of the pixel as the sum of the translational motion vector and the non-translational motion vector, (see paragraph 0057, according to an estimate of motion of foreground, and weighted average of most likely particle candidates), as claimed.

Claims 13, 15, 16 and 19 are rejected for same reasons as set forth in the rejection of claims 1, 2, 9 and 8, because claims 13, 15, 16 and 19 are claiming subject matter of similar scope as claimed in claims 1, 2, 9 and 8 respectively.  

Claim 20 are rejected for same reasons as set forth in the rejection of claim 1, because claim 20 are claiming subject matter of similar scope as claimed in claim 1.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3-5, 14 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Risinger in view of OVSNet: Towards one-pas real time video object segmentation, by Sun (IDS document).  
With respect to claim 3, Risinger discloses all the limitation as claimed and as rejected in claim 2.  However, Risinger fails to explicitly disclose wherein the matching further comprises, for each cluster in each frame: evaluating a distance between the potential object-instances in the cluster in that frame; and splitting the cluster into multiple clusters based on a result of the evaluating, as claimed.
Sun in the same field teaches for each cluster in each frame: evaluating a distance between the potential object-instances in the cluster in that frame; and splitting the cluster into multiple clusters based on a result of the evaluating, as claimed, (see section 3.1.1, wherein predict a class agnostic foreground mask, also features at the corresponding position by 20% “distance between the potential object instances” in order to segment the boundary of the object, as bounding boxes “splitting cluster in to multiple clusters”), as claimed.
It would have been obvious to one ordinary skilled in the art at the effective date of invention to combine the two references as they are analogous because they are solving similar problem of object detection using image analysis.  Teaching of Sun to identify the object in the frames can be incorporated in to the Risinger’s system (see Risinger figure 2, 230) for suggestion and modification will yield a system that will faster (see Sun Abstract last five lines) for motivation.  

With respect to claim 4, combination of Risinger and Sun further discloses wherein the matching comprises selecting a single object-instance from among the potential object-instances in each cluster in each frame, (see Sun section 3.1.1, in order to segment the boundary of the object accurately), as claimed.

With respect to claim 5, combination of Risinger and Sun further discloses wherein the matching comprises matching at least one of the single object-instances in the first frame with a single object-instance in the second frame, (see Risinger paragraph 0007, matching between a first ser of boundary regions in a current frame with the previous frame), as claimed.

With respect to claim 14, combination of Risinger and Sun further discloses wherein the first and second segmentation blocks are the same segmentation block, and/or the first and second feature extraction blocks are the same feature extraction block, (see Sun figure 3, mask probability and the output is the same), as claimed.

Claims 17 and 18 are rejected for same reasons as set forth in the rejection of claims 3 and 4+5, because claims 17 and 18 are claiming subject matter of similar scope as claimed in claims 3 and 4+5 respectively.  

Claim(s) 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Risinger in view of Dinerstein et al (US 2020/0356827).  
With respect to claim 11, Risinger discloses all the limitation as claimed and as rejected in claim 9.  However, Risinger fails to explicitly disclose wherein estimating the motion of the object 5 comprises:
generating a coarse estimate of the motion based at least in part on the mask in the first frame and the corresponding matched mask in the second frame; and refining the coarse estimate using a second machine learning algorithm, wherein the second machine learning algorithm takes as input the first frame, the second frame, and the coarse estimate, and the second machine learning algorithm is trained to predict a motion difference between the coarse motion vector and a ground truth motion vector, as claimed.
Dinerstein in the same field teaches two CNN’s “machine learning algorithms” for coarse and refine motion detection in an image “generating a coarse estimate of the motion based at least in part on the mask in the first frame and the corresponding matched mask in the second frame; and refining the coarse estimate using a second machine learning algorithm, wherein the second machine learning algorithm takes as input the first frame, the second frame, and the coarse estimate, and the second machine learning algorithm is trained to predict a motion difference between the coarse motion vector and a ground truth motion vector” (see Abstract, figure 6), as claimed.  
It would have been obvious to one ordinary skilled in the art at the effective date of invention to combine the two references as they are analogous because they are solving similar problem of object detection using image analysis.  Teaching of Dinerstein to use two CNN’s in order to come up with a refine motion detection can be incorporate in to Risinger’s system (see Risinger figure 1, 125 machine learning components) for suggestion and modified system yields a more accurate system, for motivation.  

With respect to claim 12, combination of Risinger and Dinerstein discloses wherein the machine learning algorithm is trained to predict the motion difference at a plurality of resolutions, starting with the lowest resolution and predicting the motion difference at successively higher resolutions based on up-sampling the motion difference from the preceding resolution, (see Dinerstein paragraph 0067), as claimed.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIKKRAM BALI whose telephone number is (571)272-7415. The examiner can normally be reached Monday-Friday 7:00AM-3:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VIKKRAM BALI/Primary Examiner, Art Unit 2663