DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
    The information disclosure statement (IDS) submitted on 06/16/2020, 07/20/2020, 09/04/2020, 10/07/2020 and 11/04/2020 are in compliance with the provisions of 37 CFR 1.97 and are being considered by the examiner.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1 and 2 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sabripour et al. US 2016/0110612.


1. (Original) A method for generating content, the method comprising: identifying a first object in a plurality of frames of a source content segment; 

	
At step 102, the process includes detecting an object of interest in a set of video frames. In at least one embodiment, the object of interest is a person. In at least one embodiment, the object of interest includes a feature of a person.
Sabripour, 0026: 1-3, emphasis added

FIG. 1 depicts an example process, in accordance with an embodiment. The example process 100 includes steps 102-108 and describes functionality similar to that described below in connection with FIG. 2. The example process 100 described below may be carried out by a system, which includes a communication interface, a processor, and data storage containing instructions executable by the processor for causing the system to carry out the described functions
Sabripour, 0025:



identifying a second object in the plurality of frames of the source content segment; 


In at least one embodiment, the object of interest is a face of a person. In at least one such embodiment, detecting the object of interest in the set of video frames includes using at least one of a facial-detection engine and a facial-recognition engine to detect the object of interest in the set of video frames. In at least one other embodiment, the object of interest is a weapon. In at least one embodiment, the object of interest is a vehicle. Also see Fig. 3 man and women considering two different object.
Sabripour, 0026, 0040: emphasis added



generating a first data structure that comprises a first plurality of attributes of the first object, 

the process includes generating a composite video stream from the video frames in the subset. The composite video stream shows the tracked movements of the detected object of interest without showing background data from the video frames in the subset. In at least one embodiment, the composite video stream displays only the detected object of interest or a symbol representing the detected object of interest and tracked movements. A visual example of this aspect is depicted in FIG. 2. In at least one embodiment, information associated with the detected object of interest is displayed, such as timestamp information, location information, a public threat level, and/or various other relevant data

Sabripour, 0031: emphasis added


wherein the first object can be reconstructed based on the first data structure;

the process includes generating a composite video stream from the video frames in the subset. The composite video stream shows the tracked movements of the detected object of interest without showing background data from the video frames in the subset. In at least one embodiment, the composite video stream displays only the detected object of interest or a symbol representing the detected object of interest and tracked movements. A visual example of this aspect is depicted in FIG. 2. In at least one embodiment, information associated with the detected object of interest is displayed, such as timestamp information, location information, a public threat level, and/or various other relevant data

Sabripour, 0031: emphasis added



 generating a second data structure that comprises a second plurality of attributes of the second object, 


FIG. 3 depicts a second example conceptual overview of the presently disclosed methods and systems including a plurality of video sources and objects of interest, in accordance with an embodiment. In particular, FIG. 3 depicts a conceptual overview 300 wherein a subset of frames, frames 202-206 of FIG. 2 as well as frames 302-304, are used to generate a composite video 306. In FIG. 3, depictions of composite frames, analogous to the composite frames 208-212 of FIG. 2, are omitted for the sake of simplicity. In at least one embodiment, the object of interest is a set of multiple objects of interest. The conceptual overview 300 depicts a man and a woman as both being objects of interest.
Sabripour, 0031, 0038, 0040: emphasis added


wherein the second object can be reconstructed based on the second data structure; 


the process includes generating a composite video stream from the video frames in the subset. The composite video stream shows the tracked movements of the detected object of interest without showing background data from the video frames in the subset. In at least one embodiment, the composite video stream displays only the detected object of interest or a symbol representing the detected object of interest and tracked movements. A visual example of this aspect is depicted in FIG. 2. In at least one embodiment, information associated with the detected object of interest is displayed, such as timestamp information, location information, a public threat level, and/or various other relevant data
Sabripour, 0031, 0038, 0040: emphasis added



modifying the first data structure by changing an attribute of the first plurality of attributes;


FIG. 4 depicts a database upload conceptual overview, in accordance with an embodiment. In particular, FIG. 4 depicts a conceptual overview 400. In at least one embodiment, the process described herein further includes (i) identifying a set of attribute values, values 414-422, that correspond with a set of attributes, attributes 404-212, of a detected object of interest 402, (ii) generating an unique ID 424 from the identified values 414-422, and (iii) storing the generated unique ID 424 in association with a generated composite video 430. In at least one embodiment, the process further includes storing the generated unique ID 424 and the generated composite video 430 in association with a set of video frames 426, wherein the composite video 430 is derived at least in part from the video frames 426.
Sabripour, 0041-0042: emphasis added
 

generating a resulting content segment by reconstructing a first modified object based on the first modified data structure and reconstructing the second object based on the second data structure.  


storing the unique ID 424 in association with the generated composite video 430 includes storing the composite video 430 in a searchable database 428 of such generated composite videos, the searchable database 428 being indexed by such generated unique IDs. In at least one embodiment, generating the composite video 430 includes including searchable metadata in the composite video 430, the searchable metadata including at least one of time data and location data.
Sabripour, 0042: emphasis added

FIG. 5 depicts a database query conceptual overview, in accordance with an embodiment. In particular, FIG. 5 depicts a conceptual overview 500. The conceptual overview 500 highlights some of the inputs and outputs of a database query. The database 428 includes composites videos 506-510 and 514-516, which are indexed by unique IDs, unique ID "A" 504 and unique ID "B" 512
Sabripour, 0044-0045: emphasis added


Claim 11 list all similar elements of claim 1, but in method form rather than system form.  Therefore, the supporting rationale of the rejection to claim 1 applies equally as well to claims 11.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2-10, 12-19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sabripour et al. US 2016/0110612 as applied to claim 1 above, and further in view of MJ US 2015/0269441.

Regarding claim 2, Sabripour teaches:

2. (Original) The method of claim 1,  

However, Sabripour fails to explicitly teach wherein one attribute of the first plurality of attributes is a vectorized representation of the first object.

wherein one attribute of the first plurality of attributes is a vectorized representation of the first object.
Visual tracking of an object in a video involves comparison of certain frames of the video with respect to a reference source of object's information. This source information from the initial frame is stored in the form of a dictionary D, which is defined in Equation 1 as: 
D=[d.sub.1,d.sub.2, . . . ,d.sub.n].epsilon.R.sup.l.times.n (Equation 1) 
where the dictionary D contains an array of atoms/patches ([d.sub.1, d2, . . . , dn]), which are elements of (.epsilon.) a matrix R, which is of a size l by n vectors. That is, elements from the array [d.sub.1, d2, . . . , dn] are vectorized patches sampled from the target image
MJ, 0047-0048: emphasis added


Accordingly, it would have been obvious to one ordinary skill in the art before the effective filing date to combine the teaching of MJ into a system of Sabripour in order wherein one attribute of the first plurality of attributes is a vectorized representation of the first object, as such The candidate patches are co-located with the multiple densely overlapping patches to form a dynamic candidate dictionary Y of candidate patches. Candidate patches that best match the densely overlapping patches from the first frame are identified by an L1-norm solution, in order to identify a best-matched patch in the new frame..—Abstract.

Note: The motivation that was applied to claim 2 above, applies equally as well to claims 3-10, 12-19 and 20 as presented blow. 

Regarding claim 3, Sabripour and MJ teaches:
3. (Original) The method of claim 2, further, MJ teaches wherein modifying the first data structure comprises modifying the vectorized representation of the first object. 


Each column in the dictionary D has feature vectors (patches) that uniquely represent certain parts of the target image. As described herein, the object window 304 and the search window 306 in FIG. 3 are partitioned into densely overlapping patches, both in the initial frame (for dictionary D) as well as in subsequent frames (for dictionary Y). Thus, these densely overlapping patches from different frames are used in the creation of the initial dictionary D (for the initial first frame) and the subsequent candidate dictionary Y (for the subsequent frame(s))
Sabripour, 0050-0051: emphasis added


 Regarding claim 4, Sabripour and MJ teaches:

4. (Original) The method of claim 3, further, MJ teaches wherein modifying the vectorized representation of the first object comprises removing a portion of vectors of the vectorized 2Application No.: 16/451,823Docket No.: 003597-1871-102 Preliminary Amendment dated August 13, 2019 representation of the first object and adding new vectors to the vectorized representation of the first object.  

Each column in the dictionary D has feature vectors (patches) that uniquely represent certain parts of the target image. As described herein, the object window 304 and the search window 306 in FIG. 3 are partitioned into densely overlapping patches, both in the initial frame (for dictionary D) as well as in subsequent frames (for dictionary Y). Thus, these densely overlapping patches from different frames are used in the creation of the initial dictionary D (for the initial first frame) and the subsequent candidate dictionary Y (for the subsequent frame(s))
Sabripour, 0050-0051: emphasis added

With reference now to FIG. 2, an overview of the present inventive method is shown in flowchart 200. At block 202, a first frame is initialized. For example, consider frame 300 in FIG. 3. Frame 300 is an initial frame, showing an object 302 (in this example a person). The initial frame also shows a background, which is of no interest to a tracker. That is, it is movement of the object 302 that needs to be tracked, not the objects in the background that do not move

Within the frame 300 is an object window 304, which shows the object 302 whose movement is to be tracked. The key identification information for object 302 (e.g., the subject's face) is found within object window 304. However, when the subject's face moves, so does the rest of her head (i.e., her hair). This other screen area is identified as a search window 306. Overlapping patches 308 are subcomponents of not only the search window 306 (as depicted), but also the object window 304 (not depicted). That is, both the object window 304 and the search window 306 are subdivided into unique patches
Sabripour, 0040-0041: emphasis added



Regarding claim 6, Sabripour and MJ teaches:
6. (Original) The method of claim 3, further, MJ teaches wherein modifying the vectorized representation of the object comprises resizing a portion of vectors of the vectorized representation of the first object.  

Returning to FIG. 2, block 204 depicts the creation of an object pyramid. That is, a same image (e.g., that found in the search window 306 depicted in FIG. 3) is resized to various sizes, thus compensating for the image (but not the tracked object) getting larger or smaller as it moves closer to or farther away from the video camera.
Sabripour, 0042: emphasis added



Regarding claim 7, Sabripour and MJ teaches:
7. (Original) The method of claim 3, further, MJ teaches wherein modifying the vectorized representation of the object comprises replacing the vectorized representation of the first object with a vectorized representation of a third object, wherein the third object is resized to be the same size as the first object.  

When tracking a particular object in a video, the candidate object in the current frame is taken as the area of pixels where the object was located in the previous frame. The object in the current frame is to be searched within the specified search window. The size of the search window depends on three level hierarchical pyramidal structure, as described and depicted in block 204 in FIG. 2. For example, in the first level of the pyramid, the size of the search window may be defined as 64.times.64 pixels. The second level depicts the size as 32.times.32 pixels, and last phase/level resizes the search window to 16.times.16 pixels. Patches of size 3.times.3 pixels are used in the search window of size 16.times.16. This pyramidal object representation helps the tracker in two ways. First, it is used to reduce the search space, thus reducing the amount of requisite computation, and thereby increasing the speed of the tracker. Second, capturing the object at different scales improves the robustness when the object undergoes scale change
Sabripour, 0055: emphasis added


Regarding claim 8, Sabripour and MJ teaches:
8. (Original) The method of claim 1, further, MJ teaches wherein the vectorized representation of the first object comprises vectorized representation of a sub-portion of the first object.  

When tracking a particular object, a portion of that particular object is represented as a candidate feature vector (i.e., candidate patch `y.sub.k`). This candidate feature vector is created by a linear combination of a few patches in the original target image (i.e., atoms from D from the first frame). This combination of patches is called a "sparse representation", or "sparse coding", since only a few patches are required from D to generate the candidate patch y.sub.k. That is, the initial dictionary D has more than enough vectors to represent a candidate patch from the subsequent frame
Sabripour, 0053: emphasis added


Regarding claim 9, Sabripour and MJ teaches:
9. (Original) The method of claim 8, further, MJ teaches wherein modifying the first data structure comprises modifying the vectorized representation of the sub-portion of the first object. 


Overlapping patches of the search window from a subsequent screen are placed in a subsequent dictionary of overlapping patches, referred to herein as "Y". Thus, the present invention uses two dictionaries, where one (D) is static, having been initialized with the object selected in the first frame, while the other (Y) is dynamic, since it is updated based on the spatial information obtained from a confidence map
Sabripour, 0032: emphasis added

 
Regarding claim 10, Sabripour and MJ teaches:

10. (Original) The method of claim 8, further, MJ teaches wherein modifying the first data structure comprises removing the vectorized representation of the sub-portion of the object.

Once feature vectors of a target image (from the start frame) and images (from subsequent frame(s)) are extracted and stored in respective target dictionary `D` and candidate dictionary `Y`, a candidate patch for each image patch vector in `Y` is created by a sparse linear combination of the image patch vectors in D. That is, a "candidate patch" is created from a combination of patches from D. This candidate patch is then compared to other patches in Y to identify which of the patches in Y are part of the same image captured by one or more patches from D. Thus, consider a candidate patch y.sub.k as defined in Equation 2: 
y.sub.k=D.alpha..sub.k (Equation 2) 
where candidate patch y.sub.k is created from one or more of the atoms from dictionary D, and .alpha..sub.k is an n-dimensional coefficient vector for the candidate patch `y.sub.k`. 
Sabripour, 0052: emphasis added


Claims 12-14 and 16-20 list all similar elements of claims 2-4 and 6-10, but in method form rather than system form.  Therefore, the supporting rationale of the rejection to claims 1-4 and 6-10 applies equally as well to claims 12-14 and 16-20.

Claim Rejections - 35 USC § 103

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Sabripour et al. US 2016/0110612 and MJ US 2015/0269441.

Regarding claim 5, Sabripour and MJ teaches:

5. (Original) The method of claim 3, However, Sabripour and MJ fails to explicitly teach, wherein modifying the vectorized representation of the object comprises changing a color of a portion of vectors of the vectorized representation of the first object.  Official Notice is taken that both the concept and the advantage of implementing wherein modifying the vectorized representation of the object comprises changing a color of a portion of vectors of the vectorized representation of the first object are well known and expected in the art.  Thus, it would have been obvious to one skilled in the art, at the time of the applicant’s invention, to utilize said feature within said system taught by Sabripour and MJ, because such incorporation would result in identifying vectorized representation of the object based on change of color.  


Prior Art

The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.  
Tsai et al. 	US 2019/0370984 
Amer et al. 	US 2019/0304157
Jang et al. 	US 2015/0254497


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL T TEKLE whose telephone number is (571)270-1117.  The examiner can normally be reached on Monday-Friday 8:00-4:30 ET.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached on 571-272-3922.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/DANIEL T TEKLE/Examiner, Art Unit 2481