DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The following is a non-final, first Office action on the merits. 
Claims 1-20 are pending.

Information Disclosure Statement
Information disclosure statement (IDS) was submitted on 10/10/2019 and 10/27/2019. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The Drawings filed on 10 October 2019 and 19 November 2019 have been acknowledged. 

Specification
The specification, as originally filed, has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.


Claim Objections
Claim 17 is objected to because of the following informalities:  
Claim 17 recites “a task management node” and “the task management module”. The examiner suggests applicant review the claim language and consistently claim the impacted elements. The examiner suggests applicant use the term “task management module” as disclosed in the specification disclosure. Appropriate action is required. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by U.S. Patent Application Publication, US 20200026949, to Nicholas John Alcock et al, hereinafter “Alcock”.

Regarding claim 1, Alcock teaches a method (Alcock, ¶ [0101], discloses a flow diagram of an example embodiment of a method for performing appearance matching to locate an object of interest on one or more image frames of a video captured by a video capture device (camera). The video is captured by the camera over a period of time), comprising: detecting features from a sensor signal via a deep-learning network in an edge 5processing node (Alcock, ¶ [0102], discloses the video analytics module performs the object detection and classification, and also generates images (cropped bounding boxes) from the video that best represent the objects in the scene. Alcock, ¶ [0107], discloses the Process uses a learning machine to process the cropped bounding boxes to generate the feature vectors or signatures of the images of the objects captured in the video. The learning machine is for example a neural network such as a convolutional neural network (CNN) running on a graphics processing unit (GPU). Alcock, ¶ [0112], discloses after computing and storing the feature vectors of the detected objects in the database, searching similar objects is done using an exact nearest neighbor search, exhaustively evaluating the distance from the queried feature vector (feature vector of the object of interest) to all other vectors in the time frame of interest); creating machine-learned metadata that describes the features (Alcock, ¶ [0125], discloses the number of dimensions may be larger or smaller depending on the learning machine being used to generate the feature vectors. While higher dimensions generally have greater accuracy, the computational resources required may also be very high); creating a hash with the machine-learned metadata (Alcock, ¶ [0141], discloses a processor, such as a processor comprising part of the server 406, performs a hash-based appearance search that does not require the processor to determine the ; and storing the sensor signal as a content object at the edge processing node, the object keyed with the hash at the edge processing node (Alcock, ¶ [0142], discloses the server hashes the feature vector to generate a hash vector that comprises one or more hashes corresponding to a respective one or more components of the hash vector. The server 406 then applies a threshold criterion to each of those hashes. The hashes that satisfy the threshold criterion continue to be considered by the server, while the server subsequently ignores the hashes that do not. Following the thresholding, the server accesses a scoring database that has been generated based on different examples of the search target to determine whether the hashes that satisfy the threshold criterion are represented in that database. The server then determines a score representing a similarity of the search subject to the different examples of the search target used to generate the scoring database based on how many of the hashes that satisfy the threshold criterion are represented in the scoring database).  

Regarding claim 2, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the hash is encapsulated within the content object (Alcock, ¶ [0112-0113], discloses after computing and storing the feature vectors of the detected objects in the database, searching similar objects is done using an exact nearest neighbor search, exhaustively evaluating the distance from the queried feature .  

Regarding claim 3, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the edge processing node comprises a processor 15integrated into a data storage drive that conforms to mechanical and electrical drive enclosure standards (Alcock, ¶ [0146], discloses a hash-based appearance search, whether it be for a facet of an object of interest or an object of interest itself, may be performed in accordance with a method such as that depicted in FIG. 11. The method is performed by a processor, which in the example embodiment described below comprises part of the server. However, in some other example embodiments the processor may comprise part of the camera, and in still other example embodiments may comprise part of another component of the system such as the client. Alternatively, in some other example embodiments the method may be performed in a distributed manner, with certain actions of the method being performed by different processors comprising parts of different components of the system).  

Regarding claim 4, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the sensor signal comprises a video signal and wherein the features comprise any combination of motion, people, and faces (Alcock, ¶ [0128], discloses therein is illustrated the scene and the cropped bounding boxes of the example embodiment of FIG. 4. There are shown in the scene the three people who are detected. Their images are extracted by the camera and sent to the server as the cropped bounding boxes. The images are the representative images of the three people in the video over a period of time. The three people in the video are in motion and their captured images will accordingly be different over a given period of time. To filter the images to a manageable number, a representative image (or images) is selected as the cropped bounding boxes for further processing).  

Regarding claim 5, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches storing the hash in a key query handling node separate from the edge processing node (Alcock, ¶ [0141], discloses performs a hash-based appearance search that does not require the processor to determine the Euclidean distance between feature vectors. The hash-based appearance search may be for an object of interest, such as a person or a vehicle), the hash facilitating access to the content stored at the edge processing node (Alcock, ¶ [0142], discloses the server accesses a scoring database that has been generated based on different examples of the search target to determine whether the hashes that satisfy the threshold criterion are represented in that database. The server then determines a score representing a similarity of the search subject to the different examples of the , the key query handling module storing hash keys for content objects stored on a plurality of edge processing nodes (Alcock, ¶ [0155], discloses each hash vector has at most four components. For each of those components that is a scoring component, the scoring database stores a data structure that relates a hash identifier corresponding to one of the hash vector's components, and a weight assigned to that component).  

Regarding claim 6, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches receiving a query at the key query handling node from a client (Alcock, ¶ [0116], discloses the user searches the database for an image of the object of interest); sending one or more queried keys to the client, the queried keys indicating a subset of the edge processing nodes having content objects matching the query (Alcock, ¶ [0118], discloses with each selection of a new image (or images) of the object of interest at selection, the feature vectors of the new images are searched at the database and new candidate images of the object of interest are presented at the client for the user to again select new images which are or may be of the object of interest);15PATENTDocket No. 0430.074475.00US (STL074475.OOUS)sending the queried keys from the client to the subset of edge processing nodes to retrieve the matching content objects (Alcock, ¶ [0120], discloses to initial an appearance search for an object of interest, a feature vector of the object of interest is needed in order to search the database for ; and sending the matching content objects from the subset of edge processing nodes to the client in response to the sending of the queried keys (Alcock, ¶ [0120], discloses an image of an object of interest is received at the client where it is sent to the Process to generate a feature vector of the object of interest. In the second method, the user searches the database for an image of the object of interest and retrieves the feature vector of the object of interest which was previously generated when the video was processed before storage in the database).  

Regarding claim 7, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches determining that the machine-learned metadata matches a predefined condition defined by a client, and sending a real-time alert to the client in response thereto (Alcock, ¶ [0097], discloses a visual indicator may be added to the image frame to visually identify each of the detected one or more foreground visual objects. The visual indicator may be a bounding box that surrounds each of the one or more foreground visual objects within the image frame. Alcock, ¶ [0107], discloses the Process uses a learning machine to process the cropped bounding boxes to generate the feature vectors or signatures of the images of the objects captured in the video. Alcock, ¶ [0131], discloses the input to the instantaneous object classification module is preferably a sub-region (for example within a bounding box) of an image in which the visual object of interest is located rather than the entire image frame. Further, Alcock, ¶ [0134], discloses the video analytics module detects the objects and extracts the images of each object. An image selected from these images is referred to as a finalization of the object. The finalizations of the objects are intended to .  

Regarding claim 8, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches 10 the hash comprises similarity-preserving vectors that are mapped from features extracted by a feature extractor (Alcock, ¶ [0133], discloses the video analytics module may use facial recognition (as is known in the art) to detect faces in the images of humans and accordingly provides confidence levels. The appearance search system of such an embodiment may include using feature vectors of the images or cropped bounding boxes of the faces instead of the whole human as shown in FIG. 8. Such facial feature vectors may be used alone or in conjunction with feature vectors of the whole object. Further, feature vectors of parts of objects may similarly be used alone or in conjunction with feature vectors of the whole object. For example, a part of an object may be an image of an ear of a human. Ear recognition to identify individuals is known in the art).  

Regarding claim 9, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the hash comprises a globally unique identifier associated with the content object (Alcock, ¶ [0148], discloses the server performs a hash-based appearance search with a facet as the search subject. In this example, the facet is a “red shirt”; that is, the facet descriptor is “shirt color” and the facet tag is “red”. .  

Regarding claim 10, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the hash has a smaller size than the machine learned metadata (Alcock, ¶ [0113], discloses an approximate nearest neighbor search may be used. It is similar to its ‘exact’ counterpart, but it retrieves the most likely similar results without looking at all results. This is faster, but may introduce false negatives. An example of approximate nearest neighbor may use an indexing of a hashing of the feature vectors. An approximate nearest neighbor search may be faster where the number of feature vectors is large such as when the search time frames are long).  

Regarding claim 11, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches after creating the hash: 20detecting additional features based on a change to the deep-learning network (Alcock, ¶ [0118], discloses with each selection of a new image (or images) of the object of interest at selection, the feature vectors of the new images are searched at the database and new candidate images of the object of interest are presented at the client for the user to again select new images which are or may be of the object of interest); creating additional machine-learned metadata that describes the additional features ((Alcock, ¶ [0108], discloses the Process deploys a trained model in what is known as batch learning where all of the training is done before it is used in the appearance search system. The trained model, in this embodiment, is a convolutional neural ; and updating the hash based on an additional hash value created from the additional machine learned metadata (Alcock, ¶ [0107], discloses the Process uses a learning machine to process the cropped bounding boxes to generate the feature vectors or signatures of the images of the objects captured in the video. The learning machine is for example a neural network such as a convolutional neural network (CNN) running on a graphics processing unit (GPU). Alcock, ¶ [0118], discloses this searching loop of Appearance Search may continue until the user decides enough images of the object of interest has been located and ends the search. The user may then, for example, view or download the videos associated with the images on the list).  

Regarding claim 12, Alcock teaches an edge processing node (At least, Alcock, FIG. 1, 2A, and 2B, discloses video processing appliance/devices), comprising: a processor; a data storage medium; input/output circuitry coupled to the processor and the data storage medium and 30configured to receive a sensor signal (Alcock, FIG. 1, discloses features such as client device, camera with memory sensors and CPU, etc.); and16PATENT Docket No. 0430.074475.OOUS(STL074475.OOUS)memory that stores instructions that are operable by the processor to: detect features from the sensor signal via a deep-learning network or other feature engineering method (Alcock, ¶ [0102], discloses the video analytics module performs the object detection and classification, and also generates images (cropped bounding boxes) from the video that best represent the objects in the scene. Alcock, ¶ [0107], discloses the Process uses a learning machine to process the ; create machine-learned metadata that describes the features (Alcock, ¶ [0125], discloses the number of dimensions may be larger or smaller depending on the learning machine being used to generate the feature vectors. While higher dimensions generally have greater accuracy, the computational resources required may also be very high);  5create a hash with the machine-learned metadata (Alcock, ¶ [0141], discloses a processor, such as a processor comprising part of the server 406, performs a hash-based appearance search that does not require the processor to determine the Euclidean distance between feature vectors. The hash-based appearance search may be for an object of interest, such as a person or a vehicle, as described in the example embodiments of FIGS. 1-9. Alternatively, the hash-based appearance search may be for a facet of an object of interest); and store the sensor signal as a content object in the data storage medium, the object keyed with the hash (Alcock, ¶ [0142], discloses the server hashes the feature vector to generate a hash vector that comprises one or more hashes corresponding to a respective one or more components of the hash vector. The server 406 then applies a threshold criterion to each of those hashes. The hashes that satisfy the threshold criterion continue to be considered by the server, while .  

Regarding claim 13, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the edge processing node conforms 10to mechanical and electrical drive enclosure standards (Alcock, ¶ [0146], discloses a hash-based appearance search, whether it be for a facet of an object of interest or an object of interest itself, may be performed in accordance with a method such as that depicted in FIG. 11. The method is performed by a processor, which in the example embodiment described below comprises part of the server. However, in some other example embodiments the processor may comprise part of the camera, and in still other example embodiments may comprise part of another component of the system such as the client. Alternatively, in some other example embodiments the method may be performed in a distributed manner, with certain actions of the method being performed by different processors comprising parts of different components of the system).  

Regarding claim 14, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the sensor signal comprises a video signal and wherein the features comprise any combination of motion, people, and faces (Alcock, ¶ [0128], discloses therein is illustrated the scene and the cropped bounding boxes of the example embodiment of FIG. 4. There are shown in the scene the three people who are detected. Their images are extracted by the camera and sent to the server as the cropped bounding boxes. The images are the representative images of the three people in the video over a period of time. The three people in the video are in motion and their captured images will accordingly be different over a given period of time. To filter the images to a manageable number, a representative image (or images) is selected as the cropped bounding boxes for further processing).  

Regarding claim 15, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches 15 the processor is further operable to store the hash in a key query handling node separate from the edge processing node (Alcock, ¶ [0141], discloses performs a hash-based appearance search that does not require the processor to determine the Euclidean distance between feature vectors. The hash-based appearance search may be for an object of interest, such as a person or a vehicle), the hash facilitating access to the content stored at the edge processing node (Alcock, ¶ [0142], discloses the server accesses a scoring database that has been generated based on different examples of the search target to determine whether the hashes that satisfy the threshold criterion are represented in that database. The server then determines a score representing a similarity of the search subject to the different examples of the search target used to generate the scoring database based on how many of the hashes that satisfy the threshold criterion are represented in the , the key query handling module storing hash keys for content objects stored on a plurality of edge processing nodes (Alcock, ¶ [0155], discloses each hash vector has at most four components. For each of those components that is a scoring component, the scoring database stores a data structure that relates a hash identifier corresponding to one of the hash vector's components, and a weight assigned to that component) and processing query requests to identify matching keys stored (Alcock, ¶ [0113], discloses an approximate nearest neighbor search may be used. It is similar to its ‘exact’ counterpart, but it retrieves the most likely similar results without looking at all results. This is faster, but may introduce false negatives. An example of approximate nearest neighbor may use an indexing of a hashing of the feature vectors. An approximate nearest neighbor search may be faster where the number of feature vectors is large such as when the search time frames are long. Further, Alcock, ¶ [0113], discloses appearance Search for performing appearance matching at the client to locate recorded videos of an object of interest. To initiate an appearance search for an object of interest, a feature vector of the object of interest is needed in order to search the database for similar feature vectors).  

Regarding claim 16, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the processor is further operable to determine that the machine-learned metadata matches a predefined condition defined by a client, and send a real-time alert to the client in response thereto (Alcock, ¶ [0097], .  

Regarding claim 2517, Alcock teaches a system (FIG. 1 illustrates a block diagram of connected devices of a video capture and playback system according to an example embodiment) comprising: a plurality of sensors that generate a respective plurality of data streams (Alcock, FIG. 1, discloses features such as client device, camera with memory sensor and CPU, etc. Further, Alcock, ¶ [0159], discloses the search subject may be an individual, and the different examples of the search target may comprise images of that individual taken at different times from different video ; a task management node that directs the plurality of data streams to a respective plurality of active object store nodes (Alcock, ¶ [0101], discloses a flow diagram of an example embodiment of a method for performing appearance matching to locate an object of interest on one or more image frames of a video captured by a video capture device (camera). The video is captured by the camera over a period of time), each of the active object store nodes operable to: detect features from the data stream via a deep-learning network or other 30feature engineering methods (Alcock, ¶ [0102], discloses the video analytics module performs the object detection and classification, and also generates images (cropped bounding boxes) from the video that best represent the objects in the scene. Alcock, ¶ [0107], discloses the Process uses a learning machine to process the cropped bounding boxes to generate the feature vectors or signatures of the images of the objects captured in the video. The learning machine is for example a neural network such as a convolutional neural network (CNN) running on a graphics processing unit (GPU). Alcock, ¶ [0112], discloses after computing and storing the feature vectors of the detected objects in the database, searching similar objects is done using an exact nearest neighbor search, exhaustively evaluating the distance from the queried feature vector (feature vector of the object of interest) to all other vectors in the time frame of interest); 17PATENTDocket No. 0430.074475.OOUS(STL074475.OOUS)create machine-learned metadata that describes the features (Alcock, ¶ [0125], discloses the number of dimensions may be larger or smaller depending on the learning machine being used to generate the feature vectors. While higher dimensions generally have greater accuracy, the computational resources required may also be very high); create a hash with the machine-learned metadata (Alcock, ¶ [0141], discloses a processor, such as a ; and store the sensor signal as a content object in the active object store node, the object keyed with the hash (Alcock, ¶ [0142], discloses the server hashes the feature vector to generate a hash vector that comprises one or more hashes corresponding to a respective one or more components of the hash vector. The server 406 then applies a threshold criterion to each of those hashes. The hashes that satisfy the threshold criterion continue to be considered by the server, while the server subsequently ignores the hashes that do not. Following the thresholding, the server accesses a scoring database that has been generated based on different examples of the search target to determine whether the hashes that satisfy the threshold criterion are represented in that database. The server then determines a score representing a similarity of the search subject to the different examples of the search target used to generate the scoring database based on how many of the hashes that satisfy the threshold criterion are represented in the scoring database); and  5a key query node operable to: receive and store the hashes from the active object store nodes (Alcock, ¶ [0147], discloses the server begins performing the method … the server obtains a hash vector representing a search subject depicted in an image, with the hash vector comprising one or more hashes as a respective one or more components of the hash vector); receive a query via the task management module (Alcock, ¶ [0157], ; process the query and identify matches in the stored hashes  (Alcock, ¶ [0159], discloses the method of FIG. 12 begins and proceedswhere the server obtains a hash vector training set that comprises training hash vectors representing respective examples of a search target common to training images. Continuing the example of the “red shirt” facet used in respect of FIG. 11 above, the training hash vectors are derived from respective chips that are different from each other and that have been verified to in fact depict a person wearing a red shirt. Verification may be done, for example, by a user via the client. The examples of the search target may vary with the specificity of the search to be done); and return a list of the hashes that satisfy the query, the list of the hashes 10facilitating access to a subset of the content objects stored at a subset of the active object store nodes (Alcock, ¶ [0159], discloses the examples of the search target may vary with the specificity of the search to be done. For example, in the context of the “red shirt” facet used in respect of FIG. 11, all of the facet examples used to generate the hash vector training set may be of identical type (the “shirt” clothing type) and value (red)).  

Regarding claim 18, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the hash is encapsulated within the content object (Alcock, ¶ [0112-0113], discloses after computing and storing the feature vectors of the detected objects in the database, searching similar objects is done using an exact nearest neighbor search, exhaustively evaluating the distance from the queried feature vector (feature vector of the object of interest) to all other vectors in the time frame of .  

Regarding claim 19, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches 15 the active object store nodes and the key query node are implemented as edge processing nodes that conform to mechanical and electrical drive enclosure standards (Alcock, ¶ [0146], discloses a hash-based appearance search, whether it be for a facet of an object of interest or an object of interest itself, may be performed in accordance with a method such as that depicted in FIG. 11. The method is performed by a processor, which in the example embodiment described below comprises part of the server. However, in some other example embodiments the processor may comprise part of the camera, and in still other example embodiments may comprise part of another component of the system such as the client. Alternatively, in some other example embodiments the method may be performed in a distributed manner, with certain actions of the method being performed by different processors comprising parts of different components of the system).  

Regarding claim 20, Alcock teaches the claimed invention substantially as claimed, and Alcock further teaches the edge processing nodes are each further operable to determine that the machine-learned metadata matches a predefined condition defined by a client, and send a real-time alert to the client in response thereto (Alcock, ¶ [0097], discloses a visual indicator may be added to the image frame to visually identify each of the detected one or more foreground visual objects. The visual indicator may be a bounding box that surrounds each of the one or more foreground visual objects within the image frame. Alcock, ¶ [0107], discloses the Process uses a learning machine to process the cropped bounding boxes to generate the feature vectors or signatures of the images of the objects captured in the video. Alcock, ¶ [0131], discloses the input to the instantaneous object classification module is preferably a sub-region (for example within a bounding box) of an image in which the visual object of interest is located rather than the entire image frame. Further, Alcock, ¶ [0134], discloses the video analytics module detects the objects and extracts the images of each object. An image selected from these images is referred to as a finalization of the object. The finalizations of the objects are intended to select the best representation of the visual appearance of each object during its lifetime in the scene. A finalization is used to extract a signature/feature vector which can further be used to query other finalizations to retrieve the closest match in an appearance search setting).




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US PGPub 20190327506 (Zou et al) discloses edge compute servers are assigned to predefined number of cameras to perform computational tasks, such as feature extraction, event detection, object identification, target tracking and so forth.


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALICIA M ANTOINE whose telephone number is (571)431-0687.  The examiner can normally be reached on Mon - Fri: 9am - 3pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, PIERRE M VITAL can be reached on 571-272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 






/ALICIA M ANTOINE/Examiner, Art Unit 2162                                                                                                                                                                                                        11/06/2021