Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Mody US PGPub: US 2018/0197067 A1 Jul. 12, 2018.
 
A CNN structure often receives input data, which is usually an RGB color image, and convolves samples of the input data with a set of pre-trained weights. A non-linear activation function, such as tan h, sigmoid, ReLu, follows the convolution layers. Several such convolution layers are used together to provide robust feature identification. Pooling layers, often max-pooling, are inserted between convolution layers to provide some invariance to the size of objects in the image data (ABSTRACT, Figs. 1, 2, 15, paragraph 0004).

Boesch US PGPub: US 2018/0189642 A1 Jul. 5, 2018.
A configurable accelerator framework device that includes a stream switch and a plurality of convolution accelerators. The stream switch has a plurality of input ports and a plurality of output ports. Each of the input ports is configurable at run time to unidirectionally pass data to any one or more of the output ports via a stream link. Each one of the plurality of convolution accelerators is configurable at run time to unidirectionally receive input data via at least two of the plurality of stream switch output ports, and each one of the plurality of convolution accelerators is further configurable at run time to unidirectionally communicate output data via an input port of the stream switch (ABSTRACT).

An optional normalization operation is also illustrated in FIG. 1I. The normalization operation is typically performed by a Rectified Linear Unit (ReLU). The ReLU identifies every negative number in the pooled output map and replaces the negative number with the value of zero (i.e., “0”) in a normalized output map. After processing in the ReLU layer, data in the normalized output map may be averaged in order to predict whether or not the feature of interest characterized by the kernel is found or is not found in the unknown image (Fig. 1I, paragraphs 0045, 0046).

A plurality of processing modules 410, where a second processing module 410 is an H264 processing module 410b, which is arranged to perform particular video encoding/decoding operations. A third processing module 410 is a color converter processing module 410n, which is arranged to perform color-based operations on certain multimedia data (Fig. 4A/410, paragraph 0175).

Butt US PGPub: US 2018/0157939 A1 Jun. 7, 2018.
An appearance search system comprising one or more cameras configured to capture video of a scene, the video having images of objects. The method comprises identifying one or more of the objects within the images of the objects. The method further comprises implementing a learning machine configured to generate signatures of the identified objects and generate a signature of an object of interest. The method further comprises comparing the signatures of the identified objects with the signature of the object of interest to generate similarity scores for the identified objects, and transmitting an instruction for presenting on a display one or more of the images of the objects based on the similarity scores (ABSTRACT).

Processing video may include, but is not limited to, image processing operations, analyzing, managing, compressing, encoding, storing, transmitting and/or playing back the video data. Analyzing the video may include segmenting areas of image frames and detecting visual objects, tracking and/or classifying visual objects located within the captured scene represented by the image data. The video camera 108 is an analog camera connected to an encoder (paragraphs 0063, 0072). 

The video analytics module 224 receives image data and analyzes the image data to determine properties or characteristics of the captured image or video and/or of objects found in the scene represented by the image or video. The determination may include one or more of foreground/background segmentation, object detection, object tracking, object classification, virtual tripwire, anomaly detection, facial detection, facial recognition, license plate recognition, identifying objects “left behind” or “removed”, and business intelligence (Fig. 3/300, paragraphs 0092, 0104).

The metadata may define the location, reference coordinates, of the foreground visual object, or object, within the image frame. For example, the location metadata may be further used to generate a bounding box (such as, for example, when encoding video or playing back video) outlining the detected foreground visual object (Fig. 3, paragraph 0106).

Sharma US PGPub: US 2017/0032285 1 Feb. 2, 2017.
Authenticating physical objects using machine learning method, classifying the image into one or more classes based on the product specification. The product specification includes a name of a brand, a product line, or other details on a label of the physical object (ABSTRACT).

The first network architecture consists of 3 convolution layers along with 3 max-pooling layers and ReLU (Rectified Linear Unit), followed by 2 independent convolution layers (which do not have max-pooling layers) and 3 fully connected layers in the final section (Figs. 3 - 5, paragraph 0054).
Allowable Subject Matter
Claims 1 – 8 are allowed.
The following is the examiner’s statement of reasons for allowance:

Claims 1 and it dependent claims thereof are allowed because the closest prior art either alone or in combination, fail to anticipate or render obvious, a neural network architecture, comprising: a foreground attentive subnetwork, including: an encoder subnetwork configured to extract features from an RGB image, including passing the RBD image through two convolutional layers including learned filters, a rectified linear unit (ReLU), and a max pooling kernel; and a decoder subnetwork configured to formulate a binary mask of the RGB image foreground, including passing the extracted features through two deconvolutional layers including learned filters; a body part subnetwork configured to: concurrently with formulating the binary mask, averagely slice the extracted features into four equal sliced feature maps; for each of the four equal sliced feature maps: discriminatively learn feature representations of different body parts, including passing the sliced feature map two convolutional layers, a rectified linear unit (ReLU), and two max pool kernels; and learn a local feature map of a body part, including passing output of the two max pool kernels to a first fully connected layer, and passing output from the first fully connected layer to a second fully connected layer; a feature fusion subnetwork configured to: concatenate outputs from the first fully connected layers into a third fully connected layer; concatenate outputs from the second fully connected layers and the third fully connected layer into a normalization layer; and normalize feature vectors on a unit sphere space; and a symmetric triplet loss layer configured to learning final feature vectors, in combination with all other limitations in the claim(s) as defined by applicant.


Claims 4 and it dependent claims thereof are allowed because the closest prior art either alone or in combination, fail to anticipate or render obvious, a method for person re-identification, comprising: extracting features from an RGB image, including passing the RBD image through two convolutional layers including learned filters, a rectified linear unit (ReLU), and a max pooling kernel; formulating a binary mask of the RGB image foreground, including passing the extracted features through two deconvolutional layers including learned filters; concurrently with formulating the binary mask, averagely slicing the extracted features into four equal sliced feature maps; for each of the four equal sliced feature maps: discriminatively learning feature representations of different body parts, including passing the sliced feature map two convolutional layers, a rectified linear unit (ReLU), and two max pool kernels; and learning a local feature map of a body part, including passing output of the two max pool kernels to a first fully connected layer, and passing output from the first fully connected layer to a second fully connected layer; concatenating outputs from the first fully connected layers into a third fully connected layer; concatenating outputs from the second fully connected layers and the third fully connected layer into a normalization layer; normalizing feature vectors on a unit sphere space; and learning final feature vectors at a symmetric triplet loss layer, in combination with all other limitations in the claim(s) as defined by applicant.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIMESH PATEL whose telephone number is (571)270-1228.  The examiner can normally be reached on Monday thru Friday: 6:30 AM - 3:30 PM EST.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rafael Perez-Gutierrez can be reached on 571-272-7915.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NIMESH PATEL/Primary Examiner, Art Unit 2642