DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 7/27/2021 has been entered.
 
Response to Arguments
Applicant's arguments filed 7/27/21 are moot in view of new grounds of rejection.
The Examiner withdraws the double patenting rejection.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-26 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-12 of U.S. Patent No. 9400925 and claims 1-26 of U.S. Patent No. 10402632. Although the claims at issue are not identical, they are not patentably distinct from each other because they recite substantially the same limitations.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-5, 7-20  is/are rejected under 35 U.S.C. 103 as being unpatentable over Bourdev (“Describing People: A Poselet-Based Approach to Attribute Classification”), hereafter referred to as Bourdev in view of Yang (PGPub 2011/0222724), hereafter referred to as Yang in view of Bourdev (“Detecting People Using Mutually Consistent Poselet Activations”), hereafter referred to as Bourdev21 in view of Bourdev (“Poselets: Body Part Detectors Trained Using 3D Human Pose Annotation”), hereafter referred to as Bourdev32


locating a plurality of candidate  patches from an image, wherein each candidate patch comprises at least a portion of the image;  (Bourdev, Section 4, Step 1. “We detect the poselets3 on the test image and determine which ones are true positives referring to the target person (Section 5)”)
determining a plurality of part patches for the image based on comparisons between each candidate patch with multiple training patches corresponding to multiple distinct human body portions or poses, wherein each determined part patch comprises a candidate patch having a closeness greater than a threshold value to a training patch corresponding to one of the multiple human body portions or poses, and wherein the plurality of determined part patches correspond to one or more of the multiple distinct human body portions or poses; (Bourdev2, section 3, “For each seed window we extract patches from other training examples that have similar local keypoint configuration. Following [14], we compute a similarity transform that aligns the keypoints of each annotated image of a person with the keypoint configuration within the seed window and we discard any annotations whose residual error is too high. In the absence of 3D annotation, we propose the following distance metric: D(P1, P2) = Dproc(P1, P2) + Dvis(P1, P2), (3) where Dproc is the Procrustes distance between the common keypoints in the seed and destination patch and Dvis is a visibility distance, set to the intersection over union of the keypoints present in both patches. Dvis has the e↵ect of ensuring that the two configurations have a similar aspect, which is an important cue when 3D information is not available”, thus each part patch is compared with a plurality of training patches to generate a score, the score is compared with a threshold and patches with large errors are discarded)
generating a plurality of sets of feature data corresponding to the one or more distinct human body portions or poses by processing each of the plurality of part patches with a plurality of (Bourdev, Section 4, Step 2, “For each poselet type i we extract a feature vector φi from the image patch of the activation, as described in Section 6. The feature vector consists of HOG cells at three scales, a color histogram and skin-mask features.”, See also section 6 and section 7.1, “We train a separate classifier for each of the 1200 poselet types i and for each attribute j.”)
determining, based on the plurality of sets of feature data, whether a human attribute exists in the image. (Bourdev, Section 4, steps 3-5 & section 7, Where various human attributes are determined based on the classification engine such as “Is male" “Is Female”  "Long Pants" "Has Hat" “long hair” etc..  See Figs 4&7)
Bourdev discloses the above steps for classification of a human attribute in a digital image.  Bourdev does not explicitly disclose a “convolutional neural network" as claimed.
Yang  (paragraph 6) discloses using a convolutional neural network (CNN) in order to estimate age and gender.  Yang (Paragraph 14) discloses multiple stages of feature extraction using CNN.  Figure 2 also shows an example of this where the convolution and subsampling operations are iteratively performed  to generate a feature vectors (concatenated) for determining human attributes.  
Bourdev and Yang are combinable because they are from the same field of endeavor of analysis of images to determine human attributes.
At the time of the invention, it would have been obvious to a person of ordinary skill in the art to use a CNN to generate the feature vector.
The suggestion/motivation for doing so would have been to have a system that learns as more data is entered.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 


Bourdev in view of Yang discloses 2. (Currently Amended) The method of Claim 1, wherein locating the plurality of candidate patches from the image comprises: scanning the image using a plurality of windows having various sizes; (Bourdev3, Section 1, pg. 2 left column, before last paragraph, “Given a set of poselets (256 linear support vector machines in the current implementation), we scan the input image at multiple scales and use the outputs of these to, in turn, vote for location of the torso bounds or body keypoints (Section 5). This is essentially a Hough transform step in which we weigh each vote using weights learned in a max-margin framework [10].”; see also section 3, last paragraph, “We chose instead to run a scanning window over all positions and scales of all annotations in our training set.”) (Bourdev3, Section 3, last paragraph, “We have a simple and efficient procedure to generate a poselet candidate from our training data: Given a rectangular window from one human annotation, we use the above described least-squares method to find the closest corresponding window from every other human annotation in our training set and we keep the examples whose residual distance is less than λ. The parameter λ controls the tradeoff between quantity and quality of the examples.”) wherein the comparisons between each candidate patch with multiple training patches corresponding to multiple distinct human body portions or poses comprises comparing scanned portions of the image confined by the windows with a plurality of training patches from a database, wherein the training patches are annotated with keypoints of body parts and the database comprises the training patches that form a cluster in a 3D configuration space corresponding to a recognized human body portion or pose. (Bourdev3, section 2, paragraph 2, “H3D currently consists of 2000 annotations3 which we have split into 1500 training 500 test human annotations. We have chosen the images from Flickr with Creative Commons Attributions License4 which allows free redistribution and derivative work. H3D provides annotation of 15 types of regions of a person (such as ”face”, ”upper clothes”, ”hair”, ”hat”, ”left leg”, ”background”) and 19 types of keypoint annotations, which include joints, eyes, nose, etc. Cross-referencing appearance and 3D structure allows us to do new and powerful types of queries for pose statistics and appearance, described below.”)
At the time of the invention, it would have been obvious to a person of ordinary skill in the art to use the process of Bourdev3 .
The suggestion/motivation for doing so would have been to have a system that learns as more data is entered.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Bourdev in view of  Yang and Bourdev3 to obtain the invention as specified in claim 2.


Bourdev in view of Yang discloses 3. (Original) The method of Claim 1, wherein processing each of the plurality of part patches with the plurality of convolutional neural networks is based on a plurality of convolution operations, wherein at least one of the convolution operations uses a plurality of filters having dimensions of more than one. (Yang, Fig. 2)


Bourdev in view of Yang discloses 4. (Original) The method of Claim 3, wherein the plurality of filters are capable of detecting spatially local correlations present in the part patches. (Yang, Fig. 2)

 (Yang, paragraph 19, “For every frame of the input video, the system performs face detection and tracking, then the detected faces are aligned and normalized to 64.times.64 patches and fed to the CNN recognition engine to estimate the gender and age.  The face detection and face alignment modules are also based on certain CNN models.”)

Bourdev in view of Yang discloses 7. (Original) The method of Claim 1, further comprising: resizing the plurality of part patches to a common resolution, where the common resolution is a required resolution for processing each of the plurality of part patches with the plurality of convolutional neural networks. (Yang, paragraph 19, “For every frame of the input video, the system performs face detection and tracking, then the detected faces are aligned and normalized to 64.times.64 patches and fed to the CNN recognition engine to estimate the gender and age.  The face detection and face alignment modules are also based on certain CNN models.”)

Bourdev in view of Yang discloses 8. (Original) The method of Claim 1, further comprising: breaking down the plurality of part patches into three layers based on red, green and blue channels of the plurality of part patches. (Bourdev, Section 4, Step 2, “For each poselet type i we extract a feature vector φi from the image patch of the activation, as described in Section 6. The feature vector consists of HOG cells at three scales, a color histogram and skin-mask features.”, See also section 6)

Bourdev in view of Yang discloses 9. (Original) The method of Claim 1, further comprising: presenting, through an output interface of the computing device, a signal indicating whether the human attribute exists in the image. (Bourdev, see Fig. 7)

Bourdev in view of Yang discloses 10. (Original) The method of Claim 1, further comprising: concatenating the plurality of sets of feature data to generate a set of concatenated feature data. (Bourdev, Section 3, where the feature vector consisting of the 3 scales, color histogram and skin-mask features are input into a linear SVM)

Bourdev in view of Yang discloses 11. (Original) The method of Claim 10, further comprising: locating a whole-body portion from the image, wherein the whole-body portion covers an entire human body depicted in the image; feeding the whole-body portion into a deep neural network to generate a set of whole-body feature data; and incorporating the set of whole-body feature data into the set of concatenated feature data. (Bourdev, See Fig. 4)

Bourdev in view of Yang discloses 12. (Original) The method of Claim 1, wherein determining whether the human attribute exists in the image comprises: calculating a prediction score indicating a likelihood of the human attribute existing in the image. (Bourdev, See Section 8, Fig. 7)

Bourdev in view of Yang discloses 13. (Original) The method of Claim 1, wherein the human attribute comprises one or more of gender, age, race, hair, or clothing. (Bourdev, Fig. 4, “Hat”)


Bourdev in view of Yang discloses 14. (Original) The method of Claim 1, wherein determining whether the human attribute exists in the image is based on a classification engine, wherein the classification engine comprises a linear support vector machine that is trained using training data associated with the human attribute.  (Bourdev, See Section 7, throughout)

	Claim 15 is rejected under similar grounds as claim 1.
	Claim 16 is rejected under similar grounds as claim 2.
Claim 17 is rejected under similar grounds as claim 3.
	Claim 18 is rejected under similar grounds as claim 4.
	Claim 19 is rejected under similar grounds as claim 5.
	Claim 20 is rejected under similar grounds as claim 1.

Claim 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bourdev in view of Yang in view of Suleyman (PGPub 2014/0270488), hereafter referred to as Suleyman
Bourdev in view of Yang discloses 6. The method of Claim 3, further comprising: 
But does not expressly disclose “applying a max-pooling operation to each of plurality of the part patches after one of the convolution operations has been applied to the part patch”
 Suleyman discloses “applying a max-pooling operation to each of plurality of the part patches after one of the convolution operations has been applied to the part patch” (Suleyman, paragraph 100 &104-105 )
Bourdev in view of Yang and Suleyman are combinable because they are from the same field of endeavor of Convolutional Neural Networks.
At the time of the invention, it would have been obvious to a person of ordinary skill in the art to use a max-pooling.
The suggestion/motivation for doing so would have been to reduce the dimension of the input data.

Therefore, it would have been obvious to combine Bourdev in view of Yang with Suleyman to obtain the invention as specified in claim 6.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GANDHI THIRUGNANAM whose telephone number is (571)270-3261.  The examiner can normally be reached on M-F 8:30-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on 571-272-3638.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/GANDHI THIRUGNANAM/               Primary Examiner, Art Unit 2662                                                                                                                                                                                         


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Incorporated by reference in section 5, paragraph 1, “We use the method of Bourdev et al. [3] to train 1200 poselets using images from the training and validation sets. Instead of all poselets having the same aspect ratios, we used four aspect ratios: 96x64, 64x64, 64x96 and 64x128 and trained 300 poselets of each. For each poselet, during training, we build a soft mask for the probability of each body component (such as hair, face, upper clothes, lower clothes, etc.) at each location within the normalized poselet patch (Figure 5) using body component annotations on the H3D dataset [4].”
        [3] L. Bourdev, S. Maji, T. Brox, and J. Malik. Detecting people using mutually consistent poselet activations. In ECCV, 2010. 2, 3 
        [4] L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3d human pose annotations. In ICCV, 2009. 2, 3
        
        2 Incorporated by reference in section 5, paragraph 1, “We use the method of Bourdev et al. [3] to train 1200 poselets using images from the training and validation sets. Instead of all poselets having the same aspect ratios, we used four aspect ratios: 96x64, 64x64, 64x96 and 64x128 and trained 300 poselets of each. For each poselet, during training, we build a soft mask for the probability of each body component (such as hair, face, upper clothes, lower clothes, etc.) at each location within the normalized poselet patch (Figure 5) using body component annotations on the H3D dataset [4].”
        [3] L. Bourdev, S. Maji, T. Brox, and J. Malik. Detecting people using mutually consistent poselet activations. In ECCV, 2010. 2, 3 
        [4] L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3d human pose annotations. In ICCV, 2009. 2, 3
        
        3 a salient pattern corresponding to a given viewpoint and local pose