DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  



Claim Interpretations - 35 USC § 112
The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

Claim limitations “a joint-determination module”, “a pose-estimation module", “an action-identification module”, “image-region module”, extractor module” has/have been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because it uses/they use a generic placeholder coupled with functional language without reciting sufficient structure to achieve the function.
Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, claim(s) 1-20 has/have been interpreted to cover the corresponding structure described in the specification that achieves the claimed function, and equivalents thereof.  
A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation: the modules are defined on page 29 of applicant’s specification as software stored on hardware or the hardware itself.
If applicant wishes to provide further explanation or dispute the examiner’s interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action. 
If applicant does not intend to have the claim limitation(s) treated under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may amend the claim(s) so that it/they will clearly not invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, or present a sufficient showing that the claim recites/recite sufficient structure, material, or acts for performing the claimed function to preclude application of 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
For more information, see MPEP § 2173 et seq. and Supplementary Examination Guidelines for Determining Compliance With 35 U.S.C. 112 and for Treatment of Related Issues in Patent Applications, 76 FR 7162, 7167 (Feb. 9, 2011).



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 4-6, 8, 10-16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Alghazzawi et al. WO2016/166508 as provided in the applicant’s information disclosure statement hereinafter referred to as Alghazzawi in view of Black et al. USPN 10529137 hereinafter referred to as Black and Gkioxari et al., “R-CNN’s for Pose Estimation and Action Detection” published 19 June 2014 as provided in the applicant’s information disclosure statement hereinafter referred to as Gkioxari.

As per Claim 1, Alghazzawi teaches an apparatus for performing image analysis to identify human actions represented in an image, comprising:
 a joint-determination module configured to analyse an image depicting one or more people to determine a set of joint candidates for the one or more people depicted in the image; (Alghazzawi, Page 13, Lines 17-33)
a pose-estimation module configured to derive pose estimates from the set of joint candidates that estimate a body configuration for the one or more people depicted in the image; and (Alghazzawi, Page 27, Lines 18-28)
an action-identification module configured to analyse a region of interest (Alghazzawi, Page 19, Lines 8-14) within the image identified from the derived pose estimates to identify an action performed by a person depicted in the image. (Alghazzawi, Page 20, Lines 10-20) 
	Alghazzawi does not explicitly teach a joint-determination module using a first computational neural network 
	Black teaches a joint-determination module using a first computational neural network (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts)
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Black into Alghazzawi because by using a neural network will assist and complement the camera that is used for joint detection for joints that may not be visible to the camera for accurate processing.
	Alghazzawi in view of Black does not explicitly teach an action-identification module using a second computational neural network 
	Gkioxari teaches an action-identification module using a second computational neural network (Gkioxari, Page 3, “A Single Convolutional Neural Network For Multiple Tasks”, Pose Estimation, tasks of predicting location of specific keypoints in body utilizing R-CNN, Action Classification predicting action a person is performing utilizing R-CNN)
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Gkioxari into Alghazzawi in view of Black because by utilizing neural network in conjunction with the methodology of Alghazzawi to determine behavior/action will result in a more accurate action determination.
	Therefore it would have been obvious to one of ordinary skill to combine the three references to obtain the invention in Claim 1.

As per Claim 2, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the region of interest defines a sub-region of the image, and action-identification module is configured to analyse only the sub-region of the image using the second computational neural network. (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies therefore is considered region of interest and Alghazzawi, Page 20, Lines 10-20 and Gkioxari, Page 5, 4.2.2. Action Classification)
The rationale applied to the rejection of claim 1 has been incorporated herein. 


As per Claim 4, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the apparatus further comprises an image-region module configured to identify the region of interest within the image from the derived pose estimates.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies therefore is considered region of interest)
The rationale applied to the rejection of claim 1 has been incorporated herein. 


As per Claim 5, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 4, wherein the region of interest bounds a specified subset of joints of a derived pose estimate.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies therefore is considered region of interest and Alghazzawi, Page 27, Lines 18-28)
The rationale applied to the rejection of claim 1 has been incorporated herein. 

As per Claim 6, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 4, wherein the region of interest bounds one or more derived pose estimates. (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies therefore is considered region of interest)
The rationale applied to the rejection of claim 4 has been incorporated herein. 


As per Claim 8, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the apparatus is configured to receive the image from a 2-D camera. (Black , Figure 2A, Column 5, Lines 49-52)
The rationale applied to the rejection of claim 1 has been incorporated herein. 


As per Claim 10, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the first network is a convolutional neural network.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts)
The rationale applied to the rejection of claim 1 has been incorporated herein. 


As per Claim 11, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the second network is a convolutional neural network.  (Gkioxari , Page 5, 4.2.2. Action Classification, train network called Action R-CNN)
The rationale applied to the rejection of claim 1 has been incorporated herein. 

As per Claim 12, Claim 12 claims a method executing the modules of the apparatus as claimed in Claim 1. Therefore the rejection and rationale are analogous to that made in Claim 1.

As per Claim 13, Alghazzawi teaches an apparatus for performing image analysis to identify human actions from one or more images, comprising:
 a joint-determination module configured to analyse one or more images each depicting one or more people to determine for each image a set of joint candidates for the one or more people depicted in the image; (Alghazzawi, Page 13, Lines 17-33)
a pose-estimation module configured to derive for each image pose estimates from the set of joint candidates that estimate a body configuration for the one or more people depicted in the image; and (Alghazzawi, Page 27, Lines 18-28)
 an action-identification module configured to analyse the derived pose estimates for the one or more images to identify an action performed by a person depicted in the one or more images.  (Alghazzawi, Page 20, Lines 10-20)
Alghazzawi does not explicitly teach a joint-determination module using a first computational neural network 
	Black teaches a joint-determination module using a first computational neural network (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts)
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Black into Alghazzawi because by using a neural network will assist and complement the camera that is used for joint detection for joints that may not be visible to the camera for accurate processing.
	Alghazzawi in view of Black does not explicitly teach an action-identification module using a second computational neural network 
	Gkioxari teaches an action-identification module using a second computational neural network (Gkioxari, Page 3, “A Single Convolutional Neural Network For Multiple Tasks”, Pose Estimation, tasks of predicting location of specific keypoints in body utilizing R-CNN, Action Classification predicting action a person is performing utilizing R-CNN)
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Gkioxari into Alghazzawi in view of Black because by utilizing neural network in conjunction with the methodology of Alghazzawi to determine behavior/action will result in a more accurate action determination.
	Therefore it would have been obvious to one of ordinary skill to combine the three references to obtain the invention in Claim 13.

As per Claim 14, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 13, the apparatus further comprising an extractor module configured to extract from each derived pose estimate values for a set of one or more parameters characterising the pose estimate, wherein the action- identification module is configured to use the second computational neural network to identify an action performed by a person depicted in the one or more images in dependence on the extracted parameter values.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies, by determining joint locations is considered extracting parameters and Alghazzawi, Page 20, Lines 10-20 and Gkioxari, Page 5, 4.2.2. Action Classification)
The rationale applied to the rejection of claim 13 has been incorporated herein. 


As per Claim 15, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 14, wherein the set of one or more parameters relate to a specified subset of joints of the pose estimate.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies, by determining joint locations is considered extracting parameters)
The rationale applied to the rejection of claim 14 has been incorporated herein. 


As per Claim 16, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 14, wherein the set of one or more parameters comprises at least one of: joint position for specified joints; joint angles between specified connected joints; joint velocity for specified joints; and the distance between specified pairs of joints.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies, by determining joint locations is considered joint position)
The rationale applied to the rejection of claim 14 has been incorporated herein. 

As per Claim 19, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 13, wherein each image of the one or more images depicts a plurality of people, and the action-identification module is configured to analyse the derived pose estimates for each of the plurality of people for the one or more images using the second computational neural network and to identify an action performed by each person depicted in the one or more images.  (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different and Alghazzawi, Page 20, Lines 10-20 and Gkioxari, Page 5, 4.2.2. Action Classification)
The rationale applied to the rejection of claim 13 has been incorporated herein. 

As per Claim 20, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 13, wherein the pose-estimate module is configured to derive the pose estimates from the set of joint candidates and further from imposed anatomical constraints on the joints. (Black, Column 18, Lines 39-55)
The rationale applied to the rejection of claim 13 has been incorporated herein. 






Claims 3, 7, 9 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Alghazzawi et al. WO2016/166508 as provided in the applicant’s information disclosure statement hereinafter referred to as Alghazzawi in view of Black et al. USPN 10529137 hereinafter referred to as Black and Gkioxari et al., “R-CNN’s for Pose Estimation and Action Detection” published 19 June 2014 as provided in the applicant’s information disclosure statement hereinafter referred to as Gkioxari as applied to Claims 1, 4, 13 and 14 respectively and further in view of Burry et al. US2018/0181995. 


As per Claim 3, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the action-identification module is configured to analyse the region of interest and to identify the action in response 
Alghazzawi in view of Black and Gkioxari does not explicitly teach to identify objects of a specified object class, and to detecting an object of the specified class in the region of interest. 
Burry teaches to identify objects of a specified object class, and to detecting an object of the specified class in the region of interest. (Burry, Paragraph [0020])
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Burry into Alghazzawi in view of Black and Gkioxari because by providing application of action and pose determination of Alghazzawi will allow for the technology to be used in a store setting.
	Therefore it would have been obvious to one of ordinary skill to combine the four references to obtain the invention in Claim 3.

As per Claim 7, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 4, wherein the image-region module is configured to identify a region of interest that bounds terminal ends of a derived pose estimate, and the action-identification module is configured to analyse the identified region of interest using the second computational neural network (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies therefore is considered region of interest and Alghazzawi, Page 20, Lines 10-20 and Gkioxari, Page 5, 4.2.2. Action Classification)
Alghazzawi in view of Black and Gkioxari does not explicitly teach to identify whether the person depicted in the image is holding an object of a specified class or not. 
Burry teaches to identify whether the person depicted in the image is holding an object of a specified class or not.  (Burry, Paragraph [0020])
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Burry into Alghazzawi in view of Black and Gkioxari because by providing application of action and pose determination of Alghazzawi will allow for the technology to be used in a store setting.
	Therefore it would have been obvious to one of ordinary skill to combine the four references to obtain the invention in Claim 7.

As per Claim 9, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 1, wherein the action-identification module is configured to identify the action from a class of actions (Alghazzawi, Page 20, Lines 10-20 and Gkioxari, Page 5, 4.2.2. Action Classification)
Alghazzawi in view of Black and Gkioxari does not explicitly teach including: scanning an item at a point-of-sale; and selecting an item for purchase.  (Burry, Paragraph [0020])
Burry teaches to identify whether the person depicted in the image is holding an object of a specified class or not.  (Burry, Paragraph [0020])
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Burry into Alghazzawi in view of Black and Gkioxari because by providing application of action and pose determination of Alghazzawi will allow for the technology to be used in a store setting.
	Therefore it would have been obvious to one of ordinary skill to combine the four references to obtain the invention in Claim 9.


As per Claim 17, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 13, the action- identification module is configured to identify the action performed (Alghazzawi, Page 20, Lines 10-20)
Alghazzawi in view of Black and Gkioxari does not explicitly teach wherein the one or more images is a series of multiple images, the person depicted in the series of images from changes in their derived pose estimate over the series of images 
Burry teaches wherein the one or more images is a series of multiple images, the person depicted in the series of images from changes in their derived pose estimate over the series of images (Burry, Paragraph [0020])
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Burry into Alghazzawi in view of Black and Gkioxari because by providing application of action and pose determination of Alghazzawi will allow for the technology to be used in a store setting.
	Therefore it would have been obvious to one of ordinary skill to combine the four references to obtain the invention in Claim 17.

As per Claim 18, Alghazzawi in view of Black and Gkioxari teaches the apparatus as claimed in claim 14, wherein the one or more images is a series of multiple images, and the action-identification module is configured to identify the action performed by the person and wherein the action-identification module is configured to use the second computational neural network to identify an action performed by the person from the change in the extracted parameter values (Black, Column 18, Lines 39-55, use of DeepCut CNN to partition and label body parts from different bodies, by determining joint locations is considered extracting parameters and Alghazzawi, Page 20, Lines 10-20 and Gkioxari, Page 5, 4.2.2. Action Classification)
Alghazzawi in view of Black and Gkioxari does not explicitly teach wherein the one or more images is a series of multiple images, the person depicted in the series of images from changes in their derived pose estimate over the series of images 
Burry teaches wherein the one or more images is a series of multiple images, the person depicted in the series of images from changes in their derived pose estimate over the series of images (Burry, Paragraph [0020])
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to implement the teachings of Burry into Alghazzawi in view of Black and Gkioxari because by providing application of action and pose determination of Alghazzawi will allow for the technology to be used in a store setting.
	Therefore it would have been obvious to one of ordinary skill to combine the four references to obtain the invention in Claim 18.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MING HON whose telephone number is (571)270-5245.  The examiner can normally be reached on M-F 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached on 570-270-3717.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MING Y HON/Primary Examiner, Art Unit 2666