DETAILED ACTION

Examiner’s Comment
The Non-Final Rejection mailed 12/10/2021 omitted the preliminary amendment that included adding claims 10-20. This rejection addresses all current claims and replaces the previous mailed one

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a 
rebuttable presumption that the claim limitation is to be treated in accordance 
with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The 
presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-
AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites 
sufficient structure, material, or acts to entirely perform the recited function. 

Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 


This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “generation unit”, “learning unit”, “specifying unit”, “extraction unit”, “determination unit”, in claims 1-7. 

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. Original specification in paragraph [0142] discloses computer processor(s) or equivalents thereof to execute functions performed by the unit(s). 



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 8-9 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (US Pub No. 20160171346 A1) in view of Zheng et al. (US Pub No. 20090304251 A1). 

Regarding Claim 1,
Han discloses An image analysis device comprising: a generation unit which generates a similar set, which is a set of similar pieces of learning data selected from among a plurality of pieces of learning data, each including an image and information that represents an object to be recognized that is displayed in the image; (Han, [0010], a training image and object is disclosed wherein training image object and its element are similar set or pieces of information generated for object recognition in images)  and 

Han does not explicitly disclose a learning unit which uses the generated similar set to learn parameters for a predetermined recognition model that allow the predetermined recognition model to recognize the object to be recognized that is displayed in each image included in the generated similar set.

	Zheng discloses a learning unit which uses the generated similar set to learn parameters for a predetermined recognition model that allow the predetermined recognition model to recognize the object to be recognized that is displayed in each image included in the generated similar set. (Zheng, [0004-0005], discloses marginal space learning (MSL) has been proposed for efficient and automatic 3D object localization based on learning of discriminative classifiers. The full parameter space for 3D object localization has nine dimensions: three for position, three for orientation, and three for anisotropic scaling. In MSL, in order to efficiently localize an parameters are learnt for object recognition from original set of object information)

Han discloses the claimed device except for generating parameter of the generated set of image information data. Zheng discloses that it is known in the art to provide a generation of parameters from image information data. It would have been obvious to one having ordinary skill in the art at the time the invention was made to provide the device of Han with the parameter generation of Zheng, in order to reduce computational resources by reducing data to be processed. (See Zheng paragraph [0005]). 


Regarding Claim 2, 
The combination of Han and Zheng further discloses a specifying unit which specifies a region used for recognition in the image included in each of the plurality of pieces of learning data, as a recognition region, wherein the learning unit performs the learning using a specified recognition region in 15each image included in the generated similar set.  (Han, [0052], discloses deep learning structure 200 may be applied to recognize and verify various input images 101. For example, the input image 101 may include an image associated with an object (for example, an image representing a shape of an object). The object may include, for example, an animal, an inanimate object, or a human (for example, a human face, or a human body) included in a region of interest (ROI) of an image. For example, the deep learning structure 200 may be used to recognize a human face and to perform a recognition and authentication of a user. Also, the deep learning structure 200 may be used to search for and manage a considerable amount of content (for example, multimedia including a picture or video), automatically; specific region such as human face, body features are learned in deep learning algorithm).  Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim.


Regarding Claim 3, 
specific region such as human face, body features are learned in deep learning algorithm).  Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim.


Claims 8 and 9 recite method computer program with steps and instructions corresponding to the device elements recited in Claim 1. Therefore, the recited steps and instructions of the method and computer program Claims 8 and 9 are mapped to the proposed combination in the same manner as the corresponding elements of Claim 

Furthermore, the combination of Han and Zheng further discloses non-transitory computer readable medium (Han, [0009], discloses an apparatus for recognizing a feature of an image includes a memory storing computer-readable instructions; and one or more processors configured to: execute the computer-readable instructions such that the one or more processors are configured to, receive an input image including an object; extract first feature information using a first layer, the first feature information indicating a first feature among a plurality of first feature information, the indicated first feature corresponding to the input image; extract second feature information using a second layer, the second feature information indicating a second feature among a plurality of second features, the indicated second feature corresponding to the first feature information; and recognize an element corresponding to the object based on the first feature information and the second feature information; computer readable medium is disclosed). 

	
Regarding Claim 18, 
The combination of Han and Zheng further discloses wherein the learning unit performs weighting so as to emphasize loss corresponding to an error only between similar categories during learning. (Han, [0135], discloses determine whether the objects are similar to each other, the verifier 1130 may calculate a similarity between threshold determines weighting of similarity below is error in determining similarities). Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim.

Regarding Claim 19, 
The combination of Han and Zheng further discloses wherein the learning unit performs weighting so as to emphasize loss corresponding to an error only between similar categories during learning.  (Han, [0135], discloses determine whether the objects are similar to each other, the verifier 1130 may calculate a similarity between the generated feature vectors. When the similarity exceeds a predetermined or, alternatively, desired threshold similarity, the verifier 1130 may determine that the object of the input image 1101 and the object of the other image 1102 are identical to each other. The similarity between the feature vectors may be calculated, for example, as a level to which feature values of the feature vectors and histograms are similar to each other; threshold determines weighting of similarity below is error in determining similarities). Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim.


The combination of Han and Zheng further discloses wherein the determination unit determines similarity between a plurality of categories, on the basis of an integrated value of the resemblance to each category in category discrimination. (Han, [0135], discloses determine whether the objects are similar to each other, the verifier 1130 may calculate a similarity between the generated feature vectors. When the similarity exceeds a predetermined or, alternatively, desired threshold similarity, the verifier 1130 may determine that the object of the input image 1101 and the object of the other image 1102 are identical to each other. The similarity between the feature vectors may be calculated, for example, as a level to which feature values of the feature vectors and histograms are similar to each other; threshold determines weighting of similarity below is error in determining similarities). Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim.


Claims 4-7 and 11-17 are rejected under 35 U.S.C. 103 as being unpatentable over Han as modified by Zheng, and further in view of Sakamoto et al. (US Pub No. 20140016822 A1). The teachings of Han and Zheng have been discussed previously. 

Regarding Claim 4, 


Sakamoto discloses a determination unit which determines similarity between the plurality of pieces of learning data, 25wherein the generation unit generates the similar set on the basis of the determined similarity.  

Sakamoto discloses a determination unit which determines similarity between the plurality of pieces of learning data, 25wherein the generation unit generates the similar set on the basis of the determined similarity. (Sakamoto, [0070], discloses similar image retrieval device 3 extracts information (hereinafter, referred to as similar image information) including an image which is similar to the image of the specific object included in the retrieval request and an address of a posting source web page from an internal DB. The number of items of extracted similar image information is the number which is set in advance by the similar image retrieval device 3 or the number which is designated by the retrieval request and the similar image information is selected from the information having the highest similarity of the image in the order of similarity. Further, the similar image retrieval device 3 may extract all similar image information having a distance between the feature vectors which is equal to or smaller than a predetermined value as the similar image information to be extracted; similarity between image data information is determined to generate pair of similar set of data between image and its similarity feature). 

The combination of Han and Zheng discloses the claimed device except for determining similar set based on similarity. Sakamoto discloses similarity information between object image data is determined by feature vector similarity and producing similar set of feature vector and object as a pair of similar set. It would have been obvious to one having ordinary skill in the art at the time the invention was made to provide the device of Han and Zheng with the similar pair generation of Sakamoto, in order to accurately detect object that have similar characteristics of object and feature vector stored as similar set of data pair for processing new images and detecting objects with similar characteristics. (See Sakamoto, paragraph [0070]). 


Regarding Claim 5, 
The combination of Han, Zheng and Sakamoto further discloses an extraction unit which extracts a feature value of the image included in each of the 30plurality of pieces of learning data, wherein the determination unit determines the similarity between the plurality of pieces of learning data, on the basis of a distance between respective feature values extracted from the plurality of pieces of learning data.  (Sakamoto,  [0059-0060], discloses object recognizing unit 24 determines an area including an image of the person from the image data (hereinafter, referred to as an input image) input from the information distributing unit 23 based on the first dictionary data. In such a process, the object recognizing unit 24 scans the input image and determines whether there is an image having a feature vector that has a distance from distance between feature vectors is determined to obtain similarity between information data of image). Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 3 apply to this claim.


Regarding Claim 6, 
The combination of Han and Zheng does not explicitly disclose wherein each of the plurality of pieces of learning data includes information indicating a category to which the object to be recognized displayed in the image included in the piece of learning data belongs, and wherein the determination unit determines similarity between a plurality of categories 5to which respective objects to be recognized indicated by the 

Sakamoto discloses wherein each of the plurality of pieces of learning data includes information indicating a category to which the object to be recognized displayed in the image included in the piece of learning data belongs, and wherein the determination unit determines similarity between a plurality of categories 5to which respective objects to be recognized indicated by the plurality of pieces of learning data belong, on the basis of respective feature values extracted from the plurality of pieces of learning data.   (Sakamoto, [0030], Figure 1C, discloses the information providing device 2 requests the similar image retrieval device 3 to retrieve a similar image with the extracted image of the specific object as a retrieval key. Here, as illustrated in FIG. 1C, it is considered that the image of the moving image content which is being reproduced in the terminal device 4 (hereinafter, referred to as a reproducing image) includes images 33a and 33b of specific objects 34a and 34b. In this case, the information providing device 2 extracts the images 33a and 33b of the specific objects and requests the similar image retrieval device 3 to retrieve the similar image with the images 33a and 33b of the specific objects as retrieval keys; category of similar pieces of information data between object and its features are determined it category is determined such as hat or cell category in figure 1c) . Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 3 apply to this claim.


Regarding Claim 7, 
The combination of Han and Zheng does not explicitly disclose wherein the generation unit generates, 10as the similar set, a set of pieces of learning data in which objects to be recognized displayed in respective images belong to similar categories, and wherein the learning unit learns the parameters for the predetermined recognition model that allow the predetermined recognition model to recognize a category to which each of the objects to be recognized included in the generated similar set belongs.

Sakamoto discloses wherein the generation unit generates, 10as the similar set, a set of pieces of learning data in which objects to be recognized displayed in respective images belong to similar categories, and wherein the learning unit learns the parameters for the predetermined recognition model that allow the predetermined recognition model to recognize a category to which each of the objects to be recognized included in the generated similar set belongs. (Sakamoto, [0030], Figure 1C, discloses the information providing device 2 requests the similar image retrieval device 3 to retrieve a similar image with the extracted image of the specific object as a retrieval key. Here, as illustrated in FIG. 1C, it is considered that the image of the moving image content which is being reproduced in the terminal device 4 (hereinafter, referred to as a reproducing image) includes images 33a and 33b of specific objects 34a and 34b. In this case, the information providing device 2 extracts the images 33a and 33b of the specific objects and requests the similar image retrieval device 3 to retrieve the similar image with the images 33a and 33b of the specific objects as retrieval keys; category of similar pieces of information data between object and its features are determined it category is determined such as hat or cell category in figure 1c) . Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 3 apply to this claim.


Regarding Claim 10, 
The combination of Han and Zheng further discloses comprising a determination unit which determines similarity between the plurality of pieces of learning data, wherein the generation unit generates the similar set on the basis of the determined similarity.  (Han, [0052], discloses deep learning structure 200 may be applied to recognize and verify various input images 101. For example, the input image 101 may include an image associated with an object (for example, an image representing a shape of an object). The object may include, for example, an animal, an inanimate object, or a human (for example, a human face, or a human body) included in a region of interest (ROI) of an image. For example, the deep learning structure 200 may be used to recognize a human face and to perform a recognition and authentication of a user. Also, the deep learning structure 200 may be used to search for and manage a considerable amount of content (for example, multimedia including a picture or video), automatically; specific region such as human face, body features are learned in deep learning algorithm).  Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim. 


Regarding Claim 11, 6PRELIMINARY AMENDMENTAttorney Docket No.: Q257833 
The combination of Han and Zheng further discloses Appln. No.: National Stage Entry of PCT/JP2018/008405a determination unit which determines similarity between the plurality of pieces of learning data, wherein the generation unit generates the similar set on the basis of the determined similarity. (Han, [0052], discloses deep learning structure 200 may be applied to recognize and verify various input images 101. For example, the input image 101 may include an image associated with an object (for example, an image representing a shape of an object). The object may include, for example, an animal, an inanimate object, or a human (for example, a human face, or a human body) included in a region of interest (ROI) of an image. For example, the deep learning structure 200 may be used to recognize a human face and to perform a recognition and authentication of a user. Also, the deep learning structure 200 may be used to search for and manage a considerable amount of content (for example, multimedia including a picture or video), automatically; specific region such as human face, body features are learned in deep learning algorithm).  Additionally, the rational and motivation to combine the references Han and Zheng as applied in claim 1 apply to this claim.

Regarding Claim 12, 
The combination of Han and Zheng further discloses an extraction unit which extracts a feature value of the image included in each of the plurality of pieces of learning data, wherein the determination unit determines the similarity between the plurality of pieces of learning data, on the basis of a distance between respective distance between feature vectors is determined to obtain similarity between information data of image). Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 3 apply to this claim.



The combination of Han and Zheng further discloses an extraction unit which extracts a feature value of the image included in each of the plurality of pieces of learning data, wherein the determination unit determines the similarity between the plurality of pieces of learning data, on the basis of a distance between respective feature values extracted from the plurality of pieces of learning data.  (Sakamoto,  [0059-0060], discloses object recognizing unit 24 determines an area including an image of the person from the image data (hereinafter, referred to as an input image) input from the information distributing unit 23 based on the first dictionary data. In such a process, the object recognizing unit 24 scans the input image and determines whether there is an image having a feature vector that has a distance from the feature vector of the person set in the first dictionary data which is equal to or smaller than a predetermined value. If it is determined that the image of the person is present in the input image, the object recognizing unit 24 extracts the image of the person (hereinafter, referred to as a person area image) from the input image; the object recognizing unit 24 determines the image of the specific object from the person area image based on the second dictionary data. In such a process, the object recognizing unit 24 scans the person area image and determines whether there is an image having a feature vector that has a distance from the feature vector of each of the specific objects set in the second dictionary data which is equal to or smaller than a predetermined value. If it is determined that the image of the specific object is present in the person area image, the object recognizing unit 24 extracts the image of the specific object from the person area image and outputs the extracted image to the similar image distance between feature vectors is determined to obtain similarity between information data of image). Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 3 apply to this claim.


Regarding Claim 14, 
The combination of Han and Zheng further discloses wherein each of the plurality of pieces of learning data includes information indicating a category to which the object to be recognized displayed in the image included in the piece of learning data belongs, and wherein the determination unit determines similarity between a plurality of categories to which respective objects to be recognized indicated by the plurality of pieces of learning data belong, on the basis of respective feature values extracted from the plurality of pieces of learning data.  (Sakamoto, [0030], Figure 1C, discloses the information providing device 2 requests the similar image retrieval device 3 to retrieve a similar image with the extracted image of the specific object as a retrieval key. Here, as illustrated in FIG. 1C, it is considered that the image of the moving image content which is being reproduced in the terminal device 4 (hereinafter, referred to as a reproducing image) includes images 33a and 33b of specific objects 34a and 34b. In this case, the information providing device 2 extracts the images 33a and 33b of the specific objects and requests the similar image retrieval device 3 to retrieve the similar image with the images 33a and 33b of the specific objects as retrieval keys; category of similar pieces of information data between object and its features are determined it category is determined such as hat or cell category in figure 1c) . Additionally, the rational and 


Regarding Claim 15, 
The combination of Han and Zheng further discloses wherein each of the plurality of pieces of learning data includes information indicating a category to which the object to be recognized displayed in the image included in the piece of learning data belongs, and wherein the determination unit determines similarity between a plurality of categories to which respective objects to be recognized indicated by the plurality of pieces of learning data belong, on the basis of respective feature values extracted from the plurality of pieces of learning data.  (Sakamoto, [0030], Figure 1C, discloses the information providing device 2 requests the similar image retrieval device 3 to retrieve a similar image with the extracted image of the specific object as a retrieval key. Here, as illustrated in FIG. 1C, it is considered that the image of the moving image content which is being reproduced in the terminal device 4 (hereinafter, referred to as a reproducing image) includes images 33a and 33b of specific objects 34a and 34b. In this case, the information providing device 2 extracts the images 33a and 33b of the specific objects and requests the similar image retrieval device 3 to retrieve the similar image with the images 33a and 33b of the specific objects as retrieval keys; category of similar pieces of information data between object and its features are determined it category is determined such as hat or cell category in figure 1c) . Additionally, the rational and 


Regarding Claim 16, 
The combination of Han and Zheng further discloses wherein the generation unit generates, as the similar set, a set of pieces of learning data in which objects to be recognized displayed in respective images belong to similar categories, and 8PRELIMINARY AMENDMENTAttorney Docket No.: Q257833 Appln. No.: National Stage Entry of PCT/JP2018/008405 wherein the learning unit learns the parameters for the predetermined recognition model that allow the predetermined recognition model to recognize a category to which each of the objects to be recognized included in the generated similar set belongs. (Sakamoto, [0030], Figure 1C, discloses the information providing device 2 requests the similar image retrieval device 3 to retrieve a similar image with the extracted image of the specific object as a retrieval key. Here, as illustrated in FIG. 1C, it is considered that the image of the moving image content which is being reproduced in the terminal device 4 (hereinafter, referred to as a reproducing image) includes images 33a and 33b of specific objects 34a and 34b. In this case, the information providing device 2 extracts the images 33a and 33b of the specific objects and requests the similar image retrieval device 3 to retrieve the similar image with the images 33a and 33b of the specific objects as retrieval keys; category of similar pieces of information data between object and its features are determined it category is determined such as hat or cell category in figure 1c) . Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 4 apply to this claim.

Regarding Claim 17, 
The combination of Han and Zheng further discloses wherein the generation unit generates, as the similar set, a set of pieces of learning data in which objects to be recognized displayed in respective images belong to similar categories, and wherein the learning unit learns the parameters for the predetermined recognition model that allow the predetermined recognition model to recognize a category to which each of the objects to be recognized included in the generated similar set belongs.  (Sakamoto, [0030], Figure 1C, discloses the information providing device 2 requests the similar image retrieval device 3 to retrieve a similar image with the extracted image of the specific object as a retrieval key. Here, as illustrated in FIG. 1C, it is considered that the image of the moving image content which is being reproduced in the terminal device 4 (hereinafter, referred to as a reproducing image) includes images 33a and 33b of specific objects 34a and 34b. In this case, the information providing device 2 extracts the images 33a and 33b of the specific objects and requests the similar image retrieval device 3 to retrieve the similar image with the images 33a and 33b of the specific objects as retrieval keys; category of similar pieces of information data between object and its features are determined it category is determined such as hat or cell category in figure 1c) . Additionally, the rational and motivation to combine the references Han, Zheng and Sakamoto as applied in claim 4 apply to this claim. 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PINALBEN V PATEL whose telephone number is (571)270-5872. The examiner can normally be reached M-F: 10am - 8pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on (571)272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Pinalben Patel/Examiner, Art Unit 2661