DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/14/2021 and 09/16/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitations uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: based on determining that the query object corresponds to a known object class a step for detecting the query object in one or more digital images in claim 18.
based on determining that the query object does not correspond to a known object class, a step for detecting the query object in one or more digital images in claim 18.
	The corresponding structure for performing the steps for corresponds to ¶[0119] and ¶[0144] of the specification of the instant application and figures 4-8D. 
Because these claim limitations are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recites sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by Xia et al. US PG-Pub(US 20210192375 A1).
Regarding Claim 1, a non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computing device to (¶0105] Computer-readable instructions stored on a computer-readable medium are executable by the processing unit 702 of the computer 700. A hard drive, CD-ROM, RAM, and flash memory are some examples of articles including a non-transitory computer-readable medium such as a storage device.): identify a query that comprises a query object identification label indicating a query object to be detected in one or more digital images(¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including the at least one object in the data set; select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database); determine whether the query object corresponds to a known object class based on comparing the query object identification label to known object class labels before utilizing a known object class detection model to detect the query object (Abstract, The processing circuitry selects one or more models using an entity knowledge database that includes a plurality of entities corresponding to objects based on determining that the query object corresponds to the known object class, utilize a known object class detection model to detect the query object within the one or more digital images ([0014], the one or more processors execute the instructions to: identify, as the at least one identified recognition model, a plurality of identified recognition models linked to a plurality of nodes that are ontologically coupled to the node corresponding to the at least one object; process the data set using the selected plurality of identified recognition models; and combine results of processing the selected plurality of identified recognition models to provide the indication of whether the data set includes the at least one object. The examiner interprets the prior art is determining if the query object matches any of the object recognition models.);based on determining that the query object does not correspond to a known object class, utilize a concept embedding model that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images; ([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model 
and provide an indication of the detected query object in the one or more digital images in response to the query. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-4, 6-7, 12-13, 15, and 18-19  rejected under 35 U.S.C. 103 as being unpatentable over Xia et al. US PG-Pub(US 20210192375 A1) in view of Sun et al US Patent (US 9858496 B2).
Regarding Claim 2, while Xia teaches the non-transitory computer-readable medium of claim 1, wherein the instructions, when executed by the at least one processor, cause the computing device to(¶0105] Computer-readable instructions stored on a computer-readable medium are executable by the processing unit 702 of the computer 700. A hard drive, CD-ROM, RAM, and flash memory are some examples of articles including a non-transitory computer-readable medium such as a storage device.) utilize the known object class detection model to detect the query object within the one or more digital images ([0014], the one or more processors execute the instructions to: identify, as the at least one identified recognition model, a plurality of identified recognition models linked to a plurality of nodes that are ontologically coupled to the node corresponding to the at least one object; process the data set using the selected plurality of identified recognition models; and combine results of processing the selected plurality of identified recognition models to provide the indication of whether the data set includes the at least one object):
Xia does not explicitly teach detecting the potential objects in the one or more digital images utilizing a region proposal neural network; and generating approximate boundaries about the potential objects. 

    PNG
    media_image1.png
    670
    388
    media_image1.png
    Greyscale

Sun teaches detecting potential objects in the one or more digital image utilizing a region proposal neural network (Col 1, Lines 23-28, a computing device can receive an image. The computing device can process the image, and generate a convolutional feature map. In some configurations, the convolutional feature map can be processed through a Region Proposal Network (RPN) to generate the 
and generating approximate boundaries around the potential objects (Col 2, Lines 43-51, the computing system can input the convolutional feature map into an RPN. In such examples, the RPN can evaluate the convolutional feature map and generate proposals (e.g., object candidate locations) on the convolutional feature map. In some examples, the proposals can be in the form of bounding boxes. For instance, the RPN can generate rectangular bounding boxes around potential objects (e.g., substance in the image that is more likely than not to be an object) on the convolutional feature map).
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to use a region proposal network to detect potential object in the image. One skilled in the art would have been motivated to modify Xia in this manner in order to determine a classification (e.g., a type, a class, a group, a category, etc.) of each object, and a confidence score associated with the classification (e.g., how accurate the system believes the classification to be). (Sun, Col 1, Lines 34-38)
Regarding Claim 3, the combination of Xia and Sun teaches the non-transitory computer-readable medium of claim 2, where Sun further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the known object class detection model to detect the query object within the one or more digital images by generating an object label for the potential objects utilizing an object classification neural network (Col 1, Lines 34-38, In various examples, the computing system can input the convolutional feature map with proposals into an FRCN to determine a classification (e.g., a type, a class, a group, a category, etc.) of each potential object. In some examples, the FRCN may determine that the object class matches one of a pre-determined number of object classes, and may label the object accordingly. Conversely, the FRCN may determine 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to generate an object label for the object. One skilled in the art would have been motivated to modify Xia in this manner in order to  determine a classification (e.g., a type, a class, a group, a category, etc.) of each object, and a confidence score associated with the classification (e.g., how accurate the system believes the classification to be). (Sun, Col 1, Lines 34-38)
Regarding Claim 4, the combination of Xia and Sun teaches the non-transitory computer-readable medium of claim 3, where Sun further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the known object class detection model to detect the query object within the one or more digital images by determining that an object label of one or more potential objects corresponds to a query object class corresponding to the query object (Col 2 Lines 52-61, the computing system can input the convolutional feature map with proposals into an FRCN to determine a classification (e.g., a type, a class, a group, a category, etc.) of each potential object. In some examples, the FRCN may determine that the object class matches one of a pre-determined number of object classes, and may label the object accordingly. Conversely, the FRCN may determine that the object does not match one of the pre-determined number of object classes, and it may label the object as background. The examiner interprets that the prior art is using the FRCN to determine if the object matches one of the corresponding object classes and will label the object accordingly.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to determine if the object label corresponds to a known class. One skilled in the art would have been motivated to modify Xia in this manner in order to  determine a classification 
Regarding Claim 6, the combination of Xia and Sun teaches the non-transitory computer-readable medium of claim 3, where Sun further teaches further comprising instructions that when executed by the at least one processor, cause the computing device to: determine that an object label of at least one potential object does not correspond to the query object identification label and filter out the at least one potential object based on the object label of the at least one potential object not corresponding to the query object identification label. (Col 10, Lines 36-43, the proposal classifier can determine an object category by comparing the object to pre-designated objects in the proposal classifier network (e.g., the FRCN). In some examples, if the proposal classifier does not recognize a category, it may designate the proposal as background. For example, as illustrated in FIG. 3, the forest in the input image is not recognized as one of the object categories. Thus, at 306, the forest is designated as background. The examiner interprets that the proposal classifier if the object does not correspond to a category then it will be filtered out and can be classified as a background.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to determine if the object label corresponds to an object class and filter out the potential object if it doesn’t. One skilled in the art would have been motivated to modify Xia in this manner in order to determine a classification (e.g., a type, a class, a group, a category, etc.) of each object, and a confidence score associated with the classification (e.g., how accurate the system believes the classification to be). (Sun, Col 1, Lines 34-38)
Regarding Claim 7, while Xia teaches the non-transitory computer-readable medium of claim 1, wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the concept embedding model to detect the query object within the one or more digital images by: 
Xia does not explicitly teach detecting the potential objects in the one or more digital images utilizing a region proposal neural network jointly with a concept embedding neural network and generating approximate boundaries around the potential objects 
Sun teaches detecting the potential objects in the one or more digital images utilizing a region proposal neural network jointly with a concept embedding neural network (Abstract, the convolutional feature map can be processed through a Region Proposal Network (RPN) to generate proposals for candidate objects in the image. In various examples, the computing device can process the convolutional feature map with the proposals through a Fast Region-Based Convolutional Neural Network (FRCN) proposal classifier to determine a class of each object in the image and a confidence score associated therewith. The computing device can then provide a requestor with an output including the object classification and/or confidence score. The examiner interprets that the prior art is using a region proposal neural network with a convolutional neural network to perform the classification of the object in the image. The examiner interprets it would be obvious to one of ordinary skill in the art to also incorporate a concept embedded neural network as taught by Xia to Sun’s invention in order to further detect the potential object in the image.)and generating approximate boundaries around the potential objects (Col 2, Lines 43-51, the computing system can input the convolutional feature map into an RPN. In such examples, the RPN can evaluate the convolutional feature map and generate proposals (e.g., object candidate locations) on the convolutional feature map. In some examples, the proposals can be in the form of bounding boxes. For instance, the RPN can generate rectangular bounding boxes around potential objects (e.g., substance in the image that is more likely than not to be an object) on the convolutional feature map).
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to determine the potential object using a region proposal network jointly with the concept embedding model as taught by Li. One skilled in the art would have been motivated to modify Xia in this manner in order to determine a classification (e.g., a type, a class, a group, a category, etc.) of each object, and a confidence score associated with the classification (e.g., how accurate the system believes the classification to be). (Sun, Col 1, Lines 34-38)
Regarding Claim 12, Xia teaches a system for automatically selecting objects within digital images comprising: 
one or more memory devices(¶[0102] One example computing device in the form of a computer 700 may include a processing unit 702, memory 703, removable storage 710, and non-removable storage 712.) comprising: one or more digital images(¶[0068], AI image recognition system in which a user provides or indicates a data set including an image, set of images or video to be searched); an object classification model(¶[0015], the entity knowledge database includes a plurality of ontologically organized nodes corresponding to the at least one object at different levels of generality and the one or more processors execute the instructions to: select, as the identified plurality of recognition models, respective models associated with the plurality of nodes corresponding to the different levels of generality of the at least one object); and a concept embedding model that determines a correspondence between queries and query objects in digital images(([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102.);one or more servers that cause the system to (¶[0075], The example environment 200 includes a user terminal 202 coupled to a local server 204): identify a query that comprises a query object identification label indicating a query object to be detected in the one or more digital images(¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including the at least one object in the data set; select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database); determine whether the query object corresponds to a known object class based on comparing the query object identification label to a list of known object classes before utilizing the object classification model to detect the query object(Abstract, The processing circuitry selects one or more models using an entity knowledge database that includes a plurality of entities corresponding to objects to be identified. Each of a plurality or recognition models is linked to multiple entities of the entity knowledge database so that the processing circuitry may select multiple recognition models.¶[0068] The embodiments below are described in the context of an AI image recognition system in which a user provides or indicates a data set including an image, set of images or video to be searched and also provides a query asking if a particular object or class of objects is represented in the data set. The query based on determining that the query object corresponds to the known object class, utilize the object classification model to detect the query object within the one or more digital images by determining object classifications for each of the potential objects (¶[0017] searching an entity knowledge database including a plurality of nodes corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple nodes of the entity knowledge database; selecting at least one recognition model of the plurality of recognition models to be used to identify the at least one object in response to the search of the entity knowledge database; and processing the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object.);based on determining that the query object does not correspond to the known object class, utilize the concept embedding model to detect the query object within the one or more digital images by determining a correspondence between the query object identification label and each of the potential objects; ([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102. The examiner based on the object not being linked to 
Xia does not explicitly teach a region proposal model; detect a plurality of potential objects in the one or more digital images utilizing the region proposal model; 
Sun teaches a region proposal model (Col 7, Lines 58-60, the object proposal module may include a Region Proposal Network (RPN), which may be a neural network.); detect a plurality of potential objects in the one or more digital images utilizing the region proposal model; (Col 1, Lines 23-28, a computing device can receive an image. The computing device can process the image, and generate a convolutional feature map. In some configurations, the convolutional feature map can be processed through a Region Proposal Network (RPN) to generate the proposal candidate locations of objects in the image. As seen in figure 3, the prior utilizes a region proposal network to detect objects in a digital image.);
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to incorporate a region proposal and object classification model in order to determine the potential objects in the image. One skilled in the art would have been motivated to modify Xia this manner in order to determine a classification (e.g., a type, a class, a group, a category, etc.) of each object, and a confidence score associated with the classification (e.g., how accurate the system believes the classification to be). (Sun, Col 1, Lines 34-38)
Regarding Claim 13, the combination of Xia and Sun teach the system of claim 12, where Xia further teaches wherein the query comprises an image search request and the one or more servers cause the system to provide the indication of detected query object in the one or more digital images in response to the query by returning a subset of the one or more digital images including the query object. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, 
Regarding Claim 15, the combination of Xia and Sun teach the system of claim 12, where Xia further teaches  wherein the one or more servers cause the system to determine the list of known object classes from labels of objects used to generate the object classification model ([0005], select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database. The examiner interprets the prior art is determining if the object corresponds to a model in the database before selecting the object classification model.)
Regarding Claim 18, Xia teaches in a digital medium environment for creating or editing digital images, a computer-implemented method of selecting query objects, comprising: identifying a query that comprises a query object identification label indicating a query object to be detected(¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including the at least one object in the data set; select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database); determining whether the query object identification label corresponds to a known object class before utilizing an object detection model to detect the query object(Abstract, The processing circuitry selects one or more models using an entity knowledge database that includes a plurality of entities corresponding to objects to be identified. Each of a plurality or recognition models is linked to multiple entities of the entity knowledge database so that the processing circuitry may select multiple recognition models.¶[0068] The embodiments below are described in the context of an AI image recognition system in which a user provides or indicates a data set including an image, set of  based on determining that the query object does not correspond to a known object class, a step for detecting the query object in one or more digital images; (On page 14 of the Amendment filed 9/23/21, Applicants’ point to [0144] of the specification of the subject application for providing corresponding acts for this claim limitation which invokes 35 USC 112(f). This portion of the specification discloses that the corresponding acts include a concept embedding process which is performed when the query object doesn’t correspond to a known class. ¶[0097] of Xia discloses “Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102.” The examiner based on the object not being linked to a known class the prior art uses a graph embedding process to determine if the object can be linked to any of the potential models.)and providing an indication of one or more instances of the detected query object within the one or more digital images in response to the query. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)
Xia does not explicitly teach based on determining that the query object corresponds to a known object class, a step for detecting the query object in one or more digital images;
Sun teaches based on determining that the query object corresponds to a known object class, a step for detecting the query object in one or more digital images (On page 14 of the Amendment filed 9/23/21, Applicants’ point to [0119] of the specification of the subject application for providing corresponding acts for this claim limitation which invokes 35 USC 112(f). This portion of the specification discloses that the corresponding acts include utilizing a RPN to detect the potential object if it corresponds to a known class. Col. 1, lines 23-28 of Sun discloses “a computing device can receive an image. The computing device can process the image, and generate a convolutional feature map. In some configurations, the convolutional feature map can be processed through a Region Proposal Network (RPN) to generate the proposal candidate locations of objects in the image.” As seen in figure 3, the prior utilizes a region proposal network to detect objects in a digital image.);
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Sun to Xia in order to incorporate a region proposal and object classification model in order to determine the potential objects in the image. One skilled in the art would have been motivated to modify Xia this manner in order to determine a classification (e.g., a type, a class, a group, a category, etc.) of each object, and a confidence score associated with the classification (e.g., how accurate the system believes the classification to be). (Sun, Col 1, Lines 34-38)
Regarding Claim 19, the combination of Xia and Sun teaches the method of claim 18, where Xia further teaches wherein providing the indication of one or more instances of the detected query object comprises automatically selecting an instance of the detected query object within a digital image of the one or more digital images without selecting other areas in the digital image. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of 
Claims 8-11 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Xia et al US PG-Pub (US 20210192375 A1) in view of Sun et al US Patent (US 9858496 B2) in view of Li et al. US PG-Pub (US 20200250538 A1).
Regarding Claim 8, while the combination of Xia and Sun teach the non-transitory computer-readable medium of claim 7, where Xia further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the concept embedding model to detect the query object within the one or more digital images([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102. The examiner based on the object not being linked to a known class the prior art uses a graph embedding process to determine if the object can be linked to any of the potential models.)  by:
They don’t explicitly teach generating a correlation score for each of the potential objects relative to the query object identification label utilizing a concept embedding neural network; and selecting at least one potential object of the plurality of potential objects as an instance of the query object based on the correlation scores.
Li teaches generating a correlation score for each of the potential objects relative to the query object identification label utilizing a concept embedding neural network; ([0088]) The search system and selecting at least one potential object of the plurality of potential objects as an instance of the query object based on the correlation scores([0057] FIG. 5 shows an example search results page 500 provided by the search system 306 that includes image search results for a search query that includes an image. In particular, the search results page 500 displays search results 502, 504, and 506 for the search query 508 that includes an image depicting a truck. In response to receiving a search query that includes a query image, the search system 306 may be configured to provide search results which identify images similar to the query image. In this example, each of the search results 502, 504, and 506 identify images which are similar to the query image. The examiner interprets that the prior art is displaying the image that is closely similar to the search query.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li to Xia and Sun in order to generate a correlation score of the query object. One skilled in the art would have been motivated to modify Xia and Sun in this manner in order to improve model training to characterize a wide range of concepts. (Li, ¶[0019])
Regarding Claim 9, the combination of Xia, Sun and Li teach the non-transitory computer-readable medium of claim 8, where Li further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the concept embedding model to detect the query object within the one or more digital images by generating image embeddings for each of the potential objects utilizing the concept embedding neural network. (¶[0044] FIG. 1 shows an example image embedding model 100. The image embedding model 100 is configured to process an image 102 in accordance with current values of a set of image embedding model parameters to generate an embedding 104 of the image 102. The embedding 104 is a representation of the image 102 as an ordered collection of numerical values, for example, as a vector or matrix. As will be described in more detail below, the image embedding model 100 can be trained using machine learning techniques to generate an embedding 104 of an image 102 which implicitly represents the semantic content of the image 102 (e.g., objects depicted by the image 102). The examiner interprets that the prior art is utilizing a concept embedding model that uses image embeddings to determine the query object in the image.).  
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li to Xia and Sun in order to generate image embedding to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and Sun in this manner in order to improve model training to characterize a wide range of concepts. (Li, ¶[0019])
Regarding Claim 10, the combination of Xia, Sun and Li teaches the non-transitory computer-readable medium of claim 9, where Li further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the concept embedding model to detect the query object within the one or more digital images by generating a topic embedding for the query object identification label utilizing the concept embedding neural network. (¶[0047] FIG. 2 shows an example text embedding model 200. The text embedding model 200 is configured to process a representation of a sequence of one or more words in a natural language (i.e., the text 202) in accordance with current values of a set of text embedding model parameters to generate an embedding 204 of the text 202. The embedding 204 is a representation of the text 202 as an ordered collection of   
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li to Xia and Sun in order to generate topic embeddings to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia in this manner in order to improve model training to characterize a wide range of concepts. (Li, ¶[0019])
Regarding Claim 11, the combination of Xia, Li and Sun teaches the non-transitory computer-readable medium of claim 10, where Li further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to generate the correlation score for each of the potential objects by comparing the topic embedding with the image embeddings.(¶[0088], The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using the embeddings from both the image and text in order to generate a relevance score for each given image with the search index.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li to Xia and Sun in order to compare the topic and image embedding in order to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and 
Regarding Claim 16, while the combination of Xia and Sun teaches the system of claim 12, they don’t explicitly teach wherein the one or more servers cause the system to utilize the concept embedding model to detect the query object within the one or more digital images by: generating image embeddings for each of the potential objects utilizing a concept embedding neural network; generating a topic embedding for the query object identification label utilizing the concept embedding neural network; and comparing the topic embedding with the image embeddings.
 	Li further teaches wherein the one or more servers cause the system to utilize the concept embedding model to detect the query object within the one or more digital images by: 
generating image embeddings for each of the potential objects utilizing a concept embedding neural network network (¶[0044] FIG. 1 shows an example image embedding model 100. The image embedding model 100 is configured to process an image 102 in accordance with current values of a set of image embedding model parameters to generate an embedding 104 of the image 102. The embedding 104 is a representation of the image 102 as an ordered collection of numerical values, for example, as a vector or matrix. As will be described in more detail below, the image embedding model 100 can be trained using machine learning techniques to generate an embedding 104 of an image 102 which implicitly represents the semantic content of the image 102 (e.g., objects depicted by the image 102). The examiner interprets that the prior art is utilizing a concept embedding model that uses image embeddings to determine the query object in the image.);
generating a topic embedding for the query object identification label utilizing the concept embedding neural network; (¶[0047] FIG. 2 shows an example text embedding model 200. The text embedding model 200 is configured to process a representation of a sequence of one or more words in a natural language (i.e., the text 202) in accordance with current values of a set of text embedding model ; 
and comparing the topic embedding with the image embeddings(¶[0088], The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using the embeddings from both the image and text in order to generate a relevance score for each given image with the search index.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li to Xia and Sun in order to compare the topic and image embedding in order to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and Sun in this manner in order to improve model training to characterize a wide range of concepts. (Li, ¶[0019])
Regarding Claim 17, the combination of Xia, Li and Sun teaches the system of claim 16, where Xia teaches wherein the one or more servers cause the system to determine one or more potential objects with object classifications that correspond to the query object identification label. (¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including ;
Claims 5, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Xia et al US PG-Pub(US 20210192375 A1) in view of Sun et al US Patent (US 9858496 B2) and in further view of Price et al. US PG-Pub (US 20170140236 A1).
Regarding Claim 5, while the combination of Xia and Sun teaches the non-transitory computer-readable medium of claim 4, they do not explicitly teach further comprising instructions that when executed by the at least one processor, cause the computing device to generate an object mask for each detected instance of the query object utilizing an object mask model.
Price teaches further comprising instructions that when executed by the at least one processor, cause the computing device to generate an object mask for each detected instance of the query object utilizing an object mask model (¶[0094] As illustrated, the digital selection system provides the positive and negative distance maps 158, 160 (together with one or more color channels) to the trained neural network 162. In particular, the digital selection system concatenates the distance maps (and color channels) into an image/user interaction pair and provides the image/user interaction pair to the trained neural network 162. Moreover, utilizing the trained neural network 162, the digital selection system generates an object mask. The examiner interprets the digital selection system will generate an object mask when it detects the object.). 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Price to Xia and Sun in order to generate an object mask for the detected query object. One skilled in the art would have been motivated to modify Xia and Sun in this manner in order to identify a set of pixels representing target objects quickly and accurately. (Price, ¶[0011])
Regarding Claim 14, the combination of Xia and Sun teaches the system of claim 12, where Xia further teaches  wherein- the query comprises a selection query, and the one or more servers cause the system to: provide the indication of detected query object in the one or more digital images in response to the query by returning a selection of one or more detected instances of the query object in at least one digital image of the one or more digital images(¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.); 
They don’t explicitly teach generate an object mask for the one or more detected instances of the query object in the digital image utilizing an object mask model.
Price teaches generate an object mask for the one or more detected instances of the query object in the digital image utilizing an object mask model.. (¶[0094] As illustrated, the digital selection system provides the positive and negative distance maps 158, 160 (together with one or more color channels) to the trained neural network 162. In particular, the digital selection system concatenates the distance maps (and color channels) into an image/user interaction pair and provides the image/user interaction pair to the trained neural network 162. Moreover, utilizing the trained neural network 162, the digital selection system generates an object mask. The digital selection system will generate an object mask when it detects the object.). 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Price to Xia and Sun in order to generate an object mask for the detected query object. One skilled in the art would have been motivated to modify Xia and Sun in this manner in order to identify a set of pixels representing target objects quickly and accurately. (Price, ¶[0011])
Regarding Claim 20, while Xia and Sun teaches the method of claim 19, 
further comprising generating an object mask for the one or more instances of the detected query object in the one or more digital images.
Price teaches further comprising generating an object mask for the one or more instances of the detected query object in the one or more digital images. (¶[0094] As illustrated, the digital selection system provides the positive and negative distance maps 158, 160 (together with one or more color channels) to the trained neural network 162. In particular, the digital selection system concatenates the distance maps (and color channels) into an image/user interaction pair and provides the image/user interaction pair to the trained neural network 162. Moreover, utilizing the trained neural network 162, the digital selection system generates an object mask. The examiner interprets the digital selection system will generate an object mask when it detects the object.). 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Price to Xia and Sun in order to generate an object mask for the detected query object. One skilled in the art would have been motivated to modify Xia and Sun in this manner in order to identify a set of pixels representing target objects quickly and accurately. (Price, ¶[0011])
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAN D HOANG whose telephone number is (571)272-4344. The examiner can normally be reached Monday-Friday 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire X. Wang can be reached on (571) 270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/HAN HOANG/Examiner, Art Unit 2663                                                                                                                                                                                                        
/SEAN M CONNER/Primary Examiner, Art Unit 2663