DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/27/2022 has been entered.
 Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/28/2022, 05/11/2022 and 10/28/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments
Applicant’s arguments with respect to claims 1-20 have been considered but are not persuasive.
On page 14 of the remarks, the applicant argues that Xia fails to disclose “selecting from among a set of possible object class detection neural networks from a set of possible object detection neural networks” However, Xia discloses in ¶[0005] that a recognition model is selected based on if the query object is present in the dataset or not. Xia further discloses in ¶[0029] that a known recognition model is selected based on if the object matches the query and the idea of not being able to determine a classification for an object in ¶[0097]. However, Xia does not explicitly teach using an unknown object detection network to classify the object. The newly cited prior art of Li et al. US PG-Pub(US 20200175344 A1) would cover the deficiencies of using a unknown object detection network to perform the classification as disclosed in ¶[0023] the object classification program uses an unknown object detection network in order to classify the unknown object. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the claimed invention as taught by Xia with Li in order to use an unknown object detection network to determine classifications for unknown objects. One skilled in the art would have been motivated to modify Xia in this manner in order to utilizing neural networks to identify an unknown piece of art as disclosed by Li in ¶[0001]. Thus the applicant’s arguments are not persuasive.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 6-13 and 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Xia et al US PG-Pub (US 20210192375 A1) in view of Li et al. US PG-Pub(US 20200175344 A1) in view of Li 2 et al. US PG-Pub(US 20200250538 A1) .
Regarding Claim 1, Xia teaches a non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computing device to(¶0105] Computer-readable instructions stored on a computer-readable medium are executable by the processing unit 702 of the computer 700. A hard drive, CD-ROM, RAM, and flash memory are some examples of articles including a non-transitory computer-readable medium such as a storage device.): identify a query that comprises a query object identification label indicating a query object to be detected in one or more digital images(¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including the at least one object in the data set; select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database); select, from among a set of possible object class detection neural networks (¶[0005], select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database; and process the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object.), an object class detection neural network for classifying the query object based on determining whether the query object corresponds to a known object class by comparing the query object identification label to known object class labels before utilizing the object class detection neural network to detect the query object (Abstract, The processing circuitry selects one or more models using an entity knowledge database that includes a plurality of entities corresponding to objects to be identified. Each of a plurality or recognition models is linked to multiple entities of the entity knowledge database so that the processing circuitry may select multiple recognition models.¶[0068] The embodiments below are described in the context of an AI image recognition system in which a user provides or indicates a data set including an image, set of images or video to be searched and also provides a query asking if a particular object or class of objects is represented in the data set. The query may be a specific query, such as, “does the data set include a German Shepard” or it may be a more general query, such as, “identify all of the animals in the data set.” The examiner interprets that the prior art is determining if the query belongs to a known set of classes or models before using a object detection model to detect the object in the dataset.);wherein the set of possible object class detection neural networks comprises: a known object class detection neural network that determines classifications indicating known objects (¶[0029, select at least one recognition model of the plurality of recognition models to be used to identify the at least one object in response to the search of the entity knowledge database; and process the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object. ¶[0074] The system 100 includes a block 118 that receives queries. The data sets associated with the queries may be provided to the model selection and serving process 126. As described above, in the illustrated examples, a data set may be an image file or video file and an example query may be to determine whether an object or class of objects is represented in the data set. The examiner interprets that ¶[0029] talks about selecting a model that corresponds to a known object class and ¶[0074] talks about the model selection is based on the query by the user to see if an object is represented in the dataset.),based on determining that the query object corresponds to the known object class, utilize the known object class detection neural network to detect the query object within the one or more digital images ([0014], the one or more processors execute the instructions to: identify, as the at least one identified recognition model, a plurality of identified recognition models linked to a plurality of nodes that are ontologically coupled to the node corresponding to the at least one object; process the data set using the selected plurality of identified recognition models; and combine results of processing the selected plurality of identified recognition models to provide the indication of whether the data set includes the at least one object. The examiner interprets the prior art is determining if the query object matches any of the object recognition models.);based on determining that the query object does not correspond to a known object class ([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102. The examiner based on the object not being linked to a known class the prior art uses a graph embedding process to determine if the object can be linked to any of the potential models.)and provide an indication of a detected query object corresponding to the query object in the one or more digital images in response to the query. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)
Xia does not explicitly teach an unknown object class detection neural network that determines classifications indicating unknown objects
Li teaches an unknown object class detection neural network that determines classifications indicating unknown objects(¶[0023], the image identification and classification program 132 is capable of receiving one or more images of the unknown object 112, i.e. an unknown artistic object, and identifying the unknown object 112 using the generated identification model. The image identification and classification program 132 may receive one or more images of the unknown object 112 from the imaging device 120 and/or the user device 140 and input into the one or more images of the unknown object 112 into the identification model. The image identification and classification program 132 may generate a novel linguistic description of the unknown object 112. For example, the image identification and classification program 132 may receive one or more images of the unknown object 112, such as, but not limited to, an unknown porcelain vase and input the one or more images into the identification model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the claimed invention as taught by Xia with Li in order to use a unknown object detection network to determine classifications for unknown objects. One skilled in the art would have been motivated to modify Xia in this manner in order to utilizing neural networks to identify an unknown piece of art. (Li, ¶[0001])
However, Xia and Li do not explicitly teach utilize the unknown object class detection neural network comprising a concept embedding neural Page 2 of 17network that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images;
Li 2 teaches utilize the unknown object class detection neural network comprising a concept embedding neural network that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images (¶[0088] In another example, the trained text embedding model and the trained image embedding model can both be used by the search system 306 in ranking image search results responsive to a search query that includes a sequence of one or more words. More specifically, the search system 306 can use the image embedding model 100 to generate a respective embedding of each image in a search index maintained by the search system. After receiving a search query that includes a sequence of one or more words, the search system can use the text embedding model to generate an embedding of the sequence of words, and thereafter use the generated embedding to determine a respective relevance score for each of multiple images in the search index. The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using a text embedding model and image embedding model which determines the correspondence between a search query and query object in the image by giving a relevance score for the given image based on the search query.);
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to utilize a concept embedding model to determine the correspondence between the query and query object used to detect the query object in the image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 2, the combination of Xia, Li and Li 2 teaches the non-transitory computer-readable medium of claim 1, where Xia further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to(¶0105] Computer-readable instructions stored on a computer-readable medium are executable by the processing unit 702 of the computer 700. A hard drive, CD-ROM, RAM, and flash memory are some examples of articles including a non-transitory computer-readable medium such as a storage device.) utilize the known object class detection neural network to detect the query object within the one or more digital images (¶[0014], the one or more processors execute the instructions to: identify, as the at least one identified recognition model, a plurality of identified recognition models linked to a plurality of nodes that are ontologically coupled to the node corresponding to the at least one object; process the data set using the selected plurality of identified recognition models; and combine results of processing the selected plurality of identified recognition models to provide the indication of whether the data set includes the at least one object):detecting potential objects in the one or more digital images utilizing a region proposal neural network; (¶[0083], if the recognition model/service associated with the node 350 only isolates the entities and generates the bounding boxes 304, 306, 308, 310, and 312, the system 100 may process each of the bounded images indicated by the bounding boxes 304, 306, 308, 310, and 312 using each of the second level recognition models associated with the nodes 352, 354, and 356. For example, each of the bounded images may be processed by the model associated with vehicle node 356 to identify the pick-up truck 304 in the image 302 with a confidence value of 1.000, as indicated by the dashed line 320. The examiner interprets the prior art is using bounding boxes around objects in the digital image to classify the object)and generating approximate boundaries about the potential objects(¶[0082], The response to this query may identify all entities in the image 302 using one or more recognition models 102 associated with the entity node 350. This recognition model may be a general classification/detection service that provides an entry level classification of the image 302 as a whole and blob detection to parse out objects in the image 302. This service may also provide a general characterization of the scene in the image 302 (e.g., “outdoors” and/or “stable yard”) and outline detected blobs in the image 302 with the bounding boxes 304, 306, 308, 310, and 312.). 
Regarding Claim 3, the combination of Xia, Li and Li 2 teaches the non-transitory computer-readable medium of claim 2,  where Xia further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the known object class detection neural network to detect the query object within the one or more digital images by generating object labels for the potential objects utilizing an object classification neural network. (¶[0082], One example classification/detection system may also include general models (e.g., the second level models of the database 340) that provide a broad taxonomy of many categories which includes objects that may be searched identified in an image 302. These are entry level categories, for example, animal, person, vehicle, fashion, food_and_drink, plant, sports and other broad categories. The result returned by each recognition model 102 may be accompanied by a confidence value indicating a likelihood that the image 302 belongs to the category. The models/service associated with the node 350 may return multiple categories corresponding to the nodes 352, 354 and 356, each associated with a corresponding confidence value.)
Regarding Claim 4, while the combination of Xia, Li, Li 2 and Chang teaches the non-transitory computer-readable medium of claim 3, where Xia further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the known object class detection neural network to detect the query object within the one or more digital images by determining that an object label of one or more potential objects corresponds to a query object class associated with the query object. (¶[0082], One example classification/detection system may also include general models (e.g., the second level models of the database 340) that provide a broad taxonomy of many categories which includes objects that may be searched identified in an image 302. These are entry level categories, for example, animal, person, vehicle, fashion, food_and_drink, plant, sports and other broad categories. The result returned by each recognition model 102 may be accompanied by a confidence value indicating a likelihood that the image 302 belongs to the category. The models/service associated with the node 350 may return multiple categories corresponding to the nodes 352, 354 and 356, each associated with a corresponding confidence value. The examiner interprets as seen in figure 3, it shows labels of potential objects corresponding to a category)
Regarding Claim 6, while the combination of Xia, Li and Li 2 teaches the non-transitory computer-readable medium of claim 3, where Li further teaches further comprising instructions that when executed by the at least one processor, cause the computing device to: determine that an object label of at least one potential object does not correspond to the query object identification label and filter out the at least one potential object based on the object label of the at least one potential object not corresponding to the query object identification label. (¶[0023], the one or more deep recurrent neural networks to train the art identification model. Further, the image identification and classification program 132 is capable of receiving one or more images of the unknown object 112, i.e. an unknown artistic object, and identifying the unknown object 112 using the generated identification model. The image identification and classification program 132 may receive one or more images of the unknown object 112 from the imaging device 120 and/or the user device 140 and input into the one or more images of the unknown object 112 into the identification model. The image identification and classification program 132 may generate a novel linguistic description of the unknown object 112. For example, the image identification and classification program 132 may receive one or more images of the unknown object 112, such as, but not limited to, an unknown porcelain vase and input the one or more images into the identification model. The examiner interprets if the network doesn’t know the label it will create a new label for the object.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the claimed invention as taught by Xia with Li in order to use a unknown object detection network to determine classifications for unknown objects. One skilled in the art would have been motivated to modify Xia in this manner in order to utilizing neural networks to identify an unknown piece of art. (Li, ¶[0001])
Regarding Claim 7, the combination of Xia, Li and Li 2 teaches the non-transitory computer-readable medium of claim 1, where Xia teaches wherein the instructions, when executed by the at least one processor, cause the computing device to detecting the potential objects in the one or more digital images utilizing a region proposal neural network jointly with a concept embedding neural network. ((¶[0083], if the recognition model/service associated with the node 350 only isolates the entities and generates the bounding boxes 304, 306, 308, 310, and 312, the system 100 may process each of the bounded images indicated by the bounding boxes 304, 306, 308, 310, and 312 using each of the second level recognition models associated with the nodes 352, 354, and 356. For example, each of the bounded images may be processed by the model associated with vehicle node 356 to identify the pick-up truck 304 in the image 302 with a confidence value of 1.000, as indicated by the dashed line 320. The examiner interprets it would be obvious to one of ordinary skill in the art to also incorporate a concept embedded neural network as taught by Li 2 to Xia’s RPN in order to further detect the potential object in the image.)and generating approximate boundaries around the potential objects (¶[0082], The response to this query may identify all entities in the image 302 using one or more recognition models 102 associated with the entity node 350. This recognition model may be a general classification/detection service that provides an entry level classification of the image 302 as a whole and blob detection to parse out objects in the image 302. This service may also provide a general characterization of the scene in the image 302 (e.g., “outdoors” and/or “stable yard”) and outline detected blobs in the image 302 with the bounding boxes 304, 306, 308, 310, and 312.).
Li 2 teaches utilize the concept embedding neural network to detect the query object within the one or more digital images(Li 2, ¶[0088] In another example, the trained text embedding model and the trained image embedding model can both be used by the search system 306 in ranking image search results responsive to a search query that includes a sequence of one or more words. More specifically, the search system 306 can use the image embedding model 100 to generate a respective embedding of each image in a search index maintained by the search system. After receiving a search query that includes a sequence of one or more words, the search system can use the text embedding model to generate an embedding of the sequence of words, and thereafter use the generated embedding to determine a respective relevance score for each of multiple images in the search index. The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using a text embedding model and image embedding model which determines the correspondence between a search query and query object in the image by giving a relevance score for the given image based on the search query.);
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to utilize a concept embedding model to determine the correspondence between the query and query object used to detect the query object in the image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019]) 
Regarding Claim 8, while the combination of Xia, Li and Li 2 teach the non-transitory computer-readable medium of claim 7, where Li 2 further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the concept embedding neural network to detect the query object within the one or more digital images by:
generating a correlation score for each of the potential objects relative to the query object identification label utilizing the concept embedding neural network; ([0088]) The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is generating a relevance score between the query and query object in the image in order to choose which image to display to the user based on if the score is high or low.); and selecting at least one potential object of the plurality of potential objects as an instance of the query object based on the correlation scores([0057] FIG. 5 shows an example search results page 500 provided by the search system 306 that includes image search results for a search query that includes an image. In particular, the search results page 500 displays search results 502, 504, and 506 for the search query 508 that includes an image depicting a truck. In response to receiving a search query that includes a query image, the search system 306 may be configured to provide search results which identify images similar to the query image. In this example, each of the search results 502, 504, and 506 identify images which are similar to the query image. The examiner interprets that the prior art is displaying the image that is closely similar to the search query.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to generate a correlation score of the query object. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 9, the combination of Xia, Li, and Li 2 teach the non-transitory computer-readable medium of claim 8, wherein the instructions, where Li 2 further teaches when executed by the at least one processor, cause the computing device to utilize the concept embedding neural network to detect the query object within the one or more digital images by generating image embeddings for each of the potential objects utilizing the concept embedding neural network. (¶[0044] FIG. 1 shows an example image embedding model 100. The image embedding model 100 is configured to process an image 102 in accordance with current values of a set of image embedding model parameters to generate an embedding 104 of the image 102. The embedding 104 is a representation of the image 102 as an ordered collection of numerical values, for example, as a vector or matrix. As will be described in more detail below, the image embedding model 100 can be trained using machine learning techniques to generate an embedding 104 of an image 102 which implicitly represents the semantic content of the image 102 (e.g., objects depicted by the image 102). The examiner interprets that the prior art is utilizing a concept embedding model that uses image embeddings to determine the query object in the image.).  
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to generate image embedding to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 10, the combination of Xia, Li and Li 2 teaches the non-transitory computer-readable medium of claim 9, where Li 2 further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to utilize the concept embedding neural network to detect the query object within the one or more digital images by generating a topic embedding for the query object identification label utilizing the concept embedding neural network. (¶[0047] FIG. 2 shows an example text embedding model 200. The text embedding model 200 is configured to process a representation of a sequence of one or more words in a natural language (i.e., the text 202) in accordance with current values of a set of text embedding model parameters to generate an embedding 204 of the text 202. The embedding 204 is a representation of the text 202 as an ordered collection of numerical values, for example, as a vector or matrix. As will be described in more detail below, the text embedding model 200 can be trained using machine learning techniques to generate an embedding 204 of the text 202 which implicitly represents the semantic content of the text 202 (e.g., objects described by the text 202). The examiner interprets that the prior art is utilizing a concept embedding model that uses topic/text embeddings to determine the query object in the image.).  
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to generate topic embeddings to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 11, the combination of Xia, Li and Li 2 teaches the non-transitory computer-readable medium of claim 10, where Li 2 further teaches wherein the instructions, when executed by the at least one processor, cause the computing device to generate the correlation score for each of the potential objects by comparing the topic embedding with the image embeddings.(¶[0088], The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using the embeddings from both the image and text in order to generate a relevance score for each given image with the search index.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to compare the topic and image embedding in order to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and Chang in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 12, Xia teaches a system for automatically selecting objects within digital images comprising: 
one or more memory devices(¶[0102] One example computing device in the form of a computer 700 may include a processing unit 702, memory 703, removable storage 710, and non-removable storage 712.) comprising: one or more digital images(¶[0068], AI image recognition system in which a user provides or indicates a data set including an image, set of images or video to be searched); a region proposal neural network(¶[0082], The response to this query may identify all entities in the image 302 using one or more recognition models 102 associated with the entity node 350. This recognition model may be a general classification/detection service that provides an entry level classification of the image 302 as a whole and blob detection to parse out objects in the image 302. This service may also provide a general characterization of the scene in the image 302 (e.g., “outdoors” and/or “stable yard”) and outline detected blobs in the image 302 with the bounding boxes 304, 306, 308, 310, and 312. The examiner interprets as seen in Fig. 3 that the prior art is using bounding boxes to determine object classification in the image.)
an object classification neural network (¶[0015], the entity knowledge database includes a plurality of ontologically organized nodes corresponding to the at least one object at different levels of generality and the one or more processors execute the instructions to: select, as the identified plurality of recognition models, respective models associated with the plurality of nodes corresponding to the different levels of generality of the at least one object);and one or more servers that cause the system to: (¶[0075], The example environment 200 includes a user terminal 202 coupled to a local server 204): 
identify a query that comprises a query object identification label indicating a query object to be detected in the one or more digital images(¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including the at least one object in the data set; select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database);select, from among a set of possible object class detection neural networks (¶[0005], select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database; and process the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object.), an object class detection neural network for classifying the query object based on determining whether the query object corresponds to a known object class by comparing the query object identification label to known object class labels before utilizing the object class detection neural network to detect the query object (Abstract, The processing circuitry selects one or more models using an entity knowledge database that includes a plurality of entities corresponding to objects to be identified. Each of a plurality or recognition models is linked to multiple entities of the entity knowledge database so that the processing circuitry may select multiple recognition models.¶[0068] The embodiments below are described in the context of an AI image recognition system in which a user provides or indicates a data set including an image, set of images or video to be searched and also provides a query asking if a particular object or class of objects is represented in the data set. The query may be a specific query, such as, “does the data set include a German Shepard” or it may be a more general query, such as, “identify all of the animals in the data set.” The examiner interprets that the prior art is determining if the query belongs to a known set of classes or models before using a object detection model to detect the object in the dataset.);wherein the set of possible object class detection neural networks comprises: a known object class detection neural network that determines classifications indicating known objects (¶[0029, select at least one recognition model of the plurality of recognition models to be used to identify the at least one object in response to the search of the entity knowledge database; and process the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object. ¶[0074] The system 100 includes a block 118 that receives queries. The data sets associated with the queries may be provided to the model selection and serving process 126. As described above, in the illustrated examples, a data set may be an image file or video file and an example query may be to determine whether an object or class of objects is represented in the data set. The examiner interprets that ¶[0029] talks about selecting a model that corresponds to a known object class and ¶[0074] talks about the model selection is based on the query by the user to see if an object is represented in the dataset.),based on determining that the query object corresponds to the known object class, utilize the known object class detection neural network to detect the query object within the one or more digital images ([0014], the one or more processors execute the instructions to: identify, as the at least one identified recognition model, a plurality of identified recognition models linked to a plurality of nodes that are ontologically coupled to the node corresponding to the at least one object; process the data set using the selected plurality of identified recognition models; and combine results of processing the selected plurality of identified recognition models to provide the indication of whether the data set includes the at least one object. The examiner interprets the prior art is determining if the query object matches any of the object recognition models.);based on determining that the query object does not correspond to a known object class ([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102. The examiner based on the object not being linked to a known class the prior art uses a graph embedding process to determine if the object can be linked to any of the potential models.)
detecting potential objects in the one or more digital images utilizing a region proposal neural network; (¶[0083], if the recognition model/service associated with the node 350 only isolates the entities and generates the bounding boxes 304, 306, 308, 310, and 312, the system 100 may process each of the bounded images indicated by the bounding boxes 304, 306, 308, 310, and 312 using each of the second level recognition models associated with the nodes 352, 354, and 356. For example, each of the bounded images may be processed by the model associated with vehicle node 356 to identify the pick-up truck 304 in the image 302 with a confidence value of 1.000, as indicated by the dashed line 320. The examiner interprets the prior art is using bounding boxes around objects in the digital image to classify the object)and provide an indication of a detected query object corresponding to the query object in the one or more digital images in response to the query. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)
Xia does not explicitly teach an unknown object class detection neural network that determines classifications indicating unknown objects
Li teaches an unknown object class detection neural network that determines classifications indicating unknown objects(¶[0023], the image identification and classification program 132 is capable of receiving one or more images of the unknown object 112, i.e. an unknown artistic object, and identifying the unknown object 112 using the generated identification model. The image identification and classification program 132 may receive one or more images of the unknown object 112 from the imaging device 120 and/or the user device 140 and input into the one or more images of the unknown object 112 into the identification model. The image identification and classification program 132 may generate a novel linguistic description of the unknown object 112. For example, the image identification and classification program 132 may receive one or more images of the unknown object 112, such as, but not limited to, an unknown porcelain vase and input the one or more images into the identification model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the claimed invention as taught by Xia with Li in order to use a unknown object detection network to determine classifications for unknown objects. One skilled in the art would have been motivated to modify Xia in this manner in order to utilizing neural networks to identify an unknown piece of art. (Li, ¶[0001])
However, Xia and Li do not explicitly teach a concept embedding neural network that determines a correspondence between queries and query objects in digital images and utilizing the unknown object class detection neural network comprising a concept embedding neural Page 2 of 17network that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images;
Li 2 teaches a concept embedding neural network that determines a correspondence between queries and query objects in digital images (¶[0088] In another example, the trained text embedding model and the trained image embedding model can both be used by the search system 306 in ranking image search results responsive to a search query that includes a sequence of one or more words. More specifically, the search system 306 can use the image embedding model 100 to generate a respective embedding of each image in a search index maintained by the search system. After receiving a search query that includes a sequence of one or more words, the search system can use the text embedding model to generate an embedding of the sequence of words, and thereafter use the generated embedding to determine a respective relevance score for each of multiple images in the search index. the examiner interprets that the prior art is using a text embedding model and image embedding model which determines the correspondence between a search query and query object in the image by giving a relevance score for the given image based on the search query.); utilize the unknown object class detection neural network comprising a concept embedding neural network that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images (¶[0088] In another example, the trained text embedding model and the trained image embedding model can both be used by the search system 306 in ranking image search results responsive to a search query that includes a sequence of one or more words. More specifically, the search system 306 can use the image embedding model 100 to generate a respective embedding of each image in a search index maintained by the search system. After receiving a search query that includes a sequence of one or more words, the search system can use the text embedding model to generate an embedding of the sequence of words, and thereafter use the generated embedding to determine a respective relevance score for each of multiple images in the search index. The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using a text embedding model and image embedding model which determines the correspondence between a search query and query object in the image by giving a relevance score for the given image based on the search query.);
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to utilize a concept embedding model to determine the correspondence between the query and query object used to detect the query object in the image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])Page 6 of 18
Regarding Claim 13, the combination of Xia, Li and Li 2 teach the system of claim 12, where Xia further teaches wherein the query comprises an image search request and the one or more servers cause the system to provide the indication of detected query object in the one or more digital images in response to the query by returning a subset of the one or more digital images including the query object. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)
Regarding Claim 15, the combination of Xia, Li and Li 2 teach the system of claim 12,  where Xia further teaches wherein the one or more servers cause the system to determine the known object class labels from labels of objects used to generate the object classification neural network(¶[0005], select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database. The examiner interprets the prior art is determining if the object corresponds to a model in the database before selecting the object classification model.)
Regarding Claim 16, the combination of Xia, Li and Li 2 teaches the system of claim 12, where Li 2 further teaches wherein the one or more servers cause the system to utilize the concept embedding neural network to detect the query object within the one or more digital images by: generating image embeddings for each of the plurality of potential objects utilizing the concept embedding neural network; (¶[0044] FIG. 1 shows an example image embedding model 100. The image embedding model 100 is configured to process an image 102 in accordance with current values of a set of image embedding model parameters to generate an embedding 104 of the image 102. The embedding 104 is a representation of the image 102 as an ordered collection of numerical values, for example, as a vector or matrix. As will be described in more detail below, the image embedding model 100 can be trained using machine learning techniques to generate an embedding 104 of an image 102 which implicitly represents the semantic content of the image 102 (e.g., objects depicted by the image 102). The examiner interprets that the prior art is utilizing a concept embedding model that uses image embeddings to determine the query object in the image.);
generating a topic embedding for the query object identification label utilizing the concept embedding neural network; 
and comparing the topic embedding with the image embeddings(¶[0088], The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using the embeddings from both the image and text in order to generate a relevance score for each given image with the search index.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to compare the topic and image embedding in order to detect the query object in the digital image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 17, the combination of Xia, Li and Li 2 teaches the system of claim 16, where Xia further teaches wherein the one or more servers cause the system to determine one or more potential objects with object classifications that correspond to the query object identification label. (¶[0082], One example classification/detection system may also include general models (e.g., the second level models of the database 340) that provide a broad taxonomy of many categories which includes objects that may be searched identified in an image 302. These are entry level categories, for example, animal, person, vehicle, fashion, food_and_drink, plant, sports and other broad categories. The result returned by each recognition model 102 may be accompanied by a confidence value indicating a likelihood that the image 302 belongs to the category. The models/service associated with the node 350 may return multiple categories corresponding to the nodes 352, 354 and 356, each associated with a corresponding confidence value. The examiner interprets as seen in figure 3, it shows labels of potential objects corresponding to a category)
Regarding Claim 18, Xia teaches in a digital medium environment for creating or editing digital images, a computer-implemented method of selecting query objects, comprising: identifying a query that comprises a query object identification label indicating a query object to be detected in one or more digital images(¶[0005] wherein the one or more processors execute the instructions to: receive a data set and a query including the at least one object in the data set; select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity); select, from among a set of possible object class detection neural networks (¶[0005], select at least one recognition model using an entity knowledge database including a plurality of entities corresponding to objects to be identified, wherein each recognition model of a plurality of recognition models is linked to multiple entities of the entity knowledge database; and process the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object.), an object class detection neural network for classifying the query object based on determining whether the query object corresponds to a known object class by comparing the query object identification label to known object class labels before utilizing the object class detection neural network to detect the query object (Abstract, The processing circuitry selects one or more models using an entity knowledge database that includes a plurality of entities corresponding to objects to be identified. Each of a plurality or recognition models is linked to multiple entities of the entity knowledge database so that the processing circuitry may select multiple recognition models.¶[0068] The embodiments below are described in the context of an AI image recognition system in which a user provides or indicates a data set including an image, set of images or video to be searched and also provides a query asking if a particular object or class of objects is represented in the data set. The query may be a specific query, such as, “does the data set include a German Shepard” or it may be a more general query, such as, “identify all of the animals in the data set.” The examiner interprets that the prior art is determining if the query belongs to a known set of classes or models before using a object detection model to detect the object in the dataset.);wherein the set of possible object class detection neural networks comprises: a known object class detection neural network that determines classifications indicating known objects (¶[0029, select at least one recognition model of the plurality of recognition models to be used to identify the at least one object in response to the search of the entity knowledge database; and process the data set using the at least one selected recognition model to provide an indication of whether the data set includes the at least one object. ¶[0074] The system 100 includes a block 118 that receives queries. The data sets associated with the queries may be provided to the model selection and serving process 126. As described above, in the illustrated examples, a data set may be an image file or video file and an example query may be to determine whether an object or class of objects is represented in the data set. The examiner interprets that ¶[0029] talks about selecting a model that corresponds to a known object class and ¶[0074] talks about the model selection is based on the query by the user to see if an object is represented in the dataset.),based on determining that the query object corresponds to the known object class, utilize the known object class detection neural network to detect the query object within the one or more digital images ([0014], the one or more processors execute the instructions to: identify, as the at least one identified recognition model, a plurality of identified recognition models linked to a plurality of nodes that are ontologically coupled to the node corresponding to the at least one object; process the data set using the selected plurality of identified recognition models; and combine results of processing the selected plurality of identified recognition models to provide the indication of whether the data set includes the at least one object. The examiner interprets the prior art is determining if the query object matches any of the object recognition models.);based on determining that the query object does not correspond to a known object class ([0097] Returning to FIG. 4B, after processing unknown entities at block 456 through the graph embedding process 124, block 458 determines whether the identified objects in the entity-knowledge graph database 114 are linked to recognition models 102 in the model repository 112. When the entities exist in the entity-knowledge graph database 114 but the entities are not linked to one or more recognition models 102, block 460 identifies similar objects in the entity-knowledge graph database 114 based on the graph ontology. For example, if the query object were poodle and the poodle node 536 was not directly linked to a recognition model 102. The examiner based on the object not being linked to a known class the prior art uses a graph embedding process to determine if the object can be linked to any of the potential models.)and provide an indication of a detected query object corresponding to the query object in the one or more digital images in response to the query. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)
Xia does not explicitly teach an unknown object class detection neural network that determines classifications indicating unknown objects
Li teaches an unknown object class detection neural network that determines classifications indicating unknown objects(¶[0023], the image identification and classification program 132 is capable of receiving one or more images of the unknown object 112, i.e. an unknown artistic object, and identifying the unknown object 112 using the generated identification model. The image identification and classification program 132 may receive one or more images of the unknown object 112 from the imaging device 120 and/or the user device 140 and input into the one or more images of the unknown object 112 into the identification model. The image identification and classification program 132 may generate a novel linguistic description of the unknown object 112. For example, the image identification and classification program 132 may receive one or more images of the unknown object 112, such as, but not limited to, an unknown porcelain vase and input the one or more images into the identification model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the claimed invention as taught by Xia with Li in order to use a unknown object detection network to determine classifications for unknown objects. One skilled in the art would have been motivated to modify Xia in this manner in order to utilizing neural networks to identify an unknown piece of art. (Li, ¶[0001])
However, Xia and Li do not explicitly teach utilize the unknown object class detection neural network comprising a concept embedding neural Page 2 of 17network that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images;
Li 2 teaches utilize the unknown object class detection neural network comprising a concept embedding neural network that determines a correspondence between the query object identification label and potential query objects within the one or more digital images to detect the query object within the one or more digital images (¶[0088] In another example, the trained text embedding model and the trained image embedding model can both be used by the search system 306 in ranking image search results responsive to a search query that includes a sequence of one or more words. More specifically, the search system 306 can use the image embedding model 100 to generate a respective embedding of each image in a search index maintained by the search system. After receiving a search query that includes a sequence of one or more words, the search system can use the text embedding model to generate an embedding of the sequence of words, and thereafter use the generated embedding to determine a respective relevance score for each of multiple images in the search index. The search system can determine the relevance score for a given image in the search index based on a measure of similarity (e.g., a Euclidean distance) between the embedding of the given image and the embedding of the sequence of words of the search query. The search system 306 can determine the ranking of the image search results for the search query based at least in part on the relevance scores determined using the embeddings generated by the image embedding model and the text embedding model. The examiner interprets that the prior art is using a text embedding model and image embedding model which determines the correspondence between a search query and query object in the image by giving a relevance score for the given image based on the search query.);
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Li 2 to Xia and Li in order to utilize a concept embedding model to determine the correspondence between the query and query object used to detect the query object in the image. One skilled in the art would have been motivated to modify Xia and Li in this manner in order to improve model training to characterize a wide range of concepts. (Li 2, ¶[0019])
Regarding Claim 19, the combination of Xia, Li and Li 2 teaches the computer-implemented method of claim 18, where Xia further teaches wherein providing the indication of one or more instances of the detected query object comprises automatically selecting an instance of the detected query object within a digital image of the one or more digital images without selecting other areas in the digital image. (¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.)
Claims 5, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Xia et al US PG-Pub(US 20210192375 A1) in view of Li et al. US PG-Pub(US 20200175344 A1) in view of Li 2 et al. US PG-Pub(US 20200250538 A1) in view of Price et al. US PG-Pub (US 20170140236 A1).
Regarding Claim 5, while the combination of Xia, Li, Li 2 teaches the non-transitory computer-readable medium of claim 4, they do not explicitly teach further comprising instructions that when executed by the at least one processor, cause the computing device to generate an object mask for each detected instance of the query object utilizing an object mask model.
Price teaches further comprising instructions that when executed by the at least one processor, cause the computing device to generate an object mask for each detected instance of the query object utilizing an object mask model (¶[0094] As illustrated, the digital selection system provides the positive and negative distance maps 158, 160 (together with one or more color channels) to the trained neural network 162. In particular, the digital selection system concatenates the distance maps (and color channels) into an image/user interaction pair and provides the image/user interaction pair to the trained neural network 162. Moreover, utilizing the trained neural network 162, the digital selection system generates an object mask. The examiner interprets the digital selection system will generate an object mask when it detects the object.). 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Price to Xia, Li, and Li 2 in order to generate an object mask for the detected query object. One skilled in the art would have been motivated to modify Xia, Li and Li 2this manner in order to identify a set of pixels representing target objects quickly and accurately. (Price, ¶[0011])
Regarding Claim 14, the combination of Xia, Li, Li 2 teaches the system of claim 12, where Xia further teaches  wherein- the query comprises a selection query indicating selections of one or more digital object instances within a query digital image; and the one or more servers cause the system to: provide the indication of detected query object in the one or more digital images in response to the query by returning a selection of one or more detected instances of the query object in at least one digital image of the one or more digital images(¶[0076], In this embodiment, the local server 204 may receive the data sets and queries, retrieve one or more of the recognition models 1-5 from the data store 206 execute the recognition models on the data set and display determined prediction results 128 via the user terminal 202.); 
They don’t explicitly teach generate an object mask for the one or more detected instances of the query object in the digital image utilizing an object mask model.
Price teaches generate an object mask for the one or more detected instances of the query object in the digital image utilizing an object mask model. (¶[0094] As illustrated, the digital selection system provides the positive and negative distance maps 158, 160 (together with one or more color channels) to the trained neural network 162. In particular, the digital selection system concatenates the distance maps (and color channels) into an image/user interaction pair and provides the image/user interaction pair to the trained neural network 162. Moreover, utilizing the trained neural network 162, the digital selection system generates an object mask. The digital selection system will generate an object mask when it detects the object.). 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Price to Xia, Li and Li 2 in order to generate an object mask for the detected query object. One skilled in the art would have been motivated to modify Xia, Li and Li 2 in this manner in order to identify a set of pixels representing target objects quickly and accurately. (Price, ¶[0011])
Regarding Claim 20, while Xia, Li and Li 2 teaches the computer implemented method of claim 19, 
They do not explicitly teach further comprising generating an object mask for the one or more instances of the detected query object in the one or more digital images.
Price teaches further comprising generating an object mask for the one or more instances of the detected query object in the one or more digital images. (¶[0094] As illustrated, the digital selection system provides the positive and negative distance maps 158, 160 (together with one or more color channels) to the trained neural network 162. In particular, the digital selection system concatenates the distance maps (and color channels) into an image/user interaction pair and provides the image/user interaction pair to the trained neural network 162. Moreover, utilizing the trained neural network 162, the digital selection system generates an object mask. The examiner interprets the digital selection system will generate an object mask when it detects the object.). 
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Price to Xia, Li and Li 2 in order to generate an object mask for the detected query object. One skilled in the art would have been motivated to modify Xia, Li and Li 2 in this manner in order to identify a set of pixels representing target objects quickly and accurately. (Price, ¶[0011])
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAN D HOANG whose telephone number is (571)272-4344. The examiner can normally be reached Monday-Friday 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire X. Wang can be reached on (571) 270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HAN HOANG/Examiner, Art Unit 2663                                                                                                                                                                                                        
/ANDREW M MOYER/Primary Examiner, Art Unit 2663