Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


DETAILED ACTION

Claim Status
         Claims 1-20 have been considered and are pending examination. 


Information Disclosure Statement
The information disclosure statement(s) (IDS(s)) submitted on 03/29/2019  was filed for Application Number 16369694.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.


NOTE
It is noted that any citations to specific, pages, columns, lines, or figures in the
prior art references and any interpretation of the reference should not be considered to
be limiting in any way. A reference is relevant for all it contains and may be relied upon
for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.



Claim(s) 1-6, 9-11, 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over Garera (U.S. Publication Number 2014/0314311) in view of Jayaraman (U.S. Publication Number 2020/0234162).

Referring to claims 1, 13 and 17, taking claim 13 as exemplary, Garera teaches “13. A computing device, comprising: a memory;” Garera figure 2 element 204, [0029] discloses memory “ and a processor device coupled to the memory to:” Garera figure 2 element 202, [0024], [0025], [0028], [0029] discloses workstations embodied as computing devices “receive from a first requestor a first request for a machine learning training dataset comprising a plurality of objects, the plurality of objects comprising data for training a machine learning model;” Garera [0055], [0056], [0064], (also see [0052], [0053], [0054], [0085]) discloses a workstation receiving training data request from a first requestor (i.e. a server) where training data contains a plurality of records or entries “and send, to the first requestor, a first group of objects from the plurality of objects,” Garera [0056], [0064] discloses workstation transmitting training data of a classification or category to the requesting server  
Garera  Does not explicitly teach “determine a uniqueness characteristic for objects of the plurality of objects, the uniqueness characteristic indicative of how unique each object is relative to each other object;” “the first group of objects being selected based at least partially on the uniqueness characteristic or sent in an order based at least partially on the uniqueness characteristic.”
However, Jayaraman teaches “determine a uniqueness characteristic for objects of the plurality of objects, the uniqueness characteristic indicative of how unique each object is relative to each other object;”  Jayaraman Figure 10 elements 1020, 1024, [0118], [0143], [0156] discloses pre-processing of target data set (conditions data set) where a sub phase of the conditioning removes duplicate entries (deduplicates data sets) and determines uniqueness characteristic of entries when pre-processing a target data set “the first group of objects being selected based at least partially on the uniqueness characteristic or sent in an order based at least partially on the uniqueness characteristic.” Jayaraman [0142], [0143], [0152], [0156], [0157], [0163] discloses deduplicating a subset of entries in the target data set and selecting an adequate data set which will generate a non-deficient ML model. 
Garera and Jayaraman are analogous art because they are from the same field of endeavor namely, memory management.
A person of ordinary skill in the art before the effective filing date of the claimed invention have recognized, and as taught by Jayaraman, that predicting ML model failures improves performance by using a pipeline to prepare and analyze train data before building a model and using metrics to determine inadequate/adequate data thus pre-emptively terminating a model or generating a non-deficient model (Jayaraman [0002], [0139], [0152]). Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Jayaraman’s predicting of ML model failures in the system of Garera to improve performance by using a pipeline to prepare and analyze train data before building a model and using metrics to determine inadequate/adequate data thus pre-emptively terminating a model or generating a non-deficient model
As per the non-exemplary claims 1 and 17, these claims have similar limitations and are rejected based on the reasons given above.

As per claim 2 , the combination of Garera and Jayaraman teaches “2. The method of claim 1 wherein determining the uniqueness characteristic comprises: obtaining each object from a machine learning training dataset storage;” Garera [0037] discloses training data stored in a database “storing each object in a deduplication storage system;” Jayaraman Figure 10 element 1024, [0143], [0156] discloses deduplication logic “and receiving the uniqueness characteristic for each object from the deduplication storage system.” Jayaraman Figure 10 elements 1020, 1024, [0118], [0143], [0156] discloses pre-processing of target data set (conditions data set) where a sub phase of the conditioning removes duplicate entries (deduplicates data sets) and determines uniqueness characteristic of entries when pre-processing a target data set
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 1 is equally applicable to claim 2.

As per claim 3 , the combination of Garera and Jayaraman teaches “3. The method of claim 1 further comprising:  generating, by the computing device, an object listing structure that corresponds to the machine learning training dataset, the object listing structure identifying each object in the machine learning training dataset and identifying the uniqueness characteristic associated with each object.” Jayaraman Figure 10 elements 1024, 1028, [0144], [0146] discloses pre-processed data set entries deduplicated, sorted, indexed, hashed and contained in a data structure for quick accessing and manipulation of elements of data sets where Indexing speeds up process of generating ML model.
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 1 is equally applicable to claim 3.

As per claim 4 , the combination of Garera and Jayaraman teaches “4. The method of claim 3 further comprising: receiving, by the computing device from a second requestor, a second request for the machine learning training dataset;” Garera [0024], [0055], [0064] discloses requests from a plurality of requestors (i.e. one or more servers) “determining, by the computing device, that the object listing structure that corresponds to the machine learning training dataset exists; accessing the object listing structure;” Jayaraman Figure 10 elements 1024, 1028, [0144], [0146] discloses pre-processed data set entries deduplicated, sorted, indexed, hashed and contained in a data structure for quick accessing and manipulation of elements of data sets where Indexing speeds up process of generating ML model “and sending, to the second requestor, a second group of objects,” Garera [0024], [0055], [0064] discloses requests from a plurality of requestors (i.e.  one or more servers ) to a plurality of workstations to generate training data
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 3 is equally applicable to claim 4.

As per claim 9 , the combination of Garera and Jayaraman teaches “9. The method of claim 1 wherein sending the first group of the objects from the plurality of objects, the first group of objects being selected based on the uniqueness characteristic” Jayaraman [0142], [0143], [0152], [0156], [0157], [0163] discloses deduplicating a subset of entries in the target data set and selecting an adequate data set (i.e. by determining how many entries in an input data set are unique, determining that there are less than a threshold amount  of unique entries in the de-duplicated data set or determining that there are too few unique entries in the data set ) which will generate a non-deficient ML model. Further, a data set may be remediated for a successful ML build by changing the criteria used to select input data set from a larger set of training data “ or sent in an order based at least partially on the uniqueness characteristic comprises sending the first group of objects in an order from a highest uniqueness metric to a lowest uniqueness metric.
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 1 is equally applicable to claim 9.

As per claim 10 , the combination of Garera and Jayaraman teaches “10. The method of claim 1 wherein sending the first group of the objects from the plurality of objects, the first group of objects being selected based on the uniqueness characteristic” Jayaraman [0142], [0143], [0152], [0156], [0157], [0163] discloses deduplicating a subset of entries in the target data set and selecting an adequate data set (i.e. by determining how many entries in an input data set are unique, determining that there are less than a threshold amount  of unique entries in the de-duplicated data set or determining that there are too few unique entries in the data set ) which will generate a non-deficient ML model. Further, a data set may be remediated for a successful ML build by changing the criteria used to select input data set from a larger set of training data “ or sent in an order based at least partially on the uniqueness characteristic comprises sending the first group of objects in an order from a greatest uniqueness-to-size ratio to a lowest uniqueness-to-size ratio.
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 1 is equally applicable to claim 10.

Referring to claims 6, 14 and 18, taking claim 14 as exemplary, the combination of Garera and Jayaraman teaches “14. The computing device of claim 13, wherein the first request includes object selection criteria,” Garera [0064] discloses request training data for an identified classification values or categories “and wherein to send the first group of objects to the first requestor in the order based at least partially on the uniqueness characteristic, the processor device is further to send the first group of objects to the first requestor in the order based at least partially on the uniqueness characteristic” Jayaraman [0142], [0143], [0152], [0156], [0157], [0163] discloses deduplicating a subset of entries in the target data set and selecting an adequate data set (i.e. by determining how many entries in an input data set are unique, determining that there are less than a threshold amount  of unique entries in the de-duplicated data set or determining that there are too few unique entries in the data set) which will generate a non-deficient ML model. Further, a data set may be remediated for a successful ML build by changing the criteria used to select input data set from a larger set of training data “in accordance with the object selection criteria.” Garera [0048], [0049], [0050], [0053], [0054] discloses generating training data relating to selected classification values
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 13 is equally applicable to claim 14.
As per the non-exemplary claims 6 and 18, these claims have similar limitations and are rejected based on the reasons given above.

Referring to claims 11, 15 and 19, taking claim 15 as exemplary, the combination of Garera and Jayaraman teaches “15. The computing device of claim 13 wherein each object comprises a plurality of blocks, and the uniqueness characteristic comprises a uniqueness metric that is based on a number of unique blocks contained by each object, each unique block not being contained by any other object” Jayaraman [0142], [0143], [0152], [0156], [0157], [0163] discloses deduplicating a subset of entries in the target data set and selecting an adequate data set (i.e. by determining how many entries in an input data set are unique, determining that there are less than a threshold amount  of unique entries in the de-duplicated data set or determining that there are too few unique entries in the data set ) which will generate a non-deficient ML model.
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 13 is equally applicable to claim 15.
As per the non-exemplary claims 11 and 19, these claims have similar limitations and are rejected based on the reasons given above.

Referring to claims 5, 16 and 20, taking claim 16 as exemplary, the combination of Garera and Jayaraman teaches “16. The computing device of claim 13 wherein the processor device is further to: determine that at least one object in the plurality of objects contains no unique data; and not include the at least one object in the first group of objects based on determining that the at least one object in the plurality of objects contains no unique data” Jayaraman [0118], [0143] discloses pre-processing phase determining duplicate entries in training data set and deduplication sub-phase removes duplicate training data set entries
The same motivation that was utilized for combining Garera and Jayaraman as set forth in claim 13 is equally applicable to claim 16.
As per the non-exemplary claims 5 and 20, these claims have similar limitations and are rejected based on the reasons given above.


Claim(s) 12 are rejected under 35 U.S.C. 103 as being unpatentable over Garera (U.S. Publication Number 2014/0314311) in view of Jayaraman (U.S. Publication Number 2020/0234162) in further view of Pourmohammad (U.S. Publication Number 2019/0138512)

As per claim 12 , the combination of Garera and Jayaraman teaches all the limitations of claim 1  from which claim 12  depends.
Additionally, the combination of Garera and Jayaraman teaches “and wherein the first request is directed toward a machine learning training dataset storage,” Garera [0037], [0055], [0056], [0064], (also see [0052], [0053], [0054], [0085]) discloses a workstation receiving training data request from a first requestor (i.e. a server) where training data is stored in a database
The combination of Garera and Jayaraman does not explicitly teach “a reverse proxy,” “the reverse proxy intercepting the first request and sending the first group of objects to the first requestor in a manner that is transparent to the first requestor.”  
However, Pourmohammad teaches “a reverse proxy,” “the reverse proxy intercepting the first request and sending the first group of objects to the first requestor in a manner that is transparent to the first requestor.”  Pourmohammad [0262]-[0264] discloses requests sent from an operator to an engine in order to obtain data categories. The operator can talk to a server configured to work as a reverse proxy which relays all incoming requests to an underlying server and where the reverse proxy exposes the engine to the operator and returns classification results.
Garera, Jayaraman and Pourmohammad are analogous art because they are from the same field of endeavor namely, data storage.
A person of ordinary skill in the art before the effective filing date of the claimed invention have recognized, and as taught by Pourmohammad, that an analytics system improves performance by configuring analytics system for risk analysis and, via a data ingestion service, receive, collect and pull thread data to provide solid security and protect assets (Pourmohammad [0212], [0230], [0258], [0259], [0264]). Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Pourmohammad’s analytics system in the system of Garera and Jayaraman to improve performance by configuring analytics system for risk analysis and, via a data ingestion service, receive, collect and pull thread data to provide solid security and protect assets.






Allowable Subject Matter

Claim(s) 7, 8 are objected to as being dependent upon a rejected base
claim, but would be allowable if rewritten in independent form including all of the
limitations of the base claim and any intervening claims.
After careful consideration, examination, and search of the claimed invention, prior art was not found to teach the limitations " wherein the object selection criteria requests a number N of the most unique objects, and further comprising selecting only the number N objects that have a highest uniqueness metric for the first group of objects.” and “wherein the object selection criteria requests all objects having a uniqueness metric greater than X, and further comprising selecting only the objects that have a uniqueness metric greater than X for the first group of objects.”











Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAHILBA O PUCHE whose telephone number is (571)272-9163. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on 07519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TAHILBA O PUCHE/Examiner, Art Unit 2132                                                                                                                                                                                                        06/03/2022
/DAVID YI/Supervisory Patent Examiner, Art Unit 2132