DETAILED ACTION
Response to Arguments
Claim Rejections - 35 USC § 112(a)
Applicants’ arguments with respect to the rejection of claim 7 under 35 USC § 112(a) have been fully considered but they are not persuasive.
The applicants argued that the recited limitation generating, utilizing a different machine learning model, a second set of labels corresponding to the dataset, wherein the second set of labels is different than the set of labels generated by the machine learning model; filtering the dataset using a second set of conditions to generate at least a second subset of the dataset; generating a second virtual object based at least in part on the second subset of the dataset and the second set of labels; and training a third machine learning model using the second virtual object and at least the second subset of the dataset was described in paragraphs [0049] & [0114] of the specification.
The examiner respectfully points out that paragraph [0049] of the specification teaches the first and second machine learning model as recited in claim 1, e.g., a first machine learning application can generate a first annotation object with a first set of labels for a particular dataset, while a second machine learning application can generate a second annotation object with a different set of labels for the same dataset as used by the first machine learning application. These respective machine learning applications can then generate different split objects and/or package objects that are applicable for training their respective machine learning models. Paragraph [0114] also teaches the features of first and second machine learning model as recited in claim 1. 
As noted, paragraphs [0049] & [0114] of the specification do not teach a different machine learning model that is not the first and second machine learning model and a third machine learning model using the second virtual object and at least the second subset of the dataset of the different machine learning model.   

Applicants’ arguments with respect to the rejection of claim 8 under 35 USC § 112(a) have been fully considered but they are not persuasive.
The applicants argued that the recited limitation training the second machine learning model based at least in part on a first dataset corresponding to a query on the dataset, the query being provided by the virtual object and the first dataset comprising the subset of the dataset; and validating the second machine learning model based at least in part on a second dataset corresponding to a second query on the dataset provided by the virtual object was described in paragraph [0115] of the specification.
The examiner respectfully points out that paragraph [0115] of the specification teaches that-29 - Attorney Docket No.: 122202-6503 (P41208US1)electronic device 110 trains a second machine learning model using the virtual object and at least the subset of the dataset, wherein the virtual object corresponds to a selection of data (e.g., defining columns of the view) similar to a particular query of the dataset (1516). In an example, the virtual object (e.g., the package) is based at least in part on a particular query with SQL-like commands such as defining a selection of columns in the dataset and/or joining data from annotations and/or splits objects.
As noted, paragraph [0015] clearly indicates that the second machine learning model based at least in part on a first dataset corresponding to a query on the dataset, wherein a first dataset corresponding to a query is the virtual object. Paragraph [0015] does not teaches that the query being provided by the virtual object and the first dataset comprising the subset of the dataset and validating the second machine learning model based at least in part on a second dataset corresponding to a second query on the dataset provided by the virtual object.
  
Applicants’ arguments with respect to the rejection of claim 10 under 35 USC § 112(a) have been fully considered but they are not persuasive.
The applicants argued that the recited limitation the second machine learning model provides a prediction using a second dataset as input was described in paragraphs [0022] & [0107] of the specification.
The examiner respectfully points out that paragraph [0022] of the specification teaches that machine learning may utilize models that are executed to provide predictions in particular applications (e.g., analyzing images and videos) among many other types of applications. Paragraph [0107] teaches that ML training experiments may target only a subset of the entire dataset, e.g., to train a model to classify the dog breeds, and a ML model may only be interested in the dog images from the entire computer vision dataset. After identifying the image IDs, the actual images might be scattered across many partitions, the data block layout design will allow a client to stream only those data blocks of interest.
As noted, paragraphs [0049] & [0114] of the specification do not teach a second dataset is used as input for a prediction provided by the second machine learning model as recited.

For at least the reasons as noted, the rejection of claims 7, 8 & 10 under 35 USC § 112(a) is sustained. Claims 18 & 19 include features analogous to claims 8 & 9. Claim 9 is a dependent of claim 8. The rejection of claims 9, 18 & 19 is sustained for at least the reasons as noted with regard to claims 7, 8 & 10.

Claim Rejections - 35 USC § 112(b)
Claim 8 was rejected under 35 USC § 112(b) and was amended. The examiner respectfully points out that amended claim 8 is still indefinite.
As recited in claim 1, the second machine learning model is trained by using the virtual object and at least the subset of the dataset. As disclosed in the specification (Paragraph 0115], Attorney Docket No.: 122202-6503 (P41208US1)electronic device 110 trains a second machine learning model using the virtual object and at least the subset of the dataset, wherein the virtual object corresponds to a selection of data (e.g., defining columns of the view) similar to a particular query of the dataset (1516). In an example, the virtual object (e.g., the package) is based at least in part on a particular query with SQL-like commands such as defining a selection of columns in the dataset and/or joining data from annotations and/or splits objects.
Paragraph 0115 as noted indicates that a first dataset corresponding to a query is the virtual object, and the second machine learning model is trained either using the virtual object OR a first dataset corresponding to a query on the dataset.
If claim 8 is incorporated into claim 1, it is unclear whether the second machine learning model is trained using the virtual object and at least the subset of the dataset as recited in claim 1, or is based at least in part on a first dataset corresponding to a query on the dataset, the query being provided by the virtual object and the first dataset comprising the subset of the dataset as recited in claim 8.

For at least the reasons as noted, the rejection of claim 8 under 35 USC § 112(b) is sustained. Claims 18 & 19 include features analogous to claim 8. Claim 9 is a dependent of claim 8. The rejection of claims 9, 18 & 19 is sustained for at least the reasons as noted with regard to claim 8.

Claim Rejections - 35 USC § 102
Applicants’ arguments with respect to the rejection of claim 1 under 35 USC 102 have been fully considered but they are not persuasive. 
The applicants argued that claim 1 recites, in part, “training a second machine learning model using the virtual object and at least the subset of the dataset, wherein training the second machine learning model includes utilizing streaming file input/output (I/O), the streaming file I/O providing on demand access to at least the subset of the dataset during training.” The cited portions of Tappen do not disclose or suggest at least these features of independent claim 1… Tappen discloses that “a second feature transformation is performed using a second training set comprising at least some of the images that were used to train the first feature transformation.” However, the cited portions of Tappen do not expressly or inherently disclose any “streaming file input/output,” nor that “the streaming file I/O providing on-demand access to at least the subset of the dataset during training,” as recited in independent claim 1.
The examiner respectfully disagrees.
As taught in TAPPEN, a second feature transformation is trained using the cluster of labels and a cluster of  training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50), wherein training the second feature transformation is based on training inputs and training outputs, wherein the cluster of training set of images is accessed by the training inputs and outputs during training (TAPPEN, Col. 15-Lines 13[Wingdings font/0xE0]24).
The training inputs and outputs as taught in TAPPEN is considered as being equivalent to the claimed streaming file input/output (I/O), and access to the cluster of training set of images is considered as being equivalent to the claimed on-demand access.
In short, TAPPEN’s teaching reads on the wherein clause as recited in claim 1: wherein training the second machine learning model includes utilizing streaming file input/output (I/O), the streaming file I/O providing on-demand access to at least the subset of the dataset during training, e.g., training the second feature transformation is based on training inputs and training outputs, wherein the cluster of training set of images is accessed by the training inputs and outputs during training (TAPPEN, Col. 15-Lines 13[Wingdings font/0xE0]24).

Applicants’ arguments with respect to the rejection of claim 11 under 35 USC 102 have been fully considered but they are not persuasive. 
The applicants argued that the cited portions do not expressly or inherently disclose at least, for example, “providing a file system view of raw files from the subset of the dataset to provide streaming access to the raw files,” as recited in independent claim 11.
The examiner respectfully disagrees.
As taught in TAPPEN, at box 360, a second feature transformation is performed using a second training set, wherein the training inputs include the images. At box 370, training outputs are received (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29).
The training inputs and outputs as taught in TAPPEN is considered as being equivalent to the claimed a file system view of raw files.
The teaching in TAPPEN as noted reads on limitation providing a file system view of raw files from the subset of the dataset, e.g., training inputs/outputs of images from the at least one cluster of the training set of images is provided, to provide streaming access to the raw files, e.g., the purpose is to provide access to the images for training the second machine learning tool.  

Applicants’ arguments with respect to the rejection of claim 20 under 35 USC 102 have been fully considered but they are not persuasive. 
The applicants argued that Tappen discloses that “the second feature transformation may determine a subset of the labels or categories of the cluster 120 with which the image 110 is most likely associated.” However, the cited portions of Tappen do not expressly or inherently disclose “training a third machine learning model using another virtual object generated from the subset of the dataset object and another annotation object corresponding to another set of labels for the dataset object, the third machine learning model being independent of the second machine learning model,” as recited in independent claim 20.
The examiner respectfully disagrees.
TAPPEN’s teaching reads on the newly added limitation as shown below.

CLAIM 20
training a third machine learning model using 
another virtual object generated from the subset of the dataset object and 


another annotation object corresponding to another set of labels for the dataset object, 

the third machine learning model being independent of the second machine learning model.
TAPPEN et al.
a fourth feature transformation is trained (TAPPEN, Col. 20-Lines 6[Wingdings font/0xE0]11) using 
a category generated from an image of the training set of images at the third feature transformation (TAPPEN, Col. 20-Lines 6[Wingdings font/0xE0]11), 
a third set of labels corresponding to labels 644 for the training set of images (TAPPEN, Col. 20-Lines 6[Wingdings font/0xE0]11),
wherein the fourth feature transformation is an independent of the second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50).



Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 7-10 & 18-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 

Regarding claim 7, limitation generating, utilizing a different machine learning model, a second set of labels corresponding to the dataset, wherein the second set of labels is different than the set of labels generated by the machine learning model; filtering the dataset using a second set of conditions to generate at least a second subset of the dataset; generating a second virtual object based at least in part on the second subset of the dataset and the second set of labels; and training a third machine learning model using the second virtual object and at least the second subset of the dataset was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Regarding claim 8, limitation training the second machine learning model based at least in part on a first dataset corresponding to a query on the dataset provided by the virtual object; and validating the second machine learning model based at least in part on a second dataset corresponding to a second query on the dataset provided by the virtual object was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Dependent claim 9 is rejected for at least the reasons as noted with regard to claim 8.

Regarding claim 10, limitation the second machine learning model provides a prediction using a second dataset as input was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claims 18 & 19 include features analogous to claims 8 & 9. Claims 18 & 19 are rejected for at least the reasons as noted with regard to claims 8 & 9.

Regarding claim 20, limitation training a third machine learning model using another virtual object generated from the subset of the dataset object and another annotation object corresponding to another set of labels for the dataset object, the third machine learning model being independent of the second machine learning model was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8-9 & 18-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 8, it is unclear whether the second machine learning model is trained by using the virtual object and at least the subset of the dataset as recited in claim 1, or is based at least in part on a first dataset corresponding to a query on the dataset, the query being provided by the virtual object and the first dataset comprising the subset of the dataset as recited in claim 8. For at least the reasons as noted, the feature as recited in claim 8 is considered as optional feature.

Dependent claim 9 is rejected for at least the reasons as noted with regard to claim 8.

Claims 18 & 19 include features analogous to claims 8 & 9. Claims 18 & 19 are rejected for at least the reasons as noted with regard to claims 8 & 9.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 3-11 & 13-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by TAPPEN et al. [US 9,704,054 B1], hereinafter referred to as TAPPEN.

Regarding claims 1, 11 & 20, TAPPEN teaches a system and program to perform a method. The method as taught in TAPPEN reads on the method of claims 1, 11 & 20 as shown below.

CLAIM 1
A method comprising: 
generating a dataset based at least in part on a set of files; 

generating, utilizing a machine learning model, a set of labels corresponding to the dataset, wherein the machine learning model is pre-trained based at least in part on a portion of the dataset; 





filtering the dataset using a set of conditions to generate at least a subset of the dataset; 



generating a virtual object based at least in part on the subset of the dataset and the set of labels, 


wherein the virtual object corresponds to a selection of data from the dataset; and 


training a second machine learning model using the virtual object and at least the subset of the dataset, 




wherein training the second machine learning model includes utilizing streaming file input/output (I/O), the streaming file I/O providing on-demand access to at least the subset of the dataset during training.

CLAIM 11
A system comprising: 

a processor; 

a memory device containing instructions, which when executed by the processor cause the processor to: 

generate a dataset based at least in part on a set of files; 

generate, utilizing a machine learning model, a set of labels corresponding to the dataset, wherein the machine learning model is pre-trained based at least in part on a portion of the dataset; 




filter the dataset using a set of conditions to generate at least a subset of the dataset; 



generate a virtual object based at least in part on the subset of the dataset and the set of labels; and 
train a second machine learning model using the virtual object and at least the subset of the dataset, 




wherein to train the second machine learning model includes providing a file system view of raw files from the subset of the dataset to provide streaming access to the raw files.



CLAIM 20
A non-transitory computer-readable medium comprising instructions, which when executed by a computing device, cause the computing device to perform operations comprising: 


generating a dataset object based at least in part on a set of files; 

generating, utilizing a machine learning model, an annotation object corresponding to the dataset object, the annotation object corresponding to a set of labels for the dataset object, wherein the machine learning model is pre-trained based at least in part on a portion of the dataset object; 



filtering the dataset using a set of conditions to generate a split object, the split object corresponding to at least a subset of the dataset; 


generating a virtual object based at least in part on the subset of the dataset object and the annotation object; and 


training a second machine learning model using the virtual object and at least the split object;




training a third machine learning model using 
another virtual object generated from the subset of the dataset object and 


another annotation object corresponding to another set of labels for the dataset object, 

the third machine learning model being independent of the second machine learning model.
TAPPEN et al.
A method comprising: 
a training set of images is generated based on at least one set of images (TAPPEN, FIG. 3 & Col. 14-Lines 8[Wingdings font/0xE0]11); 
a set of labels corresponding to the training set of images is generated (TAPPEN, FIG. 3 & Col. 14-Lines 30[Wingdings font/0xE0]34) based on a first feature transformation (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), wherein the first feature transformation is a first machine learning tool (TAPPEN, FIG. 3 & Col. 2-Lines 18[Wingdings font/0xE0]27) that is trained based on the training set of images (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22); 
the training set of images is sorted using a set of attributes to generate at least one cluster of the training set of images (TAPPEN, FIG. 3 & Col. 7-Lines 53[Wingdings font/0xE0]57 & Col. 14-Lines 42[Wingdings font/0xE0]58); 
a cluster of labels is generated based on the at least one cluster of the training set of images and the set of labels (TAPPEN, FIG. 3 & Col. 8-Lines 30[Wingdings font/0xE0]42 & Col. 14-Lines 59[Wingdings font/0xE0]62), 
wherein the cluster of labels corresponds to selected images from the training set of images (TAPPEN, FIG. 4 & Col. 14-Lines 42[Wingdings font/0xE0]62); and 
a second feature transformation is trained using the cluster of labels and the at least one cluster of the training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50), 
wherein training the second feature transformation is based on training inputs and training outputs, wherein the cluster of training set of images is accessed by the training inputs and outputs during training (TAPPEN, Col. 15-Lines 13[Wingdings font/0xE0]24).

A system comprising: 
a processor (TAPPEN, Col. 9-Lines 39[Wingdings font/0xE0]55); 
a memory device containing instructions, which when executed by the processor cause the processor to (TAPPEN, Col. 13-Lines 23[Wingdings font/0xE0]36): 
a training set of images is generated based on at least one set of images (TAPPEN, FIG. 3 & Col. 14-Lines 8[Wingdings font/0xE0]11); 
a set of labels corresponding to images is generated (TAPPEN, FIG. 3 & Col. 14-Lines 30[Wingdings font/0xE0]34) based on a first feature transformation (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), wherein the first feature transformation is a first machine learning tool (TAPPEN, FIG. 3 & Col. 2-Lines 18[Wingdings font/0xE0]27) that is trained based on the training set of images (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22); 
the training set of images is sorted using a set of attributes to generate at least one cluster of the training set of images (TAPPEN, FIG. 3 & Col. 7-Lines 53[Wingdings font/0xE0]57 & Col. 14-Lines 42[Wingdings font/0xE0]58); 
 a cluster of labels is generated based on the at least one cluster of the training set of images and the set of labels (TAPPEN, FIG. 3 & Col. 8-Lines 30[Wingdings font/0xE0]42 & Col. 14-Lines 59[Wingdings font/0xE0]62), 
a second feature transformation is trained using the cluster of labels and the at least one cluster of the training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50), 
wherein training inputs/outputs of images from the at least one cluster of the training set of images is provided (TAPPEN, Col. 15-Lines 13[Wingdings font/0xE0]24), the purpose is to provide access to the images for training the second machine learning tool (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29).

A non-transitory computer-readable medium comprising instructions, which when executed by a computing device, cause the computing device to perform operations comprising (TAPPEN, Col. 13-Lines 23[Wingdings font/0xE0]36):
a training set of images is generated based on at least one set of images (TAPPEN, FIG. 3 & Col. 14-Lines 8[Wingdings font/0xE0]11);
a set of labels corresponding to the training set of images are generated (TAPPEN, FIG. 3 & Col. 14-Lines 30[Wingdings font/0xE0]34) based on a first feature transformation (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), wherein the first feature transformation is a first machine learning tool (TAPPEN, FIG. 3 & Col. 2-Lines 18[Wingdings font/0xE0]27) that is trained based on the training set of images (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22); 
the training set of images is sorted using a set of attributes to generate at least one cluster of the training set of images (TAPPEN, FIG. 3 & Col. 7-Lines 53[Wingdings font/0xE0]57 & Col. 14-Lines 42[Wingdings font/0xE0]58); 
a cluster of labels is generated based on the at least one cluster of the training set of images and the set of labels (TAPPEN, FIG. 3 & Col. 8-Lines 30[Wingdings font/0xE0]42 & Col. 14-Lines 59[Wingdings font/0xE0]62); 

a second feature transformation is trained using the cluster of labels and the at least one cluster of the training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50);
a fourth feature transformation is trained (TAPPEN, Col. 20-Lines 6[Wingdings font/0xE0]11) using 
a category generated from an image of the training set of images at the third feature transformation (TAPPEN, Col. 20-Lines 6[Wingdings font/0xE0]11), 
a third set of labels corresponding to labels 644 for the training set of images (TAPPEN, Col. 20-Lines 6[Wingdings font/0xE0]11),
wherein the fourth feature transformation is an independent of the second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50).



Regarding claims 3 & 13, TAPPEN further teaches that the set of files represents an abstraction of raw data that is stored remotely in cloud storage (TAPPEN, FIG. 4 & Col. 12-Lines 10[Wingdings font/0xE0]20), and the machine learning model is pre-trained (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), and the method further comprising: providing the second machine learning model for execution at a local electronic device or at a remote server (TAPPEN, Col. 13-Line 23[Wingdings font/0xE0]Col. 14-Line 20).

Regarding claims 4 & 14, TAPPEN further teaches that the set of labels comprises metadata corresponding to extracted features or supplementary properties of the dataset (TAPPEN, FIG. 4 & Col. 2-Lines 20[Wingdings font/0xE0]21).

Regarding claims 5 & 15, TAPPEN further teaches the step of creating a split object based at least in part on the filtering the dataset using the set of conditions, the split object comprising the subset of the dataset and a second subset of the dataset, e.g., a collection of data set is generated comprising training set and testing set (TAPPEN, Col. 14-Lines 21[Wingdings font/0xE0]41).

Regarding claims 6 & 16, TAPPEN further teaches that the subset of the dataset comprises training data and the second subset of the dataset comprises validation data, the training data and the validation data comprising respective mutually exclusive subsets of the dataset (TAPPEN, Col. 14-Lines 21[Wingdings font/0xE0]41).

Regarding claim 7, TAPPEN further teaches that the set of files include raw data that is used as inputs for evaluation of the machine learning model (TAPPEN, Col. 14-Lines 34[Wingdings font/0xE0]41), and further comprising: generating, utilizing a different machine learning model, a second set of labels corresponding to the dataset, wherein the second set of labels is different than the set of labels generated by the machine learning model; filtering the dataset using a second set of conditions to generate at least a second subset of the dataset; generating a second virtual object based at least in part on the second subset of the dataset and the second set of labels; and training a third machine learning model using the second virtual object and at least the second subset of the dataset (TAPPEN, FIG. 5, Col. 17-Line 24[Wingdings font/0xE0]Col. 18-Line 56).

Regarding claims 8, 9, 18 & 19, the features as recited are optional features. Therefore, whether TAPPEN discloses the features as recited in claims 8, 9, 18 & 19, TAPPEN’s teaching still reads on the claimed invention.

Regarding claim 10, TAPPEN further teaches that the second machine learning model provides a prediction using a second dataset as input (TAPPEN, Col. 7-Lines 16[Wingdings font/0xE0]22).  

Regarding claim 17, TAPPEN further teaches that the set of files includes raw data that is used as inputs for evaluation of the machine learning model (TAPPEN, Col. 14-Lines 34[Wingdings font/0xE0]41).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 2 & 12 are rejected under 35 U.S.C. 103 as being unpatentable over TAPPEN et al. [US 9,704,054 B1], hereinafter referred to as TAPPEN, in view of LIU et al. [US 2014/0108471 A1], hereinafter referred to as LIU.

Regarding claim 2, TAPPEN does not explicitly teach the step of performing a mount command to provide access to raw files from the subset of the dataset, the mount command enabling streaming access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations.
LIU teaches that a mount command is performed to provide access to raw files from the subset of the dataset, the mount command enabling streaming access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations (LIU, [0014]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to include the teaching in LIU into TAPPEN in order to manage access to one or more storage devices. 

Regarding claim 12, TAPPEN does not explicitly teach the limitation perform a mount command to provide access to raw files from the subset of the dataset in a logical file system, wherein the mount command provides the file system view of the raw files, the file system view enabling access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations.
LIU teaches that a mount command is performed to provide access to raw files from the subset of the dataset in a logical file system, wherein the mount command provides the file system view of the raw files, the file system view enabling access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations (LIU, [0014]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to include the teaching in LIU into TAPPEN in order to manage access to one or more storage devices. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNG Q. PHAM whose telephone number is (571)272-4040. The examiner can normally be reached Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D. Reyes can be reached on 571-270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

HUNG Q. PHAM
Primary Examiner
Art Unit 2159

/HUNG Q PHAM/Primary Examiner, Art Unit 2159                                                                                                                                                                                            September 30, 2022