DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 7-10 & 18-19 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 

Regarding claim 7, limitation generating, utilizing a different machine learning model, a second set of labels corresponding to the dataset, wherein the second set of labels is different than the set of labels generated by the machine learning model; filtering the dataset using a second set of conditions to generate at least a second subset of the dataset; generating a second virtual object based at least in part on the second subset of the dataset and the second set of labels; and training a third machine learning model using the second virtual object and at least the second subset of the dataset was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Regarding claim 8, limitation training the second machine learning model based at least in part on a first dataset corresponding to a query on the dataset provided by the virtual object; and validating the second machine learning model based at least in part on a second dataset corresponding to a second query on the dataset provided by the virtual object was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Dependent claim 9 is rejected for at least the reasons as noted with regard to claim 8.

Regarding claim 10, limitation the second machine learning model provides a prediction using a second dataset as input was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claims 18 & 19 include features analogous to claims 8 & 9. Claims 18 & 19 are rejected for at least the reasons as noted with regard to claims 8 & 9.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8-9 & 18-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 8, 
1.	It is unclear whether the second machine learning model is trained by using the virtual object and at least the subset of the dataset as recited in claim 1, or is based at least in part on a first dataset corresponding to a query on the dataset provided by the virtual object as recited in claim 8. For at least the reasons as noted, the feature as recited in claim 8 is considered as optional feature; 
2.	It is unclear whether dataset is generated based at least in part on a set of files (i.e., generating a dataset based at least in part on a set of files) as recited in claim 1 or provided by the virtual object (i.e., the dataset provided by the virtual object) as recited in claim 8.

Dependent claim 9 is rejected for at least the reasons as noted with regard to claim 8.

Claims 18 & 19 include features analogous to claims 8 & 9. Claims 18 & 19 are rejected for at least the reasons as noted with regard to claims 8 & 9.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 3-11 & 13-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by TAPPEN et al. [US 9,704,054 B1], hereinafter referred to as TAPPEN.

Regarding claims 1, 11 & 20, TAPPEN teaches a system and program to perform a method. The method as taught in TAPPEN reads on the method of claims 1, 11 & 20 as shown below.

CLAIM 1
A method comprising: 
generating a dataset based at least in part on a set of files; 

generating, utilizing a machine learning model, a set of labels corresponding to the dataset, wherein the machine learning model is pre-trained based at least in part on a portion of the dataset; 





filtering the dataset using a set of conditions to generate at least a subset of the dataset; 



generating a virtual object based at least in part on the subset of the dataset and the set of labels, 


wherein the virtual object corresponds to a selection of data from the dataset; and 


training a second machine learning model using the virtual object and at least the subset of the dataset, 




wherein training the second machine learning model includes utilizing streaming file input/output (I/O), the streaming file I/O providing access to at least the subset of the dataset during training.

CLAIM 11
A system comprising: 
a processor; 

a memory device containing instructions, which when executed by the processor cause the processor to: 

generate a dataset based at least in part on a set of files; 

generate, utilizing a machine learning model, a set of labels corresponding to the dataset, wherein the machine learning model is pre-trained based at least in part on a portion of the dataset; 





filter the dataset using a set of conditions to generate at least a subset of the dataset; 



generate a virtual object based at least in part on the subset of the dataset and the set of labels; and 


train a second machine learning model using the virtual object and at least the subset of the dataset, 




wherein to train the second machine learning model includes providing a file system view of raw files from the subset of the dataset.

CLAIM 20
A non-transitory computer-readable medium comprising instructions, which when executed by a computing device, cause the computing device to perform operations comprising: 

generating a dataset object based at least in part on a set of files; 

generating, utilizing a machine learning model, an annotation object corresponding to the dataset object, the annotation object corresponding to a set of labels for the dataset object, wherein the machine learning model is pre-trained based at least in part on a portion of the dataset object; 



filtering the dataset using a set of conditions to generate a split object, the split object corresponding to at least a subset of the dataset; 


generating a virtual object based at least in part on the subset of the dataset object and the annotation object; and 


training a second machine learning model using the virtual object and at least the split object.
TAPPEN et al.
A method comprising: 
a training set of images is generated based on at least one set of images (TAPPEN, FIG. 3 & Col. 14-Lines 8[Wingdings font/0xE0]11); 
a set of labels corresponding to the training set of images are generated (TAPPEN, FIG. 3 & Col. 14-Lines 30[Wingdings font/0xE0]34) based on a first feature transformation (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), wherein the first feature transformation is a first machine learning tool (TAPPEN, FIG. 3 & Col. 2-Lines 18[Wingdings font/0xE0]27) that is trained based on the training set of images (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22); 
the training set of images is sorted using a set of attributes to generate at least one cluster of the training set of images (TAPPEN, FIG. 3 & Col. 7-Lines 53[Wingdings font/0xE0]57 & Col. 14-Lines 42[Wingdings font/0xE0]58); 
a cluster of labels is generated based on the at least one cluster of the training set of images and the set of labels (TAPPEN, FIG. 3 & Col. 8-Lines 30[Wingdings font/0xE0]42 & Col. 14-Lines 59[Wingdings font/0xE0]62), 
wherein the cluster of labels corresponds to selected images from the training set of images (TAPPEN, FIG. 4 & Col. 14-Lines 42[Wingdings font/0xE0]62); and 
a second feature transformation is trained using the cluster of labels and the at least one cluster of the training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50), 
wherein training the second feature transformation is based on training data input/output (I/O), the training data I/O providing access to the at least one cluster of the training set of images during training (TAPPEN, Col. 15-Lines 13[Wingdings font/0xE0]24).

A system comprising: 
a processor (TAPPEN, Col. 9-Lines 39[Wingdings font/0xE0]55); 
a memory device containing instructions, which when executed by the processor cause the processor to (TAPPEN, Col. 13-Lines 23[Wingdings font/0xE0]36): 
a training set of images is generated based on at least one set of images (TAPPEN, FIG. 3 & Col. 14-Lines 8[Wingdings font/0xE0]11); 
a set of labels corresponding to the training set of images are generated (TAPPEN, FIG. 3 & Col. 14-Lines 30[Wingdings font/0xE0]34) based on a first feature transformation (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), wherein the first feature transformation is a first machine learning tool (TAPPEN, FIG. 3 & Col. 2-Lines 18[Wingdings font/0xE0]27) that is trained based on the training set of images (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22); 
the training set of images is sorted using a set of attributes to generate at least one cluster of the training set of images (TAPPEN, FIG. 3 & Col. 7-Lines 53[Wingdings font/0xE0]57 & Col. 14-Lines 42[Wingdings font/0xE0]58); 
 a cluster of labels is generated based on the at least one cluster of the training set of images and the set of labels (TAPPEN, FIG. 3 & Col. 8-Lines 30[Wingdings font/0xE0]42 & Col. 14-Lines 59[Wingdings font/0xE0]62), 
a second feature transformation is trained using the cluster of labels and the at least one cluster of the training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50), 
wherein training data input/output (I/O) is provided to access to the at least one cluster of the training set of images during training (TAPPEN, Col. 15-Lines 13[Wingdings font/0xE0]24).

A non-transitory computer-readable medium comprising instructions, which when executed by a computing device, cause the computing device to perform operations comprising (TAPPEN, Col. 13-Lines 23[Wingdings font/0xE0]36):
a training set of images is generated based on at least one set of images (TAPPEN, FIG. 3 & Col. 14-Lines 8[Wingdings font/0xE0]11);
a set of labels corresponding to the training set of images are generated (TAPPEN, FIG. 3 & Col. 14-Lines 30[Wingdings font/0xE0]34) based on a first feature transformation (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), wherein the first feature transformation is a first machine learning tool (TAPPEN, FIG. 3 & Col. 2-Lines 18[Wingdings font/0xE0]27) that is trained based on the training set of images (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22); 
the training set of images is sorted using a set of attributes to generate at least one cluster of the training set of images (TAPPEN, FIG. 3 & Col. 7-Lines 53[Wingdings font/0xE0]57 & Col. 14-Lines 42[Wingdings font/0xE0]58); 
a cluster of labels is generated based on the at least one cluster of the training set of images and the set of labels (TAPPEN, FIG. 3 & Col. 8-Lines 30[Wingdings font/0xE0]42 & Col. 14-Lines 59[Wingdings font/0xE0]62); 
a second feature transformation is trained using the cluster of labels and the at least one cluster of the training set of images (TAPPEN, Col. 15-Lines 5[Wingdings font/0xE0]29), wherein second feature transformation is a second machine learning tool (TAPPEN, Col. 2-Lines 44[Wingdings font/0xE0]50).



Regarding claims 3 & 13, TAPPEN further teaches that the set of files represents an abstraction of raw data that is stored remotely in cloud storage (TAPPEN, FIG. 4 & Col. 12-Lines 10[Wingdings font/0xE0]20), and the machine learning model is pre-trained (TAPPEN, FIG. 3 & Col. 14-Lines 21[Wingdings font/0xE0]22), and the method further comprising: providing the second machine learning model for execution at a local electronic device or at a remote server (TAPPEN, Col. 13-Line 23[Wingdings font/0xE0]Col. 14-Line 20).

Regarding claims 4 & 14, TAPPEN further teaches that the set of labels comprises metadata corresponding to extracted features or supplementary properties of the dataset (TAPPEN, FIG. 4 & Col. 2-Lines 20[Wingdings font/0xE0]21).

Regarding claims 5 & 15, TAPPEN further teaches the step of creating a split object based at least in part on the filtering the dataset using the set of conditions, the split object comprising the subset of the dataset and a second subset of the dataset, e.g., a collection of data set is generated comprising training set and testing set (TAPPEN, Col. 14-Lines 21[Wingdings font/0xE0]41).

Regarding claims 6 & 16, TAPPEN further teaches that the subset of the dataset comprises training data and the second subset of the dataset comprises validation data, the training data and the validation data comprising respective mutually exclusive subsets of the dataset (TAPPEN, Col. 14-Lines 21[Wingdings font/0xE0]41).

Regarding claim 7, TAPPEN further teaches that the set of files include raw data that is used as inputs for evaluation of the machine learning model (TAPPEN, Col. 14-Lines 34[Wingdings font/0xE0]41), and further comprising: generating, utilizing a different machine learning model, a second set of labels corresponding to the dataset, wherein the second set of labels is different than the set of labels generated by the machine learning model; filtering the dataset using a second set of conditions to generate at least a second subset of the dataset; generating a second virtual object based at least in part on the second subset of the dataset and the second set of labels; and training a third machine learning model using the second virtual object and at least the second subset of the dataset (TAPPEN, FIG. 5, Col. 17-Line 24[Wingdings font/0xE0]Col. 18-Line 56).

Regarding claims 8, 9, 18 & 19, the features as recited are optional features. Therefore, whether TAPPEN discloses the features as recited in claims 8, 9, 18 & 19, TAPPEN’s teaching still reads on the claimed invention.

Regarding claim 10, TAPPEN further teaches that the second machine learning model provides a prediction using a second dataset as input (TAPPEN, Col. 7-Lines 16[Wingdings font/0xE0]22).  

Regarding claim 17, TAPPEN further teaches that the set of files includes raw data that is used as inputs for evaluation of the machine learning model (TAPPEN, Col. 14-Lines 34[Wingdings font/0xE0]41).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 2 & 12 are rejected under 35 U.S.C. 103 as being unpatentable over TAPPEN et al. [US 9,704,054 B1], hereinafter referred to as TAPPEN, in view of LIU et al. [US 2014/0108471 A1], hereinafter referred to as LIU.

Regarding claim 2, TAPPEN does not explicitly teach the step of performing a mount command to provide access to raw files from the subset of the dataset, the mount command enabling streaming access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations.
LIU teaches that a mount command is performed to provide access to raw files from the subset of the dataset, the mount command enabling streaming access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations (LIU, [0014]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to include the teaching in LIU into TAPPEN in order to manage access to one or more storage devices. 

Regarding claim 12, TAPPEN does not explicitly teach the limitation perform a mount command to provide access to raw files from the subset of the dataset in a logical file system, wherein the mount command provides the file system view of the raw files, the file system view enabling access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations.
LIU teaches that a mount command is performed to provide access to raw files from the subset of the dataset in a logical file system, wherein the mount command provides the file system view of the raw files, the file system view enabling access to different raw files in one or more machine learning frameworks or stored in one or more respective storage locations (LIU, [0014]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to include the teaching in LIU into TAPPEN in order to manage access to one or more storage devices. 





















Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNG Q. PHAM whose telephone number is (571)272-4040. The examiner can normally be reached Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D. Reyes can be reached on 571-270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

HUNG Q. PHAM
Primary Examiner
Art Unit 2159

/HUNG Q PHAM/Primary Examiner, Art Unit 2159                                                                                                                                                                                                        June 8, 2022