Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable US 2017/0256254 A1 to Huang et al., hereinafter, “Huang”.
Claim 1. A method for machine learning, comprising: training, on a model generator, a model having a plurality of layers, Huang [Abstract] teaches the technology described herein uses a modular model to process speech. A deep learning based acoustic model comprises a stack of different types of neural network layers. 

Huang [0005] teaches a deep learning acoustic model comprises a stack of different types of neural network layers (e.g. fully connected layers, convolution layers, long short term memory cell layer) or their combination. The layers can be organized in a feed-forward or recurrent network structure. 

wherein at least one of the layers comprises both a trainable portion and a shared portion; Huang [FIG. 1] and [FIG.3]

and transmitting the model from the model generator to a model executor. Huang [0050] teaches example system 100 includes client devices 102 and 104, which may comprise any type of computing device where it is desirable to have an automatic speech recognition (ASR) system on the device or interact with a server-based ASR system. For example, in one embodiment, client devices 102 and 104 may be one type of computing device described in relation to FIG. 10 herein. By way of example and not limitation, a user device may be embodied as a personal data assistant (PDA), a mobile device, smartphone, smart watch, smart glasses (or other wearable smart device), augmented reality headset, virtual reality headset, a laptop, a tablet, remote control, entertainment system, vehicle computer system, embedded system controller, appliance, home computer system, security system, consumer electronic device, or other similar electronics device.

Huang [0177] teaches an NUI processes air gestures, voice, or other physiological inputs generated by a user. Appropriate NUI inputs may be interpreted as ink strokes for presentation in association with the computing device 1000. These requests may be transmitted to the appropriate network element for further processing. An NUI implements any combination of speech recognition, touch and stylus recognition, facial recognition, biometric recognition, gesture recognition both on screen and adjacent to the screen, air gestures, head and eye tracking, and touch recognition associated with displays on the computing device 1000. The computing device 1000 may be equipped with depth cameras, such as stereoscopic camera systems, infrared camera systems, RGB camera systems, and combinations of these, for gesture detection and recognition. Additionally, the computing device 1000 may be equipped with accelerometers or gyroscopes that enable detection of motion. The output of the accelerometers or gyroscopes may be provided to the display of the computing device 1000 to render immersive augmented reality or virtual reality.

It would have been obvious, before the effective filing date of the claimed invention, to one of ordinary skill in the art to modify and combine the embodiments of Huang. One skilled in the art would have been motivated to modify the embodiments in this manner because it would allow different processes to achieve optimal results and would not cause significant change to the design. The embodiments overcome the challenge for a single acoustic model to accurately identify sounds across a plurality of environments and speakers (Huang [0002])

Claim 2. Huang further teaches wherein the at least one of the layers is implemented in a deep learning framework. Huang [0005] teaches a deep learning acoustic model comprises a stack of different types of neural network layers (e.g. fully connected layers, convolution layers

Claim 3. Huang further teaches wherein the trainable portion comprises weights of a feedforward network. Huang [0005] teaches a deep learning acoustic model comprises a stack of different types of neural network layers (e.g. fully connected layers, convolution layers, long short term memory cell layer) or their combination. The layers can be organized in a feed-forward or recurrent network structure. 

Claim 4. Huang further teaches wherein the shared portion comprises weights of a feedforward network. Huang [0005] teaches a deep learning acoustic model comprises a stack of different types of neural network layers (e.g. fully connected layers, convolution layers, long short term memory cell layer) or their combination. The layers can be organized in a feed-forward or recurrent network structure. 
 
Claim 5. Huang further teaches wherein the model is used for a classification task. 
Huang [0065] teaches the decoding component 128 applies the trained model to categorize audio data.

Huang [0069] teaches decoder 260 comprises an acoustic model (AM) 265 and a language model (LM) 270. AM 265 can use a modular model to extract features for individual speakers from the features 258 provided…The LM 270 receives the corpus of words, acoustic units, in some instances with associated scores, and determines a recognized speech 280, which may comprise words, entities (classes), or phrases.

Claim 6. Huang further teaches wherein transmitting the model from the model generator to a model executor comprises transmitting the model to a smart phone via a network. Huang [0050] teaches example system 100 includes client devices 102 and 104, which may comprise any type of computing device where it is desirable to have an automatic speech recognition (ASR) system on the device or interact with a server-based ASR system. For example, in one embodiment, client devices 102 and 104 may be one type of computing device described in relation to FIG. 10 herein. By way of example and not limitation, a user device may be embodied as a personal data assistant (PDA), a mobile device, smartphone, smart watch, smart glasses (or other wearable smart device), augmented reality headset, virtual reality headset, a laptop, a tablet, remote control, entertainment system, vehicle computer system, embedded system controller, appliance, home computer system, security system, consumer electronic device, or other similar electronics device.

Claim 7. It differs from claim 1 in that it is an apparatus performing the method of claim 1. Therefore claim 7 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 

Claim 8. It differs from claim 2 in that it is an apparatus performing the method of claim 2. Therefore claim 8 has been analyzed and reviewed in the same way as claim 2. See the above analysis. 

Claim 9. It differs from claim 3 in that it is an apparatus performing the method of claim 3. Therefore claim 9 has been analyzed and reviewed in the same way as claim 3. See the above analysis. 

Claim 10. It differs from claim 4 in that it is an apparatus performing the method of claim 4. Therefore claim 10 has been analyzed and reviewed in the same way as claim 4. See the above analysis. 

Claim 11. It differs from claim 5 in that it is an apparatus performing the method of claim 5. Therefore claim 11 has been analyzed and reviewed in the same way as claim 5. See the above analysis. 

Claim 12. It differs from claim 6 in that it is an apparatus performing the method of claim 6. Therefore claim 12 has been analyzed and reviewed in the same way as claim 6. See the above analysis. 

Claim 13. Huang further teaches wherein the model trainer comprises one or more graphics processing units (GPUs). Huang [0050] teaches example system 100 includes client devices 102 and 104, which may comprise any type of computing device where it is desirable to have an automatic speech recognition (ASR) system on the device or interact with a server-based ASR system. For example, in one embodiment, client devices 102 and 104 may be one type of computing device described in relation to FIG. 10 herein. By way of example and not limitation, a user device may be embodied as a personal data assistant (PDA), a mobile device, smartphone, smart watch, smart glasses (or other wearable smart device), augmented reality headset, virtual reality headset, a laptop, a tablet, remote control, entertainment system, vehicle computer system, embedded system controller, appliance, home computer system, security system, consumer electronic device, or other similar electronics device. It is understood in the art that smartphone, smart watches, smart glasses (and other smart devices) are equipped with GPU’s.

Huang [0177] teaches the computing device 1000 may be equipped with depth cameras, such as stereoscopic camera systems, infrared camera systems, RGB camera systems, and combinations of these, for gesture detection and recognition. Additionally, the computing device 1000 may be equipped with accelerometers or gyroscopes that enable detection of motion. The output of the accelerometers or gyroscopes may be provided to the display of the computing device 1000 to render immersive augmented reality or virtual reality.


Claim 14. A non-transitory machine readable medium having instructions stored thereon, the instructions when executed causing a machine to: receive, from a model generator and via a network, a model having a plurality of layers, wherein at least one of the layers comprises both a trainable portion and a shared portion; and process data through inference using the received model.  

Claim 15. It differs from claim 2 in that it is the non-transitory machine readable medium having instructions stored thereon, the instructions when executed causing a machine to perform the method of claim 2. Therefore claim 15 has been analyzed and reviewed in the same way as claim 2. See the above analysis. 

Claim 16. It differs from claim 3 in that it is the non-transitory machine readable medium having instructions stored thereon, the instructions when executed causing a machine to perform the method of claim 3. Therefore claim 16 has been analyzed and reviewed in the same way as claim 3. See the above analysis.
 
Claim 17. It differs from claim 4 in that it is the non-transitory machine readable medium having instructions stored thereon, the instructions when executed causing a machine to perform the method of claim 4. Therefore claim 17 has been analyzed and reviewed in the same way as claim 4. See the above analysis.

Claim 18. It differs from claim 5 in that it is the non-transitory machine readable medium having instructions stored thereon, the instructions when executed causing a machine to perform the method of claim 5. Therefore claim 18 has been analyzed and reviewed in the same way as claim 5. See the above analysis.

Claim 19. It differs from claim 6 in that it is the non-transitory machine readable medium having instructions stored thereon, the instructions when executed causing a machine to perform the method of claim 6. Therefore claim 19 has been analyzed and reviewed in the same way as claim 6. See the above analysis.

Claim 20. Huang further teaches wherein the data is not transmitted outside the machine during the inference. Huang [0034] teaches As an alternative to discrete or continuous signals, the external signals can be alternatively classified into deterministic or non-deterministic. As the deterministic signal is available before recognizing the utterance, sub-modules can be applied in the 1st-pass decoding. The signal can be obtained through user or system setting (user check non-native box, user check male/female box; system set microphone type, bluetooth connection, modularization user ID (MUID), location, etc.). The deterministic signal can also be inferred. For example, a detected location change at 60 mile/hr can be used to infer a driving mode. A name/phonebook/search history can be used to infer a gender/age. A GPS data signal can be used to activate a location dependent sub-module.

Huang [0035] teaches the signal can also be processed using a nondeterministic algorithm. A nondeterministic algorithm is an algorithm that, even for the same input, can exhibit different behaviors on different runs, as opposed to a deterministic algorithm. As the non-deterministic signal can utilize online computation, context specific sub-modules can be applied in the 2nd-pass decoding when a non-deterministic signal is used. The signal can be obtained through online computation and inference, (e.g. iCluster, sCluster, noise-level (SNR), gender/age detection, accent detection.)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DELOMIA L GILLIARD whose telephone number is (571)272-1681. The examiner can normally be reached 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on 571 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DELOMIA L GILLIARD/Primary Examiner, Art Unit 2661