Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/03rd/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner. 

Response to Arguments
Applicant’s arguments, see Remarks page 8, filed 08/24th/2022, with respect to claims 6, 11, and 17 rejection under 35 USC § 112(b) have been fully considered and are not persuasive. The network parameters W, W*, U, and U* are defined as GRU network parameters and a GRU network has a number of network parameters that is larger than four. Furthermore, the claimed invention does not disclose that the network parameters in question are training weights for the GRU network nor does it disclose how many of the parameters are used nor does it disclose what the values are.

Applicant’s arguments, see Remarks pages 9-13, filed 08/24th/2022, with respect to independent claim 1 rejection under 35 USC § 103 have been fully considered and are not persuasive.
Regarding independent claim 1 amended limitation wherein the obtaining the user behavior sample in the preset time period comprises: sorting the at least two applications according to a usage frequency of the at least two applications in the preset time period; determining at least two target applications according to a sorting result; and determining the association record of usage timing according to usage status of the at least two target applications as the user behavior sample. Merry teaches [0155-0166] sorting apps by the average pre-fetch benefit score for each app which is calculated based on its Pre-fetchBenefitHistory which is a list of scores that are calculated from the number of cache hits. Then apps are selected based on the sorting of the pre-fetch benefit score for pre-fetching. Those cache hits represent the number of times the app was used and are considered by the examiner to be the claimed usage frequency. Merry also discloses an association between apps and certain data describing the apps’ usage timing such as LastSwitchTime and LastPreFetchTime. The examiner considers such data to be the claimed usage timing. 
Regarding independent claim 1 limitation training a preset gated recurrent unit (GRU) neural network model according to the plurality of association record groups of usage timing, since Tan’s model uses labeled data to train algorithms that classify data or predict outcomes then it’s considered to be a supervised learning model. Tan and Merry both use prediction models for session-based recommendations and pre-fetching applications.
Regarding independent claim 1 amended limitation determining a number of units on an input layer of the application prediction model according to a vector dimension of each of the association record groups of usage timing, Merry teaches a prediction model that uses a number of inputs that represent data about the apps. The examiner considers that number of inputs to be the claimed number of units on an input layer of the application prediction model.
Dependent Claims 3-5, 9 and 21 are dependent on amended claim 1 and are rejected based on the 35 USC § 103 rejection below.

Applicant’s arguments, see Remarks page 13, filed 08/24th/2022, with respect to independent claim 12 rejection under 35 USC § 103 have been fully considered and are not persuasive. Independent claim 12 recites apparatus elements similar to those described above with respect to amended claim 1. As such, amended claim 12 is rejected under 35 USC § 103 for at least the same reasons as claim 1 and as shown below.
Dependent Claims 13-16 and 22 are dependent on amended claim 12 and are rejected based on the 35 USC § 103 rejection shown below.

Applicant’s arguments, see Remarks page 13, filed 08/24th/2022, with respect to dependent claims 6 and 17 rejection under 35 USC § 103 have been fully considered and are not persuasive.  The claims are rejected under 35 USC § 103 as shown below.

Applicant’s arguments, see Remarks page 14, filed 08/24th/2022, with respect to dependent claim 19 rejection under 35 USC § 103 have been fully considered and are not persuasive.  The claim is rejected under 35 USC § 103 as shown below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 6, 11, and 17 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 6, 11, and 17: the network parameters W, W*, U, and U* are indefinite. The instant specification does not disclose a definition or value of the network parameters W, W*, U, and U*. Since one of ordinary skill in the art would not be able to determine the definition or value of said network parameters, it would be impossible to quantify how a determination could be made based on an indefinite determination. In the interest of further examination the network parameters W, W*, U, and U* are interpreted as having the same definition and values as the prior art.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7, 9-10, 12-16, 18, 20 are rejected under 35 U.S.C. 103 as being anticipated by Merry (US20140373032A1) in view of Tan (Improved Recurrent Neural Networks for Session-based Recommendations).

Regarding claim 1 Merry teaches An application prediction method, performed by a processor executing instructions stored on a memory, wherein the method comprises: obtaining a user behavior sample in a preset time period, wherein the user behavior sample comprises an association record of usage timing of at least two applications, wherein the association record of usage timing comprises a usage record of the at least two applications and a usage timing relationship of the at least two applications. ([0160] 1. Retrieve last switch time and last pre-fetch time for each app that uses the feature. The examiner notes that Merry teaches [Fig. 6] retrieving the usage record of multiple applications at certain times.)
Grouping the association record of usage timing to obtain a plurality of association record groups of usage timing [(0109] In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods.)
Generate an application prediction model. [(0118] The adaptive predictive engine may continue to process these cases---e.g., to provide predictions for the Prediction Window (which is shown as a desired time period past the current time). The examiner considers providing predictions requires generating a prediction model. The examiner interprets Merry’s ability to provide predictions means their work requires generating prediction models.)
wherein the obtaining the user behavior sample in the preset time period comprises: sorting the at least two applications according to a usage frequency of the at least two applications in the preset time period ([0163] 4. Calculate the average pre-fetch benefit score for each app based on its Pre-fetchBenefitHistory. Sort apps by average benefit score. The examiner notes that Merry teaches calculating a score for each app based on the number of times it was successfully cached or pre-fetched [0158].)
determining at least two target applications according to a sorting result ([0166] 7. If the limit determined in step 5 has not yet been reached, launch pre-fetch background tasks in order of the calculated average pre-fetch benefit score for as many apps as possible up to the limit. The examiner notes that Merry teaches pre-launching multiple apps based on the order of their sorted pre-fetch benefit scores).
determining the association record of usage timing according to usage status of the at least two target applications as the user behavior sample. ([0155-0157] To aid the pre-fetch process, the following information may be tracked by the pre-fetch service for each app that uses pre-fetch: 1. LastSwitchTime-the timestamp at which the app was last switched to (launched or resumed). 2. LastPre-fetchTime-the timestamp at which the last pre-fetch for the app completed. The examiner notes that Merry teaches tracing the usage data of multiple application to aid in the pre-fetch process.)
wherein the method further comprises: determining a number of units on an input layer of the application prediction model according to a vector dimension of each of the association record groups of usage timing ([0135-0139] In some embodiments, this prediction model may use the following inputs: 1- Foreground switches - this tells when an app is foreground versus background. The present system may have the ability to know when the classic desktop is up. 2- User away as provided by typically provided. 3-  Log off as available as subscribe events today. 4- App install/uninstall. The examiner notes that Merry teaches a prediction model that uses multiple inputs. The examiner also notes that Merry inherently determines the number of inputs the prediction model input layer requires to process those inputs. The examiner interprets this number of inputs to be the claimed number of units on an input layer of the application prediction model).
determining a number of units on an output layer of the application prediction model according to a number of the applications. ([0112] Prediction engine module may receive activity data of a given app's lifecycle (e.g., the number of times an app is activated by a user, the time of day of activation, length of time of activation, and the like). These uses of an app may form a set of "cases" of use of an app. Each case may be assessed a calculated, predicted and/or estimated probability of future and/or potential activation. The examiner notes that Merry teaches a model that predicts probabilities for set of cases for apps. The examiner also notes that Merry inherently determines a number of outputs to procure those probabilities of cases for each app. Finally, the examiner interprets this number of outputs to be the claimed number of units on an output layer of the application prediction model).
However, Merry fails to explicitly teach training a preset gated recurrent unit (GRU) neural network model according to the plurality of association record groups of usage timing. 
On the other hand, Tan teaches training a preset gated recurrent unit (GRU) neural network model according to the plurality of association record groups of usage timing. ([Page. 4, Para. 7] We used one recurrent (GRU) layer in all our models as we found that additional layers did not improve performance. The GRU was set at 100 and 1000 hidden units for each model. The models are defined and trained in Keras [3] and Theano [26] on a GeForce GTX Titan Black GPU. The examiner notes that Tan teaches training an initialized model [page 3, para. 6] which is selected to be a GRU NN model [Page 4, Para 7]. The examiner also notes that Merry teaches training a model based on past user behavior to predict future behavior [0131]. The examiner also notes that Merry and Tan are considered to be analogous because they are in the same field of supervised learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate training a preset gated recurrent unit (GRU) neural network model according to the plurality of association record groups of usage timing as taught by Tan to improve model performance compared to other types of neural networks [Page 2, Para. 10].)

Regarding claim 3, Merry teaches The method according to claim 1, wherein the determining the association record of usage timing according to the usage status of the at least two target applications comprises: sampling a usage log of the at least two target applications in accordance with a preset sampling period to determine whether the at least two target applications are in a usage state at sampling instants ([Page 6, Fig. 6] The examiner notes that Merry teaches sampling of case 1 during 00:11-07:24. During this sampling, the usage status of three applications A, B, and C is examined.)
associating the usage status of the at least two target applications according to the sampling instants and the usage status so as to determine the association record of usage timing. ([0109] In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods. It will be appreciated that it is also possible to create cases using other groupings such as current app switch, previous app switch, and any period that falls within the prediction window after the app switch. The examiner notes that Merry teaches associating usage times and applications according to the sampling over certain time intervals according to [Fig. 6].)

Regarding claim 4, Merry teaches The method according to claim 3. However, Merry fails to explicitly teach wherein the training the preset GRU neural network model according to the plurality of association record groups of usage timing comprises: training the preset GRU neural network model according to the usage status corresponding to the sampling instants in the plurality of association record groups of usage timing.
On the other hand, Tan teaches wherein the training the preset GRU neural network model according to the plurality of association record groups of usage timing comprises: training the preset GRU neural network model according to the usage status corresponding to the sampling instants in the plurality of association record groups of usage timing. ([Page. 4, Para. 7] We used one recurrent (GRU) layer in all our models as we found that additional layers did not improve performance. The GRU was set at 100 and 1000 hidden units for each model. The models are defined and trained in Keras [3] and Theano [26] on a GeForce GTX Titan Black GPU. The examiner notes that Tan teaches training an initialized model [page 3, para. 6] which is selected to be a GRU NN model [Page 4, Para 7]. The examiner also notes that Merry teaches training a model based on past user behavior to predict future behavior [0131]. The examiner also notes that Merry and Tan are considered to be analogous because they are in the same field of supervised learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein the training the preset GRU neural network model according to the plurality of association record groups of usage timing comprises: training the preset GRU neural network model according to the usage status corresponding to the sampling instants in the plurality of association record groups of usage timing as taught by Tan to improve model performance compared to other types of neural networks [Page 2, Para. 10].)

Regarding claim 5, Merry teaches The method according to claim 4, wherein the grouping the association record of usage timing to obtain the plurality of association record groups of usage timing comprises: using an association record of usage timing of applications corresponding to first n sampling instants as a first association record group of usage timing, using an association record of usage timing of applications corresponding to the second to the n+lth sampling instants as a second association record group of usage timing, and so on, to obtain m-n+1 association record groups of usage timing, wherein n is a natural number greater than or equal to 2, and m is a natural number greater than 3. ([0109] In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods. It will be appreciated that it is also possible to create cases using other groupings such as current app switch, previous app switch, and any period that falls within the prediction window after the app switch. The examiner notes that Merry teaches associating usage times and multiple applications (e.g. applications A, B, and C) according to the sampling of the usage log over certain time intervals [Fig. 6].)

Regarding claim 9 Merry teaches The method according to claim 1, further comprising: obtaining usage status of at least two applications running on a terminal at instant t, and usage status of the at least two applications running on the terminal corresponding to instants t-1 to t-n, wherein, n is a natural number greater than or equal to 2 ([0109]  In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods. It will be appreciated that it is also possible to create cases using other groupings such as current app switch, previous app switch, and any period that falls within the prediction window after the app switch.)
inputting the usage status of the at least two applications to the application prediction model to obtain probabilities to start the at least two applications, ([0110] To determine the probability of "App X" being switched to in the next prediction window, the predictor may iterate over all of the cases and classify each of them based on their properties. The examiner notes that Merry teaches training a model based on past user behavior to predict future behavior [0131]. The examiner also notes that Merry teaches [Fig. 6] that when iterating over all of the cases, the usage time and status of each application in the case is considered.)
determining an application to be started corresponding to instant t+1 according to the probabilities to start the at least two applications ([0110] To determine the probability of "App X" being switched to in the next prediction window, the predictor may iterate over all of the cases and classify each of them based on their properties.)
preloading the application to be started. ([0129] The same sort of description may be applied to each App B, C and D in a similar fashion. These rate curves may then be applied by the pre-launch module according to some rules and/or heuristics---e.g., certain apps have a switch rate over some threshold may be pre-launched).

Regarding claim 12 Merry teaches An application prediction apparatus, comprising a processor and a memory storing instructions thereon, the processor when executing the instructions, being configured to: obtain a user behavior sample in the preset time period, wherein the user behavior sample comprises the association record of usage timing of the at least two applications. ([0166] 1. Retrieve last switch time and last pre-fetch time for each app that uses the feature. The examiner notes that Merry teaches [Fig. 6] retrieving the usage record of multiple applications at certain times).
Group the association record of usage timing to obtain the plurality of association record groups of usage timing ([0109] In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods.)
Generate the application prediction model. ([0118] The adaptive predictive engine may continue to process these cases---e.g., to provide predictions for the Prediction Window (which is shown as a desired time period past the current time). The examiner considers providing predictions requires generating a prediction model. The examiner interprets Merry’s ability to provide predictions means their work requires generating prediction models.)
wherein the processor is further configured to: sort the at least two applications according to a usage frequency of the at least two applications in the preset time period ([0163] 4. Calculate the average pre-fetch benefit score for each app based on its Pre-fetchBenefitHistory. Sort apps by average benefit score. The examiner notes that Merry teaches calculating a score for each app based on the number of times it was successfully cached or pre-fetched [0158].)
determine at least two target applications according to a sorting result ([0166] 7. If the limit determined in step 5 has not yet been reached, launch pre-fetch background tasks in order of the calculated average pre-fetch benefit score for as many apps as possible up to the limit. The examiner notes that Merry teaches pre-launching multiple apps based on the order of their sorted pre-fetch benefit scores).
determine the association record of usage timing according to usage status of the at least two target applications as the user behavior sample. ([0155-0157] To aid the pre-fetch process, the following information may be tracked by the pre-fetch service for each app that uses pre-fetch: 1. LastSwitchTime-the timestamp at which the app was last switched to (launched or resumed). 2. LastPre-fetchTime-the timestamp at which the last pre-fetch for the app completed. The examiner notes that Merry teaches tracing the usage data of multiple application to aid in the pre-fetch process.)
wherein the processor is further configured to: determine a number of units on an input layer of the application prediction model according to a vector dimension of each of the association record groups of usage timing ([0135-0139] In some embodiments, this prediction model may use the following inputs: 1- Foreground switches - this tells when an app is foreground versus background. The present system may have the ability to know when the classic desktop is up. 2- User away as provided by typically provided. 3-  Log off as available as subscribe events today. 4- App install/uninstall. The examiner notes that Merry teaches a prediction model that uses multiple inputs. The examiner also notes that Merry inherently determines the number of inputs the prediction model input layer requires to process those inputs. The examiner interprets this number of inputs to be the claimed number of units on an input layer of the application prediction model).
determine a number of units on an output layer of the application prediction model according to a number of the applications. ([0112] Prediction engine module may receive activity data of a given app's lifecycle (e.g., the number of times an app is activated by a user, the time of day of activation, length of time of activation, and the like). These uses of an app may form a set of "cases" of use of an app. Each case may be assessed a calculated, predicted and/or estimated probability of future and/or potential activation. The examiner notes that Merry teaches a model that predicts probabilities for set of cases for apps. The examiner also notes that Merry inherently determines a number of outputs to procure those probabilities of cases for each app. Finally, the examiner interprets this number of outputs to be the claimed number of units on an output layer of the application prediction model).
However, Merry fails to explicitly teach train the preset GRU neural network model according to the plurality of association record groups of usage timing. 
On the other hand, Tan teaches train the preset GRU neural network model according to the plurality of association record groups of usage timing. ([Page. 4, Para. 7] We used one recurrent (GRU) layer in all our models as we found that additional layers did not improve performance. The GRU was set at 100 and 1000 hidden units for each model. The models are defined and trained in Keras [3] and Theano [26] on a GeForce GTX Titan Black GPU. The examiner notes that Tan teaches training an initialized model [page 3, para. 6] which is selected to be a GRU NN model [Page 4, Para 7]. The examiner also notes that Merry teaches training a model based on past user behavior to predict future behavior [0131]. The examiner also notes that Merry and Tan are considered to be analogous because they are in the same field of supervised learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate train the preset GRU neural network model according to the plurality of association record groups of usage timing as taught by Tan to improve model performance compared to other types of neural networks [Page 2, Para. 10].)

Regarding claim 14 Merry teaches The apparatus according to claim 12, wherein the processor is further configured to: sample a usage log of the at least two target applications in accordance with a preset sampling period to determine whether the at least two target applications are in a usage state at sampling instants ([Page 6, Fig. 6] The examiner notes that Merry teaches sampling of case 1 during 00:11-07:24. During this sampling, the usage status of three applications A, B, and C is examined.)
associate the usage status of the at least two target applications according to the sampling instants and the usage status so as to determine the association record of usage timing. ([0109] In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods. It will be appreciated that it is also possible to create cases using other groupings such as current app switch, previous app switch, and any period that falls within the prediction window after the app switch. The examiner notes that Merry teaches associating usage times and applications according to the sampling over certain time intervals according to [Fig. 6].)

Regarding claim 15, Merry teaches The apparatus according to claim 14. However, Merry fails to explicitly teach wherein the processor is further configured to: train the preset GRU neural network model according to the usage status corresponding to the sampling instants in the plurality of association record groups of usage timing.
On the other hand, Tan teaches wherein the processor is further configured to: train the preset GRU neural network model according to the usage status corresponding to the sampling instants in the plurality of association record groups of usage timing. ([Page. 4, Para. 7] We used one recurrent (GRU) layer in all our models as we found that additional layers did not improve performance. The GRU was set at 100 and 1000 hidden units for each model. The models are defined and trained in Keras [3] and Theano [26] on a GeForce GTX Titan Black GPU. The examiner notes that Tan teaches training an initialized model [page 3, para. 6] which is selected to be a GRU NN model [Page 4, Para 7]. The examiner also notes that Merry teaches training a model based on past user behavior to predict future behavior [0131]. The examiner also notes that Merry and Tan are considered to be analogous because they are in the same field of supervised learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein the processor is further configured to: train the preset GRU neural network model according to the usage status corresponding to the sampling instants in the plurality of association record groups of usage timing as taught by Tan to improve model performance compared to other types of neural networks [Page 2, Para. 10].)

Regarding claim 16 Merry teaches The apparatus according to claim 15, wherein the processor is further configured to: use an association record of usage timing of applications corresponding to first n sampling instants as a first association record group of usage timing, use an association record of usage timing of applications corresponding to the second to the n+11 sampling instants as a second association record group of usage timing, and so on, to obtain m-n+1 association record groups of usage timing, wherein n is a natural number greater than or equal to 2, and m is a natural number greater than 3.. ([0109] In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods. It will be appreciated that it is also possible to create cases using other groupings such as current app switch, previous app switch, and any period that falls within the prediction window after the app switch. The examiner notes that Merry teaches associating usage times and multiple applications (e.g. applications A, B, and C) according to the sampling of the usage log over certain time intervals [Fig. 6].)

Regarding claim 20 Merry teaches The apparatus according to claim 12, wherein the processor is further configured to: obtain usage status of at least two applications running on a terminal at instant t, and usage status of the at least two applications running on the terminal corresponding to instants t-1 to t-n, wherein, n is a natural number greater than or equal to 2 ([0109]  In reference to FIG. 6, the adaptive predictor may operate on groupings of application usage periods, referred to as "cases". As shown in FIG. 6., one manner of creating cases may be affected by taking groups of ( e.g., 3 or any desired number) adjacent application usage periods. It will be appreciated that it is also possible to create cases using other groupings such as current app switch, previous app switch, and any period that falls within the prediction window after the app switch.)
input the usage status of the at least two applications to the application prediction model to obtain probabilities to start the at least two applications, ([0110] To determine the probability of "App X" being switched to in the next prediction window, the predictor may iterate over all of the cases and classify each of them based on their properties. The examiner notes that Merry teaches training a model based on past user behavior to predict future behavior [0131]. The examiner also notes that Merry teaches [Fig. 6] that when iterating over all of the cases, the usage time and status of each application in the case is considered.)
determine an application to be started corresponding to instant t+1 according to the probabilities to start the at least two applications ([0110] To determine the probability of "App X" being switched to in the next prediction window, the predictor may iterate over all of the cases and classify each of them based on their properties.)
preloading the application to be started. ([0129] The same sort of description may be applied to each App B, C and D in a similar fashion. These rate curves may then be applied by the pre-launch module according to some rules and/or heuristics---e.g., certain apps have a switch rate over some threshold may be pre-launched).

Claims 6, and 17 are rejected under 35 U.S.C. 103 as being anticipated by Merry (US20140373032A1) in view of Tan (Improved Recurrent Neural Networks for Session-based Recommendations) further in view of Nivison (Development of a Deep RNN Controller for Flight Applications) further in view of Wikipedia (Hyperbolic function)

Regarding claim 6 Merry teaches The method according to claim 1. However, Merry fails to explicitly teach wherein the application prediction model comprises a reset gate, an update gate z, a candidate status unit k, and an output status unit h, which are respectively calculated by the following formula:

    PNG
    media_image1.png
    270
    792
    media_image1.png
    Greyscale

wherein xt indicates an application used at instant t in the association record of usage timing, each of W , W*, U and U*. indicate network parameters for learning, wherein *∈{r, z,}, zt indicates an update gate at instant t, rt indicates a reset gate at instant t, ĥt indicates a candidate status unit at instant t, ht indicates an output status unit at instant t, h(t-1) indicates an output status unit at instant t-1, σ indicates a Sigmoid Function of 

    PNG
    media_image2.png
    92
    151
    media_image2.png
    Greyscale

Θ indicates vector bitwise multiplying. Merry also fails to explicitly teach a formula of tanh function is:

    PNG
    media_image3.png
    101
    287
    media_image3.png
    Greyscale

On the other hand, Nivison teaches wherein the application prediction model comprises a reset gate, an update gate z, a candidate status unit k, and an output status unit h, which are respectively calculated by the following formula:

    PNG
    media_image1.png
    270
    792
    media_image1.png
    Greyscale

wherein xt indicates an application used at instant t in the association record of usage timing, each of W , W*, U and U*. indicate network parameters for learning, wherein *∈{r, z,}, zt indicates an update gate at instant t, rt indicates a reset gate at instant t, ĥt indicates a candidate status unit at instant t, ht indicates an output status unit at instant t, h(t-1) indicates an output status unit at instant t-1, σ indicates a Sigmoid Function of 

    PNG
    media_image2.png
    92
    151
    media_image2.png
    Greyscale

Θ indicates vector bitwise multiplying. ([Page 9, Algo. 1]

    PNG
    media_image4.png
    161
    355
    media_image4.png
    Greyscale

The examiner notes that Nivison teaches GRU defining formulas that use constants b1, b2, and b3. When those constants are set to zero, the formulas taught by Nivison become similar to the GRU formulas in the claimed invention. The examiner notes that the claimed invention does not define what the network parameters are, therefore, the examiner interprets the claimed network parameters to be the same as the ones taught by Nivison [Page 9, Algo 1]. The examiner also notes that Merry and Nivison are considered to be analogous because they are in the same field of supervised learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein the application prediction model comprises a reset gate, an update gate z, a candidate status unit k, and an output status unit h, which are respectively calculated by the following formula:

    PNG
    media_image1.png
    270
    792
    media_image1.png
    Greyscale

wherein xt indicates an application used at instant t in the association record of usage timing, each of W , W*, U and U*. indicate network parameters for learning, wherein *∈{r, z,}, zt indicates an update gate at instant t, rt indicates a reset gate at instant t, ĥt indicates a candidate status unit at instant t, ht indicates an output status unit at instant t, h(t-1) indicates an output status unit at instant t-1, σ indicates a Sigmoid Function of 

    PNG
    media_image2.png
    92
    151
    media_image2.png
    Greyscale

Θ indicates vector bitwise multiplying as taught by Nivison to develop robust models for agility and speed modeling using data containing disturbances. [Page 4, Para. 2].)
Furthermore, Wikipedia teaches a formula of tanh function is:

    PNG
    media_image3.png
    101
    287
    media_image3.png
    Greyscale

([Page 4, Para. 3] 
    PNG
    media_image5.png
    121
    779
    media_image5.png
    Greyscale

The examiner notes that Merry and Wikipedia are considered to be analogous because they are in the same field of mathematical computations. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate a formula of tanh function is:

    PNG
    media_image3.png
    101
    287
    media_image3.png
    Greyscale

as taught by Wikipedia to incorporate solutions to nonlinear boundary problems. [Page 5, Para. 3].)

Regarding claim 17 Merry teaches The apparatus according to claim 12. However, Merry fails to explicitly teach wherein the application prediction model comprises a reset gate, an update gate z, a candidate status unit k, and an output status unit h, which are respectively calculated by the following formula:

    PNG
    media_image1.png
    270
    792
    media_image1.png
    Greyscale

wherein xt indicates an application used at instant t in the association record of usage timing, each of W , W*, U and U*. indicate network parameters for learning, wherein *∈{r, z,}, zt indicates an update gate at instant t, rt indicates a reset gate at instant t, ĥt indicates a candidate status unit at instant t, ht indicates an output status unit at instant t, h(t-1) indicates an output status unit at instant t-1, σ indicates a Sigmoid Function of 

    PNG
    media_image2.png
    92
    151
    media_image2.png
    Greyscale

Θ indicates vector bitwise multiplying. Merry also fails to explicitly teach a formula of tanh function is:

    PNG
    media_image3.png
    101
    287
    media_image3.png
    Greyscale
.
On the other hand, Nivison teaches wherein the application prediction model comprises a reset gate, an update gate z, a candidate status unit k, and an output status unit h, which are respectively calculated by the following formula:

    PNG
    media_image1.png
    270
    792
    media_image1.png
    Greyscale

wherein xt indicates an application used at instant t in the association record of usage timing, each of W , W*, U and U*. indicate network parameters for learning, wherein *∈{r, z,}, zt indicates an update gate at instant t, rt indicates a reset gate at instant t, ĥt indicates a candidate status unit at instant t, ht indicates an output status unit at instant t, h(t-1) indicates an output status unit at instant t-1, σ indicates a Sigmoid Function of 

    PNG
    media_image2.png
    92
    151
    media_image2.png
    Greyscale

Θ indicates vector bitwise multiplying. ([Page 9, Algo. 1] 

    PNG
    media_image4.png
    161
    355
    media_image4.png
    Greyscale

The examiner notes that Nivison teaches GRU defining formulas that use constants b1, b2, and b3. When those constants are set to zero, the formulas taught by Nivison become similar to the GRU formulas in the claimed invention. The examiner notes that the claimed invention does not define what the network parameters are, therefore, the examiner interprets the claimed network parameters to be the same as the ones taught by Nivison [Page 9, Algo 1]. The examiner also notes that Merry and Nivison are considered to be analogous because they are in the same field of supervised learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein the application prediction model comprises a reset gate, an update gate z, a candidate status unit k, and an output status unit h, which are respectively calculated by the following formula:

    PNG
    media_image1.png
    270
    792
    media_image1.png
    Greyscale

wherein xt indicates an application used at instant t in the association record of usage timing, each of W , W*, U and U*. indicate network parameters for learning, wherein *∈{r, z,}, zt indicates an update gate at instant t, rt indicates a reset gate at instant t, ĥt indicates a candidate status unit at instant t, ht indicates an output status unit at instant t, h(t-1) indicates an output status unit at instant t-1, σ indicates a Sigmoid Function of 

    PNG
    media_image2.png
    92
    151
    media_image2.png
    Greyscale

Θ indicates vector bitwise multiplying as taught by Nivison to develop robust models for agility and speed modeling using data containing disturbances. [Page 4, Para. 2].)
	Furthermore, Wikipedia teaches a formula of tanh function is:

    PNG
    media_image3.png
    101
    287
    media_image3.png
    Greyscale

([Page 4, Para. 3] 
    PNG
    media_image5.png
    121
    779
    media_image5.png
    Greyscale

The examiner notes that Merry Wikipedia are considered to be analogous because they are in the same field of mathematical computations. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate a formula of tanh function is:

    PNG
    media_image3.png
    101
    287
    media_image3.png
    Greyscale

as taught by Wikipedia to incorporate solutions to nonlinear boundary problems. [Page 5, Para. 3].)

Claims 8 and 19 are rejected under 35 U.S.C. 103 as being anticipated by Merry (US20140373032A1) in view of Tan (Improved Recurrent Neural Networks for Session-based Recommendations) further in view of Kantar (Analysis of wind speed distributions Wind distribution function)

Regarding claim 8 Merry teaches The method according to claim 1. However, Merry fails to explicitly teach wherein, an error function adopted by the application prediction model is a cross entropy loss function:

    PNG
    media_image6.png
    85
    188
    media_image6.png
    Greyscale

wherein, yk represents a standard value of usage status of the applications, ŷk represents a prediction value of the usage status of the applications, C=M+1, wherein, M represents a number of the applications, and J represents a cross entropy of the application prediction model.
On the other hand, Kantar, teaches wherein, an error function adopted by the application prediction model is a cross entropy loss function:

    PNG
    media_image6.png
    85
    188
    media_image6.png
    Greyscale

wherein, yk represents a standard value of usage status of the applications, ŷk represents a prediction value of the usage status of the applications, C=M+1, wherein, M represents a number of the applications, and J represents a cross entropy of the application prediction model. ([Page 964, Eq. 3]

    PNG
    media_image7.png
    89
    774
    media_image7.png
    Greyscale
 
The examiner notes that Kantar teaches a cross entropy loss function that uses pi as the probability of occurrence of state i and qi as the prior probability of occurrence of state i of a system. The examiner notes that the claimed invention does not define what the standard value and predicted value of usage are. The examiner interprets Kantar’s probabilities of occurrence and their relations to be the claimed standard and predicted usage status of the application. The examiner also notes that Merry and Kantar are considered to be analogous because they are in the same field of computational analysis. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein, an error function adopted by the application prediction model is a cross entropy loss function:

    PNG
    media_image6.png
    85
    188
    media_image6.png
    Greyscale

wherein, yk represents a standard value of usage status of the applications, ŷk represents a prediction value of the usage status of the applications, C=M+1, wherein, M represents a number of the applications, and J represents a cross entropy of the application prediction model as taught by Kantar to allow the inclusion of previous information and to cover the maximum entropy (MaxEnt) principle, which generates better fit results [Page 1, Para. 1].)

Regarding claim 19 Merry teaches The apparatus according to claim 12. However, Merry fails to explicitly teach wherein, an error function adopted by the application prediction model is a cross entropy loss function:

    PNG
    media_image6.png
    85
    188
    media_image6.png
    Greyscale

wherein, yk represents a standard value of usage status of the applications, ŷk represents a prediction value of the usage status of the applications, C=M+1, wherein, M represents a number of the applications, and J represents a cross entropy of the application prediction model.
On the other hand, Kantar, teaches wherein, an error function adopted by the application prediction model is a cross entropy loss function:

    PNG
    media_image6.png
    85
    188
    media_image6.png
    Greyscale

wherein, yk represents a standard value of usage status of the applications, ŷk represents a prediction value of the usage status of the applications, C=M+1, wherein, M represents a number of the applications, and J represents a cross entropy of the application prediction model. ([Page 964, Eq. 3]

    PNG
    media_image7.png
    89
    774
    media_image7.png
    Greyscale
 
The examiner notes that Kantar teaches a cross entropy loss function that uses pi as the probability of occurrence of state i and qi as the prior probability of occurrence of state i of a system. The examiner notes that the claimed invention does not define what the standard value and predicted value of usage are. The examiner interprets Kantar’s probabilities of occurrence and their relations to be the claimed standard and predicted usage status of the application. The examiner also notes that Merry and Kantar are considered to be analogous because they are in the same field of computational analysis. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein, an error function adopted by the application prediction model is a cross entropy loss function:

    PNG
    media_image6.png
    85
    188
    media_image6.png
    Greyscale

wherein, yk represents a standard value of usage status of the applications, ŷk represents a prediction value of the usage status of the applications, C=M+1, wherein, M represents a number of the applications, and J represents a cross entropy of the application prediction model as taught by Kantar to allow the inclusion of previous information and to cover the maximum entropy (MaxEnt) principle, which generates better fit results [Page 1, Para. 1].)

Claims 21 and 22 are rejected under 35 U.S.C. 103 as being anticipated by Merry (US20140373032A1) in view of Tan (Improved Recurrent Neural Networks for Session-based Recommendations) further in view of Bar (US 2007/0201641 Al)

Regarding claim 21 Merry teaches The method according to claim 1. However, Merry fails to teach further comprising: filtering out a usage record of an application from a history usage record of the at least two applications in the preset time period upon determining that usage time of the application is less than a preset time threshold.
On the other hand, Bar teaches further comprising: filtering out a usage record of an application from a history usage record of the at least two applications in the preset time period upon determining that usage time of the application is less than a preset time threshold ([0026] The processing could include, for example, filtering out service usage log entries that are associated with incoming calls, with outgoing calls that were not answered, and the like. The examiner considers an unanswered phone call to be the claimed application with a usage time that is less than a preset time threshold. The examiner also notes that Merry and Bar are considered to be analogous because they are in the same field of computer networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate further comprising: filtering out a usage record of an application from a history usage record of the at least two applications in the preset time period upon determining that usage time of the application is less than a preset time threshold as taught by Bar to create user data records [0026]).

Regarding claim 22 Merry teaches The apparatus according to claim 12. However, Merry fails to teach wherein the processor is further configured to: filter out a usage record of an application from a history usage record of the at least two applications in the preset time period upon determining that usage time of the application is less than a preset time threshold.
On the other hand, Bar teaches wherein the processor is further configured to: filter out a usage record of an application from a history usage record of the at least two applications in the preset time period upon determining that usage time of the application is less than a preset time threshold ([0026] The processing could include, for example, filtering out service usage log entries that are associated with incoming calls, with outgoing calls that were not answered, and the like. The examiner considers an unanswered phone call to be the claimed application with a usage time that is less than a preset time threshold. The examiner also notes that Merry and Bar are considered to be analogous because they are in the same field of computer networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Merry’s application prediction model to incorporate wherein the processor is further configured to: filter out a usage record of an application from a history usage record of the at least two applications in the preset time period upon determining that usage time of the application is less than a preset time threshold as taught by Bar to create user data records [0026]).




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:	
Apacible - US20080005736A1.
Apacible teaches a probabilistic and/or decision-theoretic reasoning model of application usage to predict application use
Wang - US20150161139A1.
Wang teaches techniques to improve reasonability of displaying the data objects in the search result and provide more accurate result
Martens - US20140189538A1.
Martens teaches the selection of an application based on the interaction between a user and a device.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAMCY ALGHAZZY whose telephone number is  (571)272-8824.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571) 272-2589.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHAMCY ALGHAZZY/           Examiner, Art Unit 2128        

/OMAR F FERNANDEZ RIVAS/           Supervisory Patent Examiner, Art Unit 2128