DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments have been fully considered but are moot in light of a new rejection. 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-4, 6-11, 13, 15-16, and 20-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Audhkhasi et al. (US 20170270100 A1) in view of Wolf (US 2019/0087677 A1) further in view of Sundermeyer, Martin, Hermann Ney, and Ralf Schlüter. "From feedforward to recurrent LSTM neural networks for language modeling.".
Regarding claim 1, Audhkhasi teaches An electronic device comprising: a processor (Audhkhasi Paragraph [0007] teaches one or more processors), 
and at least one input interface configured to receive one or more input sequence items (Audhkhasi Paragraph [0005] “The method comprises configuring the data processing system with an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches receiving an input sequence); 
wherein the processor is configured to: implement an artificial neural network (Audhkhasi Paragraph [0005] teaches implementing a neural network); 
wherein the artificial neural network does not internally maintain any record of context beyond the context that is inherent as a result of the training of the neural network (Audhkhasi [0052] “The recent success of semantic word embeddings in NLP has occurred in parallel with the advent of neural network language models (NNLMs). Models such as feed-forward NNLMs (FNNLMs), recurrent NNLMs (RNNLMs), and uni-/bi-directional long-short term memory (LSTM) RNNLMs are used in almost all state-of-the-art automatic speech recognition (ASR) systems in combination with the traditional N-gram LMs that do not learn a continuous distributed word representation”)
and generate one or more predicted next items in the current sequence of items using the artificial neural network by providing items from the current sequence of items as a side input to an input layer of the artificial neural network (Audhkhasi Paragraph [0005] “an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches predicting the next word in a current sequence based on an input sequence and side input (i.e. external word embedding matrix)), 
Audkhasi however fails to reach the rest of the claim limitations including the unigrams.
Wolf teaches generate a unigram count vector including counts of respective ones of the input sequence items, received at the at least one input interface prior to a current sequence of items of the input sequence items ([0061] “when the lexicon includes words in the English language and the one of the words is, say, "BABY", it can be associated with a frequency profile including a set of one or more attributes selected from a list consisting of at least the following attributes: (i) the unigram "B" appears twice, (ii) the unigram "B" appears one time in the first half of the word, (iii) the unigram "B" appears one time in the second half of the word” which shows a unigram frequency count received at an input interface, the user interface) 
wherein the counts of respective ones of the input sequence items include a cumulative count of input instances by a user, at the at least one input interface, or a particular unigram (previous citation, [0061] “[…](v) the unigram "A" appears once in the first half of the word, (vi) the unigram "Y" appears once, (vii) the unigram "Y" appears once in the second half of the word, (viii) the unigram "Y" appears at the end of the word, (ix) the bigram "BA" appears once, (x) the bigram "BA" appears once in the first half of the word, (xi) the bigram "BY" appears once, (xii) the bigram "BY" appears once in the second half of the word, (xiii) the bigram "AB" appears once, (xiv) the bigram "AB" appears once in the middle segment of the word, (xv) the trigram "BAB" appears once, (xvi) the trigram "BAB" appears once at the first three quarters of the word, (xvii) the trigram "ABY" appears once, etc” which shows counting the frequency of each unigram i.e. the cumulative count of input instances)
generate one or more predicted next time items… using... the unigram count vector as a side input ([0073] “Typically, a subset of attributes can comprise a rank of an n-gram (e.g., unigram, bigram, trigram, etc.), a segmentation level of the input image patch (halves, thirds, quarters, fifths, etc.), and a location of a segment of the input image patch (first half, second half, first third etc.) containing the n-gram[…]Unlike conventional CNNs that do not include parallel branches or include branches only during training, CNN 30 of the present embodiments includes a plurality of branches that are utilized both during training and during prediction phase.” which shows using the unigram for predicting)
Audkhasi and Wolf are analogous in the art of language predictive modeling.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Audkhasi’s system to keep track of previous input sequences in a unigram count vector. This allows the system to use previous input to further enhance predictions for the current sequence.
Both however do not teach the remaining limitations. Sundermeyer however teaches wherein the unigram count vector includes parameters for long-term context, the parameters comprising a record of uses of one or more unigrams over a time period including uses occurring before a current session (pg. 519 left col. last ¶ into right col. ¶1 “Word classes can be derived from the training data in an unsupervised fashion in various ways. E.g., in [14], word classes were obtained based on unigram frequencies. In [33] and [34], a perplexity-based approach was proposed for obtaining word classes.”)
 wherein the unigram count vector further includes historical unigram inputs of the user (pg. 520 ¶ above §C “On the other hand, by evaluating the RNN equations word by word for an entire sequence, the output layer result for a word depends on the entire sequence of history words . The activations of the recurrent hidden layer then act as a memory that automatically stores previous word information”)
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Audhkhasi and Wolf with that of Sundermeyer since a combination of known methods would yield predictable results. As shown in Sundermeyer, it is known to count the frequency of unigrams which also includes historical data. Y using this data in context with neural networks such as the previously cited references, one can have better training and modeling/classification.
Regarding claim 4, Audhkhasi in view of Wolf teaches the limitations as described in claim 1 above. Audhkhasi further teaches wherein the processor is configured to generate the one or more next predicted items in the current sequence of items by providing first and second items from the current sequence of items and the side input as inputs to the input layer of the artificial neural network (Audhkhasi Paragraph [0073] “If the NL processing system determines that the end of the sequence of words is not reached, operation returns to block 901 to continue to receive the sequence of words” teaches that a second input sequence can be received by continuing to process the sequence of words).
Regarding claim 6, Audhkhasi in view of Wolf teaches the limitations as described in claim 1 above. Audhkhasi further teaches wherein the artificial neural network is a fixed context neural network (Audhkhasi Paragraph [0047] “The neural net architecture might be feed-forward, recurrent, or another type of neural network language model” teaches the neural network can be feed-forward which is a type of fixed-context neural network)
Regarding claim 7, Audhkhasi in view of Wolf teaches the limitations as described in claim 1 above. Audhkhasi further teaches wherein the processor is configured to generate the one or more predicted next items in the current sequence of items by further providing one or more additional input sequence to the input layer of the artificial neural network (Audhkhasi Paragraph [0073-0074] “The NL processing system processes the current word in the sequence of words based on the predicted current word (block 903). Then, the NL processing system determines whether the end of the sequence is reached (block 904). If the NL processing system determines that the end of the sequence of words is not reached, operation returns to block 901 to continue to receive the sequence of words” teaches that additional sequence items can be input to the neural network to generate the next predicted item)
Regarding claim 8, Audhkhasi in view of Wolf teaches the limitations as described in claim 7 above. Audhkhasi further teaches wherein the input sequence item and one or more additional sequence items are consecutive previous items from the current sequence of items (Audhkhasi Paragraph [0073] “If the NL processing system determines that the end of the sequence of words is not reached, operation returns to block 901 to continue to receive the sequence of words” teaches that a consecutive sequence of words can be processed).
Regarding claim 9, Audhkhasi in view of Wolf teaches the limitations as described in claim 1 above. Audhkhasi further teaches wherein the current sequence of items and the side input are concatenated to form an input vector that is provided to the input layer of the artificial neural network (Audhkhasi Paragraph [0059] “that concatenates the D.sub.G-dimensional external word embeddings such as the Glove semantic embeddings G to the history and prediction word embeddings R.sub.H and R.sub.P” teaches that the external word embeddings can be concatenated to the history embeddings which are the sequence items).
Regarding claim 10, Audhkhasi in view of Wolf teaches the limitations as described in claim 1 above. Audhkhasi further teaches wherein the artificial neural network is a recurrent neural network (Audhkhasi Paragraph [0047] “The neural net architecture might be feed-forward, recurrent, or another type of neural network language model” teaches the neural network can be a recurrent neural network).
Regarding claim 11, Audhkhasi in view of Wolf teaches the limitations as described in claim 10 above. Audhkhasi further teaches wherein the processor is configured to generate one or more predicted next items in the sequence of items by: processing the side input with the artificial neural network by providing the side input to the input layer of the artificial neural network to initialise the artificial neural network (Audhkhasi Paragraph [0062] “However, the weights of the hidden to output layer connections are viewed as a word embedding matrix R.sub.P. Hence, the EWE-NNLM expands the hidden layer by the external embedding size D.sub.G and adds new connections between these new hidden neurons and the output layer with weight matrix G.sup.T” teaches that the architecture of the neural networked is initialized based on the side input); and processing the current sequence of items with the artificial neural network by providing the current sequence of items to the input layer of the initialized artificial neural network to generate the one or more predicted next items in the current sequence of items (Audhkhasi Paragraph [0005] “an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches predicting the next word in a sequence based on an input sequence).
Regarding claim 13, Audhkhasi in view of Wolf teaches the limitations as described in claim 12 above. Audhkhasi further teaches wherein the processor is configured to generate the one or more next predicted items in the sequence by providing a first and second items from the current sequence of items as the input to the input layer of the artificial neural network (Audhkhasi Paragraph [0073] “If the NL processing system determines that the end of the sequence of words is not reached, operation returns to block 901 to continue to receive the sequence of words” teaches that a first and second input sequence can be received by continuing to process the sequence of words).
Regarding claim 15, Audhkhasi teaches processor (Audhkhasi Paragraph [0007] teaches one or more processors), 
and at least one input interface configured to receive input sequence items (Audhkhasi Paragraph [0005] “The method comprises configuring the data processing system with an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches receiving an input sequence);  
wherein the processor is configured to: implement an artificial neural network
(Audhkhasi Paragraph [0005] teaches implementing a neural network);
estimate an initial state of the artificial neural network by applying a side input to an input layer of the artificial neural network, (Audhkhasi Paragraph [0062] “However, the weights of the hidden to output layer connections are viewed as a word embedding matrix R.sub.P. Hence, the EWE-NNLM expands the hidden layer by the external embedding size D.sub.G and adds new connections between these new hidden neurons and the output layer with weight matrix G.sup.T” teaches that the architecture of the neural networked is initialized based on the side input and Audhkhasi Paragraph [0005] “The external word embedding neural network language model combines an external embedding matrix both with a history word embedding matrix and a prediction word embedding matrix of the external word embedding neural network language model. The method further comprises receiving a sequence of input words by the data processing system. The method further comprises applying a plurality of previous words in the sequence of input words as inputs to the external word embedding neural network language model” teaches an external word embedding matrix (i.e. side input) that maintains a history of input sequences); 
wherein the artificial neural network does not internally maintain any record of context beyond the context that is inherent as a result of the training of the neural network (Audhkhasi [0052] “The recent success of semantic word embeddings in NLP has occurred in parallel with the advent of neural network language models (NNLMs). Models such as feed-forward NNLMs (FNNLMs), recurrent NNLMs (RNNLMs), and uni-/bi-directional long-short term memory (LSTM) RNNLMs are used in almost all state-of-the-art automatic speech recognition (ASR) systems in combination with the traditional N-gram LMs that do not learn a continuous distributed word representation”)

and generate one or more predicted next items in the current sequence of items using the artificial neural network by providing the current sequence of items, received at the at least one input interface, to the input layer of the artificial neural network (Audhkhasi Paragraph [0005] “an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches predicting the next word in a sequence based on an input sequence and side input explained below).
and generate one or more predicted next items in the current sequence of items using the artificial neural network by providing items from the current sequence of items as a side input to an input layer of the artificial neural network (Audhkhasi Paragraph [0005] “an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches predicting the next word in a current sequence based on an input sequence and side input (i.e. external word embedding matrix)), 
Audkhasi however fails to reach the rest of the claim limitations including the unigrams.
Wolf teaches generate a unigram count vector including counts of respective ones of the input sequence items, received at the at least one input interface prior to a current sequence of items of the input sequence items ([0061] “when the lexicon includes words in the English language and the one of the words is, say, "BABY", it can be associated with a frequency profile including a set of one or more attributes selected from a list consisting of at least the following attributes: (i) the unigram "B" appears twice, (ii) the unigram "B" appears one time in the first half of the word, (iii) the unigram "B" appears one time in the second half of the word” which shows a unigram frequency count received at an input interface, the user interface)
wherein the counts of respective ones of the input sequence items include a cumulative count of input instances by a user, at the at least one input interface, or a particular unigram (previous citation, [0061] “[…](v) the unigram "A" appears once in the first half of the word, (vi) the unigram "Y" appears once, (vii) the unigram "Y" appears once in the second half of the word, (viii) the unigram "Y" appears at the end of the word, (ix) the bigram "BA" appears once, (x) the bigram "BA" appears once in the first half of the word, (xi) the bigram "BY" appears once, (xii) the bigram "BY" appears once in the second half of the word, (xiii) the bigram "AB" appears once, (xiv) the bigram "AB" appears once in the middle segment of the word, (xv) the trigram "BAB" appears once, (xvi) the trigram "BAB" appears once at the first three quarters of the word, (xvii) the trigram "ABY" appears once, etc” which shows counting the frequency of each unigram i.e. the cumulative count of input instances)
generate one or more predicted next time items… using... the unigram count vector as a side input ([0073] “Typically, a subset of attributes can comprise a rank of an n-gram (e.g., unigram, bigram, trigram, etc.), a segmentation level of the input image patch (halves, thirds, quarters, fifths, etc.), and a location of a segment of the input image patch (first half, second half, first third etc.) containing the n-gram[…]Unlike conventional CNNs that do not include parallel branches or include branches only during training, CNN 30 of the present embodiments includes a plurality of branches that are utilized both during training and during prediction phase.” which shows using the unigram for predicting)
Audkhasi and Wolf are analogous in the art of language predictive modeling.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Audkhasi’s system to keep track of previous input sequences in a unigram count vector. This allows the system to use previous input to further enhance predictions for the current sequence.
Both however do not teach the unigram including historical inputs. Both however do not teach the remaining limitations. Sundermeyer however teaches wherein the unigram count vector includes parameters for long-term context, the parameters comprising a record of uses of one or more unigrams over a time period including uses occurring before a current session (pg. 519 left col. last ¶ into right col. ¶1 “Word classes can be derived from the training data in an unsupervised fashion in various ways. E.g., in [14], word classes were obtained based on unigram frequencies. In [33] and [34], a perplexity-based approach was proposed for obtaining word classes.”)
 wherein the unigram count vector further includes historical unigram inputs of the user (pg. 520 ¶ above §C “On the other hand, by evaluating the RNN equations word by word for an entire sequence, the output layer result for a word depends on the entire sequence of history words . The activations of the recurrent hidden layer then act as a memory that automatically stores previous word information”)
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Audhkhasi and Wolf with that of Sundermeyer since a combination of known methods would yield predictable results. As shown in Sundermeyer, it is known to count the frequency of unigrams which also includes historical data. Y using this data in context with neural networks such as the previously cited references, one can have better training and modeling/classification.

Regarding claim 16, Audhkhasi in view of Wolf teaches the limitations as described in claim 15 above. Audhkhasi further teaches wherein the artificial neural network is a recurrent neural network (Audhkhasi Paragraph [0047] “The neural net architecture might be feed-forward, recurrent, or another type of neural network language model” teaches the neural network can be a recurrent neural network).
Regarding claim 17, Audhkhasi in view of Wolf teaches the limitations as described in claim 16 above. Wolf further teaches wherein the processor estimates an initial state of the artificial neural network by estimating values for a recurrent hidden vector of the recurrent neural network based on the side input (Wolf [0096] “Instead of relying on the probabilities at the output layers of the CNN, CCA is optionally and preferably applied to a representation vector obtained from one or more of the hidden layers, namely below the output layers”). 
Regarding claim 18, Audhkhasi in view of Wolf teaches the limitations as described in claim 17 above. Audhkhasi further teaches wherein the input layer of the artificial neural network comprises a side input layer configured to receive the side input and a main input layer configured to receive the current sequence of items (Audhkhasi Paragraph [0071] “The system expands the NNLM with the external word embeddings to form a EWE-NNLM (block 805)” teaches expanding the neural network to include the side input which can be viewed as a side input layer to allow the external word embeddings seen in Figure 7).
Regarding claim 20, Audhkhasi teaches A computer-implemented method comprising: 
receiving input sequence items using an input interface items (Audhkhasi Paragraph [0005] “The method comprises configuring the data processing system with an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches receiving an input sequence);  ; 
implementing, at a processor, an artificial neural network (Audhkhasi Paragraph [0005] teaches implementing a neural network); 
estimate an initial state of the artificial neural network by applying a side input to an input layer of the artificial neural network, (Audhkhasi Paragraph [0062] “However, the weights of the hidden to output layer connections are viewed as a word embedding matrix R.sub.P. Hence, the EWE-NNLM expands the hidden layer by the external embedding size D.sub.G and adds new connections between these new hidden neurons and the output layer with weight matrix G.sup.T” teaches that the architecture of the neural networked is initialized based on the side input and Audhkhasi Paragraph [0005] “The external word embedding neural network language model combines an external embedding matrix both with a history word embedding matrix and a prediction word embedding matrix of the external word embedding neural network language model. The method further comprises receiving a sequence of input words by the data processing system. The method further comprises applying a plurality of previous words in the sequence of input words as inputs to the external word embedding neural network language model” teaches an external word embedding matrix (i.e. side input) that maintains a history of input sequences);  
wherein the artificial neural network does not internally maintain any record of context beyond the context that is inherent as a result of the training of the neural network (Audhkhasi [0052] “The recent success of semantic word embeddings in NLP has occurred in parallel with the advent of neural network language models (NNLMs). Models such as feed-forward NNLMs (FNNLMs), recurrent NNLMs (RNNLMs), and uni-/bi-directional long-short term memory (LSTM) RNNLMs are used in almost all state-of-the-art automatic speech recognition (ASR) systems in combination with the traditional N-gram LMs that do not learn a continuous distributed word representation”)
and generate one or more predicted next items in the current sequence of items using the artificial neural network by providing the current sequence of items, received at the at least one input interface, to the input layer of the artificial neural network (Audhkhasi Paragraph [0005] “an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches predicting the next word in a sequence based on an input sequence and side input explained below).
and generate one or more predicted next items in the current sequence of items using the artificial neural network by providing items from the current sequence of items as a side input to an input layer of the artificial neural network (Audhkhasi Paragraph [0005] “an external word embedding neural network language model that accepts as input a sequence of words and predicts a current word based on the sequence of words” teaches predicting the next word in a current sequence based on an input sequence and side input (i.e. external word embedding matrix)), 
Audkhasi however fails to reach the rest of the claim limitations including the unigrams.
Wolf teaches generate a unigram count vector including counts of respective ones of the input sequence items, received at the at least one input interface prior to a current sequence of items of the input sequence items ([0061] “when the lexicon includes words in the English language and the one of the words is, say, "BABY", it can be associated with a frequency profile including a set of one or more attributes selected from a list consisting of at least the following attributes: (i) the unigram "B" appears twice, (ii) the unigram "B" appears one time in the first half of the word, (iii) the unigram "B" appears one time in the second half of the word” which shows a unigram frequency count received at an input interface, the user interface)
wherein the counts of respective ones of the input sequence items include a cumulative count of input instances by a user, at the at least one input interface, or a particular unigram (previous citation, [0061] “[…](v) the unigram "A" appears once in the first half of the word, (vi) the unigram "Y" appears once, (vii) the unigram "Y" appears once in the second half of the word, (viii) the unigram "Y" appears at the end of the word, (ix) the bigram "BA" appears once, (x) the bigram "BA" appears once in the first half of the word, (xi) the bigram "BY" appears once, (xii) the bigram "BY" appears once in the second half of the word, (xiii) the bigram "AB" appears once, (xiv) the bigram "AB" appears once in the middle segment of the word, (xv) the trigram "BAB" appears once, (xvi) the trigram "BAB" appears once at the first three quarters of the word, (xvii) the trigram "ABY" appears once, etc” which shows counting the frequency of each unigram i.e. the cumulative count of input instances)
generate one or more predicted next time items… using... the unigram count vector as a side input ([0073] “Typically, a subset of attributes can comprise a rank of an n-gram (e.g., unigram, bigram, trigram, etc.), a segmentation level of the input image patch (halves, thirds, quarters, fifths, etc.), and a location of a segment of the input image patch (first half, second half, first third etc.) containing the n-gram[…]Unlike conventional CNNs that do not include parallel branches or include branches only during training, CNN 30 of the present embodiments includes a plurality of branches that are utilized both during training and during prediction phase.” which shows using the unigram for predicting)
Audkhasi and Wolf are analogous in the art of language predictive modeling.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Audkhasi’s system to keep track of previous input sequences in a unigram count vector. This allows the system to use previous input to further enhance predictions for the current sequence.
Both however do not teach the remaining limitations. Sundermeyer however teaches wherein the unigram count vector includes parameters for long-term context, the parameters comprising a record of uses of one or more unigrams over a time period including uses occurring before a current session (pg. 519 left col. last ¶ into right col. ¶1 “Word classes can be derived from the training data in an unsupervised fashion in various ways. E.g., in [14], word classes were obtained based on unigram frequencies. In [33] and [34], a perplexity-based approach was proposed for obtaining word classes.”)
 wherein the unigram count vector further includes historical unigram inputs of the user (pg. 520 ¶ above §C “On the other hand, by evaluating the RNN equations word by word for an entire sequence, the output layer result for a word depends on the entire sequence of history words . The activations of the recurrent hidden layer then act as a memory that automatically stores previous word information”)
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Audhkhasi and Wolf with that of Sundermeyer since a combination of known methods would yield predictable results. As shown in Sundermeyer, it is known to count the frequency of unigrams which also includes historical data. Y using this data in context with neural networks such as the previously cited references, one can have better training and modeling/classification.
Claims 3, 5 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Audhkhasi et al. (US 20170270100 A1) in view of Wolf (US 2019/0087677 A1) further in view of Sundermeyer, Martin, Hermann Ney, and Ralf Schlüter. "From feedforward to recurrent LSTM neural networks for language modeling." and Rastrow et al. (US 9600764 B1).
Regarding claim 3, Audhkhasi in view of Wolf teaches the limitations as described in claim 1 above. Audhkhasi further teaches wherein the processor is configured to generate one or more subsequent predicted items in the current sequence of items (Audhkhasi Paragraph [0072] “The NL processing system predicts a current word using the EWE-NNLM (block 902)” teaches generating a predicted word for the sequence).
Wolf and Audkhasi and Sundermeyer fails to teach the one or more subsequent predicted items occurring after the one or more predicted next items in the current sequence of items.
Rastrow teaches the one or more subsequent predicted items occurring after the one or more predicted next items in the current sequence of items (Rastrow Col. 4 lines 23-29 “data indicating the predicted label for the previous position in the sequence. For example, the data indicating the predicted label for the previous position may be an array with an element for each possible label, where each element is set to 0 except the element corresponding to the predicted previous label, which is set to 1” teaches that data can be input to a neural network including the most recently predicted item).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Audhkhasi and Wolf and Sundermeyer to include the technique taught by Rastrow to submit a previously predicted output as input to the neural network. This would provide the benefit of being able to continue to predict the sequence based on the previous predictions.
Regarding claim 5, Audhkhasi in view of Wolf teaches the limitations as described in claim 4 above. 
Audhkhasi and Wolf and Sundermeyer fail to teach wherein the second item from the current input sequence is the previously predicted next item in the sequence output by the artificial neural network.
Rastrow teaches wherein the second item from the current input sequence is the previously predicted next item in the sequence output by the artificial neural network (Rastrow Col. 4 lines 23-29 “data indicating the predicted label for the previous position in the sequence. For example, the data indicating the predicted label for the previous position may be an array with an element for each possible label, where each element is set to 0 except the element corresponding to the predicted previous label, which is set to 1” teaches that data can be input to a neural network including the most recently predicted item).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Audhkhasi and Wolf and Sundermeyer to include the technique taught by Rastrow to submit a previously predicted output as input to the neural network. This would provide the benefit of being able to continue to predict the sequence based on the previous predictions.
Regarding claim 14, applicant is directed to the citation for claim 5 above.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN W FIGUEROA whose telephone number is (571)272-4623. The examiner can normally be reached Monday-Friday, 10AM-6PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA HUANG can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

KEVIN W FIGUEROA
Primary Patent Examiner
Art Unit 2124



/Kevin W Figueroa/Primary Examiner, Art Unit 2124