DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Priority
Examiner acknowledges Applicant's claim for benefit as a 371 of PCT/GB2016/052022 filed 7/5/2016 and for priority based on GB1511887 filed 7/7/2015 in Great Britain. 

Title
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed. Examiner believes that the title of the invention is imprecise. A descriptive title indicative of the invention will help in proper indexing, classifying, searching, etc. See MPEP §606.01. However, the title of the invention should be limited to 500 characters. Examiner suggests in including the aspect(s) of the claims which Applicant believes to be novel or nonobvious over the prior art. Examiner suggests the title: “NEURAL NETWORK HAVING THE SAME NUMBER OF NODES IN THE INPUT LAYER AND AND ONE HIDDEN LAYER WHICH USES THE SAME MATRIX FOR ENCODING AND DECODING”.
Response to Arguments
Applicant's arguments filed 1/6/2022 have been fully considered but they are not persuasive. In Re page 1, applicant argues that the title has been amended to reflect the contents of claim 1.
	Examiner believes that the title is still imprecise, and suggests the title: “NEURAL NETWORK HAVING THE SAME NUMBER OF UNITS IN INPUT LAYER AS ONE HIDDEN LAYER AND USING THE SAME MATRIX FOR ENCODING AND DECODING”.

Claim Objections
Response to Arguments
Applicant’s arguments, see page 8, filed 1/6/2022, with respect to the objection to claim 5 have been fully considered and are persuasive.  The objection to claim 5 has been withdrawn. 

Claim Rejections - 35 USC § 101
Response to Arguments
Applicant’s arguments, see pages 8-13, filed 1/6/2022, with respect to §101 have been fully considered and are persuasive.  The rejection of claims 1-15 under §101 has been withdrawn. 

PRIOR ART
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Specifically, applicant’s arguments are focused on the concept that “a single matrix (e.g., the claimed encoding matrix) is used as both the encoding matrix and the decoding matrix for the artificial neural network” which is taught by a new reference which was necessitated by this amendment.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. §103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. §102(b)(2)(C) for any potential 35 U.S.C. §102(a)(2) prior art against the later invention.
Claim(s) 1, 5-7, 11, and 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Deoras (US 2015/066496) in view of
Makhzani (“k-Sparse Autoencoders”).

Claim 1 (Independent)
Deoras discloses: An electronic device comprising:
a processor, at least one input interface, and an artificial neural network, comprising an input layer, an output layer and at least first and second hidden layers (e.g. §Abstract or ¶30: DNN or RNN comprises an input layer, a plurality of hidden layers, and an output layer or ¶¶50-56: weights between the input layer and the first hidden layer … weights learned … recurrent connections … output layer or Figures 8, 9, 11 and the associated disclosure); 
wherein the processor is configured to generate one or more predicted next items in a sequence of items (e.g. ¶6: labels predicted assigned to respective previous words in the sequence can be used for predicting the label for the word currently being considered or ¶60: For each word in the sequence of words … context nodes …to perform tasks, such as sequence prediction) based on an input sequence item received at the at least one input interface (e.g. ¶60: the respective raw input is propagated in a standard feed-forward fashion … context nodes … can maintain and learn a state summarizing past inputs … to perform tasks, such as sequence prediction) by: 
retrieving a context vector corresponding to the input sequence item [using stored data] (e.g. ¶60: context nodes maintain a copy of the previous values of the hidden nodes, since these propagated to the recurrent connections from t−1 before updating rule is applied at t. Therefore, the ST-DNN 802, when using the architecture set forth in FIG. 9, can maintain and learn a state summarizing past inputs, allowing the ST-DNN 802 to perform tasks, such as sequence prediction; Also see ¶¶62-64); 
processing the context vector with the artificial neural network (e.g. ¶39: context at the particular time … associated with an observed input sequence of words or ¶60: For each word in the sequence of words … context nodes maintain a copy of the previous values .. can maintain and learn a stat summarizing past inputs … to perform tasks, such as sequence prediction …to perform tasks, such as sequence prediction; Also see ¶¶52-57, 62-64);
generating an output vector at least by transforming the output of the second hidden layer of the artificial neural network using [the stored data], wherein the output vector corresponds to one or more predicted next items (e.g. ¶52: output layer … output vector … label or ¶58: assigning semantic labels to words in a sequence of words … outputting … sequence of labels to be assigned to the respective sequence of words); and 
outputting the output vector (e.g. ¶52: output or ¶58: outputting). 
Deoras fails to explicitly recite:
encoding matrix; 
wherein the number of units of the second hidden layer is equal to the number of units of the input layer.
Makhzani  discloses:
wherein the number of units of the second hidden layer is equal to the number of units of the input layer (e.g. §1: k largest coefficients of its input vector or §4.6: two hidden layers … αk largest hidden codes, where k = 25, α = 3 in MNIST and k = 150, α = 2 in NORB in both hidden layers);
retrieving a context vector corresponding to the input sequence item from an encoding matrix (e.g. §3.2: P is the encoder weight matrix and W is the decoder weight matrix … assuming P = WT (i.e., the autoencoder has tied weights); 
generating an output vector at least by transforming the output of the second hidden layer of the artificial neural network using the encoding matrix, wherein the output vector corresponds to one or more predicted next items (e.g. §3.2: P is the encoder weight matrix and W is the decoder weight matrix … assuming P = WT (i.e., the autoencoder has tied weights)
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Deoras to incorporate a second hidden layer having the same number of nodes as an input layer and the encoder matrix to be used to encode and decode (“tied weights” where the transpose of the encoding matrix is used to decode) as taught by Makhzani for the benefit of better classification while being simple to train and very fast (Makhzani e.g. §Abstract). 

Claim 15 (Independent)
Deoras discloses: A computer-implemented method for generating one or more predicted next items in a sequence of items based on an input sequence item (§Abstract or ¶30: DNN or RNN comprises an input layer, a plurality of hidden layers, and an output layer or ¶¶50-56: weights between the input layer and the first hidden layer … weights learned … recurrent connections … output layer or Figures 8, 9, 11 and the associated disclosure), the method comprising: 
receiving, at an electronic device, an input sequence item (e.g. ¶60: the respective raw input is propagated in a standard feed-forward fashion … context nodes … can maintain and learn a state summarizing past inputs … to perform tasks, such as sequence prediction); 
retrieving, from [stored data], a context vector corresponding to the input sequence item (e.g. ¶60: context nodes maintain a copy of the previous values of the hidden nodes, since these propagated to the recurrent connections from t−1 before updating rule is applied at t. Therefore, the ST-DNN 802, when using the architecture set forth in FIG. 9, can maintain and learn a state summarizing past inputs, allowing the ST-DNN 802 to perform tasks, such as sequence prediction; Also see ¶¶62-64); 
processing the context vector with an artificial neural network, wherein the artificial neural network comprises an input layer, a first hidden layer, a second hidden layer and an output layer (e.g. ¶39: context at the particular time … associated with an observed input sequence of words or ¶60: For each word in the sequence of words … context nodes maintain a copy of the previous values .. can maintain and learn a stat summarizing past inputs … to perform tasks, such as sequence prediction …to perform tasks, such as sequence prediction; Also see ¶¶52-57, 62-64); 
retrieving an output vector at the output layer of the artificial neural network by transforming the output of the second hidden layer of the artificial neural network using at least some of [the stored data], wherein the output vector corresponds to one or more predicted next items (e.g. ¶52: output layer … output vector … label or ¶58: assigning semantic labels to words in a sequence of words … outputting … sequence of labels to be assigned to the respective sequence of words); and 
outputting the output vector (e.g. ¶52: output or ¶58: outputting).
Deoras fails to explicitly recite:
wherein the number of units of the second hidden layer of the artificial neural network is equal to the number of units of the input layer of the artificial neural network.
Deoras fails to explicitly recite:
encoding matrix; 
wherein the number of units of the second hidden layer is equal to the number of units of the input layer.
Makhzani  discloses:
retrieving, from an encoding matrix, a context vector corresponding to the input sequence item (e.g. §3.2: P is the encoder weight matrix and W is the decoder weight matrix … assuming P = WT (i.e., the autoencoder has tied weights); 
generating an output vector at least by transforming the output of the second hidden layer of the artificial neural network using the encoding matrix, wherein the output vector corresponds to one or more predicted next items (e.g. §3.2: P is the encoder weight matrix and W is the decoder weight matrix … assuming P = WT (i.e., the autoencoder has tied weights);
wherein the number of units of the second hidden layer is equal to the number of units of the input layer of the artificial neural network (e.g. §1: k largest coefficients of its input vector or §4.6: two hidden layers … αk largest hidden codes, where k = 25, α = 3 in MNIST and k = 150, α = 2 in NORB in both hidden layers).
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Deoras to incorporate a second hidden layer having the same number of nodes as an input layer and the encoder matrix to be used to encode and decode (“tied weights” where the transpose of the encoding matrix is used to decode) as taught by Makhzani for the benefit of better classification while being simple to train and very fast (Makhzani e.g. §Abstract). 

Claim 5
Deoras discloses: wherein the processor is configured to process the context vector with the artificial neural network by:
providing the context vector to the input layer of the artificial neural network (e.g. ¶¶52-55: input vector w(t) or ¶¶56-61: input vector x(t); Also see ¶¶62-64 ); 
multiplying the contents of the input layer with a first weight matrix W0 to generate a first result, and providing the first result to the first hidden layer of the artificial neural network (e.g. ¶¶52-55: weight matrices U, W, V or ¶¶56-61: U and V … weight matrices; Also see ¶¶62-64); 
processing the input to the first hidden layer with the nodes of the first hidden layer to produce an output of the first hidden layer (e.g. ¶¶52-55 or ¶¶56-61: V …weight matrices; Also see ¶¶62-64); 
multiplying the output of the first hidden layer with a second weight matrix W1 to generate a first result, and providing the second result to the second hidden layer of the artificial neural network (e.g. ¶¶52-55: weight matrices or ¶¶56-61: W … weight matrices; Also see ¶¶62-64); and
processing the input to the second hidden layer with the nodes of the second hidden layer to produce an output of the second hidden layer (e.g. ¶¶52-55: output y(t) or ¶¶56-61: c(t); Also see ¶¶62-64). 

Claim 6
Deoras discloses: wherein the artificial neural network further comprises a recurrent hidden vector (e.g. ¶52-55: hidden layer in the middle with recurrent connections … layer (vector) or ¶60: recurrent connections; Also see ¶65 or ¶70). 

Claim 7
Deoras discloses: wherein the processor is configured to concatenate the contents of the input layer with the recurrent hidden vector prior to processing the contents of the input layer with the hidden layers of the artificial neural network (e.g. ¶47: n-gram lexical features can be represented as a concatenation of n “one of N coded” binary vectors, where N is the size of the lexical vocabulary or ¶64: recurrence can be combined with the idea of an input window. This can be achieved by feeding the network with concatenation of the t previous time steps vectors … in addition to the use of word context windows; Also see ¶7 or ¶56).

Claim 11
Deoras discloses:
context vector that corresponds to the input received at the at least one input interface (¶60: context nodes maintain a copy of the previous values of the hidden nodes, since these propagated to the recurrent connections from t−1 before updating rule is applied at t. Therefore, the ST-DNN 802, when using the architecture set forth in FIG. 9, can maintain and learn a state summarizing past inputs, allowing the ST-DNN 802 to perform tasks, such as sequence prediction; Also see ¶¶62-64).
Deoras fails to explicitly recite:
retrieving a row or column of the encoding matrix.
Makhzani discloses:
wherein the processor is configured to retrieve the context vector by retrieving a row or column of the encoding matrix (e.g. §1: columns of W; Also see §3.2 or §3.3).
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Deoras to incorporate columns of the encoding matrix as taught by Makhzani for the benefit of better classification while being simple to train and very fast (Makhzani e.g. §Abstract).

Claim 14
Deoras discloses: wherein the processor is further configured to retrieve an output class prediction from the output layer of the artificial neural network, wherein the output class prediction defines a group of one or more sequence items (e.g. ¶63: output layer outputs a probability distribution over a sequence of labels that are respectively to be assigned; Also see ¶Abstract or ¶¶52-61).Page 5 of 7 611881420.1 DOCKET NO.: 346375-US-PCT/100347.141 PATENT Application No.: not yet assigned Preliminary Amendment - First Action Not Yet Received: 

Claim Rejections - 35 USC § 103
Claim(s) 2, 8-10, and 12-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Deoras (US 2015/066496) in view of
Makhzani (“k-Sparse Autoencoders”) further in view of
MacPherson (US 2009/0063557).

Claim 2
The combination of Deoras and Makhzani fails to explicitly recite:
locations of items in the multi-dimensional space. 
MacPherson discloses:
wherein the stored data is either values of parameters of a character-compositional model which is a predictor configured to compute a location of an item in a multi-dimensional space from individual characters of the item, or the stored data is item embeddings being locations of items in the multi-dimensional space (e.g. ¶¶108-110: Context Driven Topologies … the more associations each object accumulates, the more this changes the edges, or texture, of each of the -multidimensional boundaries [FIG. 4] …. Context is the measurement used in the time dependent topologies). 
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Deoras and Makhzani to incorporate data embedded/sotred in multi-dimensional space as taught by MacPherson for the benefit of consolidated representations of rups of data associated with context information (MacPherson e.g. ¶Abstract). 

Claim 8
Deoras discloses: wherein the recurrent hidden vector comprises data indicative of a previous state of the artificial neural network (e.g. ¶7: RNN can be fed with a concatenation of a threshold number of previous steps in time (vectors)).

Claim 9
Deoras discloses: wherein the processor is configured to update the recurrent hidden vector according to the output of the first hidden layer (e.g. ¶60: parameter updating rule is applied taking into account the influence of past states through the recurrent connections. Accordingly, context nodes maintain a copy of the previous values of the hidden nodes, since these propagated to the recurrent connections from t−1 before updating rule is applied at t; Also see ¶¶67-69).

Claim 10
Deoras discloses: wherein the processor is configured to update the recurrent hidden vector by replacing the recurrent hidden vector with the output of the first hidden layer (e.g. ¶60: parameter updating rule is applied taking into account the influence of past states through the recurrent connections. Accordingly, context nodes maintain a copy of the previous values of the hidden nodes, since these propagated to the recurrent connections from t−1 before updating rule is applied at t or ¶¶67-69: update … all values must be recomputed from the beginning of the word sequence in order to obtain ‘correct’ predictions that are consistent with the current model training parameters … history of values can be kept as an approximation). 

Claim 12
Deoras discloses: wherein the processor is configured to produce a 1-of-N vector corresponding to the input sequence item and to retrieve the context vector by transforming the 1-of-N vector using the encoding matrix (e.g. ¶¶52-61: input layer (vector) w(t) represents an input word at time t encoded using 1-of-N coding … hidden layer maintains representation of the word sequence history … retain word ordering information using representations concatenated in sequence in a given context window … hidden layers used in connection with outputting probability over labels while incorporating long-term context dependencies when outputting a probability distribution over a sequence of labels).

Claim 13
Deoras discloses: wherein transforming the 1-of-N vector comprises multiplying the 1-of-N vector and the encoding matrix using matrix multiplication (e.g. ¶¶52-61: layers are connected with weights denoted by the matrices U, W, and V … input layer (vector) w(t) represents an input word at time t encoded using 1-of-N coding … activation computation [using Equation 6 and Equation 7] … assigning semantic labels … Mathematically represented as [Equations 8, 9, 10] where U and V are weight matrices ). 

Claim Rejections - 35 USC § 103
Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Deoras (US 2015/066496) in view of
Makhzani (“k-Sparse Autoencoders”) further in view of
Ji (“Visual Exploration of Neural Document Embeddings in Information Retrieval: Semantics and Feature Selection”) further in view of
Liu (“Efficient Lattice Rescoring Using Recurrent Neural Network Language Models”).

Claim 3
The combination of Deoras and Makhzani and MacPherson fails to explicitly recite:
a cache.
Liu discloses:
where the encoding matrix is held in a cache and wherein the processor is configured to compute an item embedding corresponding to the input sequence item from the character-compositional model and to add the item embedding to the cache (e.g. §3.1: full preceding history … contexts of RNNLM probabilities … truncated contexts … approximated RNNLM state for the complete history … shared n-gram style … in practice operates as a state cache that stores the RNNLM probabilities associated with histories). 
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Deoras and Makhzani and MacPherson to incorporate a cache as taught by Makhzani for the benefit integration into beam search based decoders (Liu e.g. §3.2). 

Claim Rejections - 35 USC § 103
Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Deoras (US 2015/066496) in view of
Makhzani (“k-Sparse Autoencoders”) further in view of
Liu (“Efficient Lattice Rescoring Using Recurrent Neural Network Language Models”).

Claim 4
The combination of Deoras and Makhzani fails to explicitly recite:
a cache.
Liu discloses:
where the processor is configured to retrieve the context vector corresponding to the input sequence by accessing an item embedding from a cache (e.g. §3.1: cache that stores the RNNLM probabilities associated with n-gam histories derived … can be computed on-the-fly by request and accessed via the cache; Also see §3.2). 
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Deoras and Makhzani to incorporate a cache as taught by Liu for the benefit integration into beam search based decoders (Liu e.g. §3.2). 

Examiner’s Note
The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned.  They are part of the literature of the art, relevant for all they contain.”  In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including non-preferred embodiments (see MPEP 2123).  The Examiner has cited particular locations in the reference(s) as applied to the claim(s) above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim(s), typically other passages and figures will apply as well.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any prior art made of record on the attached PTO-892 and not relied upon is considered pertinent to applicant's disclosure.
Applicant is reminded that in amending in response to a rejection of claims, the patentable novelty must be clearly shown in view of the state of the art disclosed by the references cited and the objections made.  Applicant must also show how the amendments avoid such references and objections.  See 37 CFR §1.111(c).  Additionally when amending, in their remarks Applicant should particularly cite to the supporting paragraphs in the original disclosure for the amendments.

Correspondence Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN J BUSS whose telephone number is (571)272-5831.  The examiner can normally be reached on Monday, Tuesday, Thursday 9A-5P ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
As detailed in MPEP 502.03, communications via Internet e-mail are at the discretion of the applicant.  Without a written authorization by applicant in place, the USPTO will not respond via Internet e-mail to any Internet correspondence which contains information subject to the confidentiality requirement as set forth in 35 U.S.C. 122. A paper copy of such correspondence will be placed in the appropriate patent application. Examiner suggests filing PTO/SB/439 if applicant desires the examiner to be able to communicate by email.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 


/B.B./
Examiner, Art Unit 2127

/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121