DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/14/2019 and 03/24/2019 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: “input interface” and “processing unit” in claim 1. Sufficient hardware structure and its corresponding function in the claim appears to be supported in paragraphs [0050], [0051], and [0092] of the Instant Application’s written description.
Because this/these claim limitation(s) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 7, and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Shi et al. (“An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition”) in view of Horvath et al. ("Cellular neural network friendly convolutional neural networks—CNNs with CNNs.").
Regarding Claim 1,
Shi teaches an artificial intelligence device for keywords detection comprising: 
a bus (pg. 2301, section 3.2; Experiments are carried out on a workstation with a 2.50 GHz Intel(R) Xeon(R) E5-2609 CPU, 64 GB RAM and an NVIDIA(R) Tesla(TM) K40 GPU. The computer components contain a bus.); 
an input interface operatively connecting to the bus for receiving an input string of texts (fig. 2; pg. 2299, section 2; At the bottom of CRNN, the convolutional layers automatically extract a feature sequence from each input image. Images contain a list of characters (i.e. string). pg. 2301, section 3.2; For example, an image containing 10 characters is typically of size 100 32, from which a feature sequence with 25 frames can be generated.); 
a processing unit operatively connecting to the bus for forming a two-dimensional (2-D) symbol using a 2-D symbol creation application module installed thereon (pg. 2301, section 3.2; Experiments are carried out on a workstation with a 2.50 GHz Intel(R) Xeon(R) E5-2609 CPU, 64 GB RAM and an NVIDIA(R) Tesla(TM) K40 GPU. Networks are trained with ADADELTA [27], setting the parameter r to 0.9. For training efficiency, we first train our model on rescaled training images, whose sizes are 100  32, for about 250k iterations. Then, we continue the training with variable-size images for another 50k iterations. In this stage, we first sort all training images by their aspect ratios. In each iteration, we randomly pick a batch of consecutive images and rescale them to have height 32 and their median aspect ratio. By doing this, we achieve efficient variable-size training, with negligible distortions on training images. Images (i.e. 2-D symbols).), the 2-D symbol being a matrix of NxN pixels of data for containing the input string of texts, where N is a positive integer (pg. 2301, section 3.2; For example, an image containing 10 characters is typically of size 100 32, from which a feature sequence with 25 frames can be generated.); and 
operatively connecting to the bus, a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit loaded with a deep learning model for detecting whether the input string of texts contains one of a list of keywords in a category of interest (pg. 2301-2302, section 3.3; Unlike [6], CRNN is not limited to recognize a word in a known dictionary, and able to handle random strings (e.g., telephone numbers), sentences or other scripts like Chinese words. Therefore, the results of CRNN are competitive on all the testing datasets.), filter coefficients (pg. 2302, section 3.3; In CRNN, all layers have weight-sharing connections, and the fully-connected layers are not needed. Weights (i.e filter coefficients).) of a plurality of ordered convolutional layers (pg. 2299, section 2; The network architecture of CRNN, as shown in Fig. 1, consists of three components, including the convolutional layers, the recurrent layers, and a transcription layer, from bottom to top.) in the deep learning model being trained using a keyword detection training dataset (pg. 2301, section 3.1; The dataset contains 8 millions training images and their corresponding ground truth words.) with an image classification technique (pg. 2298, section 1; Some other approaches (such as [6]) treat scene text recognition as an image classification problem, and assign a class label to each English word (90K words in total).).
Shi does not explicitly disclose
a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit
However, Horvath teaches
a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit (pg. 145, Abs.; This paper discusses the development and evaluation of a Cellular Neural Network (CeNN) friendly deep learning network for solving the MNIST digit recognition problem.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Shi with the Cellular neural network of Horvath.
The results have been shown to improve the energy efficiency, performance, and accuracy over a traditional convolutional neural network. Studies show application-level speedups of 3.7X and energy savings of 6.3X. (pg. 145; Thus, while there are growing efforts to improve the energy efficiency, performance, and accuracy of inference applications, systematic studies to compare energy, delay, and accuracy tradeoffs for a common application (i.e., when different computational models, devices, etc. are employed) are limited… as a case study, we use CeNNs to realize convolutional neural networks (CoNNs) that are becoming more ubiquitous and are relevant to numerous application spaces and problems [8], (ii) apply a CeNN-friendly CoNN to a standard inference problem). 
Regarding Claim 7,
Shi and Horvath teach the artificial intelligence device for keywords detection of claim 1, Shi further teaches further comprises a display unit operatively connecting to the bus (pg. 2301, section 3.2; Experiments are carried out on a workstation with a 2.50 GHz Intel(R) Xeon(R) E5-2609 CPU, 64 GB RAM and an NVIDIA(R) Tesla(TM) K40 GPU.).
Regarding Claim 10,
Shi and Horvath teach the artificial intelligence device for keywords detection of claim 1. Horvath further teaches further comprises a memory operatively connected to the bus for providing data storage for the processing unit (pg. 149, section V; Each CeNN is connected to an analog memory array of the same size, read and writes occur in parallel, and operations per Fig. 2 can proceed in parallel. For analog memories, we use the design from [30]. We used Hspice to simulate the analog memory schematics and to determine read/write delay.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Shi with the Cellular neural network of Horvath.
The results have been shown to improve the energy efficiency, performance, and accuracy over a traditional convolutional neural network. Studies show application-level speedups of 3.7X and energy savings of 6.3X. (pg. 145; Thus, while there are growing efforts to improve the energy efficiency, performance, and accuracy of inference applications, systematic studies to compare energy, delay, and accuracy tradeoffs for a common application (i.e., when different computational models, devices, etc. are employed) are limited… as a case study, we use CeNNs to realize convolutional neural networks (CoNNs) that are becoming more ubiquitous and are relevant to numerous application spaces and problems [8], (ii) apply a CeNN-friendly CoNN to a standard inference problem). 

Claims 2, 3, and 5 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shi/Horvath, as applied above, and further in view of Hammond et al. (US-20110209150-A1), Thong et al. (US-20030110035-A1), and Gigliotti et al. (US-8700991-B1).
Regarding Claim 2,
Shi and Horvath teach the artificial intelligence device for keywords detection of claim 1. Shi further teaches wherein the keyword detection training dataset is created by following operations: 
forming a first group of two-dimensional (2-D) symbols to graphically represent the second set and the first group of 2-D symbols being associated with the category of interest (pg. 2298; For example, the algorithms in [4], [5] first detect individual characters and then recognize these detected characters with DCNN models, which are trained using labeled character images. Such methods often require training a strong character detector for accurately detecting and cropping each character out from the original word image. And pg. 2301, section 3; The datasets and setting for training and testing are given in Section 3.1, the detailed settings of CRNN for scene text images is provided in Section 3.2, and the results with the comprehensive comparisons are reported in Section 3.3. Training dataset (first group).); 
forming a second group of 2-D symbols to graphically represent the third set and the second group of 2-D symbols being assigned with a category of uninterested (pg. 2298; For example, the algorithms in [4], [5] first detect individual characters and then recognize these detected characters with DCNN models, which are trained using labeled character images. Such methods often require training a strong character detector for accurately detecting and cropping each character out from the original word image. And pg. 2301, section 3; The datasets and setting for training and testing are given in Section 3.1, the detailed settings of CRNN for scene text images is provided in Section 3.2, and the results with the comprehensive comparisons are reported in Section 3.3. testing dataset (second group).); and 
Shi and Horvath do not explicitly disclose
defining and receiving the list of keywords from a user of the artificial intelligence device for keywords detection; 
optionally modifying the list of keywords by adding one or more items for increasing robustness during training of a deep learning model for keywords detection; 
deriving a list of to-be-excluded items from the list of keywords for avoiding false alarms or confusions during training of the deep learning model; 
Attorney Docket Number: GTI-1825 Page 2 of 11gathering a first set of general texts of various topics unrelated to the category of interest; 
expanding each sample or record of the first set to include all possible shorter samples; 
creating a second set of texts by inserting or replacing a randomly selected item from the list of keywords into each of the first set at a randomly chosen location within said each of the first set; 
creating a third set of texts by inserting or replacing a randomly selected item from the list of to-be-excluded into each of the first set at a randomly chosen location within said each of the first set; 
creating the keyword detection training dataset by combining the first group and the second group of the 2-D symbols.
However, Hammond teaches
defining and receiving the list of keywords from a user of the artificial intelligence device for keywords detection (para [0002] The present disclosure generally relates to automatic method and system for forming queries to retrieve information, and more specifically, to method and system to automatically generate keywords, phrases and other entities representing the content and/or context of an active task being manipulated by a user and to retrieve information based on the representation. And para [0019]-para [0021]); 
optionally modifying the list of keywords by adding one or more items for increasing robustness during training of a deep learning model for keywords detection (para [0113] For example, if the user is a teacher, one or more terms related to teaching, education, school, etc. may be added to the query, such as "curriculum," "syllabi," "class schedule," etc. According to another example, as a search condition, the user may designate a certain number of keywords that must be used in searching information.); 
deriving a list of to-be-excluded items from the list of keywords for avoiding false alarms or confusions during training of the deep learning model (para [0082] The information retrieval system 100 maintains, or has access to, a stop list including the most commonly occurring words that provide little information about the subject of the user's document. Words included in the stop list are not good search terms because the information resources 108 will often remove them automatically. The stop list may be created by a linguistic expert, by an automatic analysis (such as statistical), or by the user or by a combination of all three. Stop list (i.e. to-be-excluded items).); 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify text recognition of Shi and Horvath with the keyword detection of Hammond.
Doing so would allow for 
Automatically generating keywords that are relevant to the context of the task being manipulated. Automation helps improve the performance of an information retrieval system. (para [0019]). 
Thong teaches
Attorney Docket Number: GTI-1825 Page 2 of 11gathering a first set of general texts of various topics unrelated to the category of interest (para [0040]-[0041] In step 102 the user 22 speaks the query or input 23 into a microphone attached to the speech detection system 20. The fusion system 24 receives the spoken input 23 and records this input 23 as recorded input speech. In step 104 the subword decoder 34 processes the recorded input speech into a sequence of subword units or phonemes. Input query (i.e. a first set of general text).); 
expanding each sample or record of the first set to include all possible shorter samples (para [0041] In step 104 the subword decoder 34 processes the recorded input speech into a sequence of subword units or phonemes. For example, the spoken query 23 may be the words "Alan Alda." The subword decoder 34 may process this input speech into the following sequence of phonemes: "eh l ah n ah l ae d ah". This sequence of phonemes is a representation of the spoken input 23 as decoded by the subword decoder 34. ); 
creating a second set of texts by inserting or replacing a… selected item from the list of keywords into each of the first set at a… chosen location within said each of the first set (fig. 2; para [0055] Subword units 78-8 is an insertions 80 that would need to be inserted into the reference pattern 72-1 in order to have a match between the input pattern 76-1 and the reference pattern 72-1 (as well as other changes). And para [0049] The first entry 62-1 is the word match at the top of the word recognition list produced by the first vocabulary look-up 32 in step 110. 110 (second set).); 
creating a third set of texts by inserting or replacing a… selected item from the list of to-be-excluded into each of the first set at a… chosen location within said each of the first set ( [0055] Subword units 78-8 is an insertions 80 that would need to be inserted into the reference pattern 72-1 in order to have a match between the input pattern 76-1 and the reference pattern 72-1 (as well as other changes). And para [0042] In step 106 the second (subword detection) vocabulary look up 36 produces an ordered list of words by comparing the subword (or phoneme) sequence to a vocabulary such as the vocabulary contained in the pronunciation dictionary 38. 106 (third set).); 
creating the keyword detection training dataset by combining the first group and the second group (para [0048] In step 112 the list fusion module 46 combines the lists of words from the vocabulary look-up modules 32, 36 into one final list 48 (e.g., N-best list).).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the method of extracting keywords to generate queries of Hammond with the transforming spoken utterances into queries of Thong.
Doing so would allow for convenient and efficient natural language voice interface. This provides a convenient replacement for typing and accurately recognizes the spoken words to perform a query (para [0001]).
While Thong discloses inserting strings into a second set and a third set, Thong does not disclose randomly inserting the randomly selected strings.
Gigliotti (US 8700991 B1) teaches
creating a second set of texts by inserting or replacing a randomly selected item from the list of keywords into each of the first set at a randomly chosen location within said each of the first set (Col. 5 lines 4-11; One exemplary obfuscation algorithm may insert random text from one or more external sources (e.g., online dictionaries or other content repositories) at random locations within a content item. Another exemplary obfuscation algorithm may insert random text from a content item at random locations within the content item.); 
creating a third set of texts by inserting or replacing a randomly selected item from the list of to-be-excluded into each of the first set at a randomly chosen location within said each of the first set (Col. 5 lines 4-11; One exemplary obfuscation algorithm may insert random text from one or more external sources (e.g., online dictionaries or other content repositories) at random locations within a content item. Another exemplary obfuscation algorithm may insert random text from a content item at random locations within the content item.); 
It would have been obvious to one of ordinary skill in the art before the effective fling date to modify general text of Hammond and Thong with the method of inserting random text of Gigliotti.
Doing so would allow for creating a obfuscated version of a content item. The proposed modification can protect the online content from illegal piracy (abs.)
Regarding Claim 3,
Shi, Horvath, Hammond, Thong, and Gigliotti teach the artificial intelligence device for keywords detection of claim 2. Shi further teaches wherein said forming the first group of 2-D symbols is based on a squared word format (pg. 2301; section 3.2; For example, an image containing 10 characters is typically of size 100 32, from which a feature sequence with 25 frames can be generated. This length exceeds the lengths of most English words.).
Regarding Claim 5,
Shi, Horvath, Hammond, Thong, and Gigliotti teach the artificial intelligence device for keywords detection of claim 2. Shi further teaches said forming the second group of 2-D symbols is based on a squared word format (pg. 2301; section 3.2; For example, an image containing 10 characters is typically of size 100 32, from which a feature sequence with 25 frames can be generated. This length exceeds the lengths of most English words.).

Claims 4 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shi/Horvath/Hammond/Thong/Gigliotti), as applied above, and further in view of Qun-ting et al. ("One-way hash function based on hyper-chaotic cellular neural network.").
Regarding Claim 4,
Shi, Horvath, Hammond, Thong, and Gigliotti teach the artificial intelligence device for keywords detection of claim 3. 
	Shi, Horvath, Hammond, Thong, and Gigliotti do not explicitly disclose
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word
However, However, Qun-Ting teaches
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word (“One-way hash function based on hyper-chaotic cellular neural network” pg. 2392, section 3.3; In the second experiment, the hash value for a paragraph of message randomly chosen is generated and stored in ASCII format similarly.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Shi, Horvath, Hammond, Thong, and Gigliotti with the Cellular neural network of Qun-Ting.
Doing so would allow for improving the algorithm runtime speed. Training and execution of the neural network may be completed in less time (pg. 2392; The executive speed of the algorithm proposed is proportional to the length of the plaintext nearly. The total iterative times is N0+L/4, while N0 is the initial iterative times, L indicates the length of the plaintext. So the iterative time required is very little when the plaintext is short.).
Regarding Claim 6,
Shi, Horvath, Hammond, Thong, and Gigliotti teach the artificial intelligence device for keywords detection of claim 5. 
	Shi, Horvath, Hammond, Thong, and Gigliotti do not explicitly disclose
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word
However, However, Qun-Ting teaches
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word (“One-way hash function based on hyper-chaotic cellular neural network” pg. 2392, section 3.3; In the second experiment, the hash value for a paragraph of message randomly chosen is generated and stored in ASCII format similarly.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Shi, Horvath, Hammond, Thong, and Gigliotti with the Cellular neural network of Qun-Ting.
Doing so would allow for improving the algorithm runtime speed. Training and execution of the neural network may be completed in less time (pg. 2392; The executive speed of the algorithm proposed is proportional to the length of the plaintext nearly. The total iterative times is N0+L/4, while N0 is the initial iterative times, L indicates the length of the plaintext. So the iterative time required is very little when the plaintext is short.).

Claims 8 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shi/Horvath, as applied above, and further in view of Kent et al. (US-5179705-A).
Regarding Claim 8,
Shi and Horvath teach the artificial intelligence device for keywords detection of claim 1. Horvath further teaches wherein the CNN based integrated circuit (pg. 146, section III, Thus, while most CeNN hardware implementations lack the layered structure of CoNNs, by using local memory (commonly available on every realized CeNN chip), a cascade of said operations can be realized by re-using the result of each previous processing layer [16].) comprises… each CNN processing engine comprising: 
a CNN processing block configured for simultaneously performing convolutional operations of the 2-D symbol (pg. 145, section II; An input image may be fed into a sequence of convolution and pooling layers.) and the filter coefficients of a plurality of ordered convolutional layers of the deep learning model (pg. 145, section II; A convolution layer is further broken down into different feature maps. Here, each unit is then connected to regions of feature maps of prior layers via filter banks (where a filter bank is a set of weights).); 
a first set of memory buffers operatively coupling to the CNN processing block for storing the 2-D symbol (pg. 149, section V; Each CeNN is connected to an analog memory array of the same size, read and writes occur in parallel, and operations per Fig. 2 can proceed in parallel. For analog memories, we use the design from [30]. We used Hspice to simulate the analog memory schematics and to determine read/write delay.); and 
a second set of memory buffers operatively coupling to the CNN processing block for storing the filter coefficients (pg. 149, section V; Each CeNN is connected to an analog memory array of the same size, read and writes occur in parallel, and operations per Fig. 2 can proceed in parallel. For analog memories, we use the design from [30]. We used Hspice to simulate the analog memory schematics and to determine read/write delay. Memories/ memory array is at least two memories.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Shi with the Cellular neural network of Horvath.
The results have been shown to improve the energy efficiency, performance, and accuracy over a traditional convolutional neural network. Studies show application-level speedups of 3.7X and energy savings of 6.3X. (pg. 145; Thus, while there are growing efforts to improve the energy efficiency, performance, and accuracy of inference applications, systematic studies to compare energy, delay, and accuracy tradeoffs for a common application (i.e., when different computational models, devices, etc. are employed) are limited… as a case study, we use CeNNs to realize convolutional neural networks (CoNNs) that are becoming more ubiquitous and are relevant to numerous application spaces and problems [8], (ii) apply a CeNN-friendly CoNN to a standard inference problem). 
Shi and Horvath do not explicitly disclose 
a plurality of CNN processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit,
However, Kent teaches 
a plurality of CNN processing engines operatively coupled to at least one input/output data bus (Col. 1, lines 14-18; In order that a plurality of different portions of a computer system may share a common resource, such as a common bus or a common memory, it is necessary to provide an arbitration system so that only one portion uses the resource at a given time.), the plurality of CNN processing engines being connected in a loop with a clock-skew circuit (col. 7 lines 41-43; Another advantage of this asynchronous arbitration and semaphore capability relates to clock skew in large systems.),
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN circuit of Shi and Horvath with the asynchronous clock skew system of Kent.
Doing so would allow for the neural network to function in an asynchronous manner. Neural networks are inherently highly distributed and use of a global clock may constrain the behavior of the network (col. 8 lines 1-10;).
Regarding Claim 9,
Shi, Horvath, and Kent teach the artificial intelligence device for keywords detection of claim 8. Horvath further teaches wherein the CNN based integrated circuit further performs pooling operations and activation operations (pg. 145, section II; As discussed in [8], a typical CoNN consists of a series of convolution, pooling, and non-linear activation stages.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Shi with the Cellular neural network of Horvath.
The results have been shown to improve the energy efficiency, performance, and accuracy over a traditional convolutional neural network. Studies show application-level speedups of 3.7X and energy savings of 6.3X. (pg. 145; Thus, while there are growing efforts to improve the energy efficiency, performance, and accuracy of inference applications, systematic studies to compare energy, delay, and accuracy tradeoffs for a common application (i.e., when different computational models, devices, etc. are employed) are limited… as a case study, we use CeNNs to realize convolutional neural networks (CoNNs) that are becoming more ubiquitous and are relevant to numerous application spaces and problems [8], (ii) apply a CeNN-friendly CoNN to a standard inference problem). 

Claims 11, 12, 14, and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Hammond et al. (US-20110209150-A1), Thong et al. (US-20030110035-A1), and Gigliotti et al. (US-8700991-B1), and Shi et al. (“An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition”).
Regarding Claim 11,
Hammond teaches a method implemented in a computing system for enabling an artificial intelligence device for keywords detection comprising: 
receiving a list of keywords in a category of interest (para [0002] The present disclosure generally relates to automatic method and system for forming queries to retrieve information, and more specifically, to method and system to automatically generate keywords, phrases and other entities representing the content and/or context of an active task being manipulated by a user and to retrieve information based on the representation. And para [0019]-para [0021]); 
optionally modifying the list of keywords by adding one or more items for increasing robustness during training of a deep learning model for keywords detection (para [0113] For example, if the user is a teacher, one or more terms related to teaching, education, school, etc. may be added to the query, such as "curriculum," "syllabi," "class schedule," etc. According to another example, as a search condition, the user may designate a certain number of keywords that must be used in searching information.); 
Attorney Docket Number: GTI-1825 Page 4 of 11deriving a list of to-be-excluded items from the list of keywords for avoiding false alarms or confusions during training of the deep learning model (para [0082] The information retrieval system 100 maintains, or has access to, a stop list including the most commonly occurring words that provide little information about the subject of the user's document. Words included in the stop list are not good search terms because the information resources 108 will often remove them automatically. The stop list may be created by a linguistic expert, by an automatic analysis (such as statistical), or by the user or by a combination of all three. Stop list (i.e. to-be-excluded items).); 
Hammond does not explicitly disclose
gathering a first set of general texts of various topics unrelated to the category of interest; 
expanding each sample or record of the first set to include all possible shorter samples; 
creating a second set of texts by inserting or replacing a randomly selected item from the list of keywords into each of the first set at a randomly chosen location within said each of the first set; 
forming a first group of two-dimensional (2-D) symbols to graphically represent the second set and the first group of 2-D symbols being associated with the category of interest; 
creating a third set of texts by inserting or replacing a randomly selected item from the list of to-be-excluded into each of the first set at a randomly chosen location within said each of the first set; 
forming a second group of 2-D symbols to graphically represent the third set and the second group of 2-D symbols being assigned with a category of uninterested; and 
creating the keyword detection training dataset by combining the first group and the second group of the 2-D symbols.
However, Thong (US 20030110035 A1) teaches
gathering a first set of general texts of various topics unrelated to the category of interest (para [0040]-[0041] In step 102 the user 22 speaks the query or input 23 into a microphone attached to the speech detection system 20. The fusion system 24 receives the spoken input 23 and records this input 23 as recorded input speech. In step 104 the subword decoder 34 processes the recorded input speech into a sequence of subword units or phonemes. Input query (i.e. a first set of general text).); 
expanding each sample or record of the first set to include all possible shorter samples (para [0041] In step 104 the subword decoder 34 processes the recorded input speech into a sequence of subword units or phonemes. For example, the spoken query 23 may be the words "Alan Alda." The subword decoder 34 may process this input speech into the following sequence of phonemes: "eh l ah n ah l ae d ah". This sequence of phonemes is a representation of the spoken input 23 as decoded by the subword decoder 34. ); 
creating a second set of texts by inserting or replacing a… selected item from the list of keywords into each of the first set at a… chosen location within said each of the first set (fig. 2; para [0055] Subword units 78-8 is an insertions 80 that would need to be inserted into the reference pattern 72-1 in order to have a match between the input pattern 76-1 and the reference pattern 72-1 (as well as other changes). And para [0049] The first entry 62-1 is the word match at the top of the word recognition list produced by the first vocabulary look-up 32 in step 110. 110 (second set).); 
creating a third set of texts by inserting or replacing a… selected item from the list of to-be-excluded into each of the first set at a …chosen location within said each of the first set ( [0055] Subword units 78-8 is an insertions 80 that would need to be inserted into the reference pattern 72-1 in order to have a match between the input pattern 76-1 and the reference pattern 72-1 (as well as other changes). And para [0042] In step 106 the second (subword detection) vocabulary look up 36 produces an ordered list of words by comparing the subword (or phoneme) sequence to a vocabulary such as the vocabulary contained in the pronunciation dictionary 38. 106 (third set).); 
creating the keyword detection training dataset by combining the first group and the second group…(para [0048] In step 112 the list fusion module 46 combines the lists of words from the vocabulary look-up modules 32, 36 into one final list 48 (e.g., N-best list).). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the method of extracting keywords to generate queries of Hammond with the transforming spoken utterances into queries of Thong.
Doing so would allow for convenient and efficient natural language voice interface. This provides a convenient replacement for typing and accurately recognizes the spoken words to perform a query (para [0001]).
While Thong discloses inserting strings into a second set and a third set, Thong does not disclose randomly inserting the randomly selected strings.
Gigliotti (US 8700991 B1) teaches
creating a second set of texts by inserting or replacing a randomly selected item from the list of keywords into each of the first set at a randomly chosen location within said each of the first set (Col. 5 lines 4-11; One exemplary obfuscation algorithm may insert random text from one or more external sources (e.g., online dictionaries or other content repositories) at random locations within a content item. Another exemplary obfuscation algorithm may insert random text from a content item at random locations within the content item.); 
creating a third set of texts by inserting or replacing a randomly selected item from the list of to-be-excluded into each of the first set at a randomly chosen location within said each of the first set (Col. 5 lines 4-11; One exemplary obfuscation algorithm may insert random text from one or more external sources (e.g., online dictionaries or other content repositories) at random locations within a content item. Another exemplary obfuscation algorithm may insert random text from a content item at random locations within the content item.); 
It would have been obvious to one of ordinary skill in the art before the effective fling date to modify general text of Hammond and Thong with the method of inserting random text of Gigliotti.
Doing so would allow for creating a obfuscated version of a content item. The proposed modification can protect the online content from illegal piracy (abs.)
	Shi teaches
forming a first group of two-dimensional (2-D) symbols to graphically represent the second set and the first group of 2-D symbols being associated with the category of interest (pg. 2298; For example, the algorithms in [4], [5] first detect individual characters and then recognize these detected characters with DCNN models, which are trained using labeled character images. Such methods often require training a strong character detector for accurately detecting and cropping each character out from the original word image. And pg. 2301, section 3; The datasets and setting for training and testing are given in Section 3.1, the detailed settings of CRNN for scene text images is provided in Section 3.2, and the results with the comprehensive comparisons are reported in Section 3.3. Training dataset (first group).); 
forming a second group of 2-D symbols to graphically represent the third set and the second group of 2-D symbols being assigned with a category of uninterested (pg. 2298; For example, the algorithms in [4], [5] first detect individual characters and then recognize these detected characters with DCNN models, which are trained using labeled character images. Such methods often require training a strong character detector for accurately detecting and cropping each character out from the original word image. And pg. 2301, section 3; The datasets and setting for training and testing are given in Section 3.1, the detailed settings of CRNN for scene text images is provided in Section 3.2, and the results with the comprehensive comparisons are reported in Section 3.3. testing dataset (second group).); and 
	It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine learning model for keyword speech recognition of Hammond, Thong, and Gigliotti with the convolutional neural network of Shi.
	Doing so would allow for higher levels of abstractions which has shown to achieve significant performance improvements in the task of speech recognition. Replacing the machine learning model of Hammond, Thong, and Gigliotti with the CNN of Shi would improve the performance of the keyword recognition (pg. 2300, section 2.2; The deep structure allows higher level of abstractions than a shallow one, and has achieved significant performance improvements in the task of speech recognition [23].).
Regarding Claim 12,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 11. Shi further teaches wherein said forming the first group of 2-D symbols is based on a squared word format (pg. 2301; section 3.2; For example, an image containing 10 characters is typically of size 100 32, from which a feature sequence with 25 frames can be generated. This length exceeds the lengths of most English words.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine learning model for keyword speech recognition of Hammond, Thong, and Gigliotti with the convolutional neural network of Shi.
	Doing so would allow for higher levels of abstractions which has shown to achieve significant performance improvements in the task of speech recognition. Replacing the machine learning model of Hammond, Thong, and Gigliotti with the CNN of Shi would improve the performance of the keyword recognition (pg. 2300, section 2.2; The deep structure allows higher level of abstractions than a shallow one, and has achieved significant performance improvements in the task of speech recognition [23].).
Regarding Claim 14,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 11. Shi further teaches said forming the second group of 2-D symbols is based on a squared word format (pg. 2301; section 3.2; For example, an image containing 10 characters is typically of size 100 32, from which a feature sequence with 25 frames can be generated. This length exceeds the lengths of most English words.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine learning model for keyword speech recognition of Hammond, Thong, and Gigliotti with the convolutional neural network of Shi.
	Doing so would allow for higher levels of abstractions which has shown to achieve significant performance improvements in the task of speech recognition. Replacing the machine learning model of Hammond, Thong, and Gigliotti with the CNN of Shi would improve the performance of the keyword recognition (pg. 2300, section 2.2; The deep structure allows higher level of abstractions than a shallow one, and has achieved significant performance improvements in the task of speech recognition [23].).
Regarding Claim 16,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 11. Gigliotti further teaches wherein the first set of general texts are gathered from a publicly available source (Col. 5 lines 4-11; One exemplary obfuscation algorithm may insert random text from one or more external sources (e.g., online dictionaries or other content repositories) at random locations within a content item.).
It would have been obvious to one of ordinary skill in the art before the effective fling date to modify general text of Hammond and Thong with the method of inserting random text of Gigliotti.
Doing so would allow for creating a obfuscated version of a content item. The proposed modification can protect the online content from illegal piracy (abs.)
Regarding Claim 17,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 11. Thong further teaches wherein each of the first set of general texts includes a plurality of natural language words (para [0003] The N-best list may then be re-ordered using additional sources of information, such as natural language processing.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the method of extracting keywords to generate queries of Hammond with the transforming spoken utterances into queries of Thong.
Doing so would allow for convenient and efficient natural language voice interface. This provides a convenient replacement for typing and accurately recognizes the spoken words to perform a query (para [0001]).
Regarding Claim 18,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 17. Shi further teaches wherein the plurality of natural language words contains more than one natural languages (pg. 2298; Some other approaches (such as [6]) treat scene text recognition as an image classification problem, and assign a class label to each English word (90K words in total). It turns out a large trained model with a huge number of classes, which is difficult to be generalized to other types of sequence-like objects, such as Chinese text, musical scores, etc., because the numbers of basic combinations of such kind of sequences can be greater than 1 million.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine learning model for keyword speech recognition of Hammond, Thong, and Gigliotti with the convolutional neural network of Shi.
	Doing so would allow for higher levels of abstractions which has shown to achieve significant performance improvements in the task of speech recognition. Replacing the machine learning model of Hammond, Thong, and Gigliotti with the CNN of Shi would improve the performance of the keyword recognition (pg. 2300, section 2.2; The deep structure allows higher level of abstractions than a shallow one, and has achieved significant performance improvements in the task of speech recognition [23].).

Claims 13 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Hammond/Thong/Gigliotti/Shi, as applied above, and further in view of Qun-ting et al. ("One-way hash function based on hyper-chaotic cellular neural network.").
Regarding Claim 13,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 12. Hammond, Thong, Gigliotti, and Shi do not explicitly disclose 
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word.
However, Qun-Ting teaches
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word (“One-way hash function based on hyper-chaotic cellular neural network” pg. 2392, section 3.3; In the second experiment, the hash value for a paragraph of message randomly chosen is generated and stored in ASCII format similarly.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Hammond, Thong, Gigliotti, and Shi with the Cellular neural network of Qun-Ting.
Doing so would allow for improving the algorithm runtime speed. Training and execution of the neural network may be completed in less time (pg. 2392; The executive speed of the algorithm proposed is proportional to the length of the plaintext nearly. The total iterative times is N0+L/4, while N0 is the initial iterative times, L indicates the length of the plaintext. So the iterative time required is very little when the plaintext is short.).
Regarding Claim 15,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 14.
	Hammond, Thong, Gigliotti, and Shi do not explicitly disclose 
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word.
However, Qun-Ting teaches
wherein the squared word format converts each word in Latin-alphabet based languages to a square format based on number of alphabet in said each word (“One-way hash function based on hyper-chaotic cellular neural network” pg. 2392, section 3.3; In the second experiment, the hash value for a paragraph of message randomly chosen is generated and stored in ASCII format similarly.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Hammond, Thong, Gigliotti, and Shi with the Cellular neural network of Qun-Ting.
Doing so would allow for improving the algorithm runtime speed. Training and execution of the neural network may be completed in less time (pg. 2392; The executive speed of the algorithm proposed is proportional to the length of the plaintext nearly. The total iterative times is N0+L/4, while N0 is the initial iterative times, L indicates the length of the plaintext. So the iterative time required is very little when the plaintext is short.).

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over over the combination of Hammond/Thong/Gigliotti/Shi, as applied above, and further in view of Wu et al. (US-10572760-B1).
Regarding Claim 19,
Hammond, Thong, Gigliotti, and Shi teach the method of claim 11. 
	Hammond, Thong, Gigliotti, and Shi do not explicitly disclose
wherein the image classification technique comprises a binary classification that contains the category of interest and the category of uninterested.
However, Wu (US 10572760 B1) teaches
wherein the image classification technique comprises a binary classification that contains the category of interest and the category of uninterested (Col. 2 lines 55- In one embodiment, the image (e.g., the pre-processed image, or the original image) is received by the detection and classification service 105, which is configured to identify objects of interest (e.g., text lines) and to classify objects identified in the image (e.g., as text or non-text).).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Hammond, Thong, Gigliotti, and Shi with the text recognition of Wu.
Doing so would allow for reducing the difference between the predicted output of the CNN and the expected result. Minimizing the deviation leads to an improved accuracy for classifying text objects (Col. 5 lines 58-64;).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Shen et al (US-20200218857-A1) – discloses inserting keywords at random locations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/H.N./Examiner, Art Unit 2121                                                                                                                                                                                                        
/NICHOLAS KLICOS/Primary Examiner, Art Unit 2145