DETAILED ACTION
Status of Claims 
Applicant’s Amendment filed on 01/27/2021 has been considered.
Claims 1-20 are currently pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendment, filed 01/27/2021, has been entered. Claims 1-6, 8-9, 11, 13-14, 16-18 and 20 have been amended.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/27/2021 has been entered.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.


Claim 1, 3, 9, 11, 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Hamada et al. (US 2016/0125048 A1), as previously cited and hereinafter Hamada, in view of Moser et al. (US 2013/0144754 A1), as previously cited and hereinafter Moser, in further view of Deoras et al. (US 2015/0066496 A1), newly cited and hereinafter Deoras.

Regarding claim 1, Hamada discloses a system comprising:
	-one or more processors (Hamada, see at least: [0161] - “the item recommendation can have the hardware configuration of a commonly-used computer that includes a processor circuit such as a central processing unit (CPU) 51”); and 
-one or more non-transitory storage devices storing computing instructions configured to run on the one or more processors (Hamada, see at least: [0161] - “the item recommendation can have the hardware configuration of a commonly-used computer that includes a processor circuit such as a central processing unit (CPU) 51, memory devices such as a read only memory (ROM) 52 and a random access memory (RAM) 53, an input-output interface (I/F) 54 to which a display panel or various operating devices are connected, a communication I/F 55 that performs communication by establishing connection with a network, and a bus 56 that interconnects the constituent elements”) and perform: 
-receiving an online query from a user, the online query comprising natural language of the user (Hamada, see at least: [0037] and [0114] - “semantic analysis engine 10 receives a natural language request D1 [i.e. receiving an online query comprising natural language of the 2 and a context tag group D4” and “In an identical manner, the context tag generator 13 according to the second embodiment too performs an operation for generating the context vector D6 having a particular number of dimensions from arbitrary phrases included in the natural language request D1”); 
-obtaining a first set of rules (1) comprising a neural network having a hidden layers, (2) defining an intent of the online query as at least one of a product-related intent or a non-product-related intent, and (3) operating as a function of at least a word vector (Hamada, see at least: [0030], [0037], and [0114] - “From the keyword “farewell party”, the recommended candidates are narrowed down to stores belonging to the store categories of Japanese-style bar, bar, restaurant, cafe, and florist shop (for the purpose of buying a gift) [i.e. define an intent of the online query as at least one of a product-related intent or a non-product-related intent]” and “The semantic analysis engine 10 receives a natural language request D1 in which a user demand is specified, and outputs a search tag group D2 and a context tag group D4. Each search tag included in the search tag group D2 represents an inquiry information piece that explains the required nature of the store [i.e. define an intent of the online query as at least one of a product-related intent or a non-product-related intent]. Each context tag included in the context tag group D4 represents an inquiry information piece that explains the situation on the user side” and “In an identical manner, the context tag generator 13 according to the second embodiment too performs an operation for generating the context vector D6 having a particular number of dimensions from arbitrary phrases included in the natural language request D1. The only differences are as follows: semantic analysis as well as syntactic analysis is performed; a statistical method called Matrix-Vector RNNs (described 6 [i.e. obtaining a first set of rules that define an intent of the online query and operate as a function of at least a word vector]”); 
-determining the intent of the online query based on the first set of rules (Hamada, see at least: [0114] - “the context tag generator 13 according to the second embodiment too performs an operation for generating the context vector D6 [i.e. intent of online query] having a particular number of dimensions from arbitrary phrases included in the natural language request D1. The only differences are as follows: semantic analysis as well as syntactic analysis is performed; a statistical method called Matrix-Vector RNNs (described later) is implemented for generating the context vector D6 [i.e. determining the intent of the online query based on the first set of rules]”); 
-in response to determining the intent is the product-related intent (Hamada, see at least: [0030] - “From the keyword “farewell party”, the recommended candidates are narrowed down to stores belonging to the store categories of Japanese-style bar, bar, restaurant, cafe, and florist shop (for the purpose of buying a gift) [i.e. determining the intent is the product-related intent]”): 
-obtaining a second set of rules that define an entity prediction from the online query (Hamada, see at least: [0030], [0058], and [0133] - “From the keyword “farewell party”, the recommended candidates are narrowed down to stores belonging to the store categories of Japanese-style bar, bar, restaurant, cafe, and florist shop (for the purpose of buying a gift) [i.e. determining the intent is the product-related intent]” and “FIG. 6 is a flowchart for explaining an exemplary sequence of operations for generating the candidate item group D3 [i.e. obtaining a second set of rules that define an entity 2 [i.e. in response to determining the intent is the product-related intent]” and “in the second embodiment, the configuration is such that the candidate extractor 21 narrows down the stores representing candidate items using the search tag group D2 [i.e. obtaining a second set of rules that define an entity prediction], and then the ranker 22A performs ranking with respect to the candidate item group D3”); 
-predicting an entity from the online query based on the second set of rules, the entity comprising a store name (Hamada, see at least: [0060], [0133], and [0111] - “from among the search tags included in the search tag group D3, with respect to the search tags having the store name or the category name as the application destination attribute, the candidate extractor 21 generates a condition specifying that the column value of the application destination attribute is in strict accordance with the search tags [i.e. predicting an entity from the online query based on the second set of rules, the entity comprising a store name]” and “in the second embodiment, the configuration is such that the candidate extractor 21 narrows down the stores representing candidate items using the search tag group D2 [i.e. predicting an entity from the online query based on the second set of rules, the entity comprising a store name], and then the ranker 22A performs ranking with respect to the candidate item group D3” and “the ranker 22 explained in the second embodiment is replaced with a ranker 22A, which refers to a context vector D6 instead of referring to the context tag group D4 according to the first embodiment, which refers to a usage log DB 105A instead of referring to the usage log DB 105, but which performs the learning operation and the estimation operation in an identical manner to the ranker 22 according to the first embodiment”); 
mapping the entity predicted from the online query to product metadata associated with one or more products (Hamada, see at least: [0062], [0133], [0043], [0111] and Fig. 13 - “the candidate extractor 21 searches the store DB 103 [i.e. mapping the entity predicted from the online query] based on the condition generated at Step S203, and outputs the obtained group of store records as the candidate item group D3 [i.e. to metadata associated with one or more products]” and “in the second embodiment, the configuration is such that the candidate extractor 21 narrows down the stores representing candidate items using the search tag group D2, and then the ranker 22A performs ranking with respect to the candidate item group D3” and “The store name column stores therein the store names of all stores managed in the item recommendation device according to the first embodiment….The other attributes column has a multi-label format in which zero or more labels are listed from among predetermined other attributes indicating the features of the stores. It is desirable that the labels of other attributes include labels from various perspectives such as the service contents, the product features [i.e. product metadata associated with one or more products], and the ambience” and “the ranker 22 explained in the second embodiment is replaced with a ranker 22A, which refers to a context vector D6 instead of referring to the context tag group D4 according to the first embodiment, which refers to a usage log DB 105A instead of referring to the usage log DB 105, but which performs the learning operation and the estimation operation in an identical manner to the ranker 22 according to the first embodiment”); and 
-coordinating displaying product information of the one more stores (Hamada, see at least: [0107] - “the recommended-item list D5 with ranking is obtained in which the inner context information taken from the natural language request D1 and the preference 5 [i.e. information of the one or more stores], any arbitrary method can be implemented [i.e. coordinating a display]”); and 
-in response to determining the intent is the non-product-related intent (Hamada, see at least: [0030] - “From the keyword “farewell party”, the recommended candidates are narrowed down to stores belonging to the store categories of Japanese-style bar, bar, restaurant, cafe, and florist shop (for the purpose of buying a gift) [i.e. determining the intent is the non-product related intent]”): 
-determining, using the first set of rules, that the non-product-related intent comprises a location (Hamada, see at least: [0030] and [0114] - “From the keyword “farewell party”, the recommended candidates are narrowed down to stores belonging to the store categories of Japanese-style bar, bar, restaurant, cafe, and florist shop (for the purpose of buying a gift) [i.e. determining that the non-product related intent comprises a location]” and “In an identical manner, the context tag generator 13 according to the second embodiment too performs an operation for generating the context vector D6 having a particular number of dimensions from arbitrary phrases included in the natural language request D1. The only differences are as follows: semantic analysis as well as syntactic analysis is performed; a statistical method called Matrix-Vector RNNs (described later) is implemented for generating the context vector D6 [i.e. determining using the first set of rules]”).
Hamada does not explicitly disclose receiving an online query from an electronic device of a user; the entity comprising one or more of a product name, a product attribute, a product 
Moser, however teaches matching buyers with related sellers (i.e. abstract), including the known technique of receiving an online query from an electronic device of a user (Moser, see at least: [0066] and [0108] - “The buyer enters information associated with a desired item (step 700) [i.e. receiving an online query], and a search is conducted (step 705) within GWarehouse database 710” and “Alerts may be sent to buyers and sellers via PCs 3570, tablets 3575, mobile phones 3580 or other means for electronic communication [i.e. electronic device of a user]”),
the known technique of an entity comprising one or more of a product name, a product attribute, a product price range, an average customer review, or a product brand (Moser, see at least: [0075] - “FIG. 13 shows that a potential buyer is using the search field to enter a product he/she is looking for, in this example an electron microscope [i.e. entity]” and Fig. 14 displays the search results including the product name, type of microscope (i.e. 150x, 200x, or 160x) and the price [i.e. comprising one or more of a product name, a product attribute, a product price range, an average customer review, or a product brand]),
the known technique of coordinating displaying product information of the one more products on the electronic device of the user (Moser, see at least: [0108] - “Alerts may be sent to buyers and sellers via PCs 3570, tablets 3575, mobile phones 3580 or other means for electronic communication [i.e. on the electronic device of the user]” and Fig. 14 displays the search results including the product name, type of microscope (i.e. 150x, 200x, or 160x) and the price [i.e. coordinating displaying product information of the one more products]), and 

It would have been recognized that applying the known technique of the acts receiving an online query from an electronic device of a user; the entity comprising one or more of a product name, a product attribute, a product price range, an average customer review, or a product brand; coordinating displaying product information of the one more products on the electronic device of the user; and coordinating displaying a message on the electronic device of the user informing the user that a user support assistant has been notified of the request to file the complaint, as taught by Moser, to the teachings of Hamada would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar system. Further, adding the modification of the acts of receiving an online query from an electronic device of a user; the entity comprising one or more of a product name, a product attribute, a product price range, an average customer review, or a product brand; coordinating displaying product information of the one more products on the electronic device of the user; and coordinating displaying a message on the electronic device of the user informing the user that a user support assistant has been notified of the request to file the 

		Hamada in view of Moser does not explicitly teach the first set of rules comprising a neural network having a plurality of hidden layers and operating as a function of at least (ii) a total number of the plurality of hidden layers; (iii) a respective hidden dimension vector for each respective hidden layer of the plurality of hidden layers; and (iv) one or more training parameter matrices inserted into a sigmoid function.
		Deoras, however, teaches analyzing a natural language request (i.e. [0023]), including the known technique of a first set of rules comprising a neural network having a plurality of hidden layers (Deoras, see at least: [0030] - “The DNN 126, [i.e. first set of rules comprising a neural network] as will be shown in greater detail herein, comprises an input layer, a plurality of hidden layers [i.e. having a plurality of hidden layers], and an output layer. For instance, a number of hidden layers in the DNN 126 may be between one and ten hidden layers”) and
		the known technique of operating as a function of at least (i) a total number of the plurality of hidden layers (Deoras, see at least: [0042] and [0045] - “The deep neural network 126 comprises a plurality of hidden layers 512-516 [i.e. the plurality of hidden layers], wherein each of the hidden layers 512-516 comprises a respective plurality of nodes (e.g., neurons)” and “as shown in FIG. 5, the DNN 126 can be formed by stacking multiple RBMs on top of one another. Thus, input to an ith RBM is output of an i−1th RBM. The ith stacked RBM can be represented by RBMi, and the weight parameters for such layer can be denoted by θi. Thus, once RBM1 is constructed and pre-trained, the posterior distribution over hidden vectors P(h|v; θi) can 2 [i.e. operating as a function of at least a total number of the plurality of hidden layers]”);
		(ii) a respective hidden dimension vector for each respective hidden layer of the plurality of hidden layers (Deoras, see at least: [0042] and [0045] - “The deep neural network 126 comprises a plurality of hidden layers 512-516 [i.e. the plurality of hidden layers], wherein each of the hidden layers 512-516 comprises a respective plurality of nodes (e.g., neurons)” and “as shown in FIG. 5, the DNN 126 can be formed by stacking multiple RBMs on top of one another. Thus, input to an ith RBM is output of an i−1th RBM. The ith stacked RBM can be represented by RBMi, and the weight parameters for such layer can be denoted by θi. Thus, once RBM1 is constructed and pre-trained, the posterior distribution over hidden vectors P(h|v; θi) [i.e. a respective hidden dimension vector for each respective hidden layer] can be obtained, and h can be sampled, which then becomes input for the second RBM layer: RBM2 [i.e. for each respective hidden layer of the plurality of hidden layers]”); and 
		(iii) one or more training parameter matrices inserted into a sigmoid function (Deoras, see at least: [0042] and [0061] - “a node in the input layer 502 is coupled to all nodes in the hidden layer 512 by way of respective weighted edges, wherein weight assigned to the edges are learned during training [i.e. training parameter matrices]” and “Mathematically, the dynamics of the Elman type of recurrent neural network can be represented as follows… h(t)=f(Ux(t)+Vh(t−1)), (8) where the sigmoid function at the hidden layer is used: f(x) = 1/(1+e-x)… [i.e. inserted into a sigmoid function] where U and V are weight matrices [i.e. at least one or more training parameter matrices inserted into a sigmoid function] between the raw input and the hidden nodes, and between the connection nodes in the hidden nodes, respectively, while W is the output weight matrix” see also [0046] and [0052]). These known techniques are applicable to the 
It would have been recognized that applying the known techniques of a first set of rules comprising a neural network having a plurality of hidden layers and operating as a function of at least (i) a total number of the plurality of hidden layers; (ii) a respective hidden dimension vector for each respective hidden layer of the plurality of hidden layers; and (iii) one or more training parameter matrices inserted into a sigmoid function, as taught by Deoras, to the teachings of Hamada in view of Moser would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar system. Further, adding the modification of a first set of rules comprising a neural network having a plurality of hidden layers and operating as a function of at least (i) a total number of the plurality of hidden layers; (ii) a respective hidden dimension vector for each respective hidden layer of the plurality of hidden layers; and (iii) one or more training parameter matrices inserted into a sigmoid function, as taught by Deoras, into the system of Hamada in view of Moser would have been recognized by those of ordinary skill in the art as resulting in an improved system that would allow suitable semantic labels to be assigned for a variety of domains/intents (see Deoras, [0038]).

Regarding claim 3, the combination of Hamada/Moser/Deoras teaches the system of claim 1. Hamada further discloses:
-wherein: the one or more non-transitory storage devices storing the computing instructions are further configured to run on the one or more processors and perform, after determining the intent is the product-related intent: obtaining a third set of rules that define a personalization of the one or more products for display on the electronic device of the user based on at least one of an online order history of the user, an in-store purchase history of the user, or an online browsing history of the user (Hamada, see at least: [0084], [0085], [0101], [0107], [0111] and Fig. 1 - “The usage log DB 105 is a database for storing therein a usage log (history information) that represents the store usage history regarding each registered user” and “Each record corresponds to a single store visit by a user [i.e. an in-store purchase history of the user],” and “the ranker 22 generates the feature vector fi based on various parameters of the current user that are stored in the user DB 104 and based on the context tag group D4 of the record stored in the usage log DB 105, and puts the feature vector fi in Equation (3) [i.e. obtaining a third set of rules that define a personalization of the one or more products]” and “the recommended-item list D5 [i.e. one or more products for display on the electronic device of the user] with ranking is obtained in which the inner context information taken from the natural language request D1 and the preference determined from the usage history of the users and the stores are reflected” and “the ranker 22 explained in the second embodiment is replaced with a ranker 22A, which refers to a context vector D6 instead of referring to the context tag group D4 according to the first embodiment, which refers to a usage log DB 105A instead of referring to the usage log DB 105, but which performs the learning operation and the estimation operation in an identical manner to the ranker 22 according to the first embodiment” Examiner notes that Fig. 1 indicates that the context-recognition-type recommendation engine comes after the semantic analysis engine [i.e. after determining the intent is the product-related intent]); and 
-mapping the entity predicted from the online query to the product metadata associated with the one or more products comprises mapping the entity predicted from the online query to the product metadata associated with the one or more products based on the third set of rules 5 with ranking is obtained in which the inner context information taken from the natural language request D1 [i.e. mapping the entity predicted from the online query to the product metadata associated with the one or more products] and the preference determined from the usage history of the users [i.e. based on the third set of rules] and the stores are reflected” and “the ranker 22 explained in the second embodiment is replaced with a ranker 22A, which refers to a context vector D6 instead of referring to the context tag group D4 according to the first embodiment, which refers to a usage log DB 105A instead of referring to the usage log DB 105, but which performs the learning operation and the estimation operation in an identical manner to the ranker 22 according to the first embodiment”).

Claims 9 and 11 recite limitations directed towards a method (i.e. [0002]). The limitations recited in claims 9 and 11 are parallel in nature to those addressed above for claims 1 and 3, respectively, and are therefore rejected for those same reasons set forth above in claims 1 and 3, respectively.

Claim 17 recites limitations directed towards a system. The limitations recited in claim 17 are parallel in nature to those addressed above for claim 1 and are therefore rejected for those same reasons set forth above in claim 1.

Regarding claim 19, the combination of Hamada/Moser/Deoras teaches the system of claim 17. Hamada further discloses:
mapping the entity predicted from the online query to the product metadata associated with the one or more products comprises mapping the entity predicted from the online query to the product metadata associated with the one or more products based on at least one of an online order history of the user, an in-store purchase history of the user, or a browsing history of the user (Hamada, see at least: [0084], [0085], [0101] and [0107] - “The usage log DB 105 is a database for storing therein a usage log (history information) that represents the store usage history regarding each registered user” and “Each record corresponds to a single store visit by a user [i.e. an in-store purchase history of the user],” and “the ranker 22 generates the feature vector fi based on various parameters of the current user that are stored in the user DB 104 and based on the context tag group D4 of the record stored in the usage log DB 105, and puts the feature vector fi in Equation (3) [i.e. obtaining a third set of rules that define a personalization of the one or more products]” and “the recommended-item list D5 with ranking is obtained in which the inner context information taken from the natural language request D1 [i.e. mapping the entity predicted from the online query to the product metadata associated with the one or more products] and the preference determined from the usage history of the users [i.e. based on in-store purchase history of the user] and the stores are reflected”).

Claims 2, 4, 10, 12, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hamada, in view of Moser, in further view of Deoras in further view of Burlik et al. (US 2017/0177710 A1), as previously cited and hereinafter Burlik.

Regarding claim 2, the combination of Hamada/Moser/Deoras teaches the system of claim 1. Hamada further discloses:
 (Hamada, see at least: Examiner notes that Fig. 1 indicates that the context-recognition-type recommendation engine comes after the semantic analysis engine [i.e. after determining the intent is the product-related intent]): 
-determining whether the one or more products are available for purchase at a physical store (Hamada, see at least: [0041] and [0043] - “store narrowing-down operation includes narrowing down the stores (candidate items), which represent the recommended candidates, based on the natural language request D1 in which the user demand is expressed” and “The other attributes column has a multi-label format in which zero or more labels are listed from among predetermined other attributes indicating the features of the stores. It is desirable that the labels of other attributes include labels from various perspectives such as the service contents, the product features [i.e. determining whether the one or more products are available for purchase at a physical store], and the ambience”).
Moser further teaches Docket No. WMT-16-011-US/0548222the product information comprises at least one of a price for the one or more products at the physical store or a location of the one or more products at the physical store (Moser, see at least: [0075] - “FIG. 13 shows that a potential buyer is using the search field to enter a product he/she is looking for, in this example an electron microscope” and Fig. 14 displays the search results including the price [i.e. product information comprises at least one of a price for the one or more products at the physical store]). It would have been obvious to one of ordinary skill in the art before the effective filing date to combine Hamada with Moser for the reasons identified with respect to claim 1.


Burlik, however, teaches analyzing natural language input (i.e. abstract), including the known technique of determining a physical store that is closest to the electronic device when the online query is received from the electronic device of the user (Burlik, see at least: [0096] - “user A may state "Where is the closest coffee shop"[i.e. closest to the electronic device of the user], whereupon the computation platform 109 may parse and map user A's statement. The computation platform 109 may cause first query processing level and second query processing level to determine nearby coffee shops. Once the nearby coffee shops are determined, the computation platform 109 may cause the third query processing level to calculate routes to the closest coffee shops, whereupon the route to the closest coffee shop is displayed to the user [i.e. determining a physical store that is closest to the electronic device]”) and the physical store being closest to the electronic device when the online query is received from the electronic device of the user (Burlik, see at least: [0096] - “user A may state "Where is the closest coffee shop"[i.e. closest to the electronic device of the user], whereupon the computation platform 109 may parse and map user A's statement. The computation platform 109 may cause first query processing level and second query processing level to determine nearby coffee shops. Once the nearby coffee shops are determined, the computation platform 109 may cause the third query processing level to calculate routes to the closest coffee shops, whereupon the route to the closest coffee shop is displayed to the user”). This known technique is applicable to the system of the combination of Hamada/Moser/Deoras as they both share characteristics and capabilities, namely, they are directed to analyzing natural language input.


Regarding claim 4, the combination of Hamada/Moser/Deoras teaches the system of claim 1. Hamada further discloses:
-wherein: the one or more non-transitory storage devices storing the computing instructions are further configured to run on the one or more processors and perform, after determining the intent is the product-related intent (Hamada, see at least: Examiner notes that Fig. 1 indicates that the context-recognition-type recommendation engine comes after the semantic analysis engine [i.e. after determining the intent is the product-related intent]): 
1 in which the user demand is expressed” and “The other attributes column has a multi-label format in which zero or more labels are listed from among predetermined other attributes indicating the features of the stores. It is desirable that the labels of other attributes include labels from various perspectives such as the service contents, the product features [i.e. determining whether the one or more products are available for purchase at a physical store], and the ambience”)
-obtaining a third set of rules that define a personalization of the one or more products for display on the electronic device of the user based on at least one of an online order history of the user, an in-store purchase history of the user, or an online browsing history of the user (Hamada, see at least: [0084], [0085], [0101], [0107] and [0111] - “The usage log DB 105 is a database for storing therein a usage log (history information) that represents the store usage history regarding each registered user” and “Each record corresponds to a single store visit by a user [i.e. an in-store purchase history of the user],” and “the ranker 22 generates the feature vector fi based on various parameters of the current user that are stored in the user DB 104 and based on the context tag group D4 of the record stored in the usage log DB 105, and puts the feature vector fi in Equation (3) [i.e. obtaining a third set of rules that define a personalization of the one or more products]” and “the recommended-item list D5 [i.e. one or more products for display on the electronic device of the user] with ranking is obtained in which the inner context information taken from the natural language request D1 and the preference determined from the usage history of the users and the stores are reflected” and “the ranker 22 explained in the second 22A, which refers to a context vector D6 instead of referring to the context tag group D4 according to the first embodiment, which refers to a usage log DB 105A instead of referring to the usage log DB 105, but which performs the learning operation and the estimation operation in an identical manner to the ranker 22 according to the first embodiment”); and
-mapping the entity predicted from the online query to the product metadata associated with the one or more products comprises mapping the entity predicted from the online query to the product metadata associated with the one or more products based on the third set of rules (Hamada, see at least: [0107] and [0111] - “the recommended-item list D5 with ranking is obtained in which the inner context information taken from the natural language request D1 [i.e. mapping the entity predicted from the online query to the product metadata associated with the one or more products] and the preference determined from the usage history of the users [i.e. based on the third set of rules] and the stores are reflected” and “the ranker 22 explained in the second embodiment is replaced with a ranker 22A, which refers to a context vector D6 instead of referring to the context tag group D4 according to the first embodiment, which refers to a usage log DB 105A instead of referring to the usage log DB 105, but which performs the learning operation and the estimation operation in an identical manner to the ranker 22 according to the first embodiment”).
Moser further teaches Docket No. WMT-16-011-US/0548222the product information comprises at least one of a price for the one or more products at the physical store or a location of the one or more products at the physical store (Moser, see at least: [0075] - “FIG. 13 shows that a potential buyer is using the search field to enter a product he/she is looking for, in this example an electron microscope” and Fig. 14 displays the search results including the price [i.e. product information comprises at least 

The combination of Hamada/Moser/Deoras does not explicitly teach determining a physical store that is closest to the electronic device when the online query is received from the electronic device of the user and the physical store being closest to the electronic device when the online query is received from the electronic device of the user.
Burlik, however, teaches analyzing natural language input (i.e. abstract), including the known technique of determining a physical store that is closest to the electronic device when the online query is received from the electronic device of the user (Burlik, see at least: [0096] - “user A may state "Where is the closest coffee shop"[i.e. closest to the electronic device when the online query is received from the electronic device of the user], whereupon the computation platform 109 may parse and map user A's statement. The computation platform 109 may cause first query processing level and second query processing level to determine nearby coffee shops. Once the nearby coffee shops are determined, the computation platform 109 may cause the third query processing level to calculate routes to the closest coffee shops, whereupon the route to the closest coffee shop is displayed to the user [i.e. determining a physical store that is closest to the electronic device]”) and the physical store being closest to the electronic device when the online query is received from the electronic device of the user (Burlik, see at least: [0096] - “user A may state "Where is the closest coffee shop"[i.e. closest to the electronic device when the online query is received from the electronic device of the user], whereupon the computation platform 109 may parse and map user A's statement. The computation platform 109 may cause first query 
It would have been recognized that applying the known technique of the act of determining a physical store that is closest to the electronic device when the online query is received from the electronic device of the user and the physical store being closest to the electronic device when the online query is received from the electronic device of the user, as taught by Burlik, to the teachings of the combination of Hamada/Moser/Deoras would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar system. Further, adding the modification of the act of determining a physical store that is closest to the electronic device when the online query is received from the electronic device of the user and the physical store being closest to the electronic device when the online query is received from the electronic device of the user, as taught by Burlik, into the system of the combination of Hamada/Moser/Deoras would have been recognized by those of ordinary skill in the art as resulting in an improved system that would allow the user to interact with the devices without having to limit the user's verbiage to a limited set of directions by processing the natural language input in different levels (see Burlik, [0031]).

Claims 10 and 12 recite limitations directed towards a method. The limitations recited in claims 10 and 12 are parallel in nature to those addressed above for claims 2 and 4, respectively, and are therefore rejected for those same reasons set forth above in claims 2 and 4, respectively.

Claim 18 recites limitations directed towards a system. The limitations recited in claim 18 are parallel in nature to those addressed above for claim 2 and are therefore rejected for those same reasons set forth above in claim 2.

Regarding claim 20, the combination of Hamada/Moser/Deoras teaches the system of claim 17. Hamada further discloses:
-wherein: the one or more non-transitory storage devices storing the computing instructions are further configured to run on the one or more processors and perform, after determine the intent is the product-related intent (Hamada, see at least: Examiner notes that Fig. 1 indicates that the context-recognition-type recommendation engine comes after the semantic analysis engine [i.e. after determining the intent is the product-related intent]):
-determining whether the one or more products are available for purchase at a physical store (Hamada, see at least: [0041] and [0043] - “store narrowing-down operation includes narrowing down the stores (candidate items), which represent the recommended candidates, based on the natural language request D1 in which the user demand is expressed” and “The other attributes column has a multi-label format in which zero or more labels are listed from among predetermined other attributes indicating the features of the stores. It is desirable that the labels of other attributes include labels from various perspectives such as the service contents, the 
-determining a personalization of the one or more products to the user based on at least one of an online order history of the user, an in-store purchase history of the user, or a browsing history of the user (Hamada, see at least: [0084], [0085], [0101] and [0107] - “The usage log DB 105 is a database for storing therein a usage log (history information) that represents the store usage history regarding each registered user” and “Each record corresponds to a single store visit by a user [i.e. an in-store purchase history of the user],” and “the ranker 22 generates the feature vector fi based on various parameters of the current user that are stored in the user DB 104 and based on the context tag group D4 of the record stored in the usage log DB 105, and puts the feature vector fi in Equation (3) [i.e. determining a personalization of the one or more products to the user]” and “the recommended-item list D5 with ranking is obtained in which the inner context information taken from the natural language request D1 and the preference determined from the usage history of the users [i.e. based on at least one of an online order history of the user, an in-store purchase history of the user, or a browsing history of the user] and the stores are reflected”); and 
-mapping the entity predicted from the online query to the product metadata associated with the one or more products comprises mapping the entity predicted from the online query to the product metadata associated with the one or more products based on the personalization of the one or more products (Hamada, see at least: [0107] - “the recommended-item list D5 with ranking is obtained in which the inner context information taken from the natural language request D1 [i.e. mapping the entity predicted from the online query to the product metadata associated with the one or more products] and the preference determined from the usage history 
Moser further teaches Docket No. WMT-16-011-US/0548222the product information comprises at least one of a price for the one or more products at the physical store or a location of the one or more products at the physical store (Moser, see at least: [0075] - “FIG. 13 shows that a potential buyer is using the search field to enter a product he/she is looking for, in this example an electron microscope” and Fig. 14 displays the search results including the price [i.e. product information comprises at least one of a price for the one or more products at the physical store]). It would have been obvious to one of ordinary skill in the art before the effective filing date to combine Hamada with Moser for the reasons identified with respect to claim 17.

The combination of Hamada/Moser/Deoras does not explicitly teach determining a physical store that is closest to the electronic device of the user and the physical store being closest to the electronic device of the user.
Burlik, however, teaches analyzing natural language input (i.e. abstract), including the known technique of determining a physical store that is closest to the electronic device of the user (Burlik, see at least: [0096] - “user A may state "Where is the closest coffee shop"[i.e. closest to the electronic device of the user], whereupon the computation platform 109 may parse and map user A's statement. The computation platform 109 may cause first query processing level and second query processing level to determine nearby coffee shops. Once the nearby coffee shops are determined, the computation platform 109 may cause the third query processing level to calculate routes to the closest coffee shops, whereupon the route to the closest coffee shop is displayed to the user [i.e. determining a physical store that is closest to the electronic 
It would have been recognized that applying the known technique of the act of determining a physical store that is closest to the electronic device of the user and the physical store being closest to the electronic device of the user, as taught by Burlik, to the teachings of the combination of Hamada/Moser/Deoras would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar system. Further, adding the modification of the act of determining a physical store that is closest to the electronic device of the user and the physical store being closest to the electronic device of the user, as taught by Burlik, into the system of the combination of Hamada/Moser/Deoras would have been recognized by those of ordinary skill in the art as resulting in an improved system that would allow the user to interact with the devices without having to limit the user's verbiage to a limited set of directions by processing the natural language input in different levels (see Burlik, [0031]).

Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Hamada, in view of Moser, in further view of Deoras in further view of Byron et al. (US 2017/0169017 A1), as previously cited and hereinafter Byron.

Regarding claim 5, the combination of Hamada/Moser/Deoras teaches the system of claim 1.
The combination of Hamada/Moser/Deoras does not explicitly teach wherein predicting the entity from the online query based on the second set of rules comprises predicting a plurality of entities and a respective confidence level for each respective entity of the plurality of entities from the online query based on the second set of rules. 
Byron, however, teaches natural language processing (i.e. [0070]), including the known technique of predicting the entity from the online query based on a set of rules comprises predicting a plurality of entities and a respective confidence level for each respective entity of the plurality of entities from the online query based on the set of rules (Byron, see at least: [0020] and [0022] - “In response to a natural language query, a cognitive computing system analyzes the unstructured data in its corpus using NLP to understand grammar and context of each information item, and it presents candidate answers [i.e. predicting the entity from the online query] and/or solutions to the user ranked by certainty of correctness” and “when a user asks a new complex question of the system, it searches the corpus to find a plurality of potential answers. It also collects evidence within the corpus, such as how many sources agree on a particular possible answer, and rates the quality of the evidence according to a scoring process [i.e. based on set of rules]. Finally, potential answers [i.e. predicting a plurality of entities] which meet a threshold of confidence of being correct [i.e. a respective confidence level for each 
It would have been recognized that applying the known technique of predicting the entity from the online query based on a set of rules comprises predicting a plurality of entities and a respective confidence level for each respective entity of the plurality of entities from the online query based on the set of rules, as taught by Byron, to the teachings of the combination of Hamada/Moser/Deoras would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar system. Further, adding the modification of the act of predicting the entity from the online query based on a set of rules comprises predicting a plurality of entities and a respective confidence level for each respective entity of the plurality of entities from the online query based on the set of rules, as taught by Byron, into the system of the combination of Hamada/Moser/Deoras would have been recognized by those of ordinary skill in the art as resulting in an improved system that would allow a search computer system to detect disagreement (see Byron, [0015]).

	Claim 13 recites limitations directed towards a method. The limitations recited in claim 13 are parallel in nature to those addressed above for claim 5, and are therefore rejected for those same reasons set forth above in claim 5.

Allowable Subject Matter
Dependent Claims 6-8 and 14-16
Dependent claims 6-8 and 14-16 are allowed over 35 USC § 103. Upon review of the evidence at hand, it is hereby concluded that the totality of the evidence, alone or in combination, neither anticipates, reasonably teaches, nor renders obvious the below noted features of the applicant’s invention. 
The allowable features over 35 USC § 103 for claims 6 and 14 are as follows:
-wherein the first set of rules comprises: converting the online query into a sequence of word vectors {xt}T, where x is the word vector, t is an index of the plurality of hidden layers, T is the total number of the plurality of hidden layers, and each respective hidden layer of the plurality of hidden layers comprises a respective hidden dimension vector {ht-i}T input for the respective hidden dimension vector, at least one respective corresponding word vector {xt}T input, and a respective output {ht}T as an input for a next respective hidden layer of the plurality of hidden layers, where h is a hidden dimension; 
applying a formula h t = σ(h t−1 H+x t W), where H and W are a respective training parameter matrices of the one or more training parameter matrices used to learn from training data, and σ is a sigmoid function σ(z) = 1/(1+e-z), where z is a dummy variable and e is an exponential function; 
inserting the output from a last hidden layer hT into a softmax function to obtain a probability over a plurality of intents as y = s(hTU + b), where f is a final output after softmax transformation, s is the softmax function s(zi)= ezi/( ∑nj=1 ∑jezj), n is a total number of the plurality of intents, i and j are respective indexes for summation, and matrices W ∈ Rd×h, H ∈ Rh×hU ∈ Rh×n and b ∈ R1×h are model parameters that are determined by model training, where R is defined as a real number set; 
manually labeling each respective training example with a respective intent of the plurality of intents; 
predicting a probability for each position in y to be 1 with the softmax function, where y is a one-hop vector that represents a true label of the online query, only a position corresponding to the respective intent, as manually labeled, takes value 1, and a remainder of positions in the one-hop vector take value 0; 
optimizing the model parameters by minimizing cross-entropy loss using an equation comprising: – (1/N) ∑Nk=1∑ni=1(yi(k)log(ŷ j(k)) where N is a number of training examples that effectively minimizes the error of model output ŷ compared with ground-truth label y; 
using a stochastic gradient descent method for batch parameter optimization; and 
calculating ŷ using optimized parameters and a highest probability intent of the plurality of intents, the highest probability intent having a highest probability given by ŷ as a final predicted intent to make a prediction after training.

The allowable features over 35 USC § 103 for claims 7 and 15 are as follows:
-wherein the second set of rules comprises: determining context of a word in the online query by representing the context as a window comprising the word concatenated with additional words neighboring the word on both sides as x(t)=[xt−k, . . . , xt, . . . , xt+k,] where x is the word vector, t is an index of the plurality of hidden layers, {xi}i=t−k t+k is a word vector representation of word xt and the additional words, and k is a word window size;
predicting the entity of the word xt; 
calculating a hidden layer h=tan h(x(t)W+b1) to get a respective probability of each respective entity of a plurality of entities by using an equation comprising ŷ=s(hU+b2), where tan h is the hyperbolic tangent function, W is one or more first matrix variables for training, b1 and b2 are respective intercept variables for training, ŷ is an output from a logistic function, h is an output from a tan h function, s is a softmax function, and a ground-truth label of the entity for the word is represented as a one-hop vector y; and 
optimizing parameters W ∈ R(2k+1)×h, b1 ∈ Rh, U ∈ Rh×n, b2 ∈ Rn by minimizing a cross-entropy loss using a stochastic gradient descent method and a batch optimization, where U is one or more second matrix variables for training, R is defined as a real number set, and h and n are dimensions of a matrix

The allowable features over 35 USC § 103 for claims 8 and 16 are as follows:
-the one or more non-transitory storage devices storing computing instructions are further configured to run on the one or more processors and perform, after determining the intent is the product related intent:
	determining a physical store that is closest to the electronic device of the user;
	determining whether the one or more products are available for purchase in the physical store that is closest to the electronic device of the user; and
	obtaining a third set of rules that define a personalization of the one or more products for display on the electronic device of the user based on at least one of an online order history of the user, an in-store purchase history of the user, or an online browsing history of the user;
	mapping the entity predicted from the online query to the product metadata associated with the one or more products comprises mapping the entity predicted from the online query to the product metadata associated with the one or more products based on the third set of rules;
	the product information comprises at least one of a price for the one or more products at the physical store or a location of the one or more products at the physical store based on the third set of rules;
	predicting the entity from the online query based on the second set of rules comprises predicting a plurality of entities and a confidence level for each entity of the plurality of entities from the online query based on the second set of rules;
	the first set of rules comprises:
	converting the online query into a sequence of word vectors {xt}T, where x is the word vector, t is an index of the plurality of hidden layers, T is the total number of the plurality of hidden layers, and each respective hidden layer of the plurality of hidden layers comprises a respective hidden dimension vector {ht−1}T input for the respective hidden dimension vector, at least one corresponding word vector {xt}T input, and a respective output {ht}T as an input for a next respective hidden layer of the plurality of hidden layers, where h is a hidden dimension;
	applying a formula ht=σ(ht−1H+xtW), where H and W are respective training parameter matrices of the one or more training parameter matrices used to learn from training data, and σ is a sigmoid function σ(z) = 1/(1+e-z), where z is a dummy variable and e is an exponential function; 
inserting the output from a last hidden layer hT into a softmax function to obtain a probability over all a plurality of intents as y = s(hTU + b), where f is a final output after softmax transformation, s is the softmax function s(zi)= ezi/( ∑nj=1 ∑jezj), n is a total number of the plurality of intents, i and j are indexes for summation, and matrices W ∈ Rd×h, H ∈ Rh×hU ∈ Rh×n and b ∈ R1×h are model parameters that are determined by model training, where R is defined as a real number set; 
manually labeling each respective training example with a respective intent of the plurality of intents; 
predicting a probability for each position in y to be 1 with the softmax function, where y is a one-hop vector that represents a true label of the online query, only a position corresponding to the respective intent, as manually labeled, takes value 1, and a remainder of positions in the one-hop vector take value 0; 
optimizing the model parameters by minimizing cross-entropy loss using an equation comprising:  – (1/N) ∑Nk=1∑ni=1(yi(k)log(ŷ j(k)) where N is a number of training examples that effectively minimizes the error of model output ŷ compared with ground-truth label y; 
using a stochastic gradient descent method for batch parameter optimization; and 
calculating ŷ using optimized parameters and a highest probability intent of the plurality of intents, the highest probability intent having a highest probability given by ŷ as a final predicted intent to make a prediction after training
the second set of rules comprises:
determining context of a word in the online query by representing the context as a window comprising the word concatenated with additional words neighboring the word on both sides as x(t)=[xt−k, . . . , xt, . . . , xt+k,] where x is a word vector, t is an index of the plurality of hidden layers, {xi}i=t−k t+k is a word vector representation of word xt and the additional words, k is a word window size;
predicting the entity of the word xt;
calculating a hidden layer h=tan h(x(t)W+b1) to get a respective probability of each respective entity of a plurality of entities by using an equation comprising ŷ=s(hU+b2), where tan h is the hyperbolic tangent function, W is one or more first matrix variables for training, b1 and b2 are respective intercept variables for training, ŷ is an output from a logistic function, h is an output from a tan h function, s is the softmax function, and a ground-truth label of the entity for the word is represented as a one-hop vector y; and
optimizing parameters W ∈ R(2k+1)×h, b1 ∈ Rh, U ∈ Rh×n, b2 ∈ Rn by minimizing a cross-entropy loss using a stochastic gradient descent method and a batch optimization, where U is one or more second matrix variables for training, R is defined as a real number set, and h and n are dimensions of a matrix.

The most remarkable prior art is Hamada (US 2016/0125048 A1) and Deoras (US 2015/0066496 A1). Hamada discloses a utilizing a mathematical model based on Matrix-Vector RNNs (MV-RNN) as well as Word2Vec (see Hamada, [0117]-[0129]) and Deoras teaches an Elman type of recurrent neural network which includes a sigmoid function at the hidden layers and a softmax function (see Deoras, [0061]), however, neither Hamada or Deoras teaches matrices W ∈ Rd×h, H ∈ Rh×hU ∈ Rh×n and b ∈ R1×h being model parameters that are determined by model training where R is defined as a real number set or optimizing parameters W ∈ R(2k+1)×h, b1 ∈ Rh, U ∈ Rh×n, b2 ∈ Rn.
Examiner further emphasizes the claims as a whole and hereby asserts that the totality of the evidence fails to set forth, either explicitly or implicitly, an appropriate rationale for further modification of the evidence at hand to arrive at the claimed invention. The combination of features as claimed would not be obvious to one of ordinary skill in the art as combining various references from the totality of evidence to reach the combination of features as claimed would be a substantial reconstruction of Applicant’s claimed invention relying on improper hindsight bias.


Response to Arguments
Rejections under 35 U.S.C. §103
		Applicant’s arguments have been considered but are moot because the arguments do not apply to the current combination of references being used.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
-Kurata et al. (US 2017/0061330 A1) teaches utilizing sigmoid activation a label prediction layer of a plurality of hidden layers in a neural network.
-He et al. (US 2017/0293638 A1) teaches weight matrix parameters and bias vector parameters being input in a sigmoid function.
-Gao et al. (US 2016/0342895 A1) teaches a sigmoid function used as an activation function in LSTM neural networks. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARIELLE E WEINER whose telephone number is (571)272-9007.  The examiner can normally be reached on M-F 8:30-5:00.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jason Dunham can be reached on 571-272-8109.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ARIELLE E WEINER/            Examiner, Art Unit 3684