DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Step 1: The claims 1-19 are a method  . Thus, each independent claim, on its face, is directed to one of the statutory categories of 35 U.S.C. §101. However, the claims 1-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 2A-Prong 1: Claim 1 as a whole recites a method or organizing human interactions  .   The claimed invention is a method that initialize a set of  parameters , provides command input for produce out for utilizing the output to cause action which is a method of managing interactions between people.   Claim 16 as a whole recites a method or organizing human interactions.  The claimed invention is a method that produce and provide a command input and produce output . The mere nominal recitation of a generic content server and generic computer based learing model   does not take the claim out of the methods of organizing human interactions grouping. Thus, the claims recites an abstract idea.
Step 2A-Prong 2:  The claim as a whole merely describes how to generally “apply” the concept of initialing, providing producing and  unitizing, information in a computer environment. The claimed computer components are recited at a high level of generality and are merely invoked as tools to perform  a generic  computer functions . Simply implementing the abstract idea on a generic computer is not a practical application of the abstract idea.

Step 2B:  As noted previously, the claim as a whole merely describes how to generally “apply” the concept of initialing, providing producing and  unitizing, information in a computer environment. Thus, even when viewed as a whole, nothing in the claim adds significantly more (i.e., an inventive concept) to the abstract idea. The claim is ineligible.

Dependent claims 2 and 3, these claims recite limitation that further define the abstract idea noted in claim 1.  In addition, they recite the additional elements  of receiving feedback data from feedback sensor.  recited at a high level of generality (i.e., as  a general means of gathering network traffic data for use in the comparison step), and amounts to mere data gathering, which
is a form of insignificant extra-solution activity?   The sensor is recited at a high-level of generality such that it amount no more than mere instruction   to apply the exception  using a generic computer components.    Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to the abstract idea.

Dependent claim 4, this  claim recite limitation that further define the abstract idea noted in claim 3.    In addition, they recite the additional elements  the computer-based learing module for producing output. The computer-based learing module is recited at a high-level of generality such that it amount no more than mere instruction   to apply the exception  using a generic computer components.    Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to the abstract idea.
Dependent claim 5, this  claim recite limitation that further define the abstract idea noted in claim 4.    In addition, they recite the additional elements storing an information in computer-based memory   . The computer- based memory   is recited at a high-level of generality such that it amount no more than mere instruction   to apply the exception  using a generic computer components.    Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to the abstract idea.

Dependent claim 6-15, these  claim recite limitation that further define the abstract idea noted in above claims.  The computer-based learing module is recited at a high-level of generality such that it amount no more than mere instruction   to apply the exception  using a generic computer components.    Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to the abstract idea   
Dependent claim 17-19, these  claim recite limitation that further define the abstract idea noted in above claim 16.  These claims do not contain any further additional elements per step 2A prong 2.  Therefore, they are considered patent ineligible for the reason give above.  

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-5, 11-17  and 19 are rejected under 35 U.S.C. 102 (a1)  as being anticipated  by  Mohammed (US Pub., No., 2018/0374138 A1) 

With respect to claim 1, Mohammed teaches a  method comprising: 
initializing a set of parameters for a computer-based learning model (Fig. 4, discloses initialize action-value function Q with random weights for episode = 1[set of parameter] and paragraph [0023], discloses   initialize, or facilitate performing certain  action with or in  online environment ); 
providing a command input into the computer-based learning model as part of a trial, 5wherein the command input calls for producing (paragraph [0023], discloses the agent can be configured to receive an instruction or recommendation  of a deep reinforcement learning system and perform an action in the online environments (e.g., present certain purchase recommendation [produce] to selected users via a website, email, or mobile application) based on received instruction.., ) a specified reward within a specified amount of time in an environment external to the computer-based learning model (paragraph [0026], discloses an online environment using an artificial intelligence (AI) system, such as deep reinforcing learning system which is configured to leverage delayed and partial reward );
 producing an output with the computer-based learning model based on the command input(paragraph [0031] discloses in responses to the observation, the deep reinforcement learing system determines or  calculates  rewards, generally a reward I a numeric value that characterize user action performance in the online environment  and time of user action  function[producing output]); and 
utilizing the output to cause an action in the environment external to the computer-based 10learning model(paragraph [0031], discloses a reward [output] is a number value that characterizes  a user action performed in the online environment and timing of the user action, each reward is a function of the observation  maned in the online environment and time the reward is used by the deep reinforcement learing system to select a particular  action to be performed by the agent in response to observation  and paragraph [0032], discloses the deep reinforcement learing system instruct the agent to perform one or more action selected for a predetermined set of action depending on the reward ). 

With respect to claim 2, Mohammed teaches elements of claim 1, furthermore, Mohammed teaches  the method further comprising: receiving feedback data from one or more feedback sensors in the external environment after the action(Fig. 2, 215, discloses feedback API to report and paragraph [0043], discloses reinforcement learning system 105 selected or determines actions to be performed by agent 110 that interfaces with online environment 115 based on rewards.., calculates reward abased on at least one of observation and selects one or more actions to be performed by agent 110 based on the calculated reward…, refer to user feedback in response to displying a purchase recommendation  ).  

With respect to claim 3, Mohammed teaches elements of claim 2, furthermore, Mohammed teaches  the method  wherein the feedback data comprises data that represents an actual reward produced in the external environment by the action(paragraph [0043], discloses observation can refer to a user feedback in response to disply a purchase recommendation  …, and paragraph [0044], discloses deep reinforcement learning system 105 can again determine or calculates a next reward ).  
With respect to claim 4, Mohammed teaches elements of claim 3, furthermore, Mohammed teaches  the method  wherein the output produced by the computer-based learning 20model depends on the set of parameters for the computer-based learning model(paragraph [0043], discloses calculates a reward based on at least one of observation , and selects one or more action [parameter]).  
With respect to claim 5, Mohammed teaches elements of claim 4, furthermore, Mohammed teaches  the method further comprising storing a copy of the set of parameters in computer-based memory(paragraph [0079], discloses memory can be configured to store information within computing system 600 during operation.., memory 620 can store instruction to perform the method  for delivering purchase recommendation).  

With respect to claim 11, Mohammed teaches elements of claim 1, furthermore, Mohammed teaches  the method  wherein the computer-based learning model is an artificial neural network(paragraph [000026], discloses an artificial  intelligence (AI) system  a neural network  and paragraph [0033], discloses deep reinforcement learning system may use one or more neural networks or AI system ) .  
With respect to claim 12 Mohammed teaches elements of claim 1, furthermore, Mohammed teaches  the method wherein the specified reward in the specified amount of time indicated in the command input represent something other than simply an optimization of reward and time(paragraph [0046], discloses obviously other time parameters can be used …).  

With respect to claim 13, Mohammed teaches elements of claim 1, furthermore, Mohammed teaches  the method   25wherein the command input represents something other than a simple desire to produce a specific total reward in a specific amount of time(paragraph [0076], discloses send an instruction  or command to agent 110 to perfume the selected action) .  


With respect to claim 14, Mohammed teaches elements of claim 1, furthermore, Mohammed teaches  the method further comprising producing the command input to match an already observed event(paragraph [0070], discloses receive a current observation characterizing the user action or online environment 115 based on the interaction of the user with the online environment 115 and at least one of purchase recommendation [input match already observed]).  

With respect to claim 15, Mohammed teaches elements of claim 14, furthermore, Mohammed teaches  the method  wherein the already observed event already produced the specified reward in the specified amount of time(paragraph [0046], discloses time-decay reward and paragraph [0049], discloses a time for purchase intent.., the reward is calculated based upon two or more observations ).

With respect to claim 16, Mohammed teaches a  method comprising: 
producing a command input into the computer-based learning model,  5wherein the command input calls for producing an event that matches an event that the computer-based model already observed (Fig. 5, 515 and 520 discloses receiving a current observation characterizing an online environment based on interaction of the user with the online environment and   determine a reward values based on the current observation , and select  or identify an action to be performed by an agent based  on the reward value [match an event] and paragraph [0066], discloses delivering purchase recommendation [producing an event match] and see also paragraph [0071]-[0072] );
providing a command input into the computer-based learning model,  5  (paragraph [0023], discloses the agent can be configured to receive an instruction or recommendation  of a deep reinforcement learning system and perform an action in the online environments (e.g., present certain purchase recommendation [produce] to selected users via a website, email, or mobile application) based on received instruction.., ); and   
 producing an output with the computer-based learning model based on the command input(paragraph [0031] discloses in responses to the observation, the deep reinforcement learing system determines or  calculates  rewards, generally a reward I a numeric value that characterize user action performance in the online environment  and time of user action  function[producing output]). 
 	
With respect to claim 17, Mohammed teaches elements of claim 16, furthermore, Mohammed teaches  the method    wherein the command input calls for producing a specified reward within a specified amount of time in an environment external to the computer-based 15learning model, and wherein the already observed event produced the specified reward in the specified amount of time(Fig. 5, 515 and 520 discloses receiving a current observation characterizing an online environment based on interaction of the user with the online environment and   determine a reward values based on the current observation , and select  or identify an action to be performed by an agent based  on the reward value [match an event] and paragraph [0066], discloses Table 1 or a   value calculated with discounting .. reward   );
. 

With respect to claim 19, Mohammed teaches elements of claim 16, furthermore, Mohammed teaches  the method further comprising utilizing the output to cause an action in the environment external to the computer-based learning model.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 6-8  and 18  are rejected under 35 U.S.C. 103 as being unpatentable over Mohammed (US Pub., No., 2018/0374138 A1) in view of Hsieh et al (US Pub., 2018/0144214 A1) 

With respect to claim 6, Mohammed teaches elements of claim 5, furthermore, Mohammed teaches  the method  a virtual  environment that can react or be modified in response to user action, inputs, or interactions (paragraph [0022]).  Mohammed failed to teach the adjusting the set of parameters in the copy to produce an adjusted set of parameters. 25 

However, Hsieh teaches adjusting the set of parameters in the copy to produce an adjusted set of parameters (paragraph [134], discloses parameters represents  variable to be adjusted    ).  Therefore, it would have been obvious to the one ordinary skill in the art before the effective filing date of the claimed invention for a user action, input or interaction [parameters] of Mohammed with adjusting the parameters during the learning process  of  data of Hsieh in order to represent the mapping of generic input-to output mapping as a result they can determine    the output with high accuracy (see Hsieh, paragraph [0136])

With respect to claim 7, Mohammed teaches elements of claim 6, furthermore, Mohammed teaches  the method a virtual  environment that can react or be modified in response to user action, inputs, or interactions (paragraph [0022]).  Mohammed failed to teach wherein the set of parameters in the copy are adjusted using supervised learning based on actual prior command inputs to the computer-based learning model and actual resulting feedback data.  
However, Hsieh teaches  wherein the set of parameters in the copy are adjusted using supervised learning based on actual prior command inputs to the computer-based learning model and actual resulting feedback data(, paragraph [0069], discloses supervised deep learning machines can also be used reduce susceptibility to false classification , paragraph [134], discloses parameters represents  variable to be adjusted and  paragraph [00136], discloses adjusting parameter during the learning process).  Therefore, it would have been obvious to the one ordinary skill in the art before the effective filing date of the claimed invention for a user action, input or interaction [parameters] of Mohammed with supervised deep learing machine during the learning process  of  data of Hsieh in order to reduce susceptibility to false classification (see Hsieh, paragraph [0136]).


With respect to claim 8, Mohammed teaches elements of claim 7, furthermore, Mohammed teaches  the method 5further comprising: generating an output from a received input accordance with current values of a respective set of parameters (paragraph [0034]).  Mohammed failed to teach the corresponding parameters  used by the computer-based learning model to produce outputs with the adjusted set of parameters is periodically replaced.    
However, Hsieh  teaches periodically replacing the set parameters used by the computer-based learning model to produce outputs with the adjusted set of parameters(paragraph [0127], discloses the training model can be  one or more of the factors 1520-1540 to be processed and train an updated model to adjust setting adjust output, request input etc. periodically and/or otherwise upon reaching a threshold, satisfying a criteria eta..). Therefore, it would have been obvious to the one ordinary skill in the art before the effective filing date of the claimed invention for a user action, input or interaction [parameters] of Mohammed with adjusting the parameters during the learning process  of  data of Hsieh in order to represent the mapping of generic input-to output mapping as a result they can determine the output with high accuracy (see Hsieh, paragraph [0136]).
With respect to claim 18, Mohammed teaches elements of claim 16, furthermore, Mohammed teaches  the method   further comprising: mapping the command input to an action that matches an observed action from the 20observed event through (paragraph [0048], discloses observation charactering one or more user action is collected or identified by deep reinforcement learning system) . Mohammed failed to teach the corresponding observation action through supervised learning.  
However, Hsieh teaches observation action through supervised learning(paragraph [0061], discloses observable feature includes objection and quantifiable regularities learned by the machine during supervised learning).  Therefore, it would have been obvious to the one ordinary skill in the art before the effective filing date of the claimed invention for a user action, input or interaction [parameters] of Mohammed with supervised deep learing machine during the learning process  of  data of Hsieh in order to reduce susceptibility to false classification (see Hsieh, paragraph [0136]).




Claim(s) 9 and 10  are rejected under 35 U.S.C. 103 as being unpatentable over Mohammed (US Pub., No., 2018/0374138 A1) in view of Hsieh et al (US Pub., 2018/0144214 A1) and further view of Schmidt et al (US Pub., 2019/0213099 A1)

With respect to claim 9, Mohammed teaches elements of claim 8, furthermore, Mohammed teaches  the method   further comprising: initialize action value and sequence  (Fig. 4) and initialize or facilitate performing certain action with or in the online environment (paragraph [0023]) and Hsieh teaches the training deep learing network of the factory 1520, 1520, 1540 can be initialized using random number for all layers, .., which can be initialized to zero (paragraph [0129]).  Neither Mohammed or Hsieh teach the corresponding initialized value includes a  timer for the trial prior to producing the output to cause the action in the external environment and incrementing the value in the timer to a current value if the trial is not complete after causing the action in the external environment.  
However, Schmidt teaches 10initializing a value in timer for the trial prior to producing the output to cause the action in the external environment; and incrementing the value in the timer to a current value if the trial is not complete after causing the action in the external environment(paragraph [0018], discloses  a time-stamps including  the time at which each call occurred can be divided into a number of shorter sequences  that each correspond to a particular time interval and shorter sequence  .., and paragraph [0021], discloses a time stamp indicting a time at which the respective system call was made. The sequence of system calls can be divided into a plurality of time interval system call sequences, each time interval system call sequence including system calls corresponding to a particular time interval).  Therefore, it  would have been obvious to the one ordinary skill in the art before the effective filing date of the claimed invention for initializing value of Mohammed and Hsieh with a time stamp indicating a time at which  the respective  system call was made of Schmidt in order measure the  particular time interval can correspond to a time interval at which the resource usage statistics of the one or more resources in the computing system (see Schmidt, paragraph [0021]).
With respect to claim 10, Mohammed teaches elements of claim 9, furthermore, Mohammed teaches  the method  initialize action value and sequence  (Fig. 4) and initialize or facilitate performing certain action with or in the online environment (paragraph [0023]) and Hsieh teaches the training deep learing network of the factory 1520, 1520, 1540 can be initialized using random number for all layers, .., which can be initialized to zero (paragraph [0129]).  Neither Mohammed or Hsieh teach updating a time associated with adjusting the set of parameters in the copy to match the current value.  

However, Schmidt teaches 10 updating a time associated with adjusting the set of parameters in the copy to match the current value(paragraph [0018], discloses  a time-stamps including  the time at which each call occurred can be divided into a number of shorter sequences  that each correspond to a particular time interval and shorter sequence  .., and paragraph [0021], discloses a time stamp indicting a time at which the respective system call was made. The sequence of system calls can be divided into a plurality of time interval [update time] system call sequences, each time interval system call sequence including system calls corresponding to a particular time interval).  Therefore, it  would have been obvious to the one ordinary skill in the art before the effective filing date of the claimed invention for initializing value of Mohammed and Hsieh with a time stamp indicating a time at which  the respective  system call was made of Schmidt in order measure the  particular time interval can correspond to a time interval at which the resource usage statistics of the one or more resources in the computing system (see Schmidt, paragraph [0021]).

Prior art on the record:

Mohammed (US Pub., No., 2018/0374138 A1) discloses systems, methods, and computer-readable media for delivering recommendations are provided to personalize user experience, optimize online advertising, and maximize revenue for online merchants

Hsieh et al (US Pub., 2018/0144214 A1) discloses  methods and apparatus to  automatically generate an image quality metric for an image are provided. An example
method includes automatically processing a first medical image using a deployed learning network model to generate an image quality metric for the first medical image, the deployed learning network model generated from a digital learning and improvement factory including a training network, wherein the training network is tuned using a set of labeled reference medical images of a plurality of image.
Schmidt et al (US Pub., 2019/0213099 A1) discloses   a method for monitoring resources in a computing system having system information includes transforming, via representation learning, variable-size information into fixed size information, and creating a machine learning neural network model and training it the machine learning model to predict future resource usage of an application. 



Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SABA DAGNEW whose telephone number is (571)270-3271. The examiner can normally be reached 9-6:45.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Waseem Ashraf can be reached on (571) 270 -3948. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SABA DAGNEW/Primary Examiner, Art Unit 3682