DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is responsive to the Application filed on 01/14/2019. Claims 1-20 are pending in the case. Claims 1, 11, and 20 are independent claims.

Claim Objections
Claims 5, 6, 15, and 16 are objected to because of the following informalities:
Claims 5 and 15 do not include an “and” in front of the last limitation.
Claims 6 and 16 recite “across the across” which seems to be a typographical error. At minimum “the across” does not have a sufficient antecedent basis.
Appropriate correction is required.

Specification
The disclosure is objected to because of the following informalities:
Paragraph 28 recites “and then stores the task distributions 150 in the task database 120.” However, both figure 1 and paragraph 28-31 refer to element 120 as the artifact database.
Paragraph 82 recites “tasks 152.” However, there appears to be no item labeled 152 in the figures.
Paragraph 81 recites “the ranked pattern list 740.” However, there appears to be no item labeled 740 in the figures.
Appropriate correction is required.

Claim Rejections - 35 U.S.C. § 101
35 U.S.C. § 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. § 101 because claimed invention is directed to an abstract idea without significantly more

Independent claim 1 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
	Step 1:
The claim is directed towards the statutory category of a process.
Step 2A Prong 1:
The claim recites a mental process. The mental process recited is:
A… method for automatically recommending workflows for software-based tasks, the method comprising:
computing an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user;
applying a first set of commands associated with the target user to a trained… learning model to determine a target distribution of weights applied to a set of tasks, wherein the trained… learning model maps different sets of commands to different distributions of weights applied to the set of tasks;
determining a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights;
generating a recommendation that specifies the first training item….
	Under the broadest reasonable interpretation, these limitations are process steps that cover mental processes including an observation, evaluation, judgment or opinion that could be performed in the human mind or with the aid of pencil and paper but for the recitation of a generic computer component. If a claim, under its broadest reasonable interpretation, covers a mental process but for the recitation of generic computer components, then it falls within the "Mental Process" grouping of abstract ideas. A person would readily be able to perform this process either mentally or with the assistance of pen and paper. See MPEP § 2106.04(a)(2).
Step 2A Prong 2: 
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). 
The following limitations are merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f): computer-implemented method; and machine-learning model.
The following limitations are adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP § 2106.05(g): transmitting the recommendation to a user to assist the user in performing a particular task.
A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the judicial exception. See MPEP § 2106.04(d). 
Step 2B:
The claimed invention does not recite any additional elements/limitations that amount to significantly more. 
The following limitations are merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f): computer-implemented method; and machine-learning model.
The following limitations are adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP § 2106.05(g): transmitting the recommendation to a user to assist the user in performing a particular task. The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “receiving and transmitting data over a network” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim).
The claimed invention recites an abstract idea without significantly more.

Dependent claim 2 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is:
partitioning the plurality of training items across the set of tasks based on different distributions of weights applied to the set of tasks to generate a plurality of task sets;
for each task set included in the plurality of task sets, performing one or more frequent pattern mining operations on at least one set of commands to generate a distribution of frequencies across a task-specific set of command patterns; and
setting the set of command patterns equal to the union of the task-specific sets of command patterns.


Dependent claim 3 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is:
generating a plurality of task-specific frequency distributions across different task-specific sets of command patterns based on different distributions of weights applied to the set of tasks, a Frequent Pattern Growth algorithm, and different sets of commands associated with the plurality of training items; and
setting the set of command patterns equal to the union of the different task- specific sets of command patterns.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 4 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: performing one or more bi-term topic modeling operations based on a at least two sets of commands associated with the plurality of training items to generate the trained machine-learning model.


Dependent claim 5 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: computing the expected distribution of frequencies comprises:
computing a plurality of similarity scores based on the different distributions of frequencies across the set of command patterns, wherein each similarity score is associated with both the target user and a different user included in a plurality of users;
combining the different distributions of frequencies based on the similarity scores to generate the expected distribution of frequencies.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 6 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: determining the first training item comprises:
determining a first command pattern from the set of command patterns based on the expected distribution of frequencies and a first distribution of frequencies across the across the set of command patterns that is associated with the target user;
performing one or more filtering operations on the plurality of training items based on the first command pattern to determine a set of matching training items; and
performing at least one of a ranking and a filtering operation on the set of matching training items based on the target distribution of weights to determine the first training item.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 7 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: generating the recommendation comprises:
determining that a first popularity score associated with the first training item is greater than a second popularity score associated with a second training item; and
adding the first training item but not the second training item to a list associated with the recommendation.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 8 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.

The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 9 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: the first training item comprises a video, a document, a tutorial, or a website.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 10 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: the particular task is included in the set of tasks.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Independent claim 11 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
	Step 1:
The claim is directed towards the statutory category of an article of manufacture.
Step 2A Prong 1:
The claim recites a mental process. The mental process recited is:
… automatically recommend workflows for software-based tasks by performing the steps of:
computing an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user;
applying a first set of commands associated with the target user to a trained… learning model to determine a target distribution of weights applied to a set of tasks, wherein the trained… learning model maps different sets of commands to different distributions of weights applied to the set of tasks;
determining a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights;
generating a recommendation that specifies the first training item….
	Under the broadest reasonable interpretation, these limitations are process steps that cover mental processes including an observation, evaluation, judgment or opinion that could be performed in the human mind or with the aid of pencil and paper but for the recitation of a generic computer component. If a claim, under its broadest reasonable interpretation, covers a mental process but for the recitation of generic computer components, then it falls within the "Mental Process" grouping of 
Step 2A Prong 2: 
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). 
The following limitations are merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f): one or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to perform the method; and machine-learning model.
The following limitations are adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP § 2106.05(g): transmitting the recommendation to a user to assist the user in performing a particular task.
A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the judicial exception. See MPEP § 2106.04(d). 
Step 2B:
The claimed invention does not recite any additional elements/limitations that amount to significantly more. 
The following limitations are merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f): one or 
The following limitations are adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP § 2106.05(g): transmitting the recommendation to a user to assist the user in performing a particular task. The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “receiving and transmitting data over a network” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim).
The claimed invention recites an abstract idea without significantly more.

Dependent claim 12 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is:
partitioning the plurality of training items across the set of tasks based on different distributions of weights applied to the set of tasks to generate a plurality of task sets;
for each task set included in the plurality of task sets, performing one or more frequent pattern mining operations on at least one set of commands to generate a distribution of frequencies across a task-specific set of command patterns; and
setting the set of command patterns equal to the union of the task-specific sets of command patterns.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 13 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: determining the set of command patterns based on a Frequent Pattern Growth algorithm and at least two sets of commands, wherein each set of commands is associated with a different training item included in the plurality of training items.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 14 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: performing one or more topic modeling operations based on at least two sets of commands associated with the plurality of training items to generate the trained machine-learning model.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 15 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: computing the expected distribution of frequencies comprises:
computing a plurality of similarity scores based on the different distributions of frequencies across the set of command patterns, wherein each similarity score is associated with both the target user and a different user included in a plurality of users;
combining the different distributions of frequencies based on the similarity scores to generate the expected distribution of frequencies.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

16.
The one or more non-transitory computer readable media of claim 11, wherein 
Dependent claim 16 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: determining the first training item comprises:
determining a first command pattern from the set of command patterns based on the expected distribution of frequencies and a first distribution of frequencies across the across the set of command patterns that is associated with the target user;
performing one or more filtering operations on the plurality of training items based on the first command pattern to determine a set of matching training items; and
performing at least one of a ranking and a filtering operation on the set of matching training items based on the target distribution of weights to determine the first training item.


Dependent claim 17 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: generating the recommendation comprises performing one or more ranking operations on the first training item and at least one other training item based on a popularity metric.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 18 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites a mental process. The mental process recited is: the first set of commands associated with the target user includes at least two subsets of commands, wherein each subset of commands is associated with a different session associated with a different discrete portion of work.
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Dependent claim 19 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.

The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). The claimed invention does not recite any additional elements/limitations that amount to significantly more. 

Independent claim 20 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.
	Step 1:
The claim is directed towards the statutory category of an apparatus.
Step 2A Prong 1:
The claim recites a mental process. The mental process recited is: 
…compute an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user;
apply a first set of commands associated with the target user to a trained… learning model to determine a target distribution of weights applied to a set of tasks, wherein the trained… learning model maps different sets of commands to different distributions of weights applied to the set of tasks;
determine a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights;
generate a recommendation that specifies the first training item….

Step 2A Prong 2: 
The claimed invention does not recite any additional elements that integrate the judicial exception into a practical application. Refer to MPEP §2106.04(d). 
The following limitations are merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f): a system, comprising: one or more memories storing instructions; and one or more processors that are coupled to the one or more memories and, when executing the instructions, are configured to perform the method; and machine-learning model.
The following limitations are adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP § 2106.05(g): transmitting the recommendation to a user to assist the user in performing a particular task.
A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the judicial exception. See MPEP § 2106.04(d). 
Step 2B:
The claimed invention does not recite any additional elements/limitations that amount to significantly more. 
The following limitations are merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f): a system, comprising: one or more memories storing instructions; and one or more processors that are coupled to the one or more memories and, when executing the instructions, are configured to perform the method; and machine-learning model.
The following limitations are adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP § 2106.05(g): transmitting the recommendation to a user to assist the user in performing a particular task. The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “receiving and transmitting data over a network” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim).
The claimed invention recites an abstract idea without significantly more.

Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:


This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant are advised of the obligation under 37 C.F.R. § 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.

Claims 1, 6-8, 10, 11, 16-18, and 20 are rejected under 35 U.S.C. § 103 as being unpatentable over Damevski et al. (Damevski, Kostadin, Hui Chen, David C. Shepherd, Nicholas A. Kraft, and Lori Pollock. "Predicting future developer behavior in the IDE using topic models." IEEE Transactions on Software Engineering 44, no. 11 (2017): 1100-1111, hereinafter Damevski) in view of Gasparic et al. (Gasparic, Marko, Tural Gurbanov, and Francesco Ricci. "Context-aware integrated development environment command recommender systems." In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 688-693. IEEE, 2017, hereinafter Gasparic).

As to independent claim 1, Damevski teaches:
A computer-implemented method for automatically recommending workflows for software-based tasks, the method comprising (Abstract, "software command recommender"
applying a first set of commands associated with the target user to a trained machine-learning model to determine a target distribution of weights applied to a set of tasks, wherein the trained machine-learning model maps different sets of commands to different distributions of weights applied to the set of tasks (Page 1103, Section 5.1, "To build the initial LDA model, we decompose past developer interaction with an IDE into a set of interaction sessions, delimited by a period of inactivity of at least 5 minutes. We choose this interval with the goal of ensuring that, most of the time, a development task (e.g., structured navigation, debugging) does not span two sessions,3 which we validate empirically by sampling and examining interaction traces." Teaches applying IDE commands associated with a developer to an LDA model (machine learning model). Page 1103, Section 5.1, "A topic, denoted as β, is a probability distribution over a fixed vocabulary. Specifically, if we assume K topics are associated with the corpus, the topics" are β = {β1; β2; ... ; βK}. The K topics are thus defined by their Probability Mass Functions (PMEFs)....” Teaches that the LDA model determines a probability distribution of weights for the IDE commands. Page 1105, Section 6.1, “For evaluation, we use developers’ interaction traces for Microsoft Visual Studio and ABB Robot Studio. Visual Studio is a well known general purpose IDE, while Robot Studio is a popular IDE intended for robotics development that supports both simulation and physical robot programming and uses a programming language called RAPID. Both datasets are large and representative.” Teaches a computer based implementation as the data is obtained from programming IDEs. Page 1103, Section 5.1, “So, in applying LDA to interaction traces, a window of interactions corresponds to a document, an interaction message corresponds to a word, and developer intention corresponds to a topic. In the following description, we use the interaction data specific terms (message, window, topic), when describing the LDA model,” and “A topic, denoted as B, is a probability distribution over a fixed vocabulary. Specifically, if we assume K topics are associated with the corpus, the topics are β = {β1; β2; ... ; βK}. The K topics are thus defined by their Probability Mass Functions (PMFs)....” Teaches that different commands are mapped to a different probability distribution (distribution of weights) by using the trained LDA model);…
generating a recommendation that specifies the first training item (Page 1105, Section 5.3, “Trained in this way, the Temporal LDA model can be used as part of the IDE, to improve how recommendations are generated online, during a developer’s use of the environment. The model can be updated at various frequencies and with different subsets of the interaction datasets produced, depending on assumptions of its quality, computational cost, and the desire to tailor it to an individual developer or, more broadly, to all developers.” Teaches generating a recommendation. Page 1102, Section 3: “Here, we examine how in certain important ways, IDE interaction logs indeed mimic natural language text, which inspired our investigation into this modeling technique for command recommendation generation.” Teaches that the system recommends a command (training item)); and
transmitting the recommendation to a user to assist the user in performing a particular task (Page 1105, Section 5.3, “Trained in this way, the Temporal LDA model can be used as part of the IDE, to improve how recommendations are generated online, during a developer’s use of the environment. The model can be updated at various frequencies and with different subsets of the interaction datasets produced, depending on assumptions of its quality, computational cost, and the desire to tailor it to an individual developer or, more broadly, to all developers.” Teaches transmitting the command recommendation to the developer to assist with programming)).
Damevski does not appear to expressly teach computing an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user; and determining a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights.
Gasparic teaches computing an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user (Page 689, Section III, "The recommendation score of a command a for a user u is defined to be P(a∣u), which is the probability of observing the usage of a." Paragraph 689, Section III, "input data for the recommendation algorithm was collected by logging the IDE interactions of first year bachelor students at the Free University of Bozen-Bolzano, during the first ten weeks of the Introduction to Programming course. The data collection was completely anonymous. The data set contains 199,220 command execution records. Each record is a tuple < u, a, t, c >, where t is the timestamp and c is the context in which u executed a. Overall, we detected 113 different user identifiers and 219 different commands." Page 689, Section III, "A user u can be described by a set of contexts Cu that were detected when she executed commands. The probability P(a|u), that a can be executed by u, if she knows a, is estimated as P(a|Cu), which is the probability to observe the execution of a in the population of users that know and use a, given a set of contexts in which u worked."); and determining a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights (Page 691, "we generated the top-5 recommendations, for each different week of usage, by using the data observed in the past". Page 692, "The context-aware usefulness metric AR@N calculates the average relevance of the top-N recommendations for each user u ∈ U. If Rec@Nu is the set of top-N recommended commands, then AR@N can be defined as follows:... Higher values indicate that more useful commands are recommended by the algorithm.").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

As to dependent claim 6, Damevski further teaches determining the first training item comprises:
determining a first command pattern from the set of command patterns based on the expected distribution of frequencies and a first distribution of frequencies across the across the set of command patterns that is associated with the target user (Page 1102, "When we examine a smaller unit of the log, such as an hour of one developer’s work, we find that the number of interaction types is small, consisting of usually highly regular and repetitive patterns");
performing one or more filtering operations on the plurality of training items based on the first command pattern to determine a set of matching training items (Page 1107, "Only those newly discovered commands that occur more than once in the trace are used, filtering out spurious command uses").
Gasparic further teaches:
performing at least one of a ranking and a filtering operation on the set of matching training items based on the target distribution of weights to determine the first training item (Page 691, "we generated the top-5 recommendations, for each different week of usage, by using the data observed in the past". Page 692, "The context-aware usefulness metric AR@N calculates the average relevance of the top-N recommendations for each user u ∈ U. If Rec@Nu is the set of top-N recommended commands, then AR@N can be defined as follows:... Higher values indicate that more useful commands are recommended by the algorithm.").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

As to dependent claim 7, Gasparic further teaches generating the recommendation comprises: determining that a first popularity score associated with the first training item is greater than a second popularity score associated with a second training item (Page 692, “calculates the average relevance of the top-N recommendations for each user u ∈ U.” Page 692, "Higher values indicate that more useful commands are recommended by the algorithm." Teaches comparing the command’s (training item) average relevance (popularity score) and determining if one is greater than the other by sorting them); and adding the first training item but not the second training item to a list associated with the recommendation (Page 692, “calculates the average relevance of the top-N recommendations.” Teaches adding only the top N commands to a list, based on their average relevance).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

claim 8, Damevski further teaches the first set of commands includes both a first command associated with a first software application and a second command associated with a second software application (Page 1105, Section 6.1, “For evaluation, we use developers’ interaction traces for Microsoft Visual Studio and ABB Robot Studio. Visual Studio is a well known general purpose IDE, while Robot Studio is a popular IDE intended for robotics development that supports both simulation and physical robot programming and uses a programming language called RAPID. Both datasets are large and representative.” Teaches that the training data includes commands from both the Visual Studio and Robot Studio applications).

As to dependent claim 10, Damevski further teaches the particular task is included in the set of tasks (Page 1102, Section 3.1, “When we examine a smaller unit of the log, such as an hour of one developer’s work, we find that the number of interaction types is small, consisting of usually highly regular and repetitive patterns. This is expected, as within a small period of time, a developer is likely focusing on a specific task and interacting with a small subset of the development environment which consists of relatively few interactions.” Teaches that a developer works on a specific task within a time period. Page 1103, Section 5.1, “To build the initial LDA model, we decompose past developer interaction with an IDE into a set of interaction sessions, delimited by a period of inactivity of at least 5 minutes. We choose this interval with the goal of ensuring that, most of the time, a development task (e.g., structured navigation, debugging) does not span two sessions, which we validate empirically by sampling and examining interaction traces.” Teaches that the specific development task is in a set of interaction sessions).

As to independent claim 11
One or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to automatically recommend workflows for software-based tasks by performing the steps of (Abstract, "software command recommender"):…
applying a first set of commands associated with the target user to a trained machine-learning model to determine a target distribution of weights applied to a set of tasks, wherein the trained machine-learning model maps different sets of commands to different distributions of weights applied to the set of tasks (Page 1103, Section 5.1, "To build the initial LDA model, we decompose past developer interaction with an IDE into a set of interaction sessions, delimited by a period of inactivity of at least 5 minutes. We choose this interval with the goal of ensuring that, most of the time, a development task (e.g., structured navigation, debugging) does not span two sessions,3 which we validate empirically by sampling and examining interaction traces." Teaches applying IDE commands associated with a developer to an LDA model (machine learning model). Page 1103, Section 5.1, "A topic, denoted as β, is a probability distribution over a fixed vocabulary. Specifically, if we assume K topics are associated with the corpus, the topics" are β = {β1; β2; ... ; βK}. The K topics are thus defined by their Probability Mass Functions (PMEFs)....” Teaches that the LDA model determines a probability distribution of weights for the IDE commands. Page 1105, Section 6.1, “For evaluation, we use developers’ interaction traces for Microsoft Visual Studio and ABB Robot Studio. Visual Studio is a well known general purpose IDE, while Robot Studio is a popular IDE intended for robotics development that supports both simulation and physical robot programming and uses a programming language called RAPID. Both datasets are large and representative.” Teaches a computer based implementation as the data is obtained from programming IDEs. Page 1103, Section 5.1, “So, in applying LDA to interaction traces, a window of interactions corresponds to a document, an interaction message corresponds to a word, and developer intention corresponds to a topic. In the following description, we use the interaction data specific terms (message, window, topic), when describing the LDA model,” and “A topic, denoted as B, is a probability distribution over a fixed vocabulary. Specifically, if we assume K topics are associated with the corpus, the topics are β = {β1; β2; ... ; βK}. The K topics are thus defined by their Probability Mass Functions (PMFs)....” Teaches that different commands are mapped to a different probability distribution (distribution of weights) by using the trained LDA model);…
generating a recommendation that specifies the first training item (Page 1105, Section 5.3, “Trained in this way, the Temporal LDA model can be used as part of the IDE, to improve how recommendations are generated online, during a developer’s use of the environment. The model can be updated at various frequencies and with different subsets of the interaction datasets produced, depending on assumptions of its quality, computational cost, and the desire to tailor it to an individual developer or, more broadly, to all developers.” Teaches generating a recommendation. Page 1102, Section 3: “Here, we examine how in certain important ways, IDE interaction logs indeed mimic natural language text, which inspired our investigation into this modeling technique for command recommendation generation.” Teaches that the system recommends a command (training item)); and
transmitting the recommendation to a user to assist the user in performing a particular task (Page 1105, Section 5.3, “Trained in this way, the Temporal LDA model can be used as part of the IDE, to improve how recommendations are generated online, during a developer’s use of the environment. The model can be updated at various frequencies and with different subsets of the interaction datasets produced, depending on assumptions of its quality, computational cost, and the desire to tailor it to an individual developer or, more broadly, to all developers.” Teaches transmitting the command recommendation to the developer to assist with programming)).
Damevski does not appear to expressly teach computing an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user; and determining a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights.
Gasparic teaches computing an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user (Page 689, Section III, "The recommendation score of a command a for a user u is defined to be P(a∣u), which is the probability of observing the usage of a." Paragraph 689, Section III, "input data for the recommendation algorithm was collected by logging the IDE interactions of first year bachelor students at the Free University of Bozen-Bolzano, during the first ten weeks of the Introduction to Programming course. The data collection was completely anonymous. The data set contains 199,220 command execution records. Each record is a tuple < u, a, t, c >, where t is the timestamp and c is the context in which u executed a. Overall, we detected 113 different user identifiers and 219 different commands." Page 689, Section III, "A user u can be described by a set of contexts Cu that were detected when she executed commands. The probability P(a|u), that a can be executed by u, if she knows a, is estimated as P(a|Cu), which is the probability to observe the execution of a in the population of users that know and use a, given a set of contexts in which u worked."); and determining a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights (Page 691, "we generated the top-5 recommendations, for each different week of usage, by using the data observed in the past". Page 692, "The context-aware usefulness metric AR@N calculates the average relevance of the top-N recommendations for each user u ∈ U. If Rec@Nu is the set of top-N recommended commands, then AR@N can be defined as follows:... Higher values indicate that more useful commands are recommended by the algorithm.").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

As to dependent claim 16, Damevski further teaches determining the first training item comprises:
determining a first command pattern from the set of command patterns based on the expected distribution of frequencies and a first distribution of frequencies across the across the set of command patterns that is associated with the target user (Page 1102, "When we examine a smaller unit of the log, such as an hour of one developer’s work, we find that the number of interaction types is small, consisting of usually highly regular and repetitive patterns");
performing one or more filtering operations on the plurality of training items based on the first command pattern to determine a set of matching training items (Page 1107, "Only those newly discovered commands that occur more than once in the trace are used, filtering out spurious command uses"); and
Page 691, "we generated the top-5 recommendations, for each different week of usage, by using the data observed in the past". Page 692, "The context-aware usefulness metric AR@N calculates the average relevance of the top-N recommendations for each user u ∈ U. If Rec@Nu is the set of top-N recommended commands, then AR@N can be defined as follows:... Higher values indicate that more useful commands are recommended by the algorithm.").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

As to dependent claim 17, Gasparic further teaches generating the recommendation comprises performing one or more ranking operations on the first training item and at least one other training item based on a popularity metric (Page 692, “calculates the average relevance of the top-N recommendations for each user u ∈ U.” Page 692, "Higher values indicate that more useful commands are recommended by the algorithm." Teaches comparing the command’s (training item) average relevance (popularity score) and determining if one is greater than the other by sorting them).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

claim 18, Damevski further teaches the first set of commands associated with the target user includes at least two subsets of commands, wherein each subset of commands is associated with a different session associated with a different discrete portion of work (Page 1102, Section 3.1, “When we examine a smaller unit of the log, such as an hour of one developer’s work, we find that the number of interaction types is small, consisting of usually highly regular and repetitive patterns. This is expected, as within a small period of time, a developer is likely focusing on a specific task and interacting with a small subset of the development environment which consists of relatively few interactions.” Teaches that a developer works on a specific task within a time period. Page 1103, Section 5.1, “To build the initial LDA model, we decompose past developer interaction with an IDE into a set of interaction sessions, delimited by a period of inactivity of at least 5 minutes. We choose this interval with the goal of ensuring that, most of the time, a development task (e.g., structured navigation, debugging) does not span two sessions, which we validate empirically by sampling and examining interaction traces.” Teaches that the specific development task is in a set of interaction sessions).

As to independent claim 20, Damevski teaches:
A system, comprising:
one or more memories storing instructions; and
one or more processors that are coupled to the one or more memories and, when executing the instructions, are configured to (Abstract, "software command recommender"):…
apply a first set of commands associated with the target user to a trained machine-learning model to determine a target distribution of weights applied to a set of tasks, wherein the trained machine-learning model maps different sets of commands to Page 1103, Section 5.1, "To build the initial LDA model, we decompose past developer interaction with an IDE into a set of interaction sessions, delimited by a period of inactivity of at least 5 minutes. We choose this interval with the goal of ensuring that, most of the time, a development task (e.g., structured navigation, debugging) does not span two sessions,3 which we validate empirically by sampling and examining interaction traces." Teaches applying IDE commands associated with a developer to an LDA model (machine learning model). Page 1103, Section 5.1, "A topic, denoted as β, is a probability distribution over a fixed vocabulary. Specifically, if we assume K topics are associated with the corpus, the topics" are β = {β1; β2; ... ; βK}. The K topics are thus defined by their Probability Mass Functions (PMEFs)....” Teaches that the LDA model determines a probability distribution of weights for the IDE commands. Page 1105, Section 6.1, “For evaluation, we use developers’ interaction traces for Microsoft Visual Studio and ABB Robot Studio. Visual Studio is a well known general purpose IDE, while Robot Studio is a popular IDE intended for robotics development that supports both simulation and physical robot programming and uses a programming language called RAPID. Both datasets are large and representative.” Teaches a computer based implementation as the data is obtained from programming IDEs. Page 1103, Section 5.1, “So, in applying LDA to interaction traces, a window of interactions corresponds to a document, an interaction message corresponds to a word, and developer intention corresponds to a topic. In the following description, we use the interaction data specific terms (message, window, topic), when describing the LDA model,” and “A topic, denoted as B, is a probability distribution over a fixed vocabulary. Specifically, if we assume K topics are associated with the corpus, the topics are β = {β1; β2; ... ; βK}. The K topics are thus defined by their Probability Mass Functions (PMFs)....” Teaches that different commands are mapped to a different probability distribution (distribution of weights) by using the trained LDA model);…
generate a recommendation that specifies the first training item (Page 1105, Section 5.3, “Trained in this way, the Temporal LDA model can be used as part of the IDE, to improve how recommendations are generated online, during a developer’s use of the environment. The model can be updated at various frequencies and with different subsets of the interaction datasets produced, depending on assumptions of its quality, computational cost, and the desire to tailor it to an individual developer or, more broadly, to all developers.” Teaches generating a recommendation. Page 1102, Section 3: “Here, we examine how in certain important ways, IDE interaction logs indeed mimic natural language text, which inspired our investigation into this modeling technique for command recommendation generation.” Teaches that the system recommends a command (training item)); and
transmit the recommendation to a user to assist the user in performing a particular task (Page 1105, Section 5.3, “Trained in this way, the Temporal LDA model can be used as part of the IDE, to improve how recommendations are generated online, during a developer’s use of the environment. The model can be updated at various frequencies and with different subsets of the interaction datasets produced, depending on assumptions of its quality, computational cost, and the desire to tailor it to an individual developer or, more broadly, to all developers.” Teaches transmitting the command recommendation to the developer to assist with programming)).
Damevski does not appear to expressly teach compute an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of 
Gasparic teaches compute an expected distribution of frequencies across a set of command patterns based on different distributions of frequencies across the set of command patterns, wherein the expected distribution of frequencies is associated with a target user, and each different distribution of frequencies is associated with a different user (Page 689, Section III, "The recommendation score of a command a for a user u is defined to be P(a∣u), which is the probability of observing the usage of a." Paragraph 689, Section III, "input data for the recommendation algorithm was collected by logging the IDE interactions of first year bachelor students at the Free University of Bozen-Bolzano, during the first ten weeks of the Introduction to Programming course. The data collection was completely anonymous. The data set contains 199,220 command execution records. Each record is a tuple < u, a, t, c >, where t is the timestamp and c is the context in which u executed a. Overall, we detected 113 different user identifiers and 219 different commands." Page 689, Section III, "A user u can be described by a set of contexts Cu that were detected when she executed commands. The probability P(a|u), that a can be executed by u, if she knows a, is estimated as P(a|Cu), which is the probability to observe the execution of a in the population of users that know and use a, given a set of contexts in which u worked."); and determine a first training item from a plurality of training items based on the expected distribution of frequencies and the target distribution of weights (Page 691, "we generated the top-5 recommendations, for each different week of usage, by using the data observed in the past". Page 692, "The context-aware usefulness metric AR@N calculates the average relevance of the top-N recommendations for each user u ∈ U. If Rec@Nu is the set of top-N recommended commands, then AR@N can be defined as follows:... Higher values indicate that more useful commands are recommended by the algorithm.").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski to include the command recommendation techniques of Gasparic to provide more relevant recommendations (see Gasparic at abstract).

Claims 2 and 12 are rejected under 35 U.S.C. § 103 as being unpatentable over Damevski in view of Gasparic and Adar et al. (Adar, Eytan, Mira Dontcheva, and Gierad Laput. "CommandSpace: modeling the relationships between tasks, descriptions and features." In Proceedings of the 27th annual ACM symposium on User interface software and technology, pp. 167-176. 2014, hereinafter Adar).

As to dependent claim 2, the rejection of claim 1 is incorporated. Damevski further teaches: partitioning the plurality of training items across the set of tasks based on different distributions of weights applied to the set of tasks to generate a plurality of task sets (Page 1103, Section 5.1, "we further divide the sessions into a succession of fixed-size windows, where each window is a sequence of m commands and events"); and for each task set included in the plurality of task sets, performing one or more frequent pattern mining operations on at least one set of commands to generate a distribution of frequencies across a task-specific set of command patterns (Page 1103, Section 5.1, "Using shorter windows, rather than the sessions, also fosters better temporal locality in the model. We train both the initial LDA model with windows as documents as well as use windows for the Temporal LDA model and prediction").
Damevski as modified by Gasparic does not appear to expressly teach setting the set of command patterns equal to the union of the task-specific sets of command patterns.
Page 174, "if the union of all features used in a tutorial would be a good signal for what the tutorial was about. We created a summative vector for all the commands in each tutorial, and used this new vector to query into the tutorial space").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the command space modeling of Adar to better bridge the gap between the language of an application and the language people use to describe what they want to accomplish (see Adar at introduction).

As to dependent claim 12, the rejection of claim 11 is incorporated. Damevski further teaches: partitioning the plurality of training items across the set of tasks based on different distributions of weights applied to the set of tasks to generate a plurality of task sets (Page 1103, Section 5.1, "we further divide the sessions into a succession of fixed-size windows, where each window is a sequence of m commands and events"); and for each task set included in the plurality of task sets, performing one or more frequent pattern mining operations on at least one set of commands to generate a distribution of frequencies across a task-specific set of command patterns (Page 1103, Section 5.1, "Using shorter windows, rather than the sessions, also fosters better temporal locality in the model. We train both the initial LDA model with windows as documents as well as use windows for the Temporal LDA model and prediction").
Damevski as modified by Gasparic does not appear to expressly teach setting the set of command patterns equal to the union of the task-specific sets of command patterns.
Adar teaches setting the set of command patterns equal to the union of the task-specific sets of command patterns (Page 174, "if the union of all features used in a tutorial would be a good signal for what the tutorial was about. We created a summative vector for all the coronands in each tutorial, and used this new vector to query into the tutorial space").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the command space modeling of Adar to better bridge the gap between the language of an application and the language people use to describe what they want to accomplish (see Adar at introduction).

Claims 4 and 14 are rejected under 35 U.S.C. § 103 as being unpatentable over Damevski in view of Gasparic and Yan et al. (Yan, Xiaohui, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. "A biterm topic model for short texts." In Proceedings of the 22nd international conference on World Wide Web, pp. 1445-1456. 2013, hereinafter Yan).

As to dependent claim 4, the rejection of claim 1 is incorporated.
Damevski as modified by Gasparic does not appear to expressly teach performing one or more bi-term topic modeling operations based on a at least two sets of commands associated with the plurality of training items to generate the trained machine-learning model.
Yan teaches performing one or more bi-term topic modeling operations based on a at least two sets of commands associated with the plurality of training items to generate the trained machine-learning model (Page 1445, “In this paper, we propose a novel way for modeling topics in short texts, referred as biterm topic model (BTM). Specifically, in BTM we learn the topics by directly modeling the generation of word co-occurrence patterns (i.e. biterms) in the whole corpus.” Teaches performing bi-term topic modeling on short-texts to generate the biterm topic model).


As to dependent claim 14, the rejection of claim 11 is incorporated.
Damevski as modified by Gasparic does not appear to expressly teach performing one or more topic modeling operations based on at least two sets of commands associated with the plurality of training items to generate the trained machine-learning model.
Yan teaches performing one or more topic modeling operations based on at least two sets of commands associated with the plurality of training items to generate the trained machine-learning model (Page 1445, “In this paper, we propose a novel way for modeling topics in short texts, referred as biterm topic model (BTM). Specifically, in BTM we learn the topics by directly modeling the generation of word co-occurrence patterns (i.e. biterms) in the whole corpus.” Teaches performing bi-term topic modeling on short-texts to generate the biterm topic model).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the bi-term topic modeling of Yan to use the aggregated patterns in the whole corpus for learning topics to solve the problem of sparse word co-occurrence patterns at document-level (see Yan at page 1445).

Claims 5, 15, and 19 are rejected under 35 U.S.C. § 103 as being unpatentable over Damevski in view of Gasparic and Li et al. (Li, Wei, Justin Matejka, Tovi Grossman, Joseph A. Konstan, and George ACM Transactions on Computer-Human Interaction (TOCHI) 18, no. 2 (2011): 1-35, hereinafter Li).

As to dependent claim 5, the rejection of claim 1 is incorporated.
Damevski as modified by Gasparic does not appear to expressly teach computing the expected distribution of frequencies comprises: computing a plurality of similarity scores based on the different distributions of frequencies across the set of command patterns, wherein each similarity score is associated with both the target user and a different user included in a plurality of users; combining the different distributions of frequencies based on the similarity scores to generate the expected distribution of frequencies.
Li teaches computing the expected distribution of frequencies comprises: computing a plurality of similarity scores based on the different distributions of frequencies across the set of command patterns, wherein each similarity score is associated with both the target user and a different user included in a plurality of users (Page 11, “Rather than matching users based on their command usage, our item-based collaborative filtering algorithm matches the active user’s commands to similar commands. The steps of the algorithms are described below. Defining User Vectors. We first define a vector Vi for each command ci in the n dimensional user-space. Similar to user-based approach, each cell, Vi(j), contains the cf-iuf value for each user uj. Build a Command-to-Command Similarity Matrix. Next, we generate a command to-command similarity matrix, M. Mik is defined for each pair of commands i and k as: Mik = cos(Vi, Vk)... For the active user, uj, we create an “active list” L, which contains all of the commands that the active user has used. Lj = {ci|c fi j > 0}... Next, we define a similarity score, si, for each command ci which is not in the active user’s active list: si = average(Mik, Vck € L).” Teaches comparing each command vector’s cf-iuf value to compute similarity scores, with each similarity score being associate with a different command (training item). Page 9, “With those two metrics we can compute the cf-iuf as: c f-iufi j = c fij : iufij. A high weight inc f-iuf is obtained when a command is used frequently by a particular user, but is used by a relatively small portion of the overall population.” Teaches that the cf-iuf value is a distribution of weight); combining the different distributions of frequencies based on the similarity scores to generate the expected distribution of frequencies (Page 11, "Build a Command-to-Command Similarity Matrix. Next, we generate a command-to-command similarity matrix, M. M;k is defined for each pair of commands i and k as: M;k = cos(V;, Vk).").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the command recommendation techniques of Li to better to aid command awareness in complex software applications (see Li at abstract).

As to dependent claim 15, the rejection of claim 11 is incorporated.
Damevski as modified by Gasparic does not appear to expressly teach computing the expected distribution of frequencies comprises: computing a plurality of similarity scores based on the different distributions of frequencies across the set of command patterns, wherein each similarity score is associated with both the target user and a different user included in a plurality of users; combining the different distributions of frequencies based on the similarity scores to generate the expected distribution of frequencies.
Li teaches computing the expected distribution of frequencies comprises: computing a plurality of similarity scores based on the different distributions of frequencies across the set of command patterns, wherein each similarity score is associated with both the target user and a different user included in a plurality of users (Page 11, “Rather than matching users based on their command usage, our item-based collaborative filtering algorithm matches the active user’s commands to similar commands. The steps of the algorithms are described below. Defining User Vectors. We first define a vector Vi for each command ci in the n dimensional user-space. Similar to user-based approach, each cell, Vi(j), contains the cf-iuf value for each user uj. Build a Command-to-Command Similarity Matrix. Next, we generate a command to-command similarity matrix, M. Mik is defined for each pair of commands i and k as: Mik = cos(Vi, Vk)... For the active user, uj, we create an “active list” L, which contains all of the commands that the active user has used. Lj = {ci|c fi j > 0}... Next, we define a similarity score, si, for each command ci which is not in the active user’s active list: si = average(Mik, Vck € L).” Teaches comparing each command vector’s cf-iuf value to compute similarity scores, with each similarity score being associate with a different command (training item). Page 9, “With those two metrics we can compute the cf-iuf as: c f-iufi j = c fij : iufij. A high weight inc f-iuf is obtained when a command is used frequently by a particular user, but is used by a relatively small portion of the overall population.” Teaches that the cf-iuf value is a distribution of weight); combining the different distributions of frequencies based on the similarity scores to generate the expected distribution of frequencies (Page 11, "Build a Command-to-Command Similarity Matrix. Next, we generate a command-to-command similarity matrix, M. M;k is defined for each pair of commands i and k as: M;k = cos(V;, Vk).").
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the command recommendation techniques of Li to better to aid command awareness in complex software applications (see Li at abstract).

As to dependent claim 19, the rejection of claim 11 is incorporated.
Damevski as modified by Gasparic does not appear to expressly teach the particular task is not included in the set of tasks.
Page 12, "In addition to testing our user-based and item-based collaborative filtering algorithms, we also implemented and evaluated Linton's algorithm [Linton and Schaefer 2000]. The algorithm suggests the top commands, as averaged across the whole user population, that a user has not used." Teaches that the specific task is not associated with the set of tasks because the Linton algorithm generates recommendations for a task from the entire user population, not the specific user. Therefore, the recommended command for a specific task is generated from the set of tasks of the global user population that does not include the user's specific task.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the command recommendation techniques of Li to better to aid command awareness in complex software applications (see Li at abstract).

Claim 9 is rejected under 35 U.S.C. § 103 as being unpatentable over Damevski in view of Gasparic and Khan et al. (Khan, Md Adnan Alam, Volodymyr Dziubak, and Andrea Bunt. "Exploring personalized command recommendations based on information found in Web documentation." In Proceedings of the 20th International Conference on Intelligent User Interfaces, pp. 225-235. 2015, hereinafter Khan).

As to dependent claim 9, the rejection of claim 1 is incorporated.
Damevski as modified by Gasparic does not appear to expressly teach the first training item comprises a video, a document, a tutorial, or a website.
Khan teaches the first training item comprises a video, a document, a tutorial, or a website (Page 225, “In this work, we propose an alternative approach to personalized command recommendations that uses command-to-task mappings mined from online documentation,” and, “Our results suggest that web documentation can be leveraged to generate recommendations for commands that are relevant to the task at hand.” Teaches that the recommended command (training item) is obtained from web documentation (a web site)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the software command recommender of Damevski as modified by Gasparic to include the personalized command recommendations of Khan to make personalized recommendations for task-relevant commands (see Khan at page 225).

Allowable over Prior Art
Claim 3 and 13 are allowable over the cited prior art. The following is a statement of reasons for the indication of allowable subject matter: the claimed invention requires that the set of command patterns are based on a Frequent Pattern Growth algorithm and at least two sets of commands and each set of commands is associated with a different training item included in the plurality of training items, which does not appear to be present in the cited prior art.

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Minder et al. (U.S. Pat. App. Pub. No. 2014/0074545) discusses recommendation systems and processes for generating recommendations within the context of a socially-enabled human workflow system are provided. Workflow data, such as social graphs, organization graphs, collaboration graphs, content data, utilization data, ratings data, and the like, are accessed and associated with a user requesting a recommendation. A user similarity score, task similarity score, goal similarity score, and content similarity score are determined. Recommendations based at least in part on one or more of the .

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Casey R. Garner whose telephone number is 571-272-2467. The examiner can normally be reached on Monday to Friday, 8am to 5pm, Eastern Time.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

/Casey R. Garner/Examiner, Art Unit 2123