DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 7, 11-12 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Klamik (US PGPUB 2007/0050677; hereinafter “Klamik”) in view of Cohen et al. (US Patent 9,069,904; hereinafter “Cohen”), Brafman et al. (US PGPUB 2019/0065357; hereinafter “Brafman”) and Qin et al. (US PGPUB 2020/0183821; hereinafter “Qin”).
Claim 1: (Currently Amended)
Klamik teaches a computer-implemented method, comprising:
performing a plurality of training test runs for an application under test (AUT), each training test run involving execution of test code against a corresponding instance of the AUT, each training test run resulting in corresponding training test results ([0021] “Testing software 120 itself also runs simultaneously with the software 121, 122 or hardware 123 under test.” [0037] “testing software conducts one or more tests and thereby produces one or more logs”);
simplifying the training test results for each training test run by removing noisy information from the training test results ([0037] “After testing software conducts one or more tests and thereby produces one or more logs such as the exemplary log of FIG. 4, additional processing can be conducted using the data in the log. With an algorithm to detect when the computer was executing what was intended (the test) vs. what was not intended (the noise), the noise can be identified and accommodated for when presenting test results to an analyst.” [0053] “Once adequate test data is obtained, the test data may be further processed 606 by removing any noise from the results”);
performing a first test run for the AUT by executing the test code against a corresponding instance of the AUT, thereby generating first test results ([0037] “testing software conducts one or more tests and thereby produces one or more logs”); and
simplifying the first test results by removing noisy information from the first test results ([0037] “After testing software conducts one or more tests and thereby produces one or more logs such as the exemplary log of FIG. 4, additional processing can be conducted using the data in the log. With an algorithm to detect when the computer was executing what was intended (the test) vs. what was not intended (the noise), the noise can be identified and accommodated for when presenting test results to an analyst.” [0053] “Once adequate test data is obtained, the test data may be further processed 606 by removing any noise from the results”).

With further regard to Claim 1, Klamik does not teach the following, however, Cohen teaches:
generating a training test run representation for each of the training test runs using the corresponding simplified training test results (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
clustering the training test run representations into a plurality of test run clusters (Col. 8 Ln. 49: “the clustering module 906 may receive the runs 905 after they undergo processing. For example, the runs may be represented as vectors of features on which a clustering algorithm utilized by the clustering module 906 may operate.”);
identifying each of a subset of the test run clusters (Col. 36 Ln. 27: “runs of test scenarios may be partitioned into clusters based on one or more values from the runs of test scenarios. For example, runs that involve a same start and/or end test step may be placed in the same cluster (e.g., runs that start from the same screen ID and end with an error are placed in the same cluster).” Col. 38 Ln. 56: “clustering of runs of test scenarios to clusters that include similar runs may be based on one or more of the following logged activities: … messages returned from the executed transactions (e.g., valid, warning, or error messages).”);
generating a first test run representation using the simplified first test results (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
determining that the first test run representation corresponds to a first one of the subset of test run clusters; and labeling the first test run with a label (Col. 41 Ln. 1: “the clustering of the runs of test scenarios to clusters that include similar runs may be done utilizing a classifier that is trained to assign test scenarios to predetermined classes. Optionally, the classifier is trained on labeled training data that includes training data that includes representations of runs of test scenarios (e.g., feature vectors) and labels corresponding to clusters to which the runs are assigned. If the labels in the training data are assigned according to some (possibly arbitrary) notion of similarity between test scenarios, clusters of test scenarios that have the same label assigned by the classifier are likely to contain runs that are similar according to the notion of similarity.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik with the test run representation generating as taught by Cohen in order “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein such information is known to aid in the process of software development and testing by reducing the burden on the software developer to make the similarity determination themselves.

With further regard to Claim 1, Klamik in view of Cohen does not teach the following, however, Brafman teaches:
wherein the identified subset of the test run clusters represent training test runs exhibiting inconsistent failure behavior in which successive test runs using a same AUT and a same test code result in one or more passes and one or more fails ([0018] “With respect to problematic tests, this process may identify anomalous tests (e.g., tests which are operating incorrectly). The categories of ‘bad’ tests that may be detected may include oscillating tests, where, by examining a brief history of tests (last X runs—e.g., 8), the number of transitions from passed to failed, and vice versa, may be counted. If the number of transitions are greater than a threshold (e.g., 4), the test may be marked as oscillating (e.g., randomly changing status).”); and
wherein the first test run label is a reliability label ([0057] “the predetermined hypotheses determination module 124 is to assign labels based on rules defined by users … For example, rules may specify that ‘when more than 50% tests fail, assign ‘environment’ label’,” wherein the “environment label” is the “reliability label”. [0064] “Referring to FIG. 2, the ‘type of problem (label)’ may apply to the code issues, environment issues, and test issues”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen with the identification and labeling of inconsistent failure behavior as taught by Brafman in order “to provide additional hints in the form of hypotheses to resolve a particular failure” (Brafman [0015]).

With further regard to Claim 1, Klamik in view of Cohen and Brafman does not teach the following, however, Qin teaches:
identifying one or more differences between the first test run representation and one or more previous test run representations ([0027] “a test result for a target program may include running results of a plurality of test cases, and the running results may include ‘pass’ or ‘failure’.” [0036] “FIG. 3 illustrates an exemplary process 300 for identifying flaky tests.” [0037] “At 310, historical running data of a group of test cases may be obtained.” [0038] “running curves may be generated, based on the historical running data, for representing running history of test cases… The running curve 400 may have two types of amplitudes, wherein the higher amplitude indicates a running result of failure, and the lower amplitude indicates a running result of pass. For example, the 3rd, 7th, 13th, 14th, 17th to 20th, and 26th test runs have a running result of failure, while other test runs have a running result of pass.”); 
quantifying the one or more differences; translating the quantified one or more differences to a representation of how likely the first test run is reliable or unreliable ([0046] “In FIG. 5, since the failure region number of the running curve 500 is five, which is above the failure region number threshold ‘two’, there is a high possibility that Test Case 1 is a flaky test in terms of the feature of ‘failure region number’.” [0053] “Although four types of features of ‘failure region number’, ‘continuous failure number’, ‘failure distance’, and ‘transition time’ are discussed above, it should be appreciated that the statistical analysis at 330 is not limited to these features, but can be performed based on any equivalents or any other features of the test cases obtained through data mining.” [0054] “In an implementation, determination results based on the features may be weighted and summed, and this test case may be determined as a flaky test if the sum meets a predefined condition.”); and 
the reliability label indicating the representation of how likely the first run is reliable or unreliable ([0056] “At 340, a determination result of one or more flaky tests that are identified by the statistical analysis may be output.” [0057] “FIG. 8 illustrates an exemplary process 800 for identifying flaky tests.” [0060] “At 860, a determination result of one or more flaky tests that are identified through verification may be output,” wherein the “determination result” in Qin is used to set the “reliability label” as taught above in Brafman.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen and Brafman with the translation of quantified differences to a representation of how likely a test is reliable or unreliable, i.e. “flaky”, as taught by Qin since “it would be beneficial to identify flaky tests, and this may help avoiding influence on a final test result of the target program” (Qin [0030]).

Claim 2:	
Klamik in view of Cohen, Brafman and Qin teaches the method of claim 1. Klamik in view of Brafman and Qin does not teach the following, however, Cohen further teaches
wherein the clustering of the training test run representations is done using an unsupervised learning technique (Col. 40 Ln. 58: “Those skilled in the art may recognize that various clustering algorithms and/or approaches may be used to cluster runs of test scenarios into clusters that include similar runs of test scenarios. For example, the clustering may be done using hierarchical clustering approaches (e.g., bottom-up or top-down approaches) or using partition-based approached (e.g., k-mean algorithms),” wherein “K-Mean algorithms” are well-known in the art to be a type of “unsupervised learning technique”.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Brafman and Qin with the unsupervised learning technique as taught by Cohen in order to further aid in a process “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein the use of an unsupervised machine learning technique, i.e. a K-Means algorithm, serves to reduce the burden on software developers as compared to manually making similarity determinations.

Claim 3:	
Klamik in view of Cohen, Brafman and Qin teaches the method of claim 1.  Klamik in view of Brafman and Qin does not teach the following, however, Cohen teaches further comprising
associating one of a plurality of labels with each of the training test run representations, wherein the clustering of the training test run representations is done using the labels and a supervised learning technique (Col. 41 Ln. 1: “the clustering of the runs of test scenarios to clusters that include similar runs may be done utilizing a classifier that is trained to assign test scenarios to predetermined classes. Optionally, the classifier is trained on labeled training data that includes training data that includes representations of runs of test scenarios (e.g., feature vectors) and labels corresponding to clusters to which the runs are assigned.” Col. 41 Ln. 25: “labels assigned to runs of test scenarios may be generated and/or assigned … automatically, e.g., by a procedure that analyzes a test scenario to detect attributes describing it (e.g., what modules and/or procedures it involves). Those skilled in the art may recognize that there are many algorithms, and/or machine learning-based approaches, that may be used to train a classifier of runs of test scenarios using labeled training data.” Col. 40 Ln. 66: “a semi-supervised clustering approach may be used such as an Expectation-Maximization (EM) algorithm.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Brafman and Qin with the supervised learning technique as taught by Cohen in order further aid in a process “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein the use of a supervised machine learning technique, i.e. an Expectation-Maximization (EM) algorithm, serves to reduce the burden on software developers as compared to manually making similarity determinations.

Claim 4:	
Klamik in view of Cohen, Brafman and Qin teaches the method of claim 3.  Klamik in view of Brafman and Qin does not teach the following, however, Cohen teaches further comprising
transmitting representations of the training test results to a remote developer device for presentation in a user interface of the remote developer device (Col. 12 Ln. 48: “the user interface 924 may initiate the instantiation of the manipulated test scenario template. For example, the user interface 924 may present a first screen belonging to the test scenario template and prompt a user to take a certain action to advance execution.” Col. 31 Ln. 40: “users perform at least part of their interaction with a software system via a user interface that includes a display that displays screens. Optionally, a screen may refer to a presentation of a certain form through which a user may access, modify and/or enter data.” Col. 58 Ln. 25: “The embodiments may also be practiced in a distributed computing environment where tasks are performed by remote processing devices that are linked through a communication network.”); and
receiving feedback from the remote developer device, the feedback being generated using the user interface and relating to the training test results; wherein associating the labels with the training test run representations is done based on the feedback (Col. 41 Ln. 25: “labels assigned to runs of test scenarios may be generated and/or assigned manually (e.g., by a tester running a test)”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Brafman and Qin with the remote developer communication as taught by Cohen since “examining the run of the test scenario may reveal a behavior of the system with respect to the certain test step, transaction, command, or procedure” (Cohen Col. 33 Ln. 12)

Claim 7:	
Klamik in view of Cohen, Brafman and Qin teaches the method of claim 1. Klamik in view of Brafman and Qin does not teach the following, however, Cohen further teaches
wherein each of the first and training test results includes state information for the corresponding test run, the method further comprising collecting the state information for each of the first and training test runs using a control/capture service (CCS) that is independent of the test code, the state information for each test run representing one or more states associated with one or more testing resources allocated for the test run (Col. 32 Ln. 8: “monitoring a user may involve utilization of internal state data of the software system; data that may not have been directly provided by the user and may also not be directly provided to the user (e.g., memory content, database activities, and/or network traffic).” Col. 32 Ln. 23: “the monitoring module is, and/or utilizes, a software module that interacts with the software system on which the test scenarios are run, in order to obtain data related to activity of the user on the software system … the monitoring module may be implemented, at least in part, separately from the software system. For example, the monitoring module may include programs that are not part of the software system (e.g., not included in a distribution of the software system),” wherein the “monitoring module” is the “CCS”. Col. 32 Ln. 67: “a run of a test scenario may include state information about systems involved in running the test scenario (e.g., the state of certain system resources, and/or performance data such as CPU load or network congestion).”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Brafman and Qin with the state information as taught by Cohen since “This data may be used to identify runs of test scenarios that describe test steps taken by a user and a result of executing the test steps on the software system”  (Cohen Col. 31 Ln. 27).

Claim 11:	
Klamik in view of Cohen, Brafman and Qin teaches the method of claim 1 and Klamik further teaches
performing a second test run for the AUT by executing the test code against a corresponding instance of the AUT, thereby generating second test results ([0037] “testing software conducts one or more tests and thereby produces one or more logs”); and 
simplifying the second test results by removing noisy information from the second test results ([0037] “After testing software conducts one or more tests and thereby produces one or more logs such as the exemplary log of FIG. 4, additional processing can be conducted using the data in the log. With an algorithm to detect when the computer was executing what was intended (the test) vs. what was not intended (the noise), the noise can be identified and accommodated for when presenting test results to an analyst.” [0053] “Once adequate test data is obtained, the test data may be further processed 606 by removing any noise from the results”).

With further regard to Claim 11, Klamik in view of Brafman and Qin does not teach the following, however, Cohen further teaches 
generating a second test run representation using the simplified second test results (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
determining that the second test run representation does not correspond to any of the test run clusters (Col. 39 Ln. 20: “a cluster of similar runs includes runs that are represented by similar vectors. Optionally, similar vectors may be characterized in various ways. In one example, similar vectors are vectors whose average pairwise similarity is above a predetermined threshold (e.g., the threshold may be 0.5). Optionally, the average pairwise similarity is determined by computing the average of the dot product of each pair of vectors. In another example, similar vectors are vectors that are all similar to a certain representative vector; e.g., the vectors all within a sphere of a certain Euclidean distance from the representative,” wherein vectors having a pairwise similarity below the “predetermined threshold” indicates that the associated test run does not correspond to any of the existing test run clusters.); and
labeling the second test run as a new type of failure (Col. 41 Ln. 25: “Optionally, labels assigned to runs of test scenarios may be generated and/or assigned … automatically, e.g., by a procedure that analyzes a test scenario to detect attributes describing it (e.g., what modules and/or procedures it involves),” wherein labelling a test run as a “new type of failure” occurs when there is no existing label which meets the test run similarity threshold, as discussed above, and as such a new label is automatically generated and assigned.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Brafman and Qin with the further test run representation generating as taught by Cohen in order “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein such information is known to aid in the process of software development and testing by reducing the burden on the software developer to make the similarity determination themselves.

Claim 12:	
Klamik in view of Cohen, Brafman and Qin teaches the method of claim 1 and Klamik further teaches
performing a second test run for the AUT by executing the test code against a corresponding instance of the AUT, thereby generating second test results ([0037] “testing software conducts one or more tests and thereby produces one or more logs”); and
simplifying the second test results by removing noisy information from the second test results ([0037] “After testing software conducts one or more tests and thereby produces one or more logs such as the exemplary log of FIG. 4, additional processing can be conducted using the data in the log. With an algorithm to detect when the computer was executing what was intended (the test) vs. what was not intended (the noise), the noise can be identified and accommodated for when presenting test results to an analyst.” [0053] “Once adequate test data is obtained, the test data may be further processed 606 by removing any noise from the results”).

With further regard to Claim 12, Klamik in view of Brafman and Qin does not teach the following, however, Cohen further teaches
generating a second test run representation using the simplified second test results (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
determining that the second test run representation does not correspond to any of the test run clusters; and forming a new cluster including the second test run representation (Col. 39 Ln. 20: “a cluster of similar runs includes runs that are represented by similar vectors. Optionally, similar vectors may be characterized in various ways. In one example, similar vectors are vectors whose average pairwise similarity is above a predetermined threshold (e.g., the threshold may be 0.5). Optionally, the average pairwise similarity is determined by computing the average of the dot product of each pair of vectors. In another example, similar vectors are vectors that are all similar to a certain representative vector; e.g., the vectors all within a sphere of a certain Euclidean distance from the representative,” wherein vectors having a pairwise similarity below the “predetermined threshold” indicates that the associated test run does not correspond to any of the existing test run clusters and as such constitutes a new cluster which will include any further vectors which are found to be above the similarity threshold when compared to the vector of the test run stored in the new cluster.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Brafman and Qin with the further test run representation generating as taught by Cohen in order “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein such information is known to aid in the process of software development and testing by reducing the burden on the software developer to make the similarity determination themselves.

Claim 24:
With regard to Claim 24, this claim is equivalent in scope to Claims 1 and 11 rejected above, merely having a different independent claim type, and as such Claim 24 is rejected under the same grounds and for the same reasons as discussed above with regard to Claims 1 and 11.
With further regard to Claim 24, the claim recites additional elements not specifically addressed in the rejection of Claims 1 and 11. The Klamik reference also discloses these additional elements of Claim 24, for example, wherein the computer program product comprises one or more non-transitory computer-readable media having computer program instructions stored therein, the computer program instructions being configured such that, when executed by one or more processors, the computer program instructions cause the one or more processors to perform operations ([0055] “the invention may be implemented in the general context of computer-executable instructions, such as program modules, being executed by a computer,” see also Fig. 1 showing Processor 110.).

Claims 5 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Klamik in view of Cohen, Brafman and Qin as applied to Claim 1 above, and further in view of Kumar et al. (US PGPUB 2011/0289489; hereinafter “Kumar”).
Claim 5:	
Klamik in view of Cohen, Brafman and Qin teaches all the limitations of claim 1 as described above. Klamik in view of Cohen, Brafman and Qin does not teach the following, however, Kumar teaches:
wherein the AUT is a web application, wherein each of the first and training test runs involves interaction of a web browser with the corresponding instance of the AUT ([0032] “each test environment 145 in a web compatibility testing system 100 may include a unique combination of a web browser and an operating system on which the browser may run. Thus, a plurality of test environments 145 may be included that have different installed versions of browsers and operating systems to facilitate compatibility testing of an application under test 105 in different web application scenarios.”)
 as controlled by the test code using a web application automation driver ([0052] “the test case 110-A illustrated in the Figure is written in the Selenese language of the Selenium web application testing system,” wherein “Selenium” is a type of “web application automation driver”, as indicated by the Applicant’s specification Paragraph [0031].), and
 wherein each of the first and training test results corresponds to a test log representing application of test commands of the test code using the web application automation driver ([0079] “FIG. 9 illustrates an exemplary execution log file 170 including step results 165 from the simulation of the user actions 115-A through 115-E of the test cases 110-A and 110-B.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen, Brafman and Qin with the web application testing as taught by Kumar in order to “provide a framework in which a set of compatibility test cases may be recorded once against an application under test and stored in a simple format in a common repository” (Kumar [0017]).

Claim 10:
Klamik in view of Cohen, Brafman and Qin teaches all the limitations of claim 1 as described above. Klamik in view of Cohen, Brafman and Qin does not teach the following, however, Kumar teaches for each of the first and training test runs:
receiving a request for initiation of the test run ([0059] “As illustrated in FIG. 5, the user interface 175 may include a URL selector 505 interface element providing for the selection or input of a base URL 510 at which to begin the test.”);
allocating one or more resources for the test run, the one or more resources including a virtual computing environment (VCE) instance, establishing a communication link between the VCE instance and the corresponding instance of the AUT ([0030] “A test environment 145 may include hardware and supporting software required for the execution of an application under test 105. A test environment 145 may be a standalone computing device, such as a personal computer, or may be a virtualized instance of a computing device created by way of a virtualization software package. Accordingly, test environments 145 may be implemented as a combination of hardware and software, and may include one or more software applications or processes for causing one or more computer processors to perform the operations of the test environment 145 described herein.”);
receiving a plurality of test commands resulting from execution of the test code ([0041] “The simulator 155 may further be configured to run test cases 110 or user actions 115 in the test environments 145 by specifying the requested test environments 145 by session identifier 160 and network identifier.”);
applying the test commands to the corresponding instance of the AUT using the VCE instance and the communication link ([0069] “When the execute 535 interface element is selected, the simulator 155 may receive one or more messages from the user interface 175 configured to cause the simulator 155 to retrieve the selected user action 115 from the repository 140, and simulate the user action 115 in each of the launched test environments 145.”);
receiving the test results for the test run with the VCE instance via the communication link ([0038] “The agent may return step results 165 based on the status of execution of the user action 115, and the controller may receive the step results 165. These step results 165 may indicate the result of the simulated user actions 115 and verifications 125.”);
correlating the test commands and the test results thereby generating a correlated data set ([0017] “These user actions may be selected and simulated on the application under test, and compared against an expected result.”); and
storing the correlated data set ([0021] “The repository 140 may further be configured to store step results 165 in log files 170, the step results 165 being based on the simulation of the test cases 110 by the simulator 155.” [0038] “The simulator 155 may further send the step results 165 to the repository 140 for storage in log files 170. In some instances, the step results 165 include a screen capture of the application under test 105 after the execution of each associated user action 115.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen, Brafman and Qin with the testing in a virtual computing environment as taught by Kumar in order “to efficiently simulate compatibility situations” (Kumar [0020]).

Claims 6 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Klamik in view of Cohen, Brafman and Qin as applied to Claims 1 and 7 above, and further in view of Lundstrom (US PGPUB 2017/0017566; hereinafter “Lundstrom”).
Claim 6:	
Klamik in view of Cohen, Brafman and Qin teaches all the limitations of claim 1 as described above. Klamik in view of Cohen, Brafman and Qin does not teach the following, however, Lundstrom teaches:
wherein the AUT is a native application configured to operate with a mobile device operating system (OS) ([0017] “when the test processor 100 detects that a mobile device 102 has been connected thereto, it initiates the testing process by issuing a request to the application build server 104, for instance, via a network 106, to install the latest build of the application subject to testing onto the mobile device 102.”), 
wherein each of the first and training test runs involves interaction of the test code with the corresponding instance of the AUT using a mobile device automation driver ([0021] “In step 306, the test processor 100 launches the device platform's (e.g., specific to the device's operating system) and corresponding application's test process or a test run.” [0025] “The TTS test run contains a selenium code abstraction layer 508, which then passes the calls through the Appium layer 510. The Appium layer 510 translates the Selenium calls to Native xcode calls for the layer 512 which communicates with the physical device layer 51,” wherein “Appium” is a type of “mobile device automation driver”, as indicated by the Applicant’s specification Paragraph [0031].), and 
wherein each of the first and training test results corresponds to a test log representing application of test commands of the test code using the mobile device automation driver ([0021] “the test progress may be continuously recorded in the test log file and displayed via the visualization workstation 108 (FIG. 1).”). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen, Brafman and Qin with the mobile device application testing as taught by Lundstrom in order to “provide a new approach to automated software testing where, instead of running at the end of development cycle, tests are performed on a continuous and modular basis” (Lundstrom [0016]).

Claim 9:	
Klamik in view of Cohen, Brafman and Qin teaches all the limitations of claim 7 as described above. Klamik in view of Cohen, Brafman and Qin does not teach the following, however, Lundstrom teaches:
wherein the AUT is a native application configured to operate with a mobile device operating system (OS) ([0017] “when the test processor 100 detects that a mobile device 102 has been connected thereto, it initiates the testing process by issuing a request to the application build server 104, for instance, via a network 106, to install the latest build of the application subject to testing onto the mobile device 102.”), 
wherein each of the first and training test runs involves interaction of the test code with the corresponding instance of the AUT ([0021] “In step 306, the test processor 100 launches the device platform's (e.g., specific to the device's operating system) and corresponding application's test process or a test run.” [0025] “The TTS test run contains a selenium code abstraction layer 508, which then passes the calls through the Appium layer 510. The Appium layer 510 translates the Selenium calls to Native xcode calls for the layer 512 which communicates with the physical device layer 51.”), and
wherein the state information includes one or more of a state of the native application, a state of the mobile device OS, or a state of an emulator emulating the mobile device OS ([0021] “the test progress may be continuously recorded in the test log file and displayed via the visualization workstation 108 (FIG. 1). Upon re-connection of device 102 to the test processor 100, step 312, resources are re-allocated and the test run resumes from the application state that was being tested when the device was unplugged.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen, Brafman and Qin with the mobile device application testing as taught by Lundstrom since “This achieves modular testing by not requiring the test run to traverse the entire state model for the application at one time” (Lundstrom [0021]).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Klamik in view of Cohen, Brafman and Qin as applied to Claim 7 above, and further in view of De Angelis et al. (US PGPUB 2015/0169434; hereinafter “De Angelis”).
Claim 8:	
Klamik in view of Cohen, Brafman and Qin teaches all the limitations of claim 7 as described above. Klamik in view of Cohen, Brafman and Qin does not teach the following, however, De Angelis teaches:
wherein the AUT is a web application ([0028] “embodiments may be used for white-box testing web applications in production environments.”), 
wherein each of the first and training test runs involves interaction of a web browser with the corresponding instance of the AUT as controlled by the test code ([0033] “Using web browser (also referred to simply as "browser") 104, the user may access web server 108 to run application 112. Upon user access, application 112 is downloaded or otherwise made accessible to browser 104 and is executed (e.g., executing application 114). For purposes of performing tests upon executing application 114, some of the test components 118 may also be downloaded to client computer 102 and run in browser 104, e.g., as executing test framework 116.”),
wherein the state information includes a state of the web browser, and wherein the CCS is configured to apply control commands to the web browser and to receive the state of the web browser from the web browser via an application programming interface associated with the web browser ([0006] “Most of the existing frameworks for web browser graphical user interface tests (e.g., Selenium) are divided into a component that offers an application programming interface (API) to define the interaction and to check the expected result, as well as a component that executes the defined interaction in the web browser. The component that executes the defined test interactions use the native automation APIs provided by the browsers. These native APIs of the browser allow checking the state of the document object model (DOM) structure in the browser.” [0061] “At operation 220, results of the executed test are observed. For example, the document object model (DOM) structure and internal state of the application may be examined. The results may include values or variables at defined stages of processing of the application, results of assertions and/or validation statements included in the test case code, etc.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Klamik in view of Cohen, Brafman and Qin with the web application testing as taught by De Angelis in order “to check the internal state of an application and to implement white-box tests” (De Angelis [0006]).

Claims 13-17, 19 and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Klamik in view of Kumar, Cohen, and Qin. 
Claim 13: (Currently Amended)
Klamik teaches a system, comprising one or more computing devices comprising: 
a memory device; and one or more processors in communication with the memory device (Fig. 1: Processor 110. [0055] “the invention may be implemented in the general context of computer-executable instructions, such as program modules, being executed by a computer… program modules may be located in both local and remote computer storage media including memory storage devices.”), 
the one or more processors configured to:
perform a plurality of training test runs for an application under test (AUT), each training test run involving execution of test code against a corresponding instance of the AUT ([0021] “Testing software 120 itself also runs simultaneously with the software 121, 122 or hardware 123 under test.”); and
perform a first test run for the AUT by executing the test code against a corresponding instance of the AUT, thereby generating first test results ([0037] “testing software conducts one or more tests and thereby produces one or more logs”).
 
With further regard to Claim 13, Klamik does not teach the following, however, Kumar teaches wherein, for each of the training test runs, the one or more processors are configured to:
allocate one or more resources for the training test run, the one or more resources including a virtual computing environment (VCE) instance; and establish a communication link between the VCE instance and the corresponding instance of the AUT ([0030] “A test environment 145 may include hardware and supporting software required for the execution of an application under test 105. A test environment 145 may be a standalone computing device, such as a personal computer, or may be a virtualized instance of a computing device created by way of a virtualization software package. Accordingly, test environments 145 may be implemented as a combination of hardware and software, and may include one or more software applications or processes for causing one or more computer processors to perform the operations of the test environment 145 described herein.”);
receive a plurality of test commands resulting from execution of the test code ([0041] “The simulator 155 may further be configured to run test cases 110 or user actions 115 in the test environments 145 by specifying the requested test environments 145 by session identifier 160 and network identifier.”);
apply the test commands to the corresponding instance of the AUT using the VCE instance and the communication link ([0069] “When the execute 535 interface element is selected, the simulator 155 may receive one or more messages from the user interface 175 configured to cause the simulator 155 to retrieve the selected user action 115 from the repository 140, and simulate the user action 115 in each of the launched test environments 145.”); and
receive training test results for the training test run with the VCE instance via the communication link ([0038] “The agent may return step results 165 based on the status of execution of the user action 115, and the controller may receive the step results 165. These step results 165 may indicate the result of the simulated user actions 115 and verifications 125.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik with the testing in a virtual computing environment as taught by Kumar in order “to efficiently simulate compatibility situations” (Kumar [0020]).

With further regard to Claim 13, Klamik in view of Kumar does not teach the following, however, Cohen teaches:
generate a training test run representation for each of the training test runs (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
cluster the training test run representations into a plurality of test run clusters (Col. 8 Ln. 49: “the clustering module 906 may receive the runs 905 after they undergo processing. For example, the runs may be represented as vectors of features on which a clustering algorithm utilized by the clustering module 906 may operate.”);
identify each of a subset of the test run clusters (Col. 36 Ln. 27: “runs of test scenarios may be partitioned into clusters based on one or more values from the runs of test scenarios. For example, runs that involve a same start and/or end test step may be placed in the same cluster (e.g., runs that start from the same screen ID and end with an error are placed in the same cluster).” Col. 38 Ln. 56: “clustering of runs of test scenarios to clusters that include similar runs may be based on one or more of the following logged activities: … messages returned from the executed transactions (e.g., valid, warning, or error messages).”);
generate a first test run representation using the first test results (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
determine that the first test run representation corresponds to a first one of the subset of test run clusters; and label the first test run with a label (Col. 41 Ln. 1: “the clustering of the runs of test scenarios to clusters that include similar runs may be done utilizing a classifier that is trained to assign test scenarios to predetermined classes. Optionally, the classifier is trained on labeled training data that includes training data that includes representations of runs of test scenarios (e.g., feature vectors) and labels corresponding to clusters to which the runs are assigned. If the labels in the training data are assigned according to some (possibly arbitrary) notion of similarity between test scenarios, clusters of test scenarios that have the same label assigned by the classifier are likely to contain runs that are similar according to the notion of similarity.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar with the test run representation generating as taught by Cohen in order “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein such information is known to aid in the process of software development and testing by reducing the burden on the software developer to make the similarity determination themselves.

With further regard to Claim 13, Klamik in view of Kumar and Cohen does not teach the following, however, Brafman teaches:
wherein the identified subset of the test run clusters represent training test runs exhibiting inconsistent failure behavior ([0018] “With respect to problematic tests, this process may identify anomalous tests (e.g., tests which are operating incorrectly). The categories of ‘bad’ tests that may be detected may include oscillating tests, where, by examining a brief history of tests (last X runs—e.g., 8), the number of transitions from passed to failed, and vice versa, may be counted. If the number of transitions are greater than a threshold (e.g., 4), the test may be marked as oscillating (e.g., randomly changing status).”); and
wherein the first test run label is a reliability label ([0057] “the predetermined hypotheses determination module 124 is to assign labels based on rules defined by users … For example, rules may specify that ‘when more than 50% tests fail, assign ‘environment’ label’,” wherein the “environment label” is the “reliability label”. [0064] “Referring to FIG. 2, the ‘type of problem (label)’ may apply to the code issues, environment issues, and test issues”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar and Cohen with the reliability label as taught by Brafman in order “to provide additional hints in the form of hypotheses to resolve a particular failure” (Brafman [0015]).

With further regard to Claim 13, Klamik in view of Kumar, Cohen and Brafman does not teach the following, however, Qin teaches:
identify one or more differences between the first test run representation and one or more previous test run representations ([0027] “a test result for a target program may include running results of a plurality of test cases, and the running results may include ‘pass’ or ‘failure’.” [0036] “FIG. 3 illustrates an exemplary process 300 for identifying flaky tests.” [0037] “At 310, historical running data of a group of test cases may be obtained.” [0038] “running curves may be generated, based on the historical running data, for representing running history of test cases… The running curve 400 may have two types of amplitudes, wherein the higher amplitude indicates a running result of failure, and the lower amplitude indicates a running result of pass. For example, the 3rd, 7th, 13th, 14th, 17th to 20th, and 26th test runs have a running result of failure, while other test runs have a running result of pass.”);
quantify the one or more differences; translate the quantified one or more differences to a representation of how likely the first test run is reliable or unreliable ([0046] “In FIG. 5, since the failure region number of the running curve 500 is five, which is above the failure region number threshold ‘two’, there is a high possibility that Test Case 1 is a flaky test in terms of the feature of ‘failure region number’.” [0053] “Although four types of features of ‘failure region number’, ‘continuous failure number’, ‘failure distance’, and ‘transition time’ are discussed above, it should be appreciated that the statistical analysis at 330 is not limited to these features, but can be performed based on any equivalents or any other features of the test cases obtained through data mining.” [0054] “In an implementation, determination results based on the features may be weighted and summed, and this test case may be determined as a flaky test if the sum meets a predefined condition.”); and 
the reliability label indicating the representation of how likely the first run is reliable or unreliable ([0056] “At 340, a determination result of one or more flaky tests that are identified by the statistical analysis may be output.” [0057] “FIG. 8 illustrates an exemplary process 800 for identifying flaky tests.” [0060] “At 860, a determination result of one or more flaky tests that are identified through verification may be output,” wherein the “determination result” in Qin is used to set the “reliability label” as taught above in Brafman.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Cohen and Brafman with the translation of quantified differences to a representation of how likely a test is reliable or unreliable, i.e. “flaky”, as taught by Qin since “it would be beneficial to identify flaky tests, and this may help avoiding influence on a final test result of the target program” (Qin [0030]).

Claim 14: 
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 13. Klamik in view of Kumar, Brafman and Qin does not teach the following, however, Cohen further teaches 
wherein the one or more processors are configured to cluster of the training test run representations using an unsupervised learning technique (Col. 40 Ln. 58: “Those skilled in the art may recognize that various clustering algorithms and/or approaches may be used to cluster runs of test scenarios into clusters that include similar runs of test scenarios. For example, the clustering may be done using hierarchical clustering approaches (e.g., bottom-up or top-down approaches) or using partition-based approached (e.g., k-mean algorithms),” wherein “K-Mean algorithms” are well-known in the art to be a type of “unsupervised learning technique”.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Brafman and Qin with the unsupervised learning technique as taught by Cohen in order to further aid in a process “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein the use of an unsupervised machine learning technique, i.e. a K-Means algorithm, serves to reduce the burden on software developers as compared to manually making similarity determinations.

Claim 15: 
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 13. Klamik in view of Kumar, Brafman and Qin does not teach the following, however, Cohen further teaches 
wherein the one or more processors are further configured to associate one of a plurality of labels with each of the training test run representations, and wherein the one or more computing devices are configured to cluster the training test run representations using the labels and a supervised learning technique (Col. 41 Ln. 1: “the clustering of the runs of test scenarios to clusters that include similar runs may be done utilizing a classifier that is trained to assign test scenarios to predetermined classes. Optionally, the classifier is trained on labeled training data that includes training data that includes representations of runs of test scenarios (e.g., feature vectors) and labels corresponding to clusters to which the runs are assigned.” Col. 41 Ln. 25: “labels assigned to runs of test scenarios may be generated and/or assigned … automatically, e.g., by a procedure that analyzes a test scenario to detect attributes describing it (e.g., what modules and/or procedures it involves). Those skilled in the art may recognize that there are many algorithms, and/or machine learning-based approaches, that may be used to train a classifier of runs of test scenarios using labeled training data.” Col. 40 Ln. 66: “a semi-supervised clustering approach may be used such as an Expectation-Maximization (EM) algorithm.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Brafman and Qin with the supervised learning technique as taught by Cohen in order further aid in a process “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein the use of a supervised machine learning technique, i.e. an Expectation-Maximization (EM) algorithm, serves to reduce the burden on software developers as compared to manually making similarity determinations.

Claim 16: 
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 15. Klamik in view of Kumar, Brafman and Qin does not teach the following, however, Cohen further teaches 
wherein the one or more processors are further configured to:
transmit representations of the training test results to a remote developer device for presentation in a user interface of the remote developer device (Col. 12 Ln. 48: “the user interface 924 may initiate the instantiation of the manipulated test scenario template. For example, the user interface 924 may present a first screen belonging to the test scenario template and prompt a user to take a certain action to advance execution.” Col. 31 Ln. 40: “users perform at least part of their interaction with a software system via a user interface that includes a display that displays screens. Optionally, a screen may refer to a presentation of a certain form through which a user may access, modify and/or enter data.” Col. 58 Ln. 25: “The embodiments may also be practiced in a distributed computing environment where tasks are performed by remote processing devices that are linked through a communication network.”); and
receive feedback from the remote developer device, the feedback being generated using the user interface and relating to the training test results; wherein the one or more processors are configured to associate the labels with the training test run representations based on the feedback (Col. 41 Ln. 25: “labels assigned to runs of test scenarios may be generated and/or assigned manually (e.g., by a tester running a test)”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Brafman and Qin with the remote developer communication as taught by Cohen since “examining the run of the test scenario may reveal a behavior of the system with respect to the certain test step, transaction, command, or procedure” (Cohen Col. 33 Ln. 12)

Claim 17:	
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 13. Klamik in view of Cohen, Brafman and Qin does not teach the following, however, Kumar further teaches
wherein the AUT is a web application, wherein each of the first and training test runs involves interaction of a web browser with the corresponding instance of the AUT ([0032] “each test environment 145 in a web compatibility testing system 100 may include a unique combination of a web browser and an operating system on which the browser may run. Thus, a plurality of test environments 145 may be included that have different installed versions of browsers and operating systems to facilitate compatibility testing of an application under test 105 in different web application scenarios.”)
as controlled by the test code using a web application automation driver ([0052] “the test case 110-A illustrated in the Figure is written in the Selenese language of the Selenium web application testing system,” wherein “Selenium” is a type of “web application automation driver”, as indicated by the Applicant’s specification Paragraph [0031].), and
wherein each of the first and training test results corresponds to a test log representing application of test commands of the test code using the web application automation driver ([0079] “FIG. 9 illustrates an exemplary execution log file 170 including step results 165 from the simulation of the user actions 115-A through 115-E of the test cases 110-A and 110-B.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Cohen, Brafman and Qin with the AUT being a web application as taught by Kumar in order “to facilitate compatibility testing of an application under test 105 in different web application scenarios” (Kumar [0032]).

Claim 19:
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 13. Klamik in view of Kumar, Brafman and Qin does not teach the following, however, Cohen further teaches
wherein each of the first and training test results includes state information for the corresponding test run, and wherein the one or more processors are further configured to collect the state information for each of the first and training test runs using a control/capture service (CCS) that is independent of the test code, the state information for each test run representing one or more states associated with one or more testing resources allocated for the test run (Col. 32 Ln. 8: “monitoring a user may involve utilization of internal state data of the software system; data that may not have been directly provided by the user and may also not be directly provided to the user (e.g., memory content, database activities, and/or network traffic).” Col. 32 Ln. 23: “the monitoring module is, and/or utilizes, a software module that interacts with the software system on which the test scenarios are run, in order to obtain data related to activity of the user on the software system … the monitoring module may be implemented, at least in part, separately from the software system. For example, the monitoring module may include programs that are not part of the software system (e.g., not included in a distribution of the software system),” wherein the “monitoring module” is the “CCS”. Col. 32 Ln. 67: “a run of a test scenario may include state information about systems involved in running the test scenario (e.g., the state of certain system resources, and/or performance data such as CPU load or network congestion).”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Brafman and Qin with the state information as taught by Cohen since “This data may be used to identify runs of test scenarios that describe test steps taken by a user and a result of executing the test steps on the software system”  (Cohen Col. 31 Ln. 27).

Claim 22: 
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 13 and Klamik further teaches wherein the one or more processors are further configured to:
perform a second test run for the AUT by executing the test code against a corresponding instance of the AUT, thereby generating second test results ([0037] “testing software conducts one or more tests and thereby produces one or more logs”).

With further regard to Claim 22, Klamik in view of Kumar, Brafman and Qin does not teach the following, however, Cohen further teaches
generate a second test run representation using the second test results (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
determine that the second test run representation does not correspond to any of the test run clusters (Col. 39 Ln. 20: “a cluster of similar runs includes runs that are represented by similar vectors. Optionally, similar vectors may be characterized in various ways. In one example, similar vectors are vectors whose average pairwise similarity is above a predetermined threshold (e.g., the threshold may be 0.5). Optionally, the average pairwise similarity is determined by computing the average of the dot product of each pair of vectors. In another example, similar vectors are vectors that are all similar to a certain representative vector; e.g., the vectors all within a sphere of a certain Euclidean distance from the representative,” wherein vectors having a pairwise similarity below the “predetermined threshold” indicates that the associated test run does not correspond to any of the existing test run clusters.); and
label the second test run as a new type of failure (Col. 41 Ln. 25: “Optionally, labels assigned to runs of test scenarios may be generated and/or assigned … automatically, e.g., by a procedure that analyzes a test scenario to detect attributes describing it (e.g., what modules and/or procedures it involves),” wherein labelling a test run as a “new type of failure” occurs when there is no existing label which meets the test run similarity threshold, as discussed above, and as such a new label is automatically generated and assigned.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Brafman and Qin with the further test run representation generating as taught by Cohen in order “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein such information is known to aid in the process of software development and testing by reducing the burden on the software developer to make the similarity determination themselves.

Claim 23: 
Klamik in view of Kumar, Cohen, Brafman and Qin teaches the system of claim 13 and Klamik further teaches wherein the one or more processors are further configured to:
perform a second test run for the AUT by executing the test code against a corresponding instance of the AUT, thereby generating second test results ([0037] “testing software conducts one or more tests and thereby produces one or more logs”); and
simplify the second test results by removing noisy information from the second test results ([0037] “After testing software conducts one or more tests and thereby produces one or more logs such as the exemplary log of FIG. 4, additional processing can be conducted using the data in the log. With an algorithm to detect when the computer was executing what was intended (the test) vs. what was not intended (the noise), the noise can be identified and accommodated for when presenting test results to an analyst.” [0053] “Once adequate test data is obtained, the test data may be further processed 606 by removing any noise from the results”).

With further regard to Claim 23, Klamik in view of Kumar, Brafman and Qin does not teach the following, however, Cohen further teaches
generate a second test run representation using the simplified second test results  (Col. 38 Ln. 54: “Logged activities related to running test scenarios may also be utilized for the purpose of clustering and/or determining similarity between runs of test scenarios.” Col. 38 Ln. 67: “logged activities may be represented as feature values that may be put in a vector corresponding to a run. For example, if a certain activity is performed during a run, a vector corresponding to the run has 1 in a certain position, and otherwise there is a 0 in the certain position,” wherein the “vector” is the “representation”.);
determine that the second test run representation does not correspond to any of the test run clusters; and form a new cluster including the second test run representation (Col. 39 Ln. 20: “a cluster of similar runs includes runs that are represented by similar vectors. Optionally, similar vectors may be characterized in various ways. In one example, similar vectors are vectors whose average pairwise similarity is above a predetermined threshold (e.g., the threshold may be 0.5). Optionally, the average pairwise similarity is determined by computing the average of the dot product of each pair of vectors. In another example, similar vectors are vectors that are all similar to a certain representative vector; e.g., the vectors all within a sphere of a certain Euclidean distance from the representative,” wherein vectors having a pairwise similarity below the “predetermined threshold” indicates that the associated test run does not correspond to any of the existing test run clusters and as such constitutes a new cluster which will include any further vectors which are found to be above the similarity threshold when compared to the vector of the test run stored in the new cluster.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Brafman and Qin with the further test run representation generating as taught by Cohen in order “to determine similarity of the runs to each other” (Cohen Col. 39 Ln. 55), wherein such information is known to aid in the process of software development and testing by reducing the burden on the software developer to make the similarity determination themselves.

Claims 18 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Klamik in view of Kumar, Cohen, Brafman and Qin as applied to Claims 13 and 19 above, and further in view of Lundstrom.
Claim 18:		
Klamik in view of Kumar, Cohen, Brafman and Qin teaches all the limitations of claim 13 as described above. Klamik in view of Kumar, Cohen, Brafman and Qin does not teach the following, however, Lundstrom teaches:
wherein the AUT is a native application configured to operate with a mobile device operating system (OS) ([0017] “when the test processor 100 detects that a mobile device 102 has been connected thereto, it initiates the testing process by issuing a request to the application build server 104, for instance, via a network 106, to install the latest build of the application subject to testing onto the mobile device 102.”), 
wherein each of the first and training test runs involves interaction of the test code with the corresponding instance of the AUT using a mobile device automation driver ([0021] “In step 306, the test processor 100 launches the device platform's (e.g., specific to the device's operating system) and corresponding application's test process or a test run.” [0025] “The TTS test run contains a selenium code abstraction layer 508, which then passes the calls through the Appium layer 510. The Appium layer 510 translates the Selenium calls to Native xcode calls for the layer 512 which communicates with the physical device layer 51,” wherein “Appium” is a type of “mobile device automation driver”, as indicated by the Applicant’s specification Paragraph [0031].), and 
wherein each of the first and training test results corresponds to a test log representing application of test commands of the test code using the mobile device automation driver ([0021] “the test progress may be continuously recorded in the test log file and displayed via the visualization workstation 108 (FIG. 1).”). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Cohen, Brafman and Qin with the mobile device application testing as taught by Lundstrom in order to “provide a new approach to automated software testing where, instead of running at the end of development cycle, tests are performed on a continuous and modular basis” (Lundstrom [0016]).

Claim 21:	
Klamik in view of Kumar, Cohen, Brafman and Qin teaches all the limitations of claim 19 as described above. Klamik in view of Kumar, Cohen, Brafman and Qin does not teach the following, however, Lundstrom teaches:
wherein the AUT is a native application configured to operate with a mobile device operating system (OS) ([0017] “when the test processor 100 detects that a mobile device 102 has been connected thereto, it initiates the testing process by issuing a request to the application build server 104, for instance, via a network 106, to install the latest build of the application subject to testing onto the mobile device 102.”), 
wherein each of the first and training test runs involves interaction of the test code with the corresponding instance of the AUT ([0021] “In step 306, the test processor 100 launches the device platform's (e.g., specific to the device's operating system) and corresponding application's test process or a test run.” [0025] “The TTS test run contains a selenium code abstraction layer 508, which then passes the calls through the Appium layer 510. The Appium layer 510 translates the Selenium calls to Native xcode calls for the layer 512 which communicates with the physical device layer 51.”), and
wherein the state information includes one or more of a state of the native application, a state of the mobile device OS, or a state of an emulator emulating the mobile device OS ([0021] “the test progress may be continuously recorded in the test log file and displayed via the visualization workstation 108 (FIG. 1). Upon re-connection of device 102 to the test processor 100, step 312, resources are re-allocated and the test run resumes from the application state that was being tested when the device was unplugged.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Cohen, Brafman and Qin with the mobile device application testing as taught by Lundstrom since “This achieves modular testing by not requiring the test run to traverse the entire state model for the application at one time” (Lundstrom [0021]).

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Klamik in view of Kumar, Cohen, Brafman and Qin as applied to Claim 19 above, and further in view of De Angelis.
Claim 20:	
Klamik in view of Kumar, Cohen, Brafman and Qin teaches all the limitations of claim 19 as described above. Klamik in view of Kumar, Cohen, Brafman and Qin does not teach the following, however, De Angelis teaches:
wherein the AUT is a web application ([0028] “embodiments may be used for white-box testing web applications in production environments.”), 
wherein each of the first and training test runs involves interaction of a web browser with the corresponding instance of the AUT as controlled by the test code ([0033] “Using web browser (also referred to simply as "browser") 104, the user may access web server 108 to run application 112. Upon user access, application 112 is downloaded or otherwise made accessible to browser 104 and is executed (e.g., executing application 114). For purposes of performing tests upon executing application 114, some of the test components 118 may also be downloaded to client computer 102 and run in browser 104, e.g., as executing test framework 116.”),
wherein the state information includes a state of the web browser, and wherein the CCS is configured to apply control commands to the web browser and to receive the state of the web browser from the web browser via an application programming interface associated with the web browser ([0006] “Most of the existing frameworks for web browser graphical user interface tests (e.g., Selenium) are divided into a component that offers an application programming interface (API) to define the interaction and to check the expected result, as well as a component that executes the defined interaction in the web browser. The component that executes the defined test interactions use the native automation APIs provided by the browsers. These native APIs of the browser allow checking the state of the document object model (DOM) structure in the browser.” [0061] “At operation 220, results of the executed test are observed. For example, the document object model (DOM) structure and internal state of the application may be examined. The results may include values or variables at defined stages of processing of the application, results of assertions and/or validation statements included in the test case code, etc.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Klamik in view of Kumar, Cohen, Brafman and Qin with the web application testing as taught by De Angelis in order “to check the internal state of an application and to implement white-box tests” (De Angelis [0006]).

Response to Arguments
Applicant's arguments, see Pages 11-13 of the Remarks filed July 14, 2022, with respect to the rejections under 35 U.S.C. 103 of Claims 1-24 have been fully considered but they are not persuasive. With respect to the Applicant’s argument that the newly amended language of Claims 1, 13 and 24 is not taught by the previously cited prior art, this argument has been fully considered but is moot in view of the newly cited Qin reference as discussed above in the respective rejections.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is as follows:
Luo et al. (“An Empirical Analysis of Flaky Tests,” 2014) discusses an extensive study of flaky tests, wherein it is noted that several open-source testing framework have annotations to label flaky tests that require a few reruns upon failure.
Gao (“Quantifying Flakiness and Minimizing Its Effects on Software Testing,” 2017) discusses a systematic approach to quantitively analyze and minimize the effects of flakiness, wherein an entropy-based metric is introduced to quantify the flakiness of different layers of test outputs and the manner in which Google labels tests flaky is analyzed.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOANNE GONZALES MACASIANO whose telephone number is (571)270-7749. The examiner can normally be reached Monday to Thursday, 10:30 AM to 6:00 PM Eastern Standard Time.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hyung S. Sough can be reached on (571) 272-6799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/J.G.M/Examiner, Art Unit 2194                                                                                                                                                                                                        
/S. SOUGH/
SPE, AU 2192/2194