DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
In response to the amendment filed 8/4/2022; claims 1-20 are pending.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1- 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.   Claims 1, 9 and 15 recite a method for selecting test items.  
Re Claims 1, 9 and 15, the limitation of steps: initiate a spoken language test to be taken by the user; select a first test item for the spoken language test from a test item bank based on a first item difficulty probability distribution;  cause the user device to deliver the first test item to the user; receive first spoken response data from the user via the user device responsive to the first test item; generate a second item difficulty probability distribution based on the first spoken response data; select a second test item for the spoken language test from the test item bank based on the second item difficulty probability distribution; cause the user device to deliver the second test item to the user; receive second spoken response data from the user via the user device responsive to the second test item; determine that an end condition has been met; and responsive to determining that the end condition has been met, end the spoken language test.  The steps above are processes that, under their broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting by a computer structure (“a server”, “a processor”, “a memory”) nothing in the claim element precludes the step from practically being performed in the mind.  For example, a teacher may readily performance the recited steps above.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  
This judicial exception is not integrated into a practical application. The claims merely include instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Specifically, the computer implemented steps are claimed to perform their basic functions of initiate test session, select test item and updating the test session; which was known in the pre-computer world. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Claims 1, 9 and 15 do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computer structure to perform claimed steps amounts to no more than mere instructions to apply the exception using a generic computer component. These structures are used only for data gathering and manipulation, as such, only represent insignificant pre-solution activity.  Viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a patent eligible application of the abstract idea such that the claims amount to significantly more than the abstract idea itself.   Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.  Hence, claims 2 – 8, 10 – 14 and 16 – 20 inherit the deficiencies of their respective parent claims through their dependencies and do not recite additional limitations sufficient to direct the claims to more than the claimed abstract idea, and are thus rejected for the same reasons.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 4 – 5 and 12 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention. 
Claims 4 and 12 recite “the random number exceeds a predetermined probability threshold”.  As one of ordinary still in the art would have understood, a probability value is usually represented as percentage; for example, there is a 50% or .50 chance (probability) of rain today.  It’s unclear how to compare a random number with a percentage value (probability threshold).  The specification has not clearly elaborated how the random number incorporated with the probability threshold value.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1,4-9,12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Gawlick et al. (US 2015/0161901 A1) in view of Fujimori et al. (US 2005/0256663 A1).
Re claims 1, 9:
1. Gawlick teaches a system (Gawlick, Abstract) comprising: 
a server that is in electronic communication with a user device associated with a user (Gawlick, [0032], “data associated with any one of the computer adaptive testing modules, test script administration process or interim ability estimation process may be stored on the datastore. The datastore may additionally be configured to receive and/or forward some or all of the stored data”; [0062], “account the examinee's performance”), the server comprising: 
a processor; and a memory device configured to store computer-readable instructions, which, when executed (Gawlick, [0013]), cause the processor to: 
initiate a language test to be taken by the user (Gawlick, fig. 1, “Administer computer implemented test battery”; [0022]; [0065], “math”, “writing”); 
select a first test item for the language test from a test item bank based on first test item selection parameters (Gawlick, fig. 1, “Select a first test section from test battery”; Abstract, ”An ability estimate is calculated from an earlier section(s) of the at least two or more sections and an initial item and subsequent items for a subsequent section are selected from the plurality of test items based upon the ability estimate(s) from the earlier section(s)”; [0038], “closest match between ability and item difficulty”); 
cause the user device to deliver the first test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); 
receive first response data from the user via the user device responsive to the first test item (Gawlick, [0013], “receiving the examinee’s response”); 
perform an analysis of the first response data (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”); 
produce second test item selection parameters by modifying the first test item selection parameters based on the analysis of the first response data (Gawlick, fig. 1, “Inform selection of the initial test item for the next test section based upon examinee's ability estimate from the first selected section”; [0014], “One or more test items from the plurality of test items are selected to include in a subsequent test section to the one test section of the at least two or more test sections based upon the initial ability estimate from at least the one previous test section”); 
select a second test item for the language test from the test item bank based on the second test item selection parameters (Gawlick, fig. 1, “Inform selection of the initial test item for the next test section based upon examinee's ability estimate from the first selected section”; [0014], “One or more test items from the plurality of test items are selected to include in a subsequent test section to the one test section of the at least two or more test sections based upon the initial ability estimate from at least the one previous test section” ; [0038], “closest match between ability and item difficulty”); 
cause the user device to deliver the second test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); 
receive second response data from the user via the user device responsive to the second test item (Gawlick, [0013], “receiving the examinee’s response”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”); 
determine that an end condition has been met; and responsive to determining that the end condition has been met, end the language test (Gawlick, [0038], “This process continues until the test reaches its full length or satisfies certain termination rules”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”; [0058], “Selection length refers to the number of items contained in a subset of items meeting content and statistical requirements, from which an item is selected to present”; each test section include a fixed number of items; the test section ends when the termination rule has been met).

9. Gawlick teaches a system (Gawlick, Abstract) comprising:
a server that is in electronic communication with a user device associated with a user (Gawlick, [0032], “data associated with any one of the computer adaptive testing modules, test script administration process or interim ability estimation process may be stored on the datastore. The datastore may additionally be configured to receive and/or forward some or all of the stored data”; [0062], “account the examinee's performance”), the server comprising: 
a processor; and a memory device configured to store computer readable instructions (Gawlick, [0013]), which, when executed, cause the processor to: 
initiate a language test to be taken by the user (Gawlick, fig. 1, “Administer computer implemented test battery”; [0022]; [0065], “math”, “writing”); 
generate a first random number (Gawlick, [0024], “The first item can then be selected at random from a subset of items (where the subset may be specified to contain from one to n items)”); 
select a first test item for the language test based on test item selection parameters and the first random number (Gawlick, [0024], “The first item can then be selected at random from a subset of items (where the subset may be specified to contain from one to n items)”), the first test item selection parameters defining a first difficulty range, wherein the first test item has a first difficulty value that is within the first difficulty range (Gawlick, fig. 1, “Select a first test section from test battery”; Abstract, ”An ability estimate is calculated from an earlier section(s) of the at least two or more sections and an initial item and subsequent items for a subsequent section are selected from the plurality of test items based upon the ability estimate(s) from the earlier section(s)”; [0038], “closest match between ability and item difficulty”; [0024], “selecting the item to present ( e.g., one whose difficulty is closest to the ability estimate, one with the most information at the ability estimate)”; close match (instead exact match) means there is a range of acceptable values for item difficulty); 
cause the user device to deliver the first test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); 
receive first response data from the user via the user device responsive to the first test item (Gawlick, [0013], “receiving the examinee’s response”); 
perform an analysis of the first response data; update the test item selection parameters by increasing the first difficulty range to a second difficulty range based on the analysis of the first response data (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”); 
generate a second random number (Gawlick, [0064], “randomly select the next item”); 
select a second test item for the language test having a second difficulty value within the second difficulty range based on the second random number (Gawlick, fig. 1, “Inform selection of the initial test item for the next test section based upon examinee's ability estimate from the first selected section”; [0014], “One or more test items from the plurality of test items are selected to include in a subsequent test section to the one test section of the at least two or more test sections based upon the initial ability estimate from at least the one previous test section”); 
cause the user device to deliver the second test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); 
receive second response data from the user via the user device responsive to the second test item (Gawlick, [0013], “receiving the examinee’s response”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”); 
determine that a first end condition has been met; and end the language test  (Gawlick, [0038], “This process continues until the test reaches its full length or satisfies certain termination rules”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”; [0058], “Selection length refers to the number of items contained in a subset of items meeting content and statistical requirements, from which an item is selected to present”; each test section include a fixed number of items; the test section ends when the termination rule has been met). 

Gawlick does not explicitly disclose initiate a spoken language test to be taken by the user; instead Gawlick teaches an adaptive test for generating test item for subject matter such as math and writing. (Gawlick, [0065], “math”, “writing”).  

Fujimori et al. (US 2005/0256663 A1) teaches a test system realizes in an English ability test (Fujimori, Abstract).  Fujimori further teaches a spoken response; specifically, Fujimori teaches 
initiate a spoken language test to be taken by the user (Fujimori, [0088], “returns an answer in the uttered voice format (step 404)”; [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”); 
select a first test item for the spoken language test from a test item bank based on first test item selection parameters (Fujimori, fig. 1, “Problem database 105”; [0084], “the database storing a number of problems for which the difficulty level”; fig. 4, 402; [0008], “a problem database accessible by the test management server and storing a plurality of problems for which an item parameter including a difficulty level and identifiability is estimated in advance”); 
cause the user device to deliver the first test item to the user (Fujimori, fig. 1, “Problem database 105”; [0084], “the database storing a number of problems for which the difficulty level”; fig. 4, 402); 
receive first spoken response data from the user via the user device responsive to the first test item (Fujimori, [0088], “returns an answer in the uttered voice format (step 404)”; [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”); 
perform an analysis of the first spoken response data (Fujimori, [0085], “a testee uses a microphone (voice input/output device 309 shown in FIG. 3) provided for the personal computer, and inputs an answer in response to a presented question or by voice, and the contents of the utterance is processed as a target to be evaluated”; [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”); 
produce second test item selection parameters by modifying the first test item selection parameters based on the analysis of the first spoken response data (Fujimori, [0008], “. With the configuration, the test system presents n problems to one testee, and the ability θ of the testee is evaluated from the response of the testee to the presented n problems. The test management server includes: (1) means for selecting n problems … from the problem database in response to a request transmitted from the first computer, and transmitting the selected problems to the first computer; (2) answer storage means for storing an answer returned from the first computer in response to the problem selected from the problem database and transmitted to the first computer;”); 
select a second test item for the spoken language test from the test item bank based on the second test item selection parameters; cause the user device to deliver the second test item to the user; receive second spoken response data from the user via the user device responsive to the second test item … end the spoken language test  (Fujimori, figs. 5 – 6, “1st trial”, “2nd trial”, “3rd trial”).

The substitution of one known element (speaking language test as shown in Fujimori) for another as (math and writing test shown in Gawlick) would have been obvious to one of ordinary skill in the art at the time of the invention since the substitution of the subject matter shown in Fujimori would have yielded predictable results, namely, providing a test of the presentation format which is hard to determine the correct or wrong answer such as writing and speaking tests in a foreign language test (Fujimori, Abstract).

Gawlick does not explicitly disclose a difficulty range.  Fujimori teaches a difficulty range (Fujimori, [0127], “the estimated values of θ and difficulty level are in the range of -3.5 to 3.5, and the range of the estimated value of the identifiability is 0.02 to 2.0”).  Therefore, in view of Fujimori, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system / method as taught by Gawlick, by providing difficulty range as taught by Fujimori, since it was known in the art to classify a subset of items with similar difficulty level in one category. 

Re claims 2, 10:
2. The system of claim 1, wherein the first spoken response data comprises recorded speech data, and wherein performing the analysis of the first spoken response data comprises executing a speech recognition algorithm to identify and extract words from the recorded speech data.  10. The system of claim 9, wherein the first spoken response data comprises recorded speech data, and wherein performing the analysis of the first spoken response data comprises executing a speech recognition algorithm to identify and extract words from the recorded speech data (Fujimori, [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”).  

Re claim 5:
5. The system of claim 4, wherein selecting the second test item comprises: 
randomly selecting the second test item from a group of test items of the test item bank, wherein the group of test items includes only test items having difficulty values within the second difficulty range, wherein the difficulty values of the test items of the group of test items are calculated using the item response theory model (Gawlick, [0024], “The first item can then be selected at random from a subset of items (where the subset may be specified to contain from one to n items). There are various methods for selecting the item to present ( e.g., one whose difficulty is closest to the ability estimate, one with the most information at the ability estimate)”; [0076], “the item can be randomly selected from the subset of three items satisfying the test blueprint and having the best match between item difficulty and ability estimates. Additionally, in concert with the individualized initial ability estimates for the subsequent section”).

Re claims 8, 13:
8. The system of claim 1, wherein determining that the end condition has been met comprises determining that a predetermined number of test items have been delivered. 13. The system of claim 9, wherein determining that the first end condition has been met comprises determining that a predetermined number of test items have been delivered (Gawlick, [0038], “This process continues until the test reaches its full length or satisfies certain termination rules”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”; [0058], “Selection length refers to the number of items contained in a subset of items meeting content and statistical requirements, from which an item is selected to present”; each test section include a fixed number of items; the test section ends when the termination rule has been met).

Re claim 14
14. The system of claim 9, wherein the computer-readable instructions, when executed, further cause the processor to: 
determine that the first end condition has been met by determining that a first predetermined number of test items have been delivered during a first stage of the spoken language, wherein the first difficulty range has a predefined association with the first stage (Gawlick, fig. 1, “Select a first test section from test battery”; Abstract, ”An ability estimate is calculated from an earlier section(s) of the at least two or more sections and an initial item and subsequent items for a subsequent section are selected from the plurality of test items based upon the ability estimate(s) from the earlier section(s)”; [0038], “closest match between ability and item difficulty”); 
responsive to determining that the first predetermined number of test items have been delivered: end the first stage; initiate a second stage of the spoken language test, and update the test item selection parameters to include a third difficulty range having a predefined association with the second stage (Gawlick, [0013], “Each test section has a set of test items selected from a plurality of test items. An item selection process is also provided. At least one section from the plurality of test sections is administered to an examinee using the computer”; fig. 1, “Inform selection of the initial test item for tile next test section based upon examinee's ability estimate from the first selected section”);
generate a third random number; select a third test item for the spoken language having a third difficulty value within the third difficulty range based on the third random number (Gawlick, [0024], “The first item can then be selected at random from a subset of items (where the subset may be specified to contain from one to n items)”); 
cause the user device to deliver the third test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); and 
determine that a second end condition has been met by determining that a second predetermined number of test items have been delivered, wherein ending the spoken language test is performed responsive to determining that the second end condition has been met (Gawlick, [0038], “This process continues until the test reaches its full length or satisfies certain termination rules”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”; [0058], “Selection length refers to the number of items contained in a subset of items meeting content and statistical requirements, from which an item is selected to present”; each test section include a fixed number of items; the test section ends when the termination rule has been met).

Claims 3,4,6-7,11-12 and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gawlick et al. (US 2015/0161901 A1) in view of Fujimori et al. (US 2005/0256663 A1) and Brown et al. (US 2016/0217701 A1).
Re claims 3, 11, 17:
3. The system of claim 1, wherein performing the analysis of the first spoken response data comprises: generating a score based on the first spoken response data; updating an item response theory model based on the score.  11. The system of claim 9, wherein performing analysis of the first spoken response data comprises: generating a score based on the first spoken response data; updating an item response theory model based on the score, wherein the first difficulty value and the second difficulty value are determined based on the item response theory model; updating a level associated with the item response theory model; responsive to updating the level, determining a change in the level; and generating a reward value based on the change in the level, wherein the test item selection parameters are updated based on the score and the reward value.  17. The system of claim 15, wherein the computer-readable instructions, when executed, further cause the processor to: generate a score based on the first response data; update an item response theory model based on the score, wherein a first difficulty value of the first test item and a second difficulty value of the second test item are determined based on the item response theory model; update a level associated with the item response theory model; responsive to updating the level, determine a change in the level; and generate a reward value based on the change in the level, wherein the second item difficulty probability distribution is generated based on the score and the reward value (Gawlick, fig. 1, “Select a first test section from test battery”; [0024], “The examinee responds to the selected item. The examinee' s response to the first item is scored and used to update the ability estimate for the examinee”; Abstract, ”An ability estimate is calculated from an earlier section(s) of the at least two or more sections and an initial item and subsequent items for a subsequent section are selected from the plurality of test items based upon the ability estimate(s) from the earlier section(s)”; [0038], “closest match between ability and item difficulty”). 

Gawlick does not explicitly disclose updating a confidence level associated with the item response theory model and a reward value.  Brown teaches the missing limitation (Brown, figs. 4 – 6; [0057]; [0073], “If the posterior estimate exceeds the difficulty level Z of the question with a sufficient confidence level, e.g., 0.95 (step 530), then the assessment agent 122 may determine that the student has mastered questions for skill i having that level of difficulty Z”; [0100]).   Brown further teaches a reward value (Brown, figs. 4 – 6, “skill j to be acquired by the student”; skill j is the reward value).  Therefore, in view of Brown, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system/method described in Gawlick, by providing confidence level and reward value as taught by Brown, since the posterior estimate of student ability exceeds the skill acquisition probability with a confidence level greater than a sufficient confidence level (Brown, [0100]).

Gawlick does not explicitly disclose item selection include probability distribution.  Brown et al. (US 2016/0217701 A1) teaches systems and methods for analyzing student learning and calibrating the difficulty of questions on a test or examination.  Brown teaches generate a second item difficulty probability distribution based on the first response data (Brown, [0018]; [0034]; [0058]; [0060]; [0062]).   Therefore, in view of Brown, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system/method described in Gawlick, by providing the difficulty probability distribution as taught by Brown, since example, an expected value or estimate of a probability distribution may provide an estimate of difficulty of a question, and the variance of the distribution may provide a certainty for that estimate (Brown, [0018]).

Re claims 4:
4. The system of claim 3, wherein the first test item selection parameters comprise a first difficulty range, wherein the second test item selection parameters comprise a second difficulty range, and wherein the computer-readable instructions, when executed, further cause the processor to: 
generate a random number (Gawlick, [0024], “The first item can then be selected at random from a subset of items (where the subset may be specified to contain from one to n items)”); determine that the random number exceeds a predetermined probability threshold (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”; [0034], “The expected a posteriori (EAP) estimation method can also be used at every step to obtain an examinee's interim ability which may then be used for selecting subsequent items in subsequent sections of a test battery”; [0043] – [0046], i.e., [0044], “Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP). The ML method involves maximizing the likelihood function over the range ... Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP) …” [0044], “P(Uij | 0) is the probability that an examinee i with ability Ɵ”); and  increasing the first difficulty range of the first test item selection parameters to the second difficulty range of the second test item selection parameters (Gawlick, [0073], “Based on ability estimates, the subsequent item was selected from a subset of items satisfying test blueprint and having close match between item difficulty and examinee ability”).

Gawlick does not explicitly disclose updating a reward value; a reward value is based on confidence level.  Brown teaches the missing limitation (Brown, figs. 4 – 6; [0057]; [0073], “If the posterior estimate exceeds the difficulty level Z of the question with a sufficient confidence level, e.g., 0.95 (step 530), then the assessment agent 122 may determine that the student has mastered questions for skill i having that level of difficulty Z”; [0100]).   Brown further teaches a reward value (Brown, figs. 4 – 6, “skill j to be acquired by the student”; skill j is the reward value).  Therefore, in view of Brown, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system/method described in Gawlick, by providing confidence level and reward value as taught by Brown, since the posterior estimate of student ability exceeds the skill acquisition probability with a confidence level greater than a sufficient confidence level (Brown, [0100]).

Re claims 6 - 7:
6. The system of claim 3, wherein the first test item selection parameters comprise a first probability, wherein the second test item selection parameters comprise a second probability, wherein updating the item response theory model comprises updating a user skill level of the user based on the score, and wherein the computer-readable instructions, when executed, further cause the processor to: 
responsive to updating the user skill level, generate the second probability based on the updated user skill level and the reward value (Gawlick, [0021], “estimate an examinee' s ability using the examinee' s response information relating to a specific set of operational test items”; [0024], “The examinee' s response to the first item is scored and used to update the ability estimate for the examinee”).

7. The system of claim 6, wherein selecting the second test item comprises: 
selecting the second test item from a group of test items of the test item bank according to the second probability, such that a probability of selecting a given test item of the group of test items having a difficulty value determined by the item response theory model is defined by the second probability (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”; [0034], “The expected a posteriori (EAP) estimation method can also be used at every step to obtain an examinee's interim ability which may then be used for selecting subsequent items in subsequent sections of a test battery”; [0043] – [0046], i.e., [0044], “Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP). The ML method involves maximizing the likelihood function over the range ... Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP) …” [0044], “P(Uij | 0) is the probability that an examinee i with ability Ɵ”).

Gawlick does not explicitly disclose item selection include probability distribution.  Brown et al. (US 2016/0217701 A1) teaches systems and methods for analyzing student learning and calibrating the difficulty of questions on a test or examination.  Brown teaches generate a second item difficulty probability distribution based on the first response data (Brown, [0018]; [0034]; [0058]; [0060]; [0062]).   Therefore, in view of Brown, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system/method described in Gawlick, by providing the difficulty probability distribution as taught by Brown, since example, an expected value or estimate of a probability distribution may provide an estimate of difficulty of a question, and the variance of the distribution may provide a certainty for that estimate (Brown, [0018]).

Re claim 12:
12. The system of claim 11, wherein the computer-readable instructions, when executed, further cause the processor to: 
determine that the reward value exceeds a predetermined threshold (Brown, figs. 4 – 6; [0057]; [0073], “If the posterior estimate exceeds the difficulty level Z of the question with a sufficient confidence level, e.g., 0.95 (step 530), then the assessment agent 122 may determine that the student has mastered questions for skill i having that level of difficulty Z”; [0100]); and 
determine that the first random number exceeds a predetermined probability threshold, wherein the second difficulty range is generated responsive to determining that the reward value exceeds the predetermined threshold and that the first random number exceeds the predetermined probability threshold (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”; [0034], “The expected a posteriori (EAP) estimation method can also be used at every step to obtain an examinee's interim ability which may then be used for selecting subsequent items in subsequent sections of a test battery”; [0043] – [0046], i.e., [0044], “Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP). The ML method involves maximizing the likelihood function over the range ... Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP) …” [0044], “P(Uij | 0) is the probability that an examinee i with ability Ɵ”).

Re claim 15:
15. Gawlick teaches a system (Gawlick, Abstract) comprising:
a server that is in electronic communication with a user device associated with a user (Gawlick, [0032], “data associated with any one of the computer adaptive testing modules, test script administration process or interim ability estimation process may be stored on the datastore. The datastore may additionally be configured to receive and/or forward some or all of the stored data”; [0062], “account the examinee's performance”), the server comprising: 
a processor; and a memory device configured to store computer-readable instructions, which, when executed (Gawlick, [0013]), cause the processor to: 
initiate a language test to be taken by the user (Gawlick, fig. 1, “Administer computer implemented test battery”; [0022]; [0065], “math”, “writing”); 
select a first test item for the language test from a test item bank based on a first item difficulty (Gawlick, [0034], “The expected a posteriori (EAP) estimation method can also be used at every step to obtain an examinee's interim ability which may then be used for selecting subsequent items in subsequent sections of a test battery”; [0043] – [0046], i.e., [0044], “Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP). The ML method involves maximizing the likelihood function over the range ... Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP) …” [0044], “P(Uij | 0) is the probability that an examinee i with ability Ɵ”); 
cause the user device to deliver the first test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); 
receive first response data from the user via the user device responsive to the first test item (Gawlick, [0013], “receiving the examinee’s response”);
generate a second item difficulty based on the first response data; select a second test item for the language test from the test item bank based on the second item difficulty (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”; [0034], “The expected a posteriori (EAP) estimation method can also be used at every step to obtain an examinee's interim ability which may then be used for selecting subsequent items in subsequent sections of a test battery”; [0043] – [0046], i.e., [0044], “Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP). The ML method involves maximizing the likelihood function over the range ... Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP) …” [0044], “P(Uij | 0) is the probability that an examinee i with ability Ɵ”); 
cause the user device to deliver the second test item to the user (Gawlick, [0032]; [0033]; “a workstation, computer network, or other like electronic device, examinees' input, answers or responses are received in response to the plurality of test items”); 
receive second response data from the user via the user device responsive to the second test item (Gawlick, [0013], “receiving the examinee’s response”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”); 
determine that an end condition has been met; and responsive to determining that the end condition has been met, end the language test (Gawlick, [0038], “This process continues until the test reaches its full length or satisfies certain termination rules”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”; [0058], “Selection length refers to the number of items contained in a subset of items meeting content and statistical requirements, from which an item is selected to present”; each test section include a fixed number of items; the test section ends when the termination rule has been met).
Gawlick does not explicitly disclose initiate a spoken language test to be taken by the user; instead Gawlick teaches an adaptive test for generating test item for subject matter such as math and writing. (Gawlick, [0065], “math”, “writing”).  

Fujimori et al. (US 2005/0256663 A1) teaches a test system realizes in an English ability test (Fujimori, Abstract).  Fujimori further teaches a spoken response; specifically, Fujimori teaches 
initiate a spoken language test to be taken by the user (Fujimori, [0088], “returns an answer in the uttered voice format (step 404)”; [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”); 
select a first test item for the spoken language test from a test item bank based on first test item selection parameters (Fujimori, fig. 1, “Problem database 105”; [0084], “the database storing a number of problems for which the difficulty level”; fig. 4, 402; [0008], “a problem database accessible by the test management server and storing a plurality of problems for which an item parameter including a difficulty level and identifiability is estimated in advance”); 
cause the user device to deliver the first test item to the user (Fujimori, fig. 1, “Problem database 105”; [0084], “the database storing a number of problems for which the difficulty level”; fig. 4, 402); 
receive first spoken response data from the user via the user device responsive to the first test item (Fujimori, [0088], “returns an answer in the uttered voice format (step 404)”; [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”); 
perform an analysis of the first spoken response data (Fujimori, [0085], “a testee uses a microphone (voice input/output device 309 shown in FIG. 3) provided for the personal computer, and inputs an answer in response to a presented question or by voice, and the contents of the utterance is processed as a target to be evaluated”; [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”); 
produce second test item selection parameters by modifying the first test item selection parameters based on the analysis of the first spoken response data (Fujimori, [0008], “. With the configuration, the test system presents n problems to one testee, and the ability θ of the testee is evaluated from the response of the testee to the presented n problems. The test management server includes: (1) means for selecting n problems … from the problem database in response to a request transmitted from the first computer, and transmitting the selected problems to the first computer; (2) answer storage means for storing an answer returned from the first computer in response to the problem selected from the problem database and transmitted to the first computer;”); 
select a second test item for the spoken language test from the test item bank based on the second test item selection parameters; cause the user device to deliver the second test item to the user; receive second spoken response data from the user via the user device responsive to the second test item … end the spoken language test  (Fujimori, figs. 5 – 6, “1st trial”, “2nd trial”, “3rd trial”).

The substitution of one known element (speaking language test as shown in Fujimori) for another as (math and writing test shown in Gawlick) would have been obvious to one of ordinary skill in the art at the time of the invention since the substitution of the subject matter shown in Fujimori would have yielded predictable results, namely, providing a test of the presentation format which is hard to determine the correct or wrong answer such as writing and speaking tests in a foreign language test (Fujimori, Abstract).

Gawlick does not explicitly disclose a difficulty range.  Fujimori teaches a difficulty range (Fujimori, [0127], “the estimated values of θ and difficulty level are in the range of -3.5 to 3.5, and the range of the estimated value of the identifiability is 0.02 to 2.0”).  Therefore, in view of Fujimori, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system / method as taught by Gawlick, by providing difficulty range as taught by Fujimori, since it was known in the art to classify a subset of items with similar difficulty level in one category. 

Gawlick does not explicitly disclose difficulty probability distribution. Brown et al. (US 2016/0217701 A1) teaches systems and methods for analyzing student learning and calibrating the difficulty of questions on a test or examination.  Brown teaches generate a second item difficulty probability distribution based on the first response data (Brown, [0018]; [0034]; [0058]; [0060]; [0062]).   Therefore, in view of Brown, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system/method described in Gawlick, by providing the difficulty probability distribution as taught by Brown, since example, an expected value or estimate of a probability distribution may provide an estimate of difficulty of a question, and the variance of the distribution may provide a certainty for that estimate (Brown, [0018]).

Re claim 16:
16. The system of claim 15, wherein the first spoken response data comprises recorded speech data, and wherein the computer-readable instructions, when executed, further cause the processor to execute a speech recognition algorithm to identify and extract words from the recorded speech data (Fujimori, [0078], “(f) fluency and (g) pronunciation are the evaluation items to be set”). 

Re claims 18 – 19:
18. The system of claim 17, wherein updating the item response theory model comprises: updating a user skill level of the user based on the score, wherein the second item difficulty probability distribution is generated based on the user skill level and the reward value (Gawlick, [0021], “estimate an examinee' s ability using the examinee' s response information relating to a specific set of operational test items”; [0024], “The examinee' s response to the first item is scored and used to update the ability estimate for the examinee”; Brown, [0018]; [0034]; [0058]; [0060]; [0062]).

19. The system of claim 18, wherein a probability of the second test item being selected is defined by the second probability distribution based on the second difficulty value of the second test item (Gawlick, [0013], “at least one subsequent test section is informed based on scores from the one test section or previous test sections”; [0014], “An initial ability estimate for the examinee's responses to the set of the plurality of test items in the one test section is calculated”; [0034], “The expected a posteriori (EAP) estimation method can also be used at every step to obtain an examinee's interim ability which may then be used for selecting subsequent items in subsequent sections of a test battery”; [0043] – [0046], i.e., [0044], “Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP). The ML method involves maximizing the likelihood function over the range ... Two estimation methods commonly used in CAT are maximum likelihood (ML) and expected a posteriori (EAP) …” [0044], “P(Uij | 0) is the probability that an examinee i with ability Ɵ”; Brown, [0018]; [0034]; [0058]; [0060]; [0062]).

Re claim 20:
20. The system of claim 15, wherein determining that the end condition has been met comprises determining that a predetermined number of test items have been delivered to the user via the user device (Gawlick, [0038], “This process continues until the test reaches its full length or satisfies certain termination rules”; [0070], “This fixed-length adaptive battery contains two sections. Section 1 consists of 28 items. Section 2 consists of 30 items”; [0058], “Selection length refers to the number of items contained in a subset of items meeting content and statistical requirements, from which an item is selected to present”; each test section include a fixed number of items; the test section ends when the termination rule has been met).

Response to Arguments
Applicant's arguments filed 8/4/2022 have been fully considered but they are not persuasive. 
Applicant argues:
Applicant's specification further details systems and methods for addressing these specific technical challenges present in the field of spoken language tests using automatic speech recognition.
Appellant's claims do not include additional elements that either alone or in combination are sufficient to claim a practical application because to the extent that, e.g., "speech recognition" are claimed, as these are merely claimed to add insignificant extra-solution activity to the judicial exception (e.g., pre-solution activity of data gathering and post-solution activity of presenting data) and/or do no more than generally link the use of a judicial exception to a particular technological environment or field of use.

Applicant argues: 
There is no teaching or suggestion of modifying test item selection parameters that are used to select test items for a spoken language test in the cited portions of Gawlick. Rather, the system in Gawlick generates separate ability assessments that are associated with the user, not with test items. The ability assessments in Gawlick are not modified, but rather new ability assessments are generated at different stages of the testing process.
The Office respectfully submits Gawlick teaches an examinee’s ability level (Gawlick, [0021], “estimate an examinee' s ability”) and an item’s difficulty level (Gawlick, [0062]).  Gawlick suggests that the ability estimate of a student may be updated based on the student’s response to the item.  See Gawlick, [0024], “The examinee's response to the first item is scored and used to update the ability estimate for the examinee”; [0075], “After administering the second item, ability is estimated based on that updated prior ability distribution and responses to the first two items. In this way, the prior and ability estimates can be consecutively updated after each subsequent item is administered until the fixed test length is reached”.     Furthermore, the claims require “update the test item selection parameters by increasing the first difficulty range to a second difficulty range based on the analysis of the first spoken response data.”   Gawlick teaches an “Adaptive Testing Process” where “more accurate ability estimates throughout the testing process …  at each step further improves selection efficiency and ensures a closer match between difficulty of selected items and examinee abilities.”  Gawlick explicitly suggests that item(s) is continuously selected based on the examinee’s updated ability based on previous response(s) ([0075], “the prior and ability estimates can be consecutively updated after each subsequent item is administered”). 

Applicant argues: 
These cited portions of Gawlick discuss randomly selecting items, but do not discuss use of random number to perform this random selection. Moreover, with respect to the difficulty ranges recited in claim 9, the Office Action notes on page 7 that a "close match (instead exact match) means there is a range of acceptable values for item difficulty." However, this assertion is not supported by any evidence in Gawlick, as Gawlick does not discuss any difficulty ranges. The discussion of determining a closest match between user ability and item difficulty in the cited portions of Gawlick is not the same as the language in claim 9 reciting changing difficulty ranges based on random numbers.
The Office respectfully submits Gawlick states “The first item can then be selected at random from a subset of items (where the subset may be specified to contain from one to n items). There are various methods for selecting the item to present (e.g., one whose difficulty is closest to the ability estimate, one with the most information at the ability estimate).”   It’s inherent that a random number (from 1 to n) has to be generated to select an item from a subset.”
The newly cited reference Fujimori teaches a difficulty range (Fujimori, [0127], “the estimated values of θ and difficulty level are in the range of -3.5 to 3.5, and the range of the estimated value of the identifiability is 0.02 to 2.0”).  

Applicant argues: 
Claim 15 recites, in part "generate a second item difficulty probability distribution based on the first spoken response data; [and] select a second test item for the spoken language test from the test item bank based on the second item difficulty probability distribution" (emphasis added). Applicant submits that Gawlick and Gray, either alone or in combination, fail to teach or suggest at least these features recited in claim 15.
Brown et al. (US 2016/0217701 A1) teaches the difficulty probability distribution as required in claim 15. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACK YIP whose telephone number is (571)270-5048. The examiner can normally be reached Monday thru Friday; 9:00 AM - 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, XUAN THAI can be reached on (571) 272-7147. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JACK YIP/Primary Examiner, Art Unit 3715