Notice of Pre-AIA  or AIA  Status
	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	Claims 1-20 have been examined.

Claim Rejections - 35 U.S.C. § 101
35 U.S.C. § 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

	Claims 1-20 are rejected. The claimed invention is directed to “mental steps” and “mathematical concepts” without significantly more. 
	The claims recite:
hierarchical prediction

hierarchical prediction domain

prediction inputs

hierarchical predictive relationships

prediction nodes

hierarchical predictive position

analysis

predictive output

Claim 1
	Step 1 inquiry: Does this claim fall within a statutory category?

	The preamble of the claim recites “1. A computer-implemented method for generating a predictive output based at least in part on one or more prediction inputs, the computer-implemented comprising…” Therefore, it is a “method” (or “process”), which is a statutory category of invention. Therefore, the answer to the inquiry is: “YES”.

Step 2A (Prong One) inquiry:

	Are there limitations in Claim 1 that recite abstract ideas?

	YES. The following limitations in Claim 1 recite abstract ideas that fall within at least one of the groupings of abstract ideas enumerated in the 2019 PEG. Specifically, they are “mental steps” and “mathematical concepts”:

obtaining access to an online machine learning model, wherein (i) the online machine learning model is configured to perform an online machine learning analysis associated with a hierarchical prediction domain based at least in part on one or more prediction inputs to generate one or more structurally hierarchical predictions, (ii) the hierarchical prediction domain is associated with one or more hierarchical predictive relationships among a plurality of prediction nodes, (iii) the one or more hierarchical predictive relationships define, for each of the plurality of prediction nodes, a corresponding hierarchical predictive position, and (iv) each structurally hierarchical prediction is determined based at least in part on the hierarchical predictive position of the corresponding prediction node;

performing the online predictive machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally hierarchical predictions for the one or more prediction inputs; and

generating, based at least in part on the one or more structurally hierarchical predictions, the predictive output, wherein the predictive output indicates a selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions.

Step 2A (Prong Two) inquiry:

Are there additional elements or a combination of elements in the claim that apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that it is more than a drafting effort designed to monopolize the exception?

Applicant’s claims contain the following “additional elements”:
	(1) online
	(2) A machine learning model

	A “online” is a broad term which is described at a high level. Applicant’s Specification recites:

[0245] Embodiments of the subject matter described herein can be implemented in a computing system that includes a back-end component, e.g., as an information/data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client device having a graphical user interface or a web browser through which a user can interact with an implementation of the subject matter described herein, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital information/data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter- network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

[0246] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits information/data (e.g., a Hypertext Markup Language (HTML) page) to a client device (e.g., for purposes of displaying information/data to and receiving user input from a user interacting with the client device). Information/data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.

This “online” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “machine learning model” is a broad term which is described at a high level. Applicant’s Specification recites:

[0106] FIGS. 4A and 4B depict two example ensemble architectures utilizing all of the four above-mentioned ML models. However, a person of ordinary skill in the art will recognize that the four mentioned ML models can be utilized individually or in any particular combination of two or more of the four mentioned ML models. Furthermore, a person of ordinary skill in the art will recognize that, if two or more ML models are utilized to generate a prediction (e.g., all four mentioned ML models are utilized to generate a prediction), the four ML models may be organized in accordance with any ensemble architecture, including an ensemble architecture that is different from either or both of the ensemble architectures depicted in FIGS. 4A and 4B. Moreover, a person of ordinary skill in the art will recognize that one or more of each of the four mentioned ML models may be utilized in combination of one or more other ML models in accordance with various ensemble architectures to generate a multi-model prediction framework. Thus, the depiction of example ensemble architectures in FIGS. 4A and 4B, and the accompanying description of the noted example ensemble architectures provided herein, is not meant to be limiting as to the scope of the present invention. 

[0107]   FIG. 4A is an operational flow diagram for an ensemble architecture 410 with an online ML model for processing structured input data, a co-occurrence analysis ML model for processing structured input data, a structured fusion ML model for combining structure-based predictions, and an unstructured fusion ML model for combining structure-based predictions and non-structure-based predictions, where the ensemble architecture 410 performs a structured fusion before performing an unstructured fusion. As depicted in the ensemble architecture 410, the online learning unit 111 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with an online ML model to generate one or more online learning predictions. Moreover, the co-occurrence analysis unit 112 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with a co-occurrence analysis ML model to generate one or more co- occurrence analysis predictions. In some embodiments, both the one or more online learning predictions and the one or more co-occurrence analysis predictions are structure-based predictions, i.e., predictions generated based on structured input data.

This “machine learning model” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	The answer to the inquiry is “NO”, no additional elements integrate the claimed abstract idea into a practical application.

Step 2B inquiry:
Does the claim provide an inventive concept, i.e., does the claim recite additional element(s) or a combination of elements that amount to significantly more than the judicial exception in the claim?

Applicant’s claims contain the following “additional elements”: 
	(1) online
	(2) A machine learning model

	A “online” is a broad term which is described at a high level. Applicant’s Specification recites:

[0245] Embodiments of the subject matter described herein can be implemented in a computing system that includes a back-end component, e.g., as an information/data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client device having a graphical user interface or a web browser through which a user can interact with an implementation of the subject matter described herein, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital information/data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter- network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

[0246] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits information/data (e.g., a Hypertext Markup Language (HTML) page) to a client device (e.g., for purposes of displaying information/data to and receiving user input from a user interacting with the client device). Information/data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “machine learning model” is a broad term which is described at a high level. Applicant’s Specification recites:

[0106] FIGS. 4A and 4B depict two example ensemble architectures utilizing all of the four above-mentioned ML models. However, a person of ordinary skill in the art will recognize that the four mentioned ML models can be utilized individually or in any particular combination of two or more of the four mentioned ML models. Furthermore, a person of ordinary skill in the art will recognize that, if two or more ML models are utilized to generate a prediction (e.g., all four mentioned ML models are utilized to generate a prediction), the four ML models may be organized in accordance with any ensemble architecture, including an ensemble architecture that is different from either or both of the ensemble architectures depicted in FIGS. 4A and 4B. Moreover, a person of ordinary skill in the art will recognize that one or more of each of the four mentioned ML models may be utilized in combination of one or more other ML models in accordance with various ensemble architectures to generate a multi-model prediction framework. Thus, the depiction of example ensemble architectures in FIGS. 4A and 4B, and the accompanying description of the noted example ensemble architectures provided herein, is not meant to be limiting as to the scope of the present invention. 

[0107]   FIG. 4A is an operational flow diagram for an ensemble architecture 410 with an online ML model for processing structured input data, a co-occurrence analysis ML model for processing structured input data, a structured fusion ML model for combining structure-based predictions, and an unstructured fusion ML model for combining structure-based predictions and non-structure-based predictions, where the ensemble architecture 410 performs a structured fusion before performing an unstructured fusion. As depicted in the ensemble architecture 410, the online learning unit 111 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with an online ML model to generate one or more online learning predictions. Moreover, the co-occurrence analysis unit 112 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with a co-occurrence analysis ML model to generate one or more co- occurrence analysis predictions. In some embodiments, both the one or more online learning predictions and the one or more co-occurrence analysis predictions are structure-based predictions, i.e., predictions generated based on structured input data.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	Therefore, the answer to the inquiry is “NO”, no additional elements provide an inventive concept that is significantly more than the claimed abstract ideas the claimed abstract idea into a practical application.

	Claim 1 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 2
	Claim 2 recites:

obtaining access to a co-occurrence analysis machine learning model, wherein (i) the co-occurrence analysis machine learning model is configured to perform a co-occurrence machine learning analysis associated with the hierarchical prediction domain based at least in part on the one or more prediction inputs to generate one or more structurally non-hierarchically predictions, (ii) the co-occurrence predictive analysis machine learning model is associated with a predictive co-occurrence score between each feature-node pair of a predictive feature set of one or more predictive feature sets and a prediction node of the plurality nodes, and (iii) each of the one or more structurally non-hierarchical predictions is determined without regard to the hierarchical predictive position of the corresponding prediction node;

performing the co-occurrence machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally non-hierarchical predictions for the one or more prediction inputs; and

generating the predictive output based at least in part on the one or more structurally non-hierarchical predictions in addition to the one or more structurally hierarchical predictions.

	Applicant’s Claim 2 merely teaches non-hierarchical predictions and hierarchical predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 2 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 3
	Claim 3 recites:

obtaining one or more structurally non-hierarchically predictions, wherein each of the one or more structurally non-hierarchical predictions is determined without regard to the hierarchical predictive position of the corresponding prediction node;

obtaining access to a structured fusion machine learning model, wherein the structured fusion machine learning model is configured to perform a structured fusion machine learning analysis based at least in part on the one or more the structurally non-hierarchically predictions and the one or more structurally non-hierarchically predictions to generate one or more structure-based predictions;

performing the structured fusion machine learning analysis to generate the one or more structure-based predictions; and

generating the predictive output based at least in part on the one or more structure-based predictions.

	Applicant’s Claim 3 merely teaches machine learning and predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 3 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 4
	Claim 4 recites:

the one or more prediction inputs comprise one or more unstructured prediction inputs; and

generating the predictive output further comprises:

generating one or more structure-based predictions based at least in part on the one or more structurally hierarchical predictions;

generating one or more non-structure-based predictions based at least in part on the one or more unstructured prediction inputs;

obtaining an unstructured fusion machine learning model, wherein (i) the unstructured fusion machine learning model is configured to perform an unstructured fusion machine learning analysis based at least in part on the one or more structure-based predictions and the one or more non-structure-based predictions to generate one or more unstructured-fused predictions; and (ii) the unstructured fusion machine learning analysis comprises retraining one or more structured machine learning models comprising the online machine learning model based at least in part on the one or more non-structure-based predictions;

performing the unstructured fusion machine learning analysis to generate the one or more non-structure-based predictions; and

generating the predictive output based at least in part on the unstructured-fused predictions.

	Applicant’s Claim 4 merely teaches fused predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 4 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 5
	Claim 5 recites:

the one or more prediction inputs comprise one or more medical feature inputs for a patient profile;

the predictive output comprises at least one human phenotype ontology label prediction for the patient profile; and

the one or more predictive hierarchical relationships comprise one or more human phenotype ontology dependency relationships.

	Applicant’s Claim 5 merely teaches inputs, labels, and relationships. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 5 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 6
	Claim 6 recites:

wherein the online machine learning model is a follow-the-regularized leader machine learning model.

	Applicant’s Claim 6 merely teaches a type of machine learning model. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 6 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 7
	Claim 7 recites:

the selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions comprises a threshold number of selected prediction nodes; and

the threshold number of selected prediction nodes is determined based at least in part on one or more online machine learning parameters of the online machine learning model.

	Applicant’s Claim 7 merely teaches prediction nodes. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 7 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 8
	Step 1 inquiry: Does this claim fall within a statutory category?

	The preamble of the claim recites “8. An apparatus comprising…” Therefore, it is an “apparatus”, which is a statutory category of invention. Therefore, the answer to the inquiry is: “YES”.

Step 2A (Prong One) inquiry:

	Are there limitations in Claim 8 that recite abstract ideas?

	YES. The following limitations in Claim 8 recite abstract ideas that fall within at least one of the groupings of abstract ideas enumerated in the 2019 PEG. Specifically, they are “mental steps” and “mathematical concepts”:

obtain access to an online machine learning model, wherein (i) the online machine learning model is configured to perform an online machine learning analysis associated with a hierarchical prediction domain based at least in part on one or more prediction inputs to generate one or more structurally hierarchical predictions, (ii) the hierarchical prediction domain is associated with one or more hierarchical predictive relationships among a plurality of prediction nodes, (iii) the one or more hierarchical predictive relationships define, for each of the plurality of prediction nodes, a corresponding hierarchical predictive position, and (iv) each structurally hierarchical prediction is determined based at least in part on the hierarchical predictive position of the corresponding prediction node;

perform the online predictive machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally hierarchical predictions for the one or more prediction inputs; and

generate, based at least in part on the one or more structurally hierarchical predictions, the predictive output, wherein the predictive output indicates a selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions.

Step 2A (Prong Two) inquiry:

Are there additional elements or a combination of elements in the claim that apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that it is more than a drafting effort designed to monopolize the exception?

Applicant’s claims contain the following “additional elements”:
	(1) online
	(2) A machine learning model
	(3) A processor
	(4) A memory

	A “online” is a broad term which is described at a high level. Applicant’s Specification recites:

[0245] Embodiments of the subject matter described herein can be implemented in a computing system that includes a back-end component, e.g., as an information/data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client device having a graphical user interface or a web browser through which a user can interact with an implementation of the subject matter described herein, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital information/data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter- network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

[0246] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits information/data (e.g., a Hypertext Markup Language (HTML) page) to a client device (e.g., for purposes of displaying information/data to and receiving user input from a user interacting with the client device). Information/data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.

This “online” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “machine learning model” is a broad term which is described at a high level. Applicant’s Specification recites:

[0106] FIGS. 4A and 4B depict two example ensemble architectures utilizing all of the four above-mentioned ML models. However, a person of ordinary skill in the art will recognize that the four mentioned ML models can be utilized individually or in any particular combination of two or more of the four mentioned ML models. Furthermore, a person of ordinary skill in the art will recognize that, if two or more ML models are utilized to generate a prediction (e.g., all four mentioned ML models are utilized to generate a prediction), the four ML models may be organized in accordance with any ensemble architecture, including an ensemble architecture that is different from either or both of the ensemble architectures depicted in FIGS. 4A and 4B. Moreover, a person of ordinary skill in the art will recognize that one or more of each of the four mentioned ML models may be utilized in combination of one or more other ML models in accordance with various ensemble architectures to generate a multi-model prediction framework. Thus, the depiction of example ensemble architectures in FIGS. 4A and 4B, and the accompanying description of the noted example ensemble architectures provided herein, is not meant to be limiting as to the scope of the present invention. 

[0107]   FIG. 4A is an operational flow diagram for an ensemble architecture 410 with an online ML model for processing structured input data, a co-occurrence analysis ML model for processing structured input data, a structured fusion ML model for combining structure-based predictions, and an unstructured fusion ML model for combining structure-based predictions and non-structure-based predictions, where the ensemble architecture 410 performs a structured fusion before performing an unstructured fusion. As depicted in the ensemble architecture 410, the online learning unit 111 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with an online ML model to generate one or more online learning predictions. Moreover, the co-occurrence analysis unit 112 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with a co-occurrence analysis ML model to generate one or more co- occurrence analysis predictions. In some embodiments, both the one or more online learning predictions and the one or more co-occurrence analysis predictions are structure-based predictions, i.e., predictions generated based on structured input data.

This “machine learning model” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “processor” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0089] As shown in FIG. 2, in one embodiment, the classification computing entity 106 may include or be in communication with one or more processing elements 205 (also referred to as processors, processing circuitry, and/or similar terms used herein interchangeably) that communicate with other elements within the classification computing entity 106 via a bus, for example. As will be understood, the processing element 205 may be embodied in a number of different ways. For example, the processing element 205 may be embodied as one or more Complex Pogrammable Logic Devices (CPLDs), microprocessors, multi-core processors, coprocessing entities, Application-Specific Instruction-Set Processors (ASIPs), microcontrollers, and/or controllers. Further, the processing element 205 may be embodied as one or more other processing devices or circuitry. The term circuitry may refer to an entirely hardware embodiment or a combination of hardware and computer program products. Thus, the processing element 205 may be embodied as integrated circuits, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Programmable Logic Arrays (PLAs), hardware accelerators, other circuitry, and/or the like. As will therefore be understood, the processing element 205 may be configured for a particular use or configured to execute instructions stored in volatile or non-volatile media or otherwise accessible to the processing element 205. As such, whether configured by hardware or computer program products, or by a combination thereof, the processing element 205 may be capable of performing steps or operations according to embodiments of the present invention when configured accordingly.


	[0243] The processes and logic flows described herein can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input information/data and generating output. Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and information/data from a read-only memory, a random access memory, or both. The essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive information/data from or transfer information/data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Devices suitable for storing computer program instructions and information/data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto- optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

This “processor” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “memory” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0077] In one embodiment, a non-volatile computer-readable storage medium may include a floppy disk, flexible disk, hard disk, solid-state storage (SSS) (e.g., a solid state drive (SSD), solid state card (SSC), solid state module (SSM), enterprise flash drive, magnetic tape, or any other non-transitory magnetic medium, and/or the like. A non-volatile computer-readable storage medium may also include a punch card, paper tape, optical mark sheet (or any other physical medium with patterns of holes or other optically recognizable indicia), compact disc read only memory (CD-ROM), compact disc-rewritable (CD-RW), digital versatile disc (DVD), Blu-ray disc (BD), any other non-transitory optical medium, and/or the like. Such a non-volatile computer-readable storage medium may also include read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory (e.g., Serial, NAND, NOR, and/or the like), multimedia memory cards (MMC), secure digital (SD) memory cards, SmartMedia cards, CompactFlash (CF) cards, Memory Sticks, and/or the like. Further, a non- volatile computer-readable storage medium may also include conductive-bridging random access memory (CBRAM), phase-change random access memory (PRAM), ferroelectric random-access memory (FeRAM), non-volatile random-access memory (NVRAM), magnetoresistive random- access memory (MRAM), resistive random-access memory (RRAM), Silicon-Oxide-Nitride- Oxide-Silicon memory (SONGS), floating junction gate random access memory (FJG RAM), Millipede memory, racetrack memory, and/or the like.

	[0078] In one embodiment, a volatile computer-readable storage medium may include random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), fast page mode dynamic random access memory (FPM DRAM), extended data-out dynamic random access memory (EDO DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), double data rate type two synchronous dynamic random access memory (DDR2 SDRAM), double data rate type three synchronous dynamic random access memory (DDR3 SDRAM), Rambus dynamic random access memory (RDRAM), Twin Transistor RAM (TTRAM), Thyristor RAM (T-RAM), Zero-capacitor (Z-RAM), Rambus in-line memory module (RIMM), dual in-line memory module (DIMM), single in-line memory module (SIMM), video random access memory (VRAM), cache memory (including various levels), flash memory, register memory, and/or the like. It will be appreciated that where embodiments are described to use a computer-readable storage medium, other types of computer-readable storage media may be substituted for or used in addition to the computer-readable storage media described above.

This “memory” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	The answer to the inquiry is “NO”, no additional elements integrate the claimed abstract idea into a practical application.

Step 2B inquiry:
Does the claim provide an inventive concept, i.e., does the claim recite additional element(s) or a combination of elements that amount to significantly more than the judicial exception in the claim?

Applicant’s claims contain the following “additional elements”: 
	(1) online
	(2) A machine learning model
	(3) A processor
	(4) A memory

	A “online” is a broad term which is described at a high level. Applicant’s Specification recites:

[0245] Embodiments of the subject matter described herein can be implemented in a computing system that includes a back-end component, e.g., as an information/data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client device having a graphical user interface or a web browser through which a user can interact with an implementation of the subject matter described herein, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital information/data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter- network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

[0246] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits information/data (e.g., a Hypertext Markup Language (HTML) page) to a client device (e.g., for purposes of displaying information/data to and receiving user input from a user interacting with the client device). Information/data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “machine learning model” is a broad term which is described at a high level. Applicant’s Specification recites:

[0106] FIGS. 4A and 4B depict two example ensemble architectures utilizing all of the four above-mentioned ML models. However, a person of ordinary skill in the art will recognize that the four mentioned ML models can be utilized individually or in any particular combination of two or more of the four mentioned ML models. Furthermore, a person of ordinary skill in the art will recognize that, if two or more ML models are utilized to generate a prediction (e.g., all four mentioned ML models are utilized to generate a prediction), the four ML models may be organized in accordance with any ensemble architecture, including an ensemble architecture that is different from either or both of the ensemble architectures depicted in FIGS. 4A and 4B. Moreover, a person of ordinary skill in the art will recognize that one or more of each of the four mentioned ML models may be utilized in combination of one or more other ML models in accordance with various ensemble architectures to generate a multi-model prediction framework. Thus, the depiction of example ensemble architectures in FIGS. 4A and 4B, and the accompanying description of the noted example ensemble architectures provided herein, is not meant to be limiting as to the scope of the present invention. 

[0107] FIG. 4A is an operational flow diagram for an ensemble architecture 410 with an online ML model for processing structured input data, a co-occurrence analysis ML model for processing structured input data, a structured fusion ML model for combining structure-based predictions, and an unstructured fusion ML model for combining structure-based predictions and non-structure-based predictions, where the ensemble architecture 410 performs a structured fusion before performing an unstructured fusion. As depicted in the ensemble architecture 410, the online learning unit 111 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with an online ML model to generate one or more online learning predictions. Moreover, the co-occurrence analysis unit 112 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with a co-occurrence analysis ML model to generate one or more co- occurrence analysis predictions. In some embodiments, both the one or more online learning predictions and the one or more co-occurrence analysis predictions are structure-based predictions, i.e., predictions generated based on structured input data.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “processor” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0089] As shown in FIG. 2, in one embodiment, the classification computing entity 106 may include or be in communication with one or more processing elements 205 (also referred to as processors, processing circuitry, and/or similar terms used herein interchangeably) that communicate with other elements within the classification computing entity 106 via a bus, for example. As will be understood, the processing element 205 may be embodied in a number of different ways. For example, the processing element 205 may be embodied as one or more Complex Pogrammable Logic Devices (CPLDs), microprocessors, multi-core processors, coprocessing entities, Application-Specific Instruction-Set Processors (ASIPs), microcontrollers, and/or controllers. Further, the processing element 205 may be embodied as one or more other processing devices or circuitry. The term circuitry may refer to an entirely hardware embodiment or a combination of hardware and computer program products. Thus, the processing element 205 may be embodied as integrated circuits, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Programmable Logic Arrays (PLAs), hardware accelerators, other circuitry, and/or the like. As will therefore be understood, the processing element 205 may be configured for a particular use or configured to execute instructions stored in volatile or non-volatile media or otherwise accessible to the processing element 205. As such, whether configured by hardware or computer program products, or by a combination thereof, the processing element 205 may be capable of performing steps or operations according to embodiments of the present invention when configured accordingly.


	[0243] The processes and logic flows described herein can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input information/data and generating output. Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and information/data from a read-only memory, a random access memory, or both. The essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive information/data from or transfer information/data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Devices suitable for storing computer program instructions and information/data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto- optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “memory” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0077] In one embodiment, a non-volatile computer-readable storage medium may include a floppy disk, flexible disk, hard disk, solid-state storage (SSS) (e.g., a solid state drive (SSD), solid state card (SSC), solid state module (SSM), enterprise flash drive, magnetic tape, or any other non-transitory magnetic medium, and/or the like. A non-volatile computer-readable storage medium may also include a punch card, paper tape, optical mark sheet (or any other physical medium with patterns of holes or other optically recognizable indicia), compact disc read only memory (CD-ROM), compact disc-rewritable (CD-RW), digital versatile disc (DVD), Blu-ray disc (BD), any other non-transitory optical medium, and/or the like. Such a non-volatile computer-readable storage medium may also include read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory (e.g., Serial, NAND, NOR, and/or the like), multimedia memory cards (MMC), secure digital (SD) memory cards, SmartMedia cards, CompactFlash (CF) cards, Memory Sticks, and/or the like. Further, a non- volatile computer-readable storage medium may also include conductive-bridging random access memory (CBRAM), phase-change random access memory (PRAM), ferroelectric random-access memory (FeRAM), non-volatile random-access memory (NVRAM), magnetoresistive random- access memory (MRAM), resistive random-access memory (RRAM), Silicon-Oxide-Nitride- Oxide-Silicon memory (SONGS), floating junction gate random access memory (FJG RAM), Millipede memory, racetrack memory, and/or the like.

	[0078] In one embodiment, a volatile computer-readable storage medium may include random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), fast page mode dynamic random access memory (FPM DRAM), extended data-out dynamic random access memory (EDO DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), double data rate type two synchronous dynamic random access memory (DDR2 SDRAM), double data rate type three synchronous dynamic random access memory (DDR3 SDRAM), Rambus dynamic random access memory (RDRAM), Twin Transistor RAM (TTRAM), Thyristor RAM (T-RAM), Zero-capacitor (Z-RAM), Rambus in-line memory module (RIMM), dual in-line memory module (DIMM), single in-line memory module (SIMM), video random access memory (VRAM), cache memory (including various levels), flash memory, register memory, and/or the like. It will be appreciated that where embodiments are described to use a computer-readable storage medium, other types of computer-readable storage media may be substituted for or used in addition to the computer-readable storage media described above.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	Therefore, the answer to the inquiry is “NO”, no additional elements provide an inventive concept that is significantly more than the claimed abstract ideas the claimed abstract idea into a practical application.

	Claim 8 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 9
	Claim 9 recites:

obtaining access to a co-occurrence analysis machine learning model, wherein (i) the co-occurrence analysis machine learning model is configured to perform a co-occurrence machine learning analysis associated with the hierarchical prediction domain based at least in part on the one or more prediction inputs to generate one or more structurally non-hierarchically predictions, (ii) the co-occurrence predictive analysis machine learning model is associated with a predictive co-occurrence score between each feature-node pair of a predictive feature set of one or more predictive feature sets and a prediction node of the plurality nodes, and (iii) each of the one or more structurally non-hierarchical predictions is determined without regard to the hierarchical predictive position of the corresponding prediction node;

performing the co-occurrence machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally non-hierarchical predictions for the one or more prediction inputs; and

generating the predictive output based at least in part on the one or more structurally non-hierarchical predictions in addition to the one or more structurally hierarchical predictions.

	Applicant’s Claim 9 merely teaches non-hierarchical predictions and hierarchical predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 9 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 10
	Claim 10 recites:

obtaining one or more structurally non-hierarchically predictions, wherein each of the one or more structurally non-hierarchical predictions is determined without regard to the hierarchical predictive position of the corresponding prediction node;

obtaining access to a structured fusion machine learning model, wherein the structured fusion machine learning model is configured to perform a structured fusion machine learning analysis based at least in part on the one or more the structurally non-hierarchically predictions and the one or more structurally non-hierarchically predictions to generate one or more structure-based predictions;

performing the structured fusion machine learning analysis to generate the one or more structure-based predictions; and

generating the predictive output based at least in part on the one or more structure-based predictions.

	Applicant’s Claim 10 merely teaches machine learning and predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 10 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 11
	Claim 11 recites:

the one or more prediction inputs comprise one or more unstructured prediction inputs; and

generating the predictive output further comprises:

generating one or more structure-based predictions based at least in part on the one or more structurally hierarchical predictions;

generating one or more non-structure-based predictions based at least in part on the one or more unstructured prediction inputs;

obtaining an unstructured fusion machine learning model, wherein (i) the unstructured fusion machine learning model is configured to perform an unstructured fusion machine learning analysis based at least in part on the one or more structure-based predictions and the one or more non-structure-based predictions to generate one or more unstructured-fused predictions; and (ii) the unstructured fusion machine learning analysis comprises retraining one or more structured machine learning models comprising the online machine learning model based at least in part on the one or more non-structure-based predictions;

performing the unstructured fusion machine learning analysis to generate the one or more non-structure-based predictions; and

generating the predictive output based at least in part on the unstructured-fused predictions.

	Applicant’s Claim 11 merely teaches fused predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 11 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 12
	Claim 12 recites:

the one or more prediction inputs comprise one or more medical feature inputs for a patient profile;

the predictive output comprises at least one human phenotype ontology label prediction for the patient profile; and

the one or more predictive hierarchical relationships comprise one or more human phenotype ontology dependency relationships.

	Applicant’s Claim 12 merely teaches inputs, labels, and relationships. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 12 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 13
	Claim 13 recites:

wherein the online machine learning model is a follow-the-regularized leader machine learning model.

	Applicant’s Claim 13 merely teaches a type of machine learning model. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 13 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 14
	Claim 14 recites:

the selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions comprises a threshold number of selected prediction nodes; and

the threshold number of selected prediction nodes is determined based at least in part on one or more online machine learning parameters of the online machine learning model.

	Applicant’s Claim 14 merely teaches prediction nodes. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 14 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 15
	Step 1 inquiry: Does this claim fall within a statutory category?

	The preamble of the claim recites “15. A non-transitory computer storage medium comprising instructions configured to cause one or more processors to at least at least perform operations configured to at least…” Therefore, it is a “computer storage medium”, which is NOT a proper “computer readable medium”. Therefore, it fails to recite a proper computerized “product of manufacture”. It fails to recite a statutory category of invention. Therefore, the answer to the inquiry is: “NO”.

Step 2A (Prong One) inquiry:

	Are there limitations in Claim 15 that recite abstract ideas?

	YES. The following limitations in Claim 15 recite abstract ideas that fall within at least one of the groupings of abstract ideas enumerated in the 2019 PEG. Specifically, they are “mental steps” and “mathematical concepts”:

obtain access to an online machine learning model, wherein (i) the online machine learning model is configured to perform an online machine learning analysis associated with a hierarchical prediction domain based at least in part on one or more prediction inputs to generate one or more structurally hierarchical predictions, (ii) the hierarchical prediction domain is associated with one or more hierarchical predictive relationships among a plurality of prediction nodes, (iii) the one or more hierarchical predictive relationships define, for each of the plurality of prediction nodes, a corresponding hierarchical predictive position, and (iv) each structurally hierarchical prediction is determined based at least in part on the hierarchical predictive position of the corresponding prediction node;

perform the online predictive machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally hierarchical predictions for the one or more prediction inputs; and

generate, based at least in part on the one or more structurally hierarchical predictions, the predictive output, wherein the predictive output indicates a selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions.

Step 2A (Prong Two) inquiry:

Are there additional elements or a combination of elements in the claim that apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that it is more than a drafting effort designed to monopolize the exception?

Applicant’s claims contain the following “additional elements”:
	(1) online
	(2) A machine learning model
	(3) A processor
	(4) A memory

	A “online” is a broad term which is described at a high level. Applicant’s Specification recites:

[0245] Embodiments of the subject matter described herein can be implemented in a computing system that includes a back-end component, e.g., as an information/data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client device having a graphical user interface or a web browser through which a user can interact with an implementation of the subject matter described herein, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital information/data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter- network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

[0246] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits information/data (e.g., a Hypertext Markup Language (HTML) page) to a client device (e.g., for purposes of displaying information/data to and receiving user input from a user interacting with the client device). Information/data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.

This “online” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “machine learning model” is a broad term which is described at a high level. Applicant’s Specification recites:

[0106] FIGS. 4A and 4B depict two example ensemble architectures utilizing all of the four above-mentioned ML models. However, a person of ordinary skill in the art will recognize that the four mentioned ML models can be utilized individually or in any particular combination of two or more of the four mentioned ML models. Furthermore, a person of ordinary skill in the art will recognize that, if two or more ML models are utilized to generate a prediction (e.g., all four mentioned ML models are utilized to generate a prediction), the four ML models may be organized in accordance with any ensemble architecture, including an ensemble architecture that is different from either or both of the ensemble architectures depicted in FIGS. 4A and 4B. Moreover, a person of ordinary skill in the art will recognize that one or more of each of the four mentioned ML models may be utilized in combination of one or more other ML models in accordance with various ensemble architectures to generate a multi-model prediction framework. Thus, the depiction of example ensemble architectures in FIGS. 4A and 4B, and the accompanying description of the noted example ensemble architectures provided herein, is not meant to be limiting as to the scope of the present invention. 

[0107]   FIG. 4A is an operational flow diagram for an ensemble architecture 410 with an online ML model for processing structured input data, a co-occurrence analysis ML model for processing structured input data, a structured fusion ML model for combining structure-based predictions, and an unstructured fusion ML model for combining structure-based predictions and non-structure-based predictions, where the ensemble architecture 410 performs a structured fusion before performing an unstructured fusion. As depicted in the ensemble architecture 410, the online learning unit 111 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with an online ML model to generate one or more online learning predictions. Moreover, the co-occurrence analysis unit 112 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with a co-occurrence analysis ML model to generate one or more co- occurrence analysis predictions. In some embodiments, both the one or more online learning predictions and the one or more co-occurrence analysis predictions are structure-based predictions, i.e., predictions generated based on structured input data.

This “machine learning model” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “processor” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0089] As shown in FIG. 2, in one embodiment, the classification computing entity 106 may include or be in communication with one or more processing elements 205 (also referred to as processors, processing circuitry, and/or similar terms used herein interchangeably) that communicate with other elements within the classification computing entity 106 via a bus, for example. As will be understood, the processing element 205 may be embodied in a number of different ways. For example, the processing element 205 may be embodied as one or more Complex Pogrammable Logic Devices (CPLDs), microprocessors, multi-core processors, coprocessing entities, Application-Specific Instruction-Set Processors (ASIPs), microcontrollers, and/or controllers. Further, the processing element 205 may be embodied as one or more other processing devices or circuitry. The term circuitry may refer to an entirely hardware embodiment or a combination of hardware and computer program products. Thus, the processing element 205 may be embodied as integrated circuits, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Programmable Logic Arrays (PLAs), hardware accelerators, other circuitry, and/or the like. As will therefore be understood, the processing element 205 may be configured for a particular use or configured to execute instructions stored in volatile or non-volatile media or otherwise accessible to the processing element 205. As such, whether configured by hardware or computer program products, or by a combination thereof, the processing element 205 may be capable of performing steps or operations according to embodiments of the present invention when configured accordingly.


	[0243] The processes and logic flows described herein can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input information/data and generating output. Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and information/data from a read-only memory, a random access memory, or both. The essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive information/data from or transfer information/data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Devices suitable for storing computer program instructions and information/data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto- optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

This “processor” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	A “memory” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0077] In one embodiment, a non-volatile computer-readable storage medium may include a floppy disk, flexible disk, hard disk, solid-state storage (SSS) (e.g., a solid state drive (SSD), solid state card (SSC), solid state module (SSM), enterprise flash drive, magnetic tape, or any other non-transitory magnetic medium, and/or the like. A non-volatile computer-readable storage medium may also include a punch card, paper tape, optical mark sheet (or any other physical medium with patterns of holes or other optically recognizable indicia), compact disc read only memory (CD-ROM), compact disc-rewritable (CD-RW), digital versatile disc (DVD), Blu-ray disc (BD), any other non-transitory optical medium, and/or the like. Such a non-volatile computer-readable storage medium may also include read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory (e.g., Serial, NAND, NOR, and/or the like), multimedia memory cards (MMC), secure digital (SD) memory cards, SmartMedia cards, CompactFlash (CF) cards, Memory Sticks, and/or the like. Further, a non- volatile computer-readable storage medium may also include conductive-bridging random access memory (CBRAM), phase-change random access memory (PRAM), ferroelectric random-access memory (FeRAM), non-volatile random-access memory (NVRAM), magnetoresistive random- access memory (MRAM), resistive random-access memory (RRAM), Silicon-Oxide-Nitride- Oxide-Silicon memory (SONGS), floating junction gate random access memory (FJG RAM), Millipede memory, racetrack memory, and/or the like.

	[0078] In one embodiment, a volatile computer-readable storage medium may include random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), fast page mode dynamic random access memory (FPM DRAM), extended data-out dynamic random access memory (EDO DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), double data rate type two synchronous dynamic random access memory (DDR2 SDRAM), double data rate type three synchronous dynamic random access memory (DDR3 SDRAM), Rambus dynamic random access memory (RDRAM), Twin Transistor RAM (TTRAM), Thyristor RAM (T-RAM), Zero-capacitor (Z-RAM), Rambus in-line memory module (RIMM), dual in-line memory module (DIMM), single in-line memory module (SIMM), video random access memory (VRAM), cache memory (including various levels), flash memory, register memory, and/or the like. It will be appreciated that where embodiments are described to use a computer-readable storage medium, other types of computer-readable storage media may be substituted for or used in addition to the computer-readable storage media described above.

This “memory” limitation represents “insignificant extra-solution activity”. (See, M.P.E.P. § 2106.05(I)(A)).

	The answer to the inquiry is “NO”, no additional elements integrate the claimed abstract idea into a practical application.

Step 2B inquiry:
Does the claim provide an inventive concept, i.e., does the claim recite additional element(s) or a combination of elements that amount to significantly more than the judicial exception in the claim?

Applicant’s claims contain the following “additional elements”: 
	(1) online
	(2) A machine learning model
	(3) A processor
	(4) A memory

	A “online” is a broad term which is described at a high level. Applicant’s Specification recites:

[0245] Embodiments of the subject matter described herein can be implemented in a computing system that includes a back-end component, e.g., as an information/data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client device having a graphical user interface or a web browser through which a user can interact with an implementation of the subject matter described herein, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital information/data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter- network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

[0246] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits information/data (e.g., a Hypertext Markup Language (HTML) page) to a client device (e.g., for purposes of displaying information/data to and receiving user input from a user interacting with the client device). Information/data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “machine learning model” is a broad term which is described at a high level. Applicant’s Specification recites:

[0106] FIGS. 4A and 4B depict two example ensemble architectures utilizing all of the four above-mentioned ML models. However, a person of ordinary skill in the art will recognize that the four mentioned ML models can be utilized individually or in any particular combination of two or more of the four mentioned ML models. Furthermore, a person of ordinary skill in the art will recognize that, if two or more ML models are utilized to generate a prediction (e.g., all four mentioned ML models are utilized to generate a prediction), the four ML models may be organized in accordance with any ensemble architecture, including an ensemble architecture that is different from either or both of the ensemble architectures depicted in FIGS. 4A and 4B. Moreover, a person of ordinary skill in the art will recognize that one or more of each of the four mentioned ML models may be utilized in combination of one or more other ML models in accordance with various ensemble architectures to generate a multi-model prediction framework. Thus, the depiction of example ensemble architectures in FIGS. 4A and 4B, and the accompanying description of the noted example ensemble architectures provided herein, is not meant to be limiting as to the scope of the present invention. 

[0107] FIG. 4A is an operational flow diagram for an ensemble architecture 410 with an online ML model for processing structured input data, a co-occurrence analysis ML model for processing structured input data, a structured fusion ML model for combining structure-based predictions, and an unstructured fusion ML model for combining structure-based predictions and non-structure-based predictions, where the ensemble architecture 410 performs a structured fusion before performing an unstructured fusion. As depicted in the ensemble architecture 410, the online learning unit 111 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with an online ML model to generate one or more online learning predictions. Moreover, the co-occurrence analysis unit 112 retrieves the structured input data 121 from the storage subsystem and processes the structured input data 121 in accordance with a co-occurrence analysis ML model to generate one or more co- occurrence analysis predictions. In some embodiments, both the one or more online learning predictions and the one or more co-occurrence analysis predictions are structure-based predictions, i.e., predictions generated based on structured input data.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “processor” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0089] As shown in FIG. 2, in one embodiment, the classification computing entity 106 may include or be in communication with one or more processing elements 205 (also referred to as processors, processing circuitry, and/or similar terms used herein interchangeably) that communicate with other elements within the classification computing entity 106 via a bus, for example. As will be understood, the processing element 205 may be embodied in a number of different ways. For example, the processing element 205 may be embodied as one or more Complex Pogrammable Logic Devices (CPLDs), microprocessors, multi-core processors, coprocessing entities, Application-Specific Instruction-Set Processors (ASIPs), microcontrollers, and/or controllers. Further, the processing element 205 may be embodied as one or more other processing devices or circuitry. The term circuitry may refer to an entirely hardware embodiment or a combination of hardware and computer program products. Thus, the processing element 205 may be embodied as integrated circuits, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Programmable Logic Arrays (PLAs), hardware accelerators, other circuitry, and/or the like. As will therefore be understood, the processing element 205 may be configured for a particular use or configured to execute instructions stored in volatile or non-volatile media or otherwise accessible to the processing element 205. As such, whether configured by hardware or computer program products, or by a combination thereof, the processing element 205 may be capable of performing steps or operations according to embodiments of the present invention when configured accordingly.


	[0243] The processes and logic flows described herein can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input information/data and generating output. Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and information/data from a read-only memory, a random access memory, or both. The essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive information/data from or transfer information/data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Devices suitable for storing computer program instructions and information/data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto- optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	A “memory” is a broad term which is described at a high level. Applicant’s Specification recites:

	[0077] In one embodiment, a non-volatile computer-readable storage medium may include a floppy disk, flexible disk, hard disk, solid-state storage (SSS) (e.g., a solid state drive (SSD), solid state card (SSC), solid state module (SSM), enterprise flash drive, magnetic tape, or any other non-transitory magnetic medium, and/or the like. A non-volatile computer-readable storage medium may also include a punch card, paper tape, optical mark sheet (or any other physical medium with patterns of holes or other optically recognizable indicia), compact disc read only memory (CD-ROM), compact disc-rewritable (CD-RW), digital versatile disc (DVD), Blu-ray disc (BD), any other non-transitory optical medium, and/or the like. Such a non-volatile computer-readable storage medium may also include read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory (e.g., Serial, NAND, NOR, and/or the like), multimedia memory cards (MMC), secure digital (SD) memory cards, SmartMedia cards, CompactFlash (CF) cards, Memory Sticks, and/or the like. Further, a non- volatile computer-readable storage medium may also include conductive-bridging random access memory (CBRAM), phase-change random access memory (PRAM), ferroelectric random-access memory (FeRAM), non-volatile random-access memory (NVRAM), magnetoresistive random- access memory (MRAM), resistive random-access memory (RRAM), Silicon-Oxide-Nitride- Oxide-Silicon memory (SONGS), floating junction gate random access memory (FJG RAM), Millipede memory, racetrack memory, and/or the like.

	[0078] In one embodiment, a volatile computer-readable storage medium may include random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), fast page mode dynamic random access memory (FPM DRAM), extended data-out dynamic random access memory (EDO DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), double data rate type two synchronous dynamic random access memory (DDR2 SDRAM), double data rate type three synchronous dynamic random access memory (DDR3 SDRAM), Rambus dynamic random access memory (RDRAM), Twin Transistor RAM (TTRAM), Thyristor RAM (T-RAM), Zero-capacitor (Z-RAM), Rambus in-line memory module (RIMM), dual in-line memory module (DIMM), single in-line memory module (SIMM), video random access memory (VRAM), cache memory (including various levels), flash memory, register memory, and/or the like. It will be appreciated that where embodiments are described to use a computer-readable storage medium, other types of computer-readable storage media may be substituted for or used in addition to the computer-readable storage media described above.

Therefore, the claim as a whole does not amount to significantly more than the exception itself (i.e., there is no inventive concept in the claim). (See, M.P.E.P. § 2106.05(II)).

	Therefore, the answer to the inquiry is “NO”, no additional elements provide an inventive concept that is significantly more than the claimed abstract ideas the claimed abstract idea into a practical application.

	Claim 15 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 16
	Claim 16 recites:

obtaining access to a co-occurrence analysis machine learning model, wherein (i) the co-occurrence analysis machine learning model is configured to perform a co-occurrence machine learning analysis associated with the hierarchical prediction domain based at least in part on the one or more prediction inputs to generate one or more structurally non-hierarchically predictions, (ii) the co-occurrence predictive analysis machine learning model is associated with a predictive co-occurrence score between each feature-node pair of a predictive feature set of one or more predictive feature sets and a prediction node of the plurality nodes, and (iii) each of the one or more structurally non-hierarchical predictions is determined without regard to the hierarchical predictive position of the corresponding prediction node;

performing the co-occurrence machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally non-hierarchical predictions for the one or more prediction inputs; and

generating the predictive output based at least in part on the one or more structurally non-hierarchical predictions in addition to the one or more structurally hierarchical predictions.

	Applicant’s Claim 16 merely teaches non-hierarchical predictions and hierarchical predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 16 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 17
	Claim 17 recites:

obtaining one or more structurally non-hierarchically predictions, wherein each of the one or more structurally non-hierarchical predictions is determined without regard to the hierarchical predictive position of the corresponding prediction node;

obtaining access to a structured fusion machine learning model, wherein the structured fusion machine learning model is configured to perform a structured fusion machine learning analysis based at least in part on the one or more the structurally non-hierarchically predictions and the one or more structurally non-hierarchically predictions to generate one or more structure-based predictions;

performing the structured fusion machine learning analysis to generate the one or more structure-based predictions; and

generating the predictive output based at least in part on the one or more structure-based predictions.

	Applicant’s Claim 17 merely teaches machine learning and predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 17 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 18
	Claim 18 recites:

the one or more prediction inputs comprise one or more unstructured prediction inputs; and

generating the predictive output further comprises:

generating one or more structure-based predictions based at least in part on the one or more structurally hierarchical predictions;

generating one or more non-structure-based predictions based at least in part on the one or more unstructured prediction inputs;

obtaining an unstructured fusion machine learning model, wherein (i) the unstructured fusion machine learning model is configured to perform an unstructured fusion machine learning analysis based at least in part on the one or more structure-based predictions and the one or more non-structure-based predictions to generate one or more unstructured-fused predictions; and (ii) the unstructured fusion machine learning analysis comprises retraining one or more structured machine learning models comprising the online machine learning model based at least in part on the one or more non-structure-based predictions;

performing the unstructured fusion machine learning analysis to generate the one or more non-structure-based predictions; and

generating the predictive output based at least in part on the unstructured-fused predictions.

	Applicant’s Claim 18 merely teaches fused predictions. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 18 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 19
	Claim 19 recites:

the one or more prediction inputs comprise one or more medical feature inputs for a patient profile;

the predictive output comprises at least one human phenotype ontology label prediction for the patient profile; and

the one or more predictive hierarchical relationships comprise one or more human phenotype ontology dependency relationships.

	Applicant’s Claim 19 merely teaches inputs, labels, and relationships. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 19 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim 20
	Claim 20 recites:

wherein the online machine learning model is a follow-the-regularized leader machine learning model.

	Applicant’s Claim 20 merely teaches a type of machine learning model. It does not integrate the abstract idea to a practical application, nor is it anything significantly more than the abstract idea. (See, 2106.05(a)(II).)
	Claim 20 is, therefore, NOT ELIGIBLE subject matter under 35 U.S.C. § 101.

Claim Rejections - 35 U.S.C. § 103
	In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

	The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	Claims 1, 8, and 15 are rejected under 35 U.S.C. § 103 as being unpatentable over Fu, et al., CNN With Coarse-to-Fine Layer for Hierarchical Classification, IET Comput. Vis., Vol. 12 Iss. 6, 2018, pp. 892-899, in view of Teng, Time-ordered Online Training of Convolutional Neural Networks, Doctoral Thesis, Carnegie Mellon University, 2019, pp. 1-132 in their entireties. Specifically:

Claim 1
           Claim 1's ''obtaining access to an online machine learning model, wherein (i) the online machine learning model is configured to perform an online machine learning analysis associated with a hierarchical prediction domain based at least in part on one or more prediction inputs to generate one or more structurally hierarchical predictions, (ii) the hierarchical prediction domain is associated with one or more hierarchical predictive relationships among a plurality of prediction nodes, (iii) the one or more hierarchical predictive relationships define, for each of the plurality of prediction nodes, a corresponding hierarchical predictive position, and (iv) each structurally hierarchical prediction is determined based at least in part on the hierarchical predictive position of the corresponding prediction node;'' is primarily taught by Fu, et al., page 892, left column, last full paragraph, where it recites:

In this paper, we introduce Bayesian techniques into the hierarchical classification of CNNs and propose a hierarchical CNN architecture called coarse-to-fine CNN. A coarse-to-fine CNN is simple, only with a coarse-to-fine layer on the top of a generic CNN. The coarse-to-fine layer is inspired by the Bayesian equation. One of the layer's inputs is the result of coarse classifications; it will directly affect the fine classification, which is the layer's output, in both training phase and testing phase. Compare to other related works, this operation is more consistent with human understandable concepts. In this paper, we are interested only in tree-based hierarchical class structure (a node can only have one parent node). DAG based (a node can have more than one parent node) is out of the scope of this paper.

	Further, “hierarchical predictions” are primarily taught by Fu, et al., page 898, left column, last full paragraph, where it recites:

In this paper, we propose a novel hierarchical CNN architecture, called coarse-to-fine CNN. It is an end-to-end model, with a simpler structure than other related hierarchical CNNs. The model outputs multiple hierarchical predictions from coarse to fine simultaneously and the coarse prediction affects the fine prediction directly using a coarse-to-fine layer, which is inspired by the Bayesian equation. The proposed coarse-to-fine layer has a clear forward and backward propagations. Our models improve over the corresponding baseline CNNs on several benchmark datasets. Besides, we reveal that the performance of coarse-to-fine CNNs is influenced by different category trees. For future work, it is important to explore the optimal structure of category tree for coarse-to-fine CNNs.

	The claimed “online machine learning model” is not expressly taught by Fu, et al. The “online machine learning model” is, however, taught by Teng, page iv, second full paragraph, where it recites:

In this work, we tackle the problem of online, real-time training of a deep learning algorithm. We refer to this type of problem as time-ordered online training (ToOT), of which the UAS use case is an example. We begin with the observation that in human students, learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. They may have help from a mentor or teacher, but must use this assistance wisely and effectively. For a real-time learning algorithm onboard a UAS, this means being able to learn from an incoming video feed while minimizing the human annotations and time required to do so, and moving around the environment to augment its learning. We define a metric, incremental training benefit (ITB) per annotation, that seeks to capture the value extracted for each annotation provided by the user.

	Rationale - It would have been obvious for one of ordinary skill in the art to substitute the online data source of Teng for the unspecified data source of Fu, et al. because it predictably allows the system to learn from an incoming feed while minimizing the human annotations and time required to do so.

           Claim 1's ''performing the online predictive machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally hierarchical predictions for the one or more prediction inputs; and'' is not expressly taught by Fu, et al. The “online machine learning model” is, however, taught by Teng, page iv, second full paragraph, where it recites:

In this work, we tackle the problem of online, real-time training of a deep learning algorithm. We refer to this type of problem as time-ordered online training (ToOT), of which the UAS use case is an example. We begin with the observation that in human students, learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. They may have help from a mentor or teacher, but must use this assistance wisely and effectively. For a real-time learning algorithm onboard a UAS, this means being able to learn from an incoming video feed while minimizing the human annotations and time required to do so, and moving around the environment to augment its learning. We define a metric, incremental training benefit (ITB) per annotation, that seeks to capture the value extracted for each annotation provided by the user.

	Rationale - It would have been obvious for one of ordinary skill in the art to substitute the online data source of Teng for the unspecified data source of Fu, et al. because it predictably allows the system to learn from an incoming feed while minimizing the human annotations and time required to do so.

           Claim 1's ''generating, based at least in part on the one or more structurally hierarchical predictions, the predictive output, wherein the predictive output indicates a selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions.'' is primarily taught by Fu, et al., page 892, right column, first partial paragraph, where it recites:

…techniques into the design, making the coarse-to-fine CNNs able to learn the hierarchical category tree. It needs much less additional parameters to convert a traditional flat CNN to perform a coarse-to-fine task than other hierarchical CNN architectures. We validate our model on MNIST, CIFAR-10, and CIFAR-100 datasets, showing clear benefits.

	Further, it is taught by Fu, et al., page 894, right column, first full paragraph, where it recites:

We evaluate coarse-to-fine CNNs on the benchmark datasets MNIST (in Section 4.2), CIFAR-10 (in Section 4.3), and CIFAR-100 (in Section 4.4) by comparing the performance of coarse-to-fine CNN models with their corresponding baseline traditional CNN models. While all the above coarse-to-fine CNNs are only about 2-level problems, an additional experiment on CIFAR-10 is carried in Section 4.5 to see that our model also can be applied to multi-level learning. Section 4.6 provides visual results of the classification of the proposed architecture; it gives a better understanding of how coarse-to-fine CNNs work. Finally, we will have a summary of all the experiments at the end of this section. Our construction of the category trees follows the principle of [24]: they rely on the fact that there exist natural groupings amongst object instances such as shape, colour, scale and location.

Claim 8
           Claim 8's ''obtain access to an online machine learning model, wherein (i) the online machine learning model is configured to perform an online machine learning analysis associated with a hierarchical prediction domain based at least in part on one or more prediction inputs to generate one or more structurally hierarchical predictions, (ii) the hierarchical prediction domain is associated with one or more hierarchical predictive relationships among a plurality of prediction nodes, (iii) the one or more hierarchical predictive relationships define, for each of the plurality of prediction nodes, a corresponding hierarchical predictive position, and (iv) each structurally hierarchical prediction is determined based at least in part on the hierarchical predictive position of the corresponding prediction node;'' is primarily taught by Fu, et al., page 892, left column, last full paragraph, where it recites:

In this paper, we introduce Bayesian techniques into the hierarchical classification of CNNs and propose a hierarchical CNN architecture called coarse-to-fine CNN. A coarse-to-fine CNN is simple, only with a coarse-to-fine layer on the top of a generic CNN. The coarse-to-fine layer is inspired by the Bayesian equation. One of the layer's inputs is the result of coarse classifications; it will directly affect the fine classification, which is the layer's output, in both training phase and testing phase. Compare to other related works, this operation is more consistent with human understandable concepts. In this paper, we are interested only in tree-based hierarchical class structure (a node can only have one parent node). DAG based (a node can have more than one parent node) is out of the scope of this paper.

	Further, “hierarchical predictions” are primarily taught by Fu, et al., page 898, left column, last full paragraph, where it recites:

In this paper, we propose a novel hierarchical CNN architecture, called coarse-to-fine CNN. It is an end-to-end model, with a simpler structure than other related hierarchical CNNs. The model outputs multiple hierarchical predictions from coarse to fine simultaneously and the coarse prediction affects the fine prediction directly using a coarse-to-fine layer, which is inspired by the Bayesian equation. The proposed coarse-to-fine layer has a clear forward and backward propagations. Our models improve over the corresponding baseline CNNs on several benchmark datasets. Besides, we reveal that the performance of coarse-to-fine CNNs is influenced by different category trees. For future work, it is important to explore the optimal structure of category tree for coarse-to-fine CNNs.

	The claimed “online machine learning model” is not expressly taught by Fu, et al. The “online machine learning model” is, however, taught by Teng, page iv, second full paragraph, where it recites:

In this work, we tackle the problem of online, real-time training of a deep learning algorithm. We refer to this type of problem as time-ordered online training (ToOT), of which the UAS use case is an example. We begin with the observation that in human students, learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. They may have help from a mentor or teacher, but must use this assistance wisely and effectively. For a real-time learning algorithm onboard a UAS, this means being able to learn from an incoming video feed while minimizing the human annotations and time required to do so, and moving around the environment to augment its learning. We define a metric, incremental training benefit (ITB) per annotation, that seeks to capture the value extracted for each annotation provided by the user.

	Rationale - It would have been obvious for one of ordinary skill in the art to substitute the online data source of Teng for the unspecified data source of Fu, et al. because it predictably allows the system to learn from an incoming feed while minimizing the human annotations and time required to do so.

           Claim 8's ''perform the online predictive machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally hierarchical predictions for the one or more prediction inputs; and'' is not expressly taught by Fu, et al. The “online machine learning model” is, however, taught by Teng, page iv, second full paragraph, where it recites:

In this work, we tackle the problem of online, real-time training of a deep learning algorithm. We refer to this type of problem as time-ordered online training (ToOT), of which the UAS use case is an example. We begin with the observation that in human students, learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. They may have help from a mentor or teacher, but must use this assistance wisely and effectively. For a real-time learning algorithm onboard a UAS, this means being able to learn from an incoming video feed while minimizing the human annotations and time required to do so, and moving around the environment to augment its learning. We define a metric, incremental training benefit (ITB) per annotation, that seeks to capture the value extracted for each annotation provided by the user.

	Rationale - It would have been obvious for one of ordinary skill in the art to substitute the online data source of Teng for the unspecified data source of Fu, et al. because it predictably allows the system to learn from an incoming feed while minimizing the human annotations and time required to do so.

           Claim 8's ''generate, based at least in part on the one or more structurally hierarchical predictions, the predictive output, wherein the predictive output indicates a selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions.'' is primarily taught by Fu, et al., page 892, right column, first partial paragraph, where it recites:

…techniques into the design, making the coarse-to-fine CNNs able to learn the hierarchical category tree. It needs much less additional parameters to convert a traditional flat CNN to perform a coarse-to-fine task than other hierarchical CNN architectures. We validate our model on MNIST, CIFAR-10, and CIFAR-100 datasets, showing clear benefits.

	Further, it is taught by Fu, et al., page 894, right column, first full paragraph, where it recites:

We evaluate coarse-to-fine CNNs on the benchmark datasets MNIST (in Section 4.2), CIFAR-10 (in Section 4.3), and CIFAR-100 (in Section 4.4) by comparing the performance of coarse-to-fine CNN models with their corresponding baseline traditional CNN models. While all the above coarse-to-fine CNNs are only about 2-level problems, an additional experiment on CIFAR-10 is carried in Section 4.5 to see that our model also can be applied to multi-level learning. Section 4.6 provides visual results of the classification of the proposed architecture; it gives a better understanding of how coarse-to-fine CNNs work. Finally, we will have a summary of all the experiments at the end of this section. Our construction of the category trees follows the principle of [24]: they rely on the fact that there exist natural groupings amongst object instances such as shape, colour, scale and location.

Claim 15
           Claim 15's ''obtain access to an online machine learning model, wherein (i) the online machine learning model is configured to perform an online machine learning analysis associated with a hierarchical prediction domain based at least in part on one or more prediction inputs to generate one or more structurally hierarchical predictions, (ii) the hierarchical prediction domain is associated with one or more hierarchical predictive relationships among a plurality of prediction nodes, (iii) the one or more hierarchical predictive relationships define, for each of the plurality of prediction nodes, a corresponding hierarchical predictive position, and (iv) each structurally hierarchical prediction is determined based at least in part on the hierarchical predictive position of the corresponding prediction node;'' is primarily taught by Fu, et al., page 892, left column, last full paragraph, where it recites:

In this paper, we introduce Bayesian techniques into the hierarchical classification of CNNs and propose a hierarchical CNN architecture called coarse-to-fine CNN. A coarse-to-fine CNN is simple, only with a coarse-to-fine layer on the top of a generic CNN. The coarse-to-fine layer is inspired by the Bayesian equation. One of the layer's inputs is the result of coarse classifications; it will directly affect the fine classification, which is the layer's output, in both training phase and testing phase. Compare to other related works, this operation is more consistent with human understandable concepts. In this paper, we are interested only in tree-based hierarchical class structure (a node can only have one parent node). DAG based (a node can have more than one parent node) is out of the scope of this paper.

	Further, “hierarchical predictions” are primarily taught by Fu, et al., page 898, left column, last full paragraph, where it recites:

In this paper, we propose a novel hierarchical CNN architecture, called coarse-to-fine CNN. It is an end-to-end model, with a simpler structure than other related hierarchical CNNs. The model outputs multiple hierarchical predictions from coarse to fine simultaneously and the coarse prediction affects the fine prediction directly using a coarse-to-fine layer, which is inspired by the Bayesian equation. The proposed coarse-to-fine layer has a clear forward and backward propagations. Our models improve over the corresponding baseline CNNs on several benchmark datasets. Besides, we reveal that the performance of coarse-to-fine CNNs is influenced by different category trees. For future work, it is important to explore the optimal structure of category tree for coarse-to-fine CNNs.

	The claimed “online machine learning model” is not expressly taught by Fu, et al. The “online machine learning model” is, however, taught by Teng, page iv, second full paragraph, where it recites:

In this work, we tackle the problem of online, real-time training of a deep learning algorithm. We refer to this type of problem as time-ordered online training (ToOT), of which the UAS use case is an example. We begin with the observation that in human students, learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. They may have help from a mentor or teacher, but must use this assistance wisely and effectively. For a real-time learning algorithm onboard a UAS, this means being able to learn from an incoming video feed while minimizing the human annotations and time required to do so, and moving around the environment to augment its learning. We define a metric, incremental training benefit (ITB) per annotation, that seeks to capture the value extracted for each annotation provided by the user.

	Rationale - It would have been obvious for one of ordinary skill in the art to substitute the online data source of Teng for the unspecified data source of Fu, et al. because it predictably allows the system to learn from an incoming feed while minimizing the human annotations and time required to do so.

           Claim 15's ''perform the online predictive machine learning analysis based at least in part on the one or more prediction inputs to generate the one or more structurally hierarchical predictions for the one or more prediction inputs; and'' is not expressly taught by Fu, et al. The “online machine learning model” is, however, taught by Teng, page iv, second full paragraph, where it recites:

In this work, we tackle the problem of online, real-time training of a deep learning algorithm. We refer to this type of problem as time-ordered online training (ToOT), of which the UAS use case is an example. We begin with the observation that in human students, learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. They may have help from a mentor or teacher, but must use this assistance wisely and effectively. For a real-time learning algorithm onboard a UAS, this means being able to learn from an incoming video feed while minimizing the human annotations and time required to do so, and moving around the environment to augment its learning. We define a metric, incremental training benefit (ITB) per annotation, that seeks to capture the value extracted for each annotation provided by the user.

	Rationale - It would have been obvious for one of ordinary skill in the art to substitute the online data source of Teng for the unspecified data source of Fu, et al. because it predictably allows the system to learn from an incoming feed while minimizing the human annotations and time required to do so.

           Claim 15's ''generate, based at least in part on the one or more structurally hierarchical predictions, the predictive output, wherein the predictive output indicates a selected node of the plurality of prediction nodes for at least some of the structurally hierarchical predictions.'' is primarily taught by Fu, et al., page 892, right column, first partial paragraph, where it recites:

…techniques into the design, making the coarse-to-fine CNNs able to learn the hierarchical category tree. It needs much less additional parameters to convert a traditional flat CNN to perform a coarse-to-fine task than other hierarchical CNN architectures. We validate our model on MNIST, CIFAR-10, and CIFAR-100 datasets, showing clear benefits.

	Further, it is taught by Fu, et al., page 894, right column, first full paragraph, where it recites:

We evaluate coarse-to-fine CNNs on the benchmark datasets MNIST (in Section 4.2), CIFAR-10 (in Section 4.3), and CIFAR-100 (in Section 4.4) by comparing the performance of coarse-to-fine CNN models with their corresponding baseline traditional CNN models. While all the above coarse-to-fine CNNs are only about 2-level problems, an additional experiment on CIFAR-10 is carried in Section 4.5 to see that our model also can be applied to multi-level learning. Section 4.6 provides visual results of the classification of the proposed architecture; it gives a better understanding of how coarse-to-fine CNNs work. Finally, we will have a summary of all the experiments at the end of this section. Our construction of the category trees follows the principle of [24]: they rely on the fact that there exist natural groupings amongst object instances such as shape, colour, scale and location.

Conclusion
	Any inquiries concerning this communication or earlier communications from the examiner should be directed to Wilbert L. Starks, Jr., who may be reached Monday through Friday, between 8:00 a.m. and 5:00 p.m. EST. or via telephone at (571) 272-3691 or email:  Wilbert.Starks@uspto.gov.

	If you need to send an Official facsimile transmission, please send it to (571) 273-8300. 

	If attempts to reach the examiner are unsuccessful the Examiner’s Supervisor (SPE), Kakali Chaki, may be reached at (571) 272-3719.

	Hand-delivered responses should be delivered to the Receptionist @ (Customer Service Window Randolph Building 401 Dulany Street, Alexandria, VA 22313), located on the first floor of the south side of the Randolph Building. 

	Finally, information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Moreover, status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have any questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) toll-free @ 1-866-217-9197.

            /WILBERT L STARKS/
            Primary Examiner, Art Unit 2122

WLS
11 SEP 2022