DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-20 are pending of which claims 1, 11, 19 and 20 are in independent form.
Claims 1-20 are rejected under 35 U.S.C. 103.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-4, 6, 11, 15, 16, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rolfe; Jason T. et al. (US 20200401916 A1) [Rolfe1] in view of Wang; Jian et al. (US 20180150785 A1) [Wang] in view of Rolfe; Jason Tyler et al. (US 20190244680 A1) [Rolfe2].

	Regarding claims 1, 11, 19 and 20, Rolfe1 discloses, a training apparatus for training a network including a first network and a second network, the network being configured to infer a feature of an input graph (At 1606, after the network has been trained at 1604 ¶ [0220]), the training apparatus comprising one or more memories and one or more processors, wherein the one or more processors (see Fig. 1) are configured to: 
merge, by the first network, first hidden vectors of first nodes of the input graph and a second hidden vector of a second node coupled to each of the first nodes, based on the first hidden vectors, the second hidden vector and information on coupling between the first nodes (The VAE may then produce a latent representation for each node of the graph-representable input objects by combining the hidden state vectors of the multiple spanning trees (e.g., by addition, concatenation, averages, maximums) ¶ [0278], [0279], [0296], [0297]); 
the feature of the input graph (FIG. 11 is a flowchart of an example method for training an example generative machine learning model based on graph-representable inputs. At act 1102 a processor receives linear representations of a graph-representable input object, as described above. At act 1104 the processor determines a plurality of spanning trees based on a plurality of tree traversals of the input object, as described above. At act 1106 the processor encodes the plurality of spanning trees to obtain a plurality of hidden vectors corresponding to the spanning trees, as described above. Such encoding may comprise, for example, processing each of the spanning trees with an RNN (such as an LSTM). At act 1108 the hidden vectors are combined, e.g., by addition, concatenation, maximization, or another suitable technique. In some implementations, the combination technique is commutative ¶ [0296]); 
 calculate a loss of the feature of the input graph  (This confluence of features tends to avoid pathological regions of the latent space by tending to limit the entreating of non-existent connections, compensate for the loss of representational richness caused by the shift to a non-hierarchical approximating posterior distribution, and induce some slight overfitting ¶ [0374]).
However Rolfe1 does not explicitly facilitate update, by the first network, the first hidden vectors and the second hidden vector, based on a result of the merging; extract, from the second network, [the feature of the input graph], based on the updated first hidden vectors and the updated second hidden vector.
	Wang discloses, update, by the first network, the first hidden vectors and the second hidden vector, based on a result of the merging (The method 600 continues at operation 608 with determining if the changes in regression coefficients and hidden feature vector are below predetermined thresholds. For example, there may be predetermined thresholds for change in the regression coefficients and the change in the hidden feature vector ¶ [0067], [0090], [0099], [0104]); 
extract, from the second network, [the feature of the input graph], based on the updated first hidden vectors and the updated second hidden vector (The method 600 continues at operation 608 with determining if the changes in regression coefficients and hidden feature vector are below predetermined thresholds. For example, there may be predetermined thresholds for change in the regression coefficients and the change in the hidden feature vector ¶ [0067], [0090], [0099], [0104]).
It would have been obvious to one ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Wang’s system would have allowed Rolfe1 to facilitate update, by the first network, the first hidden vectors and the second hidden vector, based on a result of the merging; extract, from the second network, [the feature of the input graph], based on the updated first hidden vectors and the updated second hidden vector. The motivation to combine is apparent in the Rolfe1’s reference, because there is a need to improve machine learning techniques for vector modelling within a network service, including software-configured computerized variants of such special-purpose machines and improvements to such variants, and to the technologies by which such special-purpose machines become improved compared to other special-purpose machines that facilitate generating machine learned vector models to identify latent vectors within a database and generate recommendations using the latent vectors.
However neither Rolfe1 nor Wang explicitly facilitate update at least one of the first network or the second network, based on the calculated loss.
Rolfe2 discloses, update at least one of the first network or the second network, based on the calculated loss (This allows the use of the gradient of the decoder to estimate the change in the loss function, since the gradient of the decoder captures the effect of small changes in the location of a selected packet in the latent space ¶ [0216], [0219]). 
It would have been obvious to one ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Rolfe2’s system would have allowed Rolfe1 and Wang to facilitate update at least one of the first network or the second network, based on the calculated loss. The motivation to combine is apparent in the Rolfe1 and Wang’s reference, because there is a need to advantageously implement machine learning over large datasets.

Regarding claim 2, the combination of Rolfe1, Wang and Rolfe2 discloses, wherein the one or more processors are further configured to: calculate a weight with respect to the first hidden vector and a weight with respect to the second hidden vector (Wang: In instances, where the vector component 230 comprises the first sub-component and the second sub-component, the first sub-component cooperates to iterate the first regression coefficients and the second regression coefficients while the second sub-component holds constant the values for the first hidden feature vector and the second hidden feature vector ¶ [0062]-[0064]); 
merge the first hidden vector and the second hidden vector based on the calculated weights to create a first update hidden vector and a second update hidden vector (Rolfe1: The VAE may then produce a latent representation for each node of the graph-representable input objects by combining the hidden state vectors of the multiple spanning trees (e.g., by addition, concatenation, averages, maximums) ¶ [0278], [0279], [0296], [0297]); and 
update the first hidden vector based on the first update hidden vector and update the second hidden vector based on the second update hidden vector (Rolfe2: This allows the use of the gradient of the decoder to estimate the change in the loss function, since the gradient of the decoder captures the effect of small changes in the location of a selected packet in the latent space ¶ [0216], [0219]).

Regarding claims 3 and 16, the combination of Rolfe1, Wang and Rolfe2 discloses, wherein the one or more processors are further configured to update a ratio at which the first hidden vector and the second hidden vector are merged (Wang: The hidden similarity value may also be determined as a ratio of common elements, a semantic distance, a Hamming distance, or any other value indicating a similarity between the set of common elements and the attributes of the specified member profile. In some embodiments, the hidden similarity value ¶ [0067]).

Regarding claim 4, the combination of Rolfe1, Wang and Rolfe2 discloses, wherein the merging includes a gate operation (Rolfe1: The digital processor(s) 106 may be any logic processing unit or circuitry (e.g., integrated circuits), such as one or more central processing units (“CPUs”), graphics processing units (“GPUs”), digital signal processors (“DSPs”), application-specific integrated circuits (“ASICs”), programmable gate arrays (“FPGAs”)).

Regarding claim 6, the combination of Rolfe1, Wang and Rolfe2 discloses, wherein the one or more processors are further configured to: update the first hidden vector based on the first hidden vector updated in a preceding layer and on a result of the merging (Rolfe2: This allows the use of the gradient of the decoder to estimate the change in the loss function, since the gradient of the decoder captures the effect of small changes in the location of a selected packet in the latent space ¶ [0216], [0219]); and 
update the second hidden vector based on the second hidden vector updated in the preceding layer and on the result of the merging (Rolfe2: This allows the use of the gradient of the decoder to estimate the change in the loss function, since the gradient of the decoder captures the effect of small changes in the location of a selected packet in the latent space ¶ [0216], [0219]).

Regarding claim 15, the combination of Rolfe1, Wang and Rolfe2 discloses, wherein the input graph is a chemical molecular graph, and the one or more processors are further configured to extract, as the feature amount of the second node, at least one of: a number of aromatic rings within the input graph, a name of the input graph, or a chirality of the input graph (Rolfe1: The generative machine learning model may receive linear representations of non-linear graph-representable objects; for example, it may receive string representations (e.g., in SMILES or InChI formats) of molecules (which may comprise non-linearities such as loops, e.g. in the form of aromatic rings) ¶ [0278]. SMILES strings can be translated into multiple spanning trees by reconstructing a molecular graph and attempting various tree traversals ¶ [0279], [0294], [0295]). 


Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Rolfe; Jason T. et al. (US 20200401916 A1) [Rolfe1] in view of Wang; Jian et al. (US 20180150785 A1) [Wang] in view of Rolfe; Jason Tyler et al. (US 20190244680 A1) [Rolfe2] in view of Eyuboglu; Vedat et al. (US 11140695 B1) [Eyuboglu].

Regarding claim 5, the combination of Rolfe1, Wang and Rolfe2 teaches all the elements of claim 1.
However neither one of Rolfe1, Wang or Rolfe2 explicitly facilitate wherein the one or more processors are further configured to: create: a first message to be transmitted from each of the first nodes updated in a preceding layer to a first node coupled thereto; a second message to be transmitted from each of the first nodes updated in the preceding layer to the second node; a third message to be transmitted from the second node updated in the preceding layer to the first node; and a fourth message to be transmitted from the second node updated in the preceding layer to the second node; and merge the first hidden vector and the second hidden vector, based at least in part on the first message, the second message, the third message and the fourth message.
Eyuboglu discloses, wherein the one or more processors are further configured to: create: a first message to be transmitted from each of the first nodes updated in a preceding layer to a first node coupled thereto; a second message to be transmitted from each of the first nodes updated in the preceding layer to the second node; a third message to be transmitted from the second node updated in the preceding layer to the first node; and a fourth message to be transmitted from the second node updated in the preceding layer to the second node; and merge the first hidden vector and the second hidden vector, based at least in part on the first message, the second message, the third message and the fourth message (Upon scheduling transmissions in a time interval up to time t, the first mesh node sends a control message to the second mesh node at a time t≤t. Upon receiving the control message from the first mesh node, the second mesh node schedules transmissions in a time interval up to time t [col. 2, ll. 51-56]. Also see [col. 3, ll. 15-50], [col. 5, ll. 50-col. 7, ll. 36]).
It would have been obvious to one ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Eyuboglu’s system would have allowed Rolfe1, Wang and Rolfe2 to facilitate wherein the one or more processors are further configured to: create: a first message to be transmitted from each of the first nodes updated in a preceding layer to a first node coupled thereto; a second message to be transmitted from each of the first nodes updated in the preceding layer to the second node; a third message to be transmitted from the second node updated in the preceding layer to the first node; and a fourth message to be transmitted from the second node updated in the preceding layer to the second node; and merge the first hidden vector and the second hidden vector, based at least in part on the first message, the second message, the third message and the fourth message. The motivation to combine is apparent in the Rolfe1, Wang and Rolfe2’s reference, because there is a need to improve mesh network systems and methods for cellular networks that are spectrally efficient and can be deployed easily in higher frequency bands.


Claim(s) 7, 8, 10, 12-14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Rolfe; Jason T. et al. (US 20200401916 A1) [Rolfe1] in view of Wang; Jian et al. (US 20180150785 A1) [Wang] in view of Rolfe; Jason Tyler et al. (US 20190244680 A1) [Rolfe2] in view YIN; Jianwei et al. (US 20200110777 A1) [Yin].

Regarding claims 7, 12 and 13, the combination of Rolfe1, Wang and Rolfe2 discloses, wherein the one or more processors are further configured to: calculate the first hidden vectors based on a feature amount of each first node (Rolfe1: The VAE may then produce a latent representation for each node of the graph-representable input objects by combining the hidden state vectors of the multiple spanning trees (e.g., by addition, concatenation, averages, maximums) ¶ [0278]); and 
calculate the second hidden vector based on a feature amount of the second node (Rolfe2: This allows the use of the gradient of the decoder to estimate the change in the loss function, since the gradient of the decoder captures the effect of small changes in the location of a selected packet in the latent space ¶ [0216], [0219]).
However neither one of Rolfe1, Wang or Rolfe2 explicitly facilitates extract the information on the coupling between the first nodes from the input graph.
Yin discloses, extract the information on the coupling between the first nodes from the input graph (The present invention concentrates the connection information in the adjacency matrix into the diagonal region of the adjacency matrix, and further uses the filter matrix to extract the subgraph structure of the graph in the diagonal direction, greatly reducing the computational complexity. The graph feature extraction system with the connection information regularization module has a much smaller amount of computation than the graph feature extraction system without the module ¶ [0028], [0062], [0218] and [0247]).
It would have been obvious to one ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Yin’s system would have allowed Rolfe1, Wang and Rolfe2 to facilitate extract the information on the coupling between the first nodes from the input graph. The motivation to combine is apparent in the Rolfe1, Wang and Rolfe2’s reference, because there is a need to improve effectively extract the larger subgraph structure in the graph, and thus perform a very good feature representation of the graph.

Regarding claim 8, the combination of Rolfe1, Wang, Rolfe2 and Yin discloses, wherein the one or more processors are further configured to extract observation information regarding the input graph as the feature amount of the second node (Yin: The present invention concentrates the connection information in the adjacency matrix into the diagonal region of the adjacency matrix, and further uses the filter matrix to extract the subgraph structure of the graph in the diagonal direction, greatly reducing the computational complexity. The graph feature extraction system with the connection information regularization module has a much smaller amount of computation than the graph feature extraction system without the module ¶ [0028], [0062], [0218] and [0247]).

Regarding claims 10 and 18, the combination of Rolfe1, Wang, Rolfe2 and Yin discloses, wherein the information on the coupling includes at least one of an adjacency matrix between the first nodes or a tensor expressing coupling between the first nodes (Yin: A method and system of graph feature extraction and graph classification based on adjacency matrix is provided. The invention first concentrates the connection information elements in the adjacency matrix into a specific diagonal region of the adjacency matrix which reduces the non-connection information elements in advance [Abstract]. Also see ¶ [0014]-[0019]).

Regarding claim 14, the combination of Rolfe1, Wang, Rolfe2 and Yin discloses, wherein the one or more processors are further configured to extract, as the feature amount of the second node, at least one of: a number of the first nodes of the input graph, a number of kinds of the first nodes, a number of kinds of edges mutually coupling the first nodes, a diameter of the input graph, a graph Laplacian of the input graph, a normalized graph Laplacian of the input graph, or a number of connected sub-graphs in the input graph when the input graph is split into more than two sub-graphs which are not connected each other (Yin: the degree of the vertex is the number of edges associated with the vertex; the out-degree of the vertex is the number of edges start from the vertex and point to other vertices; the in-degree of the vertex is the number of edges starting from other vertices and point to the vertex ¶ [0071], [0074]).


Claim(s) 9 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Rolfe; Jason T. et al. (US 20200401916 A1) [Rolfe1] in view of Wang; Jian et al. (US 20180150785 A1) [Wang] in view of Rolfe; Jason Tyler et al. (US 20190244680 A1) [Rolfe2] in view SRIDHARAN; SRINIVAS et al. (US 20180322386 A1) [Sridharan].

Regarding claims 9 and 17, the combination of Rolfe1, Wang and Rolfe2 teaches all the limitations of claim 1.
However neither one of Rolfe1, Wang or Rolfe2 explicitly facilitates wherein the second node coupled to each of the first nodes is a node that is virtually created.
Sridharan discloses, wherein the second node coupled to each of the first nodes is a node that is virtually created (A negotiation can be performed such that a memory address within each node is associated with a virtual address within a distributed virtual address space 1730. In one embodiment, a specific physical address range in each node is mapped to the virtual addresses associated with the node, such that the same physical addresses in each node is mapped to the distributed virtual address space 1730. The distributed virtual address mapping is exchanged between nodes, such that each node is aware of the address range for each other node ¶ [0227], [0232]).
It would have been obvious to one ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Sridharan’s system would have allowed Rolfe1, Wang and Rolfe2 to facilitate wherein the second node coupled to each of the first nodes is a node that is virtually created. The motivation to combine is apparent in the Rolfe1, Wang and Rolfe2’s reference, because there is a need to improve data processing and more particularly to data processing via a general-purpose graphics processing unit.

Conclusion
The examiner requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111(c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD S ROSTAMI whose telephone number is (571)270-1980. The examiner can normally be reached Mon-Fri From 9 a.m. to 5 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain T Alam can be reached on (571)272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





9/20/2022
/MOHAMMAD S ROSTAMI/               Primary Examiner, Art Unit 2154