DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.

Claim Interpretation.
Claim 1, line 12, (and other claims) recites “hidden features”.  The Examiner notes that the Written Opinion submitted by Applicant characterizes this term as “vague and unclear” (see the Written Opinion at section 1.2).  However, the application describes this term in the context of the invention and uses the term in examples (e.g., see pages 20-21 of the application as filed).  The term is also known in the art of neural networks (see the discussion of the art under “Allowable Subject Matter”).  Therefore, the Examiner is of the opinion that the term is sufficiently definite.

Claim Rejections - 35 USC § 112 - Indefinite
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-6, 8, 9, 11, 14, 16-18, 24, and 26-31 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1, line 4, recites “the properties of one device”.  It is not clear if this is a reference to the first device (e.g., see line 1: “properties of a first device”; in which case line 4 should be “the properties of the first [[one]] device”, or similar language) or if it is a reference to one of the plurality of second devices (e.g., see line 2: “the properties of a plurality of second devices”; in which case line 4 should be “the properties of one of the second devices”, or similar language), or if it is a reference to a device different than the first and second devices (in which case the term should be distinguished from the first and second devices, such as “a third device”).  In any event, clarification is required.
Claims 2-6, 8, 9, 11, 14, 16, and 17 are rejected because they depend from claim 1 and fail to further limit the scope in a manner to overcome the rejections.
Claim 18, line 4, recites “the properties of one device”.  This is rejected for the reasons discussed in claim 1.
Claim 24 is rejected because it depends from claim 18 and fails to further limit the scope in a manner to overcome the rejection.
 Claim 26, line 5, recites “the properties of one device”.  This is rejected for the reasons discussed in claim 1.
Claims 27 and 28 are rejected because it depends from claim 26 and fails to further limit the scope in a manner to overcome the rejection.
Claim 29, line 5, recites “the properties of one device”.  This is rejected for the reasons discussed in claim 1.
Claims 30 and 31 are rejected because it depends from claim 29 and fails to further limit the scope in a manner to overcome the rejection.

Allowable Subject Matter
Claims 1-6, 8, 9, 11, 14, 16-18, 24, and 26-31 would be allowable if rewritten or amended to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action.  The following is a statement of reasons for the indication of allowable subject matter.
US 2016/0086087 (Ghouti) teaches machine learning that predicts parameters (e.g., impurities or components) in gas compositions in a multistage separator.  See, for example:  
[0009] The method of predicting gas compositions relates to predicting gas composition in a multistage separator. Particularly, solutions to the regression problem of gas composition prediction are developed using extreme learning machines (ELMs) for defining the optimal predictor weights and non-negative matrix factorization to extract parts-based features from a set of properties of a reservoir.

[0072] The present invention relates to a method of predicting gas compositions in a multistage separator, particularly using an extreme learning machine in combination with an optimal feature extractor based on non-negative matrix decomposition (NMF) algorithms. Particularly, solutions to the regression problem of gas composition prediction are developed using extreme learning machines (ELMs) for defining the optimal predictor weights and non-negative matrix factorization to extract parts-based features from a set of properties of a reservoir.

FIG. 1 illustrates a multi-stage separate (e.g., see [0034]) with a Reservoir R, a first stage separator (Stage 1) receiving the Oil (Oil 1) and a tank holding the gas G1 extracted from Oil 1, a second stage separator (Stage 2) receiving oil (Oil 2) into a holding tank after processing in Stage 1, and a tank holding the gas G2 extracted from Oil 2, and a third stage separator (Stock Tank) receiving oil (Oil 3) into a holding tank after processing in Stage 2, and a tank holding the gas G3 extracted from Oil 3.  See [0007].  

    PNG
    media_image1.png
    734
    716
    media_image1.png
    Greyscale


More specifically, FIG. 2 illustrates a Multi-Layer perceptron (MLP) Artificial Neural Network (ANN).  

    PNG
    media_image2.png
    708
    727
    media_image2.png
    Greyscale

See also FIG. 4.  In other words, it illustrates a conventional multi-layer neural network.  FIG. 3 illustrates a more detailed view of a neuron with a sigmoidal activation function according to the teachings of the application.

    PNG
    media_image3.png
    269
    635
    media_image3.png
    Greyscale

FIG. 8 illustrates a method of implementing the ANN including the target device (e.g., the Multistage Separator), Data Storage, and Feature Selection using Non-Negative Matrix decomposition (NMF) algorithms.

    PNG
    media_image4.png
    765
    728
    media_image4.png
    Greyscale


The Examiner notes that the Written Opinion submitted by Applicant cites Ghouti as an “X” reference and cited claim 1 of Ghouti for support against claim 1 of the present application.  While claim 1, and Ghouti as a whole, does teach the general subject matter, it fails to teach the particulars of the claimed subject matter.  For example, claim 1, step a of Ghouti is cited against the first paragraph of claim 1 in the present application.  While it does teach obtaining data related to the property of the subject analysis, it fails to teach predicting properties of a first device recited in the preamble of claim 1 of the present application.  As illustrated above, Ghouti teaches a gas reservoir, but this is not a first device as recited in the present application.  Arguably, the various stages of the multistage separator in FIG. 1 of Ghouti is a plurality of second devices, Ghouti does not teach obtaining data related to the properties of the stages of the separator, but rather it obtains data related to the properties of the Oil and Gas within the separator.  Similar differences in the remaining elements of claim 1 are also present.  In other words, while Ghouti teaches the general subject matter of ANN and prediction, it lacks the particulars in the claim.
While Ghouti teaches to organize data in a matrix and factorize the matrix (e.g., see [0072]: “… Particularly, solutions to the regression problem of gas composition prediction are developed using extreme learning machines (ELMs) for defining the optimal predictor weights and non-negative matrix factorization to extract parts-based features from a set of properties of a reservoir.”), it fails to teach the particular implementation recited in the second, third, and fourth paragraphs of claim 1.
These differences are made more clear when the claimed subject matter of the present applicaiton and the teachings of Ghouti are considered in the context of the respective teachings.  Ghouti teaches in the field of oil and gas separation (e.g., the petroleum industry; see [0002]), while the present application is directed to optical data communications (e.g., see page 1, second through fourth paragraphs under “Background”).  These fundamental differences in teachings and claimed subject matter help to explain the difficulty in mapping the teachings from Ghouti to the present claims.
US 2017/0140278 (Gupta) teaches analyzing a big data set using Machine Learning Algorithms.  See, for example:
[0019] FIG. 1 is a block diagram of a DSS for configuring a data processing system for analyzing a big data dataset in accordance with some embodiments of the inventive subject matter. A DSS big data environment advisor data processing system 105 is configured to receive a big data dataset comprising new active data along with a prediction request to predict the performance of a data processing system with respect to one or more performance parameters in analyzing the new active data. The big data environment advisor data processing system 105 may generate the performance prediction based on historical job metadata corresponding to previous big data datasets that have been analyzed and based on various machine learning algorithms that have been used in predicting the performance of analyzing previous big data datasets, which have had their accuracy evaluated based on actual results.
FIG. 2 illustrates a generic data processing system that may be used to implement the Machine Learning Algorithm.

    PNG
    media_image5.png
    709
    574
    media_image5.png
    Greyscale

FIG. 3 is a block diagram that illustrates a software/hardware architecture for configuring a data processing system.

    PNG
    media_image6.png
    826
    523
    media_image6.png
    Greyscale

In particular, a prediction is made as to which machine learning algorithm will be best suited for predicting particular performance parameters of the system.  See, for example:
[0025] The data classification module 325 may be configured to collect metadata corresponding to the analysis jobs performed previously on other big data datasets by various data processing systems and data processing system configurations including the data processing system target for a current active data dataset. The algorithm mapping module 330 may be configured to select a machine learning algorithm form a plurality of machine learning algorithm that may be the most accurate in determining a prediction for the performance of a data processing system in analyzing a current active data dataset. This selection may be made based on one or more previous predictions with respect to various data processing systems and data processing system configurations. The prediction engine module 335 may be configured to generate a prediction of the performance of a data processing system with respect to one or more performance parameters in response to a request identifying the one or more performance parameters and new active data forming part of a big data dataset to be analyzed. The prediction engine module 335 may select a group of historical metadata (i.e., metadata for data that has already been analyzed by one or more data processing systems) that most closely matches the metadata of the new active data to be analyzed from the data classification module 325 and may select a machine learning algorithm that is the most efficient at generating a prediction for the particular performance parameter(s) from the algorithm mapping module 330. The prediction engine module 335 may then apply the particular machine learning algorithm received from the algorithm mapping module 330 to the group of historical metadata to build a prediction model, which may be an equation, graph, or other mechanism for specifying a relationship between the data points in the group of historical metadata. The prediction model may then be applied to the metadata of the new active data to generate a prediction of the level of performance with respect to one or more performance parameters in analyzing the new active data on the data processing system. The data center management interface module 340 may be configured to communicate changes to a configuration of a data processing system based on the prediction generated by the prediction engine module 335. The DSS big data environment advisor data processing system 105 may be integrated as part of a data center management system or may be a stand-alone system that communicates with a data center management system over a network or suitable communication connection.

FIG. 4 illustrates functional relationships between the modules of FIG. 3.

    PNG
    media_image7.png
    765
    493
    media_image7.png
    Greyscale

In particular, the Algorithm Mapping Module 330 is illustrated within the broken line box in the upper right, and the Data Classification Module 325 is illustrated within the broken line box at the bottom.  With regard to algorithm mapping, see:
[0031] … The algorithm mapping module 330 may provide to the prediction engine 335 the machine learning algorithm that has resulted in the most accurate predictions for a particular performance parameter at block 435. The algorithm mapping module 330 may also provide one or more default machine learning algorithms when no historical prediction accuracy data is available for a particular performance parameter. Various machine learning algorithms can be used in accordance with embodiments of the inventive subject matter, including, but not limited to, kernel density estimation, K-means, kernel principal components analysis, linear regression, neighbors, non-negative matrix factorization, support vector machines, dimensionality reduction, fast singular value decomposition, and decision tree.
While this teaches to select an algorithm with the most accurate predictions for a particular performance parameter, it fails to teach “the data relating to the properties of one device comprise respective values of a first parameter at specific values of a second parameter”.   Similarly, while Gupta teaches organize data in a matrix and factorize the matrix (e.g., see [0031]: “Various machine learning algorithms can be used in accordance with embodiments of the inventive subject matter, including, but not limited to, … non-negative matrix factorization …”), it fails to teach the particular implementation recited in the second, third, and fourth paragraphs of claim 1.
US 2018/0174046 (Xiao) teaches that neural networks can be used to reveal hidden features in data.  See, for example:
[0004] Neural network is a large-scale, multi-parameter optimization tool. Depending on a lot of training data, neural network can learn hidden features that are difficult to summarize in the data, thus completing a number of complex tasks, such as face detection, picture classification, object detection, action tracking, natural language translation. Neural network has been widely used in the field of artificial intelligence. At present, the most widely used neural network in target detection, such as pedestrian detection, is convolutional neural network. There are two main problems that plague the current pedestrian target detection method: first, generation of a large number of “false positive” detection results, that is, a non-target area is marked as a target; second, incapability of automatically detecting some targets from the neural network due to light, target gestures and other effects. This is because during training and detection of the neural network for target detection, a position of the target in the picture is always generated directly, without fully considering division of this process and iterative training for the network, nor considering other factors that can assist in training and improving detection accuracy.

US 2018/0266680 (Arabi) teaches that neural networks can be used to reveal hidden features in data.  See, for example:
[0095] Lastly, an example of a method for monitoring a burning operation is generally represented by flowchart 310 in FIG. 20. In this embodiment, fluid is ignited (block 312) and image data indicative of the combustion of the fluid is acquired (block 314). From the foregoing description, it will be appreciated that the ignited fluid could include a well effluent having oil or gas routed to a burner 282 and that the image data can be acquired with one or more cameras 292. The method also includes detecting (block 316) features in the acquired image data, such as through the techniques described above, and comparing (block 318) the detected features to examples (e.g., reference images) or thresholds, which may be stored in a database 320. The status of a burning operation (e.g., normal operation, flame degradation, or flame out) can be determined from the comparison and, in at least some instances, an indication of the status is provided to an operator (block 322). The burning operation may then be controlled based on the identified burning status (block 324), such as described above. Machine learning may be used to train and refine operation of the burner monitoring and control functionalities described herein. By way of example, unsupervised machine learning may facilitate feature identification using information from past scenarios (e.g., alarms, state of operation at time of alarm, and how problem was solved) and may enable detection of hidden features in acquired image data.

US 2019/0122081 (Shin) teaches hidden features were known in neural networks.  See, for example:
[0039] In accordance with an embodiment of the present invention, if M neural networks having an L layer are given, an equation for feature sharing is defined as follows.

    PNG
    media_image8.png
    51
    311
    media_image8.png
    Greyscale

[0040] wherein W indicates weight of a neural network, h indicates a hidden feature, a indicates a Bernoulli random feature, and ϕ indicates an activation function.

The following art is generally related to the field of Machine Learning.
US 2014/0357312 (Davis) teaches the general operation of machine learning, including the use of a learning or training algorithm to adjust the weights of the ANN.  See, for example:
[0855] Embodiments of present technology can also employ neuromorphic processing techniques (sometimes termed "machine learning," "deep learning," or "neural network technology"). As is familiar to artisans, such techniques employ large arrays of artificial neurons--interconnected to mimic biological synapses. These methods employ programming that is different than the traditional, von Neumann, model. In particular, connections between the circuit elements are weighted according to correlations in data that the processor has previously learned (or been taught).

[0857] Associated with each connection within the ANN is a weight, which is used by the input neuron in calculating the weighted sum of its inputs. The learning (or training) process is embodied in these weights, which are not chosen directly by the ANN designer, In general, this learning process involves determining the set of connection weights in the network that optimizes the output of the ANN is some respect. Two main types of learning, supervised and unsupervised, involve using a training algorithm to repeatedly present input data from a training set to the ANN and adjust the connection weights accordingly. In supervised learning, the training set includes the desired ANN outputs corresponding to each input data instance, while training sets for unsupervised learning contain only input data. In a third type of learning, called reinforcement learning, the ANN adapts on-line as it is used in an application. Combinations of learning types can be used; in feed-forward ANNs, a popular approach is to first use unsupervised learning for the input and interior layers and then use supervised learning to train the weights in the output layer.
In other words, it was known in the field of machine learning to apply an automatic learning algorithm to train the system based on a set of training data related to desired function of the machine learning system.  
US 2017/0163337 (Djukic) teaches that it was known to use machine learning for signal processing in optical communication devices.  See, for example, Djukic at [0034]:
[0034] In one or more embodiments, the channel summary is processed with respect to a machine learning algorithm operating on a network element. In particular, a channel summary and/or channel property may be classified according to various categories before further processing. This classification may be performed using support vector machines or k-nearest neighbor algorithms. Thus, the channel properties determined in Step 310 may be associated with various historical properties of a particular optical channel, such as by appending historical property information to data regarding the channel properties. Accordingly, channel summaries and/or channel properties may be used with various statistical techniques, such as histogram binning for a probability density function (PDF), and/or mean and variance estimation for parameteric PDF estimation.
In other words, Djukic teaches that it was known to use a machine learning including a classifier for controlling the signal processing or operation of elements in optical communication systems.  
US 2015/0063159 (Bonawitz) teaches that artificial neural networks were well-known forms of machine learning.  See, for example:
[0136] In further examples, predictions about balloons failures and reassignment decisions may be improved over time as more information about past balloons becomes available. In some examples, a computing system may apply a machine-learning process to improve associations between certain predicted failure modes and corresponding tasks within the network. In addition to the general techniques discussed herein, the computing device may apply any of a number of well-known machine learning processes such as an artificial neural network (ANN), SVM (Support Vector Machines), Genetic Algorithms, Bayesian inference, Bayes Nets, a Reinforcement Learning method, regression analysis, or a Decision Tree, for instance. After performing such a machine-learning process, a computing system may then be able to conclude that certain correlations between predicted failure modes and assigned tasks are inaccurate or could be more accurate, and then update the predictions and/or assignments accordingly.

The paper by Khan entitled “An Optical Communication’s Perspective on Machine Learning and Its Applications” (Khan) teaches the general goal of machine learning (ML) is, when given a data set, to solve two main types of problems: (a) functional description of given data and (b) classification of data by deriving appropriate decision boundaries.  FIG. 3 illustrates an example of dynamic network resources allocation and link capacity maximization via cross-layer optimization in SDNs:

    PNG
    media_image9.png
    496
    734
    media_image9.png
    Greyscale

Regarding the classification of data, this depends on how the different classes of data are distributed across the variable space.  See, for example, FIG. 4:

    PNG
    media_image10.png
    392
    732
    media_image10.png
    Greyscale

It teaches the use of Artificial Neural Networks as shown in FIG. 5, specifically ANN learning processes (FIG. 6) and decision boundaries for appropriate data classification (FIG. 7).

    PNG
    media_image11.png
    369
    738
    media_image11.png
    Greyscale

Khan teaches the use of Support Vector Machines (SVMs), which is a ML technique that transforms data into a higher-dimensional space called “feature space”, wherein the data belongs to two different classes can be separated more easily by a simple straight plane decision boundary (i.e., a hyperplane).  FIGS. 9 and 10 illustrate an example and 10 illustrate mapping a data set to a higher-dimensional feature space.  FIG. 9 show a data set undergoing a non-linear transformation when going from a 2D data space to 3D data space, thereby allowing for linear separation in 3D space when the data was linearly inseparable in 2D space.  FIG. 10 shows the use of a nonlinear kernel function

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARREN WOLF whose telephone number is (571)270-3378. The examiner can normally be reached Monday through Friday, 6:00 AM to 2:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KENNETH N. VANDERPUYE can be reached on 571-272-3078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DARREN E WOLF/Primary Examiner, Art Unit 2636