DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. JP2017-026630, filed on 02/16/2017.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/13/2018 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective 

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 and 4-7 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. "Show, attend and tell: Neural image caption generation with visual attention." International conference on machine learning. PMLR, (2015, “Xu”) and in view of Krause et al. "A Hierarchical Approach for Generating Descriptive Image Paragraphs." arXiv preprint arXiv:1611.06607 (2016, “Krause”) and in view of Yang et al. "Stacked attention networks for image question answering." Proceedings of the IEEE conference on computer vision and pattern recognition. (2016, “Yang”) and further in view of Wang et al. "Hierarchical attention network for action recognition in videos." arXiv preprint arXiv:1607.06416 (2016, “Wang”).
Regarding claim 1, Xu teaches a text preparation apparatus comprising: a storage device; and a processor configured to operate in accordance with a program stored in the storage device(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black GPU.” Note: It is being interpreted that the NVIDIA Titan Black GPU represents both a processor and a storage wherein the processor is configured to: perform encoding processing to generate feature vectors from input measured data on a plurality of variables(Xu, pg. 3, sec. 3.1.1 Encoder: Convolutional Features, fig.1, “We use a convolutional neural network in order to extract a set of feature vectors which we refer to as annotation vectors. The extractor produces L vectors, each of which is a D-dimensional representation corresponding to a part of the image…                        
                            a
                            =
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                     
                                    …
                                    ,
                                    
                                        
                                            a
                                        
                                        
                                            L
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                            ∈
                            
                                
                                    R
                                
                                
                                    D
                                
                            
                        
                    …[i]n order to obtain a correspondence between the feature vectors and portions of the 2-D image, we extract features from a lower convolutional layer…[t]his allows the decoder to selectively focus on certain parts of an image by weighting a subset of all the feature vectors.”) ; and perform decoding processing to determine a text consistent with the measured data from the feature vectors, wherein the feature vectors include a first feature vector representing features extracted from the entirety of the measured data and feature vector sets of measured data on individual variables, wherein each feature vector in a feature vector set represents a feature of a part of the measured data on the corresponding variable(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1, fig. 2, “We use a long short-term memory (LSTM) network… that produces a caption by generating one word at every time step conditioned on a context vector, the previous hidden state and the previously generated words… In simple terms, the context vector                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     is a dynamic representation of the relevant part of the image input at time t. We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , i=1,…,L corresponding to the features extracted at different image locations.” Note: It is being interpreted that the context vector                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents a first feature vector representing features extracted from the entirety of the measured data and the annotation vectors,                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , i =1,…, L represents feature vector sets of measured data on individual variables, wherein each feature vector in a feature vector set represents a feature of a ); generate a first vector set from a state vector of a previous step and the feature vector sets(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents the first vector set,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    represents the state vector of a previous step, and each                        
                            
                                
                                     
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents the feature vectors), each vector of the first vector set being generated based on similarity degrees between individual vectors in one of the feature vector sets and the state vector (Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                        
                             
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents feature vectors,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     represents the state vector  from the measured data, and                         
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                        
                     represents similarity degrees between individual feature vectors and the state vector in the first vector set). 
Xu does not teach: and wherein, in the decoding processing, the processor is configured to: perform first-layer recurrent neural network processing for phrase types to be used in the text and second-layer recurrent neural network processing for words appropriate for each of the 
However, Krause teaches: and wherein, in the decoding processing, the processor is configured to: perform first-layer recurrent neural network processing for phrase types to be used in the text and second-layer recurrent neural network processing for words appropriate for each of the phrase types (Krause, pg.4, sec., 4.3, fig.2, “The pooled region vector… is given as input to a hierarchical neural language model composed of two modules: a sentence RNN and a word RNN.” Note: It is being interpreted that the sentence RNN represents the first-layer recurrent neural network processing for phrase types to be used in the text and the word RNN represents the second-layer recurrent neural network processing for words appropriate for each of the phrase types); determine a phrase appropriate for each of the phrase types based on outputs of the second-layer recurrent neural network processing(Krause, pg.4, sec., 4.3, fig.2, “The sentence RNN is responsible for deciding the number of sentences S…and for producing a P-dimensional topic vector for each of these sentences. Given a topic vector for a sentence, the word RNN generates the words of that sentence.”) in the first-layer recurrent neural network processing(Krause, pg.4, sec., 4.3, fig.2, “The pooled region vector… is given as input to a hierarchical neural language model composed of two modules: a sentence RNN and a word RNN.” Note: It is being interpreted that the sentence RNN represents the first-layer recurrent neural network processing).
Accordingly, it would have been obvious to one of ordinary skill in the art before the
effective filing date of the claimed invention to modify Xu’s apparatus in view of Krause to teach: and wherein, in the decoding processing, the processor is configured to: perform first-
Xu does not teach: generate a second vector based on similarity degrees between individual vectors in the first vector set and the state vector.  
However Yang teaches: generate a second vector based on similarity degrees between individual vectors in the first vector set and the state vector (Yang, pg. 24, sec. 3.3 Stacked Attention Networks, fig 1(a), fig. 1(b), “Therefore, we iterate the above query-attention process using multiple attention layers, each extracting more fine-grained visual attention information for answer prediction. Formally, the SANs take the following formula: for the k-th attention layer we compute:                         
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            =
                            t
                            a
                            n
                            h
                            ⁡
                            (
                            
                                
                                    W
                                
                                
                                    I
                                    ,
                                    A
                                
                                
                                    K
                                
                            
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                             
                            ⨁
                            
                                
                                    
                                        
                                            W
                                        
                                        
                                            Q
                                            ,
                                             
                                            A
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            u
                                        
                                        
                                            k
                                            -
                                            1
                                        
                                    
                                    +
                                    
                                        
                                            b
                                        
                                        
                                            A
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            )
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            s
                            o
                            f
                            t
                            m
                            a
                            x
                            (
                            
                                
                                    W
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            )
                        
                    …                        
                             
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        i
                                    
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            i
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            v
                                        
                                        
                                            i
                                        
                                    
                                
                            
                        
                     ,                          
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                        
                     represents the second vector,                        
                             
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                        
                     represents the similarity degree,                          
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                        
                      represents the individual vectors in the first vector set and                         
                             
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                     which represents the state vector).
 based on similarity degrees between individual vectors in the first vector set and the state vector.  The motivation to do so would be to focus on specific image regions that are relevant (Yang, pg. 22, “[T]he stacked attention model, which locates, via multi-step reasoning, the image regions that are relevant… [t]he higher-level attention layer gives a sharper attention distribution focusing on the regions that are more relevant to the answer.”)
Xu does not teach: and input the second vector to a given step. 
However, Wang teaches: and input the second vector to a given step(Wang, pg.2, fig. 1, fig. 3, As figure 1 illustrates brown blocks titled ATTN outputting their attention weight vectors into blue blocks titled WA that are part of the LSTM for Appearance). 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Xu’s apparatus in view of Wang to teach: and input the second vector to a given step.  The motivation to do so would be to focus on two different types of regions of an image frame (Wang, pg.2, sec. 1 Introduction,  “A soft attention is adopted on the spatial-temporal input features with LSTM to learn the important regions in a frame and the crucial frames in the videos.”).
Regarding dependent claim 4,  Xu in view of Krause and in view of Yang and further in view of Wang teaches the text preparation apparatus according to claim 1,wherein the processor is configured(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black GPU.”) to learn parameters in the encoding processing and the decoding processing from a plurality of training data pairs, wherein each pair of the plurality of training data pairs is composed of measured data on a plurality of variables and a text(Krause, pg. 5, sec. 4.4 Training and Sampling, “Training data consists of pairs (x, y), with x an image and y a ground-truth paragraph description for that image, where y has S sentences, the ith sentence has                         
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                             
                        
                    words....” Note: It is being interpreted that x represents the measured data on a plurality of variables and y represents the text), and wherein the processor is configured to(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black GPU.”) determine relations between the phrase types and the variables based on similarity degrees between individual vectors in the first vector set and the state vector in the learning(Yang, pg. 24, sec. 3.3 Stacked Attention Networks, fig 1(a), fig. 1(b), “Therefore, we iterate the above query-attention process using multiple attention layers, each extracting more fine-grained visual attention information for answer prediction. Formally, the SANs take the following formula: for the k-th attention layer we compute:                         
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            =
                            t
                            a
                            n
                            h
                            ⁡
                            (
                            
                                
                                    W
                                
                                
                                    I
                                    ,
                                    A
                                
                                
                                    K
                                
                            
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                             
                            ⨁
                            
                                
                                    
                                        
                                            W
                                        
                                        
                                            Q
                                            ,
                                             
                                            A
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            u
                                        
                                        
                                            k
                                            -
                                            1
                                        
                                    
                                    +
                                    
                                        
                                            b
                                        
                                        
                                            A
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            )
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            s
                            o
                            f
                            t
                            m
                            a
                            x
                            (
                            
                                
                                    W
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            )
                        
                    …                        
                             
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        i
                                    
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            i
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            v
                                        
                                        
                                            i
                                        
                                    
                                
                            
                        
                     ,                          
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                        
                     represents the second vector,                        
                             
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                        
                     represents the similarity degree,                          
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                        
                      represents the individual vectors in the first vector set and                         
                             
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                     which represents the state vector) from the plurality of training data pairs(Krause, pg. 5, sec. 4.4 Training and Sampling, “Training data consists of pairs (x, y), with x an image and y a ground-truth paragraph description for that image, where y has S sentences, the ith sentence has                         
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                             
                        
                    words....”).
Regarding dependent claim 5, Xu in view of Krause and in view of Yang and further in view of Wang teaches the text preparation apparatus according to claim 1,wherein the processor is configured to(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black learn parameters in the encoding processing and the decoding processing from a plurality of training data pairs, wherein each pair of the plurality of training data pairs is composed of measured data on a plurality of variables and a text(Krause, pg. 5, sec. 4.4 Training and Sampling, “Training data consists of pairs (x, y), with x an image and y a ground-truth paragraph description for that image, where y has S sentences, the ith sentence has                         
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                             
                        
                    words....” Note: It is being interpreted that x represents the measured data on a plurality of variables and y represents the text), and wherein the processor is configured to(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black GPU.”) determine a feature pattern relevant to a phrase consistent with a state vector in measured data on a variable represented by a feature vector set, based on similarity degrees between individual vectors in the feature vector set and the state vector in the learning( Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                        
                            
                                
                                    
                                        
                                             
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents a feature pattern relevant to a phrase,                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents feature vector sets,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     represents the state vector in measured data, and                          
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                        
                     represents similarity degrees between individual feature vectors and the state vector in the first vector set) from the plurality of training data pairs(Krause, pg. 5, sec. 4.4 Training and Sampling, “Training data consists of pairs (x, y), with x an image and y a ground-truth paragraph description for that image, where y has S sentences, the ith sentence has                         
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                             
                        
                    words....”). 
 method for a text preparation apparatus to prepare a text, the text preparation apparatus including a storage device and a processor configured to operate in accordance with a program stored in the storage device(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black GPU.” Note: It is being interpreted that the NVIDIA Titan Black GPU represents both a processor and a storage device), and the method comprising: performing, by the processor, encoding processing to generate feature vectors from input measured data on a plurality of variables(Xu, pg. 3, sec. 3.1.1 Encoder: Convolutional Features, fig.1, “We use a convolutional neural network in order to extract a set of feature vectors which we refer to as annotation vectors. The extractor produces L vectors, each of which is a D-dimensional representation corresponding to a part of the image…                        
                            a
                            =
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                     
                                    …
                                    ,
                                    
                                        
                                            a
                                        
                                        
                                            L
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                            ∈
                            
                                
                                    R
                                
                                
                                    D
                                
                            
                        
                    …[i]n order to obtain a correspondence between the feature vectors and portions of the 2-D image, we extract features from a lower convolutional layer…[t]his allows the decoder to selectively focus on certain parts of an image by weighting a subset of all the feature vectors.”); and performing, by the processor, decoding processing to determine a text consistent with the measured data from the feature vectors, wherein the feature vectors include a first feature vector representing features extracted from the entirety of the measured data and feature vector sets of measured data on individual variables, wherein each feature vector in a feature vector set represents a feature of a part of the measured data on the corresponding variable(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1, fig. 2, “We use a long short-term memory (LSTM) network… that produces a caption by generating one word at every time step conditioned on a context vector, the previous hidden state and the previously generated words… In simple terms, the context vector                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     is a dynamic t. We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , i=1,…,L corresponding to the features extracted at different image locations.” Note: It is being interpreted that the context vector                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents a first feature vector representing features extracted from the entirety of the measured data and the annotation vectors,                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , i =1,…, L represents feature vector sets of measured data on individual variables, wherein each feature vector in a feature vector set represents a feature of a part of the measured data on the corresponding variable); generating, by the processor, a first vector set from a state vector of a previous step and the feature vector sets(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents the first vector set,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    represents the state vector of a previous step, and each                        
                            
                                
                                     
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents the feature vectors), each vector of the first vector set being generated based on similarity degrees between individual vectors in one of the feature vector sets and the state vector(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                        
                             
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents feature vectors,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     represents the state vector  from the measured data, and                         
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                        
                     represents similarity degrees between individual feature vectors and the state vector in the first vector set). 
Xu does not teach: and wherein, the decoding processing includes: performing, by the processor, first-layer recurrent neural network processing for phrase types to be used in the text and second-layer recurrent neural network processing for words appropriate for each of the phrase types; determining, by the processor, a phrase appropriate for each of the phrase types based on outputs of the second-layer recurrent neural network processing; in the first-layer recurrent neural network processing. 
However, Krause teaches: and wherein, the decoding processing includes: performing, by the processor, first-layer recurrent neural network processing for phrase types to be used in the text and second-layer recurrent neural network processing for words appropriate for each of the phrase types (Krause, pg.4, sec., 4.3, fig.2, “The pooled region vector… is given as input to a hierarchical neural language model composed of two modules: a sentence RNN and a word RNN.” Note: It is being interpreted that the sentence RNN represents the first-layer recurrent neural network processing for phrase types to be used in the text and the word RNN represents the second-layer recurrent neural network processing for words appropriate for each of the phrase types); determining, by the processor, a phrase appropriate for each of the phrase types based on outputs of the second-layer recurrent neural network processing (Krause, pg.4, sec., 4.3, fig.2, “The sentence RNN is responsible for deciding the number of sentences S…and for producing a P-dimensional topic vector for each of these sentences. Given a topic vector for a sentence, the word RNN generates the words of that sentence.”); in the first-layer recurrent neural network processing(Krause, pg.4, sec., 4.3,  is given as input to a hierarchical neural language model composed of two modules: a sentence RNN and a word RNN.” Note: It is being interpreted that the sentence RNN represents the first-layer recurrent neural network processing).
Accordingly, it would have been obvious to one of ordinary skill in the art before the
effective filing date of the claimed invention to modify Xu’s method in view of Krause to teach: and wherein, the decoding processing includes: performing, by the processor, first-layer recurrent neural network processing for phrase types to be used in the text and second-layer recurrent neural network processing for words appropriate for each of the phrase types; determining, by the processor, a phrase appropriate for each of the phrase types based on outputs of the second-layer recurrent neural network processing; in the first-layer recurrent neural network processing. The motivation to do so would be to generate longer sentences that avoid the vanishing gradient problem (Krause, pg. 2, sec. 2 Related Work, “In order to generate a paragraph description, a model must reason about long term linguistic structures spanning multiple sentences. Due to vanishing gradients, recurrent neural networks trained with stochastic gradient descent often struggle to learn long term dependencies…Another solution is a hierarchical recurrent network, where the architecture is designed such that different parts of the model operate on different time scales.”).
Xu does not teach: generating, by the processor, a second vector based on similarity degrees between individual vectors in the first vector set and the state vector.  
However Yang teaches: generating, by the processor, a second vector based on similarity degrees between individual vectors in the first vector set and the state vector (Yang, pg. 24, sec. 3.3 Stacked Attention Networks, fig 1(a), fig. 1(b), “Therefore, we iterate the above query-attention process using multiple attention layers, each extracting more fine-grained                         
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            =
                            t
                            a
                            n
                            h
                            ⁡
                            (
                            
                                
                                    W
                                
                                
                                    I
                                    ,
                                    A
                                
                                
                                    K
                                
                            
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                             
                            ⨁
                            
                                
                                    
                                        
                                            W
                                        
                                        
                                            Q
                                            ,
                                             
                                            A
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            u
                                        
                                        
                                            k
                                            -
                                            1
                                        
                                    
                                    +
                                    
                                        
                                            b
                                        
                                        
                                            A
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            )
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            s
                            o
                            f
                            t
                            m
                            a
                            x
                            (
                            
                                
                                    W
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            )
                        
                    …                        
                             
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        i
                                    
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            i
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            v
                                        
                                        
                                            i
                                        
                                    
                                
                            
                        
                     ,                          
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                        
                     represents the second vector,                        
                             
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                        
                     represents the similarity degree,                          
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                        
                      represents the individual vectors in the first vector set and                         
                             
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                     which represents the state vector).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Xu’s method in view of Yang to teach: generating, by the processor, a second vector based on similarity degrees between individual vectors in the first vector set and the state vector. The motivation to do so would be to focus on specific image regions that are relevant (Yang, pg. 22, “[T]he stacked attention model, which locates, via multi-step reasoning, the image regions that are relevant… [t]he higher-level attention layer gives a sharper attention distribution focusing on the regions that are more relevant to the answer.”)
Xu does not teach: and inputting, by the processor, the second vector to a given step. 
However, Wang teaches: and inputting, by the processor, the second vector to a given step(Wang, pg.2, fig. 1, fig. 3, As figure 1 illustrates brown blocks titled ATTN outputting their attention weight vectors into blue blocks titled WA that are part of the LSTM for Appearance). 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Xu’s method in view of Wang to teach: and inputting, by the processor, the second vector to a given step.  The motivation to do so would be to focus on two different types of regions of an image frame (Wang, pg.2, sec. 1 Introduction,  “A soft attention is adopted on the spatial-temporal input features with LSTM to learn the important regions in a frame and the crucial frames in the videos.”).
encoder configured to generate feature vectors from input measured data on a plurality of variables(Xu, pg. 3, sec. 3.1.1 Encoder: Convolutional Features, fig.1, “We use a convolutional neural network in order to extract a set of feature vectors which we refer to as annotation vectors. The extractor produces L vectors, each of which is a D-dimensional representation corresponding to a part of the image…                        
                            a
                            =
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                     
                                    …
                                    ,
                                    
                                        
                                            a
                                        
                                        
                                            L
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                            ∈
                            
                                
                                    R
                                
                                
                                    D
                                
                            
                        
                    …[i]n order to obtain a correspondence between the feature vectors and portions of the 2-D image, we extract features from a lower convolutional layer…[t]his allows the decoder to selectively focus on certain parts of an image by weighting a subset of all the feature vectors.”) ; and a decoder configured to determine a text consistent with the measured data from the feature vectors, wherein the feature vectors include a first feature vector representing features extracted from the entity of the measured data and feature vector sets of measured data on individual variables, wherein each feature vector in a feature vector set represents a feature of a part of the measured data on the corresponding variable(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1, fig. 2, “We use a long short-term memory (LSTM) network… that produces a caption by generating one word at every time step conditioned on a context vector, the previous hidden state and the previously generated words… In simple terms, the context vector                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     is a dynamic representation of the relevant part of the image input at time t. We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , i=1,…,L corresponding to the features extracted at different image locations.” Note: It is being interpreted that the context vector                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents a first feature vector representing features extracted from the entirety of the measured data and the annotation vectors,                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , i =1,…, L represents feature vector sets of measured data on individual variables, wherein each feature vector in a feature vector set represents a feature of a part of the ); generate a first vector set from a state vector of a previous step and the feature vector sets(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     represents the first vector set,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    represents the state vector of a previous step, and each                        
                            
                                
                                     
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents the feature vectors), each vector of the first vector set being generated based on similarity degrees between individual vectors in one of the feature vector sets and the state vector(Xu, pg. 3, sec. 3.1.2. Decoder: Long Short-Term Memory Network, fig.1 (3), fig. 2, “We define a mechanism                         
                            ϕ
                        
                     that computes                         
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                        
                     from the annotation vectors                        
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    ,[from] i=1,…, L corresponding to the features extracted at different image locations…[t]he weight                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     of each annotation vector                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     is computed by an attention model                         
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                        
                     for which we use a multilayer perceptron conditioned on the previous hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    …                        
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            ,
                             
                             
                            
                                
                                    α
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            e
                                                        
                                                        
                                                            t
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                L
                                            
                                        
                                        
                                            e
                                            x
                                            p
                                            ⁡
                                            (
                                            
                                                
                                                    e
                                                
                                                
                                                    t
                                                    k
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    …                        
                            
                                
                                    
                                        
                                            z
                                        
                                        ^
                                    
                                
                                
                                    t
                                
                            
                            =
                            ϕ
                            (
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            ,
                            
                                
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    .” Note: It is being interpreted that                        
                             
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     from i=1,…, L represents feature vectors,                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     represents the state vector  from the measured data, and                         
                            
                                
                                    e
                                
                                
                                    t
                                    i
                                
                            
                            =
                            
                                
                                    f
                                
                                
                                    a
                                    t
                                    t
                                
                            
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                        
                     represents similarity degrees between individual feature vectors and the state vector in the first vector set). 
Xu does not teach: wherein, the decoder includes a first-layer recurrent neural network for phrase types to be used in the text and second-layer recurrent neural network for words appropriate for each of the phrase types, and wherein the decoder is configured to: determine a 
However, Krause teaches: wherein, the decoder includes a first-layer recurrent neural network for phrase types to be used in the text and second-layer recurrent neural network for words appropriate for each of the phrase types (Krause, pg.4, sec., 4.3, fig.2, “The pooled region vector… is given as input to a hierarchical neural language model composed of two modules: a sentence RNN and a word RNN.” Note: It is being interpreted that the sentence RNN represents the first-layer recurrent neural network for phrase types to be used in the text and the word RNN represents the second-layer recurrent neural network for words appropriate for each of the phrase types), and wherein the decoder is configured to: determine a phrase appropriate for each of the phrase types based on outputs of the second-layer recurrent neural network (Krause, pg.4, sec., 4.3, fig.2, “The sentence RNN is responsible for deciding the number of sentences S…and for producing a P-dimensional topic vector for each of these sentences. Given a topic vector for a sentence, the word RNN generates the words of that sentence.”); in the first-layer recurrent neural network (Krause, pg.4, sec., 4.3, fig.2, “The pooled region vector… is given as input to a hierarchical neural language model composed of two modules: a sentence RNN and a word RNN.” Note: It is being interpreted that the sentence RNN represents the first-layer recurrent neural network).
Accordingly, it would have been obvious to one of ordinary skill in the art before the
effective filing date of the claimed invention to modify Xu’s apparatus in view of Krause to teach: wherein, the decoder includes a first-layer recurrent neural network for phrase types to be used in the text and second-layer recurrent neural network for words appropriate for each of the phrase types, and wherein the decoder is configured to: determine a phrase appropriate for each 
Xu does not teach: generate a second vector based on similarity degrees between individual vectors in the first vector set and the state vector.  
However Yang teaches: generate a second vector based on similarity degrees between individual vectors in the first vector set and the state vector (Yang, pg. 24, sec. 3.3 Stacked Attention Networks, fig 1(a), fig. 1(b), “Therefore, we iterate the above query-attention process using multiple attention layers, each extracting more fine-grained visual attention information for answer prediction. Formally, the SANs take the following formula: for the k-th attention layer we compute:                         
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            =
                            t
                            a
                            n
                            h
                            ⁡
                            (
                            
                                
                                    W
                                
                                
                                    I
                                    ,
                                    A
                                
                                
                                    K
                                
                            
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                             
                            ⨁
                            
                                
                                    
                                        
                                            W
                                        
                                        
                                            Q
                                            ,
                                             
                                            A
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            u
                                        
                                        
                                            k
                                            -
                                            1
                                        
                                    
                                    +
                                    
                                        
                                            b
                                        
                                        
                                            A
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            )
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            s
                            o
                            f
                            t
                            m
                            a
                            x
                            (
                            
                                
                                    W
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    p
                                
                                
                                    k
                                
                            
                            )
                        
                    …                        
                             
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        i
                                    
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            i
                                        
                                        
                                            k
                                        
                                    
                                    
                                        
                                            v
                                        
                                        
                                            i
                                        
                                    
                                
                            
                        
                     ,                          
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            v
                                        
                                        ~
                                    
                                
                                
                                    I
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                    .” Note: It is being interpreted that                         
                            
                                
                                    u
                                
                                
                                    k
                                
                            
                        
                     represents the second vector,                        
                             
                            
                                
                                    h
                                
                                
                                    A
                                
                                
                                    k
                                
                            
                        
                     represents the similarity degree,                          
                            
                                
                                    v
                                
                                
                                    I
                                
                            
                        
                      represents the individual vectors in the first vector set and                         
                             
                            
                                
                                    u
                                
                                
                                    k
                                    -
                                    1
                                
                            
                        
                     which represents the state vector).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Xu’s apparatus in view of Yang to teach: generate a second vector based on similarity degrees between individual vectors in the first vector set and the state vector.  The motivation to do so would be to focus on specific image  [t]he higher-level attention layer gives a sharper attention distribution focusing on the regions that are more relevant to the answer.”)
Xu does not teach: and input the second vector to a given step. 
However, Wang teaches: and input the second vector to a given step(Wang, pg.2, fig. 1, fig. 3, As figure 1 illustrates brown blocks titled ATTN outputting their attention weight vectors into blue blocks titled WA that are part of the LSTM for Appearance). 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Xu’s apparatus in view of Wang to teach: and input the second vector to a given step.  The motivation to do so would be to focus on two different types of regions of an image frame (Wang, pg.2, sec. 1 Introduction,  “A soft attention is adopted on the spatial-temporal input features with LSTM to learn the important regions in a frame and the crucial frames in the videos.”).
Claims 2-3 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. "Show, attend and tell: Neural image caption generation with visual attention." International conference on machine learning. PMLR, (2015, “Xu”) and in view of Krause et al. "A Hierarchical Approach for Generating Descriptive Image Paragraphs." arXiv preprint arXiv:1611.06607 (2016, “Krause”) and in view of Yang et al. "Stacked attention networks for image question answering." Proceedings of the IEEE conference on computer vision and pattern recognition. (2016, “Yang”) and  in view of Wang et al. "Hierarchical attention network for action recognition in videos." arXiv preprint arXiv:1607.06416 (2016, “Wang”) and further in view of Tong et al. "Production Estimation for Shale Wells with Sentiment-Based Features from Geology 2015 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE, (2015, “Tong”).
Regarding dependent claim 2, Xu in view of Krause and in view of Yang and further in view of Wang teaches the text preparation apparatus according to claim 1. 
Xu in view of Krause and in view of Yang and further in view of Wang do not teach: wherein the measured data is data acquired in drilling an oil well, wherein the text is a geology report in drilling the oil well, and wherein the phrase types are rock properties.
However, Tong teaches: wherein the measured data is data acquired in drilling an oil well, wherein the text is a geology report in drilling the oil well (Tong, pg. 1314, sec. A Experimental Data, “In this experiment, we focus on three categories of information, which are well summarization information, well trajectory, and geology report in scanned reports. Well summarizations record the basic information about each well, including wellhead location, well type, total measured depth, and formation tops, etc. Note that formation tops data record depths of a number of specific formations.”), and wherein the phrase types are rock properties(Tong, pg. 1311, sec. A Phrase Extraction, “We extract three categories of featured phrases. i.e., oil stain, porosity, and fluorescence cut. The main reason for extracting them is that they are key rock properties that a geologist examines to evaluate the quality of a reservoir.”).
Accordingly, it would have been obvious to one of ordinary skill in the art before the
effective filing date of the claimed invention to modify Xu’s apparatus in view of Krause and in view of Yang and in view of Wang and further in view of Tong to teach: wherein the measured data is data acquired in drilling an oil well, wherein the text is a geology report in drilling the oil well, and wherein the phrase types are rock properties. The motivation to do so would be to incorporate sentiment analysis to better analyze an expert’s opinion on a given matter (Tong, pg. “In the present work, the geology report is studied to determine how it can contribute to the FE process. We mainly focus on extracting information from the geology report, in which the geologist gives an opinion on a number of specific properties of a sample rock. The opinion is identified as a sentiment by using sentiment analysis.”). 
Regarding dependent claim 3, Xu in view of Krause and in view of Yang and in view of Wang and further in view of Tong teaches the text preparation apparatus according to claim 2,wherein the processor is configured to(Xu, pg. 6, sec. 4.3 Training Procedure,  “On our largest dataset (MS COCO), our soft attention model took less than 3 days to train on an NVIDIA Titan Black GPU.”) learn parameters in the encoding processing and the decoding processing from a plurality of training data pairs, and wherein each pair of the plurality of training data pairs (Krause, pg. 5, sec. 4.4 Training and Sampling, “Training data consists of pairs (x, y), with x an image and y a ground-truth paragraph description for that image, where y has S sentences, the ith sentence has                         
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                             
                        
                    words....” Note: It is being interpreted that x represents the measured data on a plurality of variables and y represents the text), is composed of measured data on a plurality of variables in a certain depth range and a geology report of the certain depth range (Tong, pg. 1314, sec. A Experimental Data, “In this experiment, we focus on three categories of information, which are well summarization information, well trajectory, and geology report in scanned reports. Well summarizations record the basic information about each well, including wellhead location, well type, total measured depth, and formation tops, etc. Note that formation tops data record depths of a number of specific formations.”). 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20180143966 A1(details a spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning)
US 9858524 B2 (details methods, systems, and apparatus of obtaining an input image and feeding the image into a Convolutional Neural Network encoder and then into a LSTM decoder to generate a natural language description of the input image)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ADAM CLARK STANDKE whose telephone number is (571)270-1806.  The examiner can normally be reached between the hours of 9:30AM-6:30PM (Eastern Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications 






/ADAM CLARK STANDKE/Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122