DETAILED ACTION
Response to Arguments
Applicant’s arguments with respect to claims 1, 8, and 14 have been considered but are moot because the prior art reference of Doan has been replaced with the prior art reference of Bernstein in view of Applicant’s new amendments. 
Bernstein’s invention deals with teaching an automatic schema matcher. See Abstract. Accordingly, Bernstein teaches Applicant’s amended limitation on ¶0036, by stating that “[c]ommon types of schemas can include extensible markup language (XML) schemas, relational (e.g., structured query language (SQL)) schemas, ontology schemas (e.g., resource description framework (RDF) schema or web ontology language (OWL)), and object-oriented (e.g., common language runtime (CLR)) schemas. As illustrated in FIG. 1, given two schemas ( e.g., 106, 108), the systems and methods described herein can facilitate automatically developing a mapping from the first schema 106 to the second schema 108.” 
And further on ¶ ¶0062-0066, by stating that “[p]hase two can rank the candidates by scoring each candidate match based on textual similarity, structural similarity and type… [a]s well, each candidate's total similarity to the selected element can be computed as a weighted sum of textual and structural similarity.” 
This leads to the following claim mapping of Bernstein to Applicant’s amended limitation:
mapping from relational (e.g., structured query language (SQL) to extensible markup language (XML) schemas where  phase two can use a combination of text, structure and type to calculate the similarity as illustrated by fig. 1 [wherein each similarity score of the plurality of similarity scores represents a similarity between a field of the non-hierarchical text data schema and a single level XDM of the plurality of single level XDMs of the XDM field] mapping from relational (e.g., structured query language (SQL) to extensible markup language (XML) schemas where each candidate's total similarity to the selected element can be computed as a weighted sum of textual and structural similarity [computing a probability of a match between the input field and the XDM field by multiplying the plurality of similarity scores].1 


In regards to Applicant’s argument that there is no prima facie case that claims 1, 8, and 14 are obvious in view of  Collins, Hainaut, Ebraheem, and Doan, Examiner once again must respectively point out that Applicant’s arguments with respect to claims 1, 8, and 14 have been considered but are moot because the prior art reference of Doan has been replaced with the prior art reference of Bernstein in view of Applicant’s new amendments.
As detailed above Bernstein teaches Applicant’s amended limitation:
 wherein each similarity score of the plurality of similarity scores represents a similarity between a field of the non-hierarchical text data schema and a single level XDM of the plurality of single level XDMs of the XDM field; 
computing a probability of a match between the input field and the XDM field by multiplying the plurality of similarity scores. 
Accordingly, the current Office Action has shown on pages 5 -11 that the cited references of Collins, Hainaut, Ebraheem, and Bernstein teach Applicant’s limitation. And in this case unlike the prior art of Doan, the prior art of Bernstein does teach mapping non-hierarchical data (for example relational schemas) to hierarchical data (for example XML schemas).Accordingly, the 35 U.S.C. § 103 rejection of claims 1, 8, 14, and claims 2-5, 7, 9-12, 15-18, and 20 are not withdrawn. 


 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/10/2022 has been entered.
 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 8-12, and 14-18 are rejected under 35 U.S.C. 103 as being unpatentable over Collins, Samuel Robert, et al. "XML schema mappings for heterogeneous database access." Information and Software Technology 44.4 (2002) (“Collins”) in view of Hainaut et al. Hierarchical Data Model. Encyclopedia of Database Systems. Springer, Boston, MA. (2009)(“Hainaut”) and in view of Ebraheem, et al. "Distributed representations of tuples for entity resolution." Proceedings of the VLDB Endowment 11.11 (2018)(“Ebraheem”) and further in view of Bernstein et al. US 2007/0055655 Al(“Bernstein”).
Regarding claim 1, Collins teaches a method comprising:
	identifying a first sequence of vectors representing an input field of a non-hierarchical text data schema (Collins, pg. 253, right-column, “This section deals with mapping schemas of databases stored in the network data model…The network data model uses two basic constructs: Record Types and Set Types. A record contains data items which may be of type…a vector (a list).” Collins teaches The network data model uses two basic constructs: Record Types and Set Types. A record contains data items which may be of type a vector (a list) (i.e.  identifying a first sequence of vectors representing an input field of a non-hierarchical text data schema)); identifying a second sequence of vectors representing a hierarchical standard data model (XDM) field(Collins, pgs. 253-254, right-column, “Step 2a: The next step of the algorithm is to create a complex type for the individual records of the various record types. Therefore, for each record type RT with data items D1, …, Dn, create a complex type RT-RecType and include D1, …, Dn as elements. It Dx is a vector observe Step 2b… For the data item D, if D is a vector, add an anonymous complex type to the element D4. Insert element D-Item and set the minOccurs attribute to 0 and the maxOccurs attribute to unbounded.” Collins teaches Therefore, for each record type RT with data items D1, …, Dn, create a complex type RT-RecType and include D1, …, Dn as elements. It Dx is a vector add an anonymous complex type to the element D4. Insert element D-Item and set the minOccurs attribute to 0 and the maxOccurs attribute to unbounded (i.e. identifying a second sequence of vectors representing a hierarchical standard data model (XDM) field)); converting the input field to an XDM schema by mapping the input field to the XDM field (Collins, pg. 255, right-column, sec. 3.2.1. Sample network Database, “The XML schema for a sample network data model is shown below.” Collins teaches The XML schema for a sample network data model is shown below in which a sample network data model of RT1 and RT2 and is data items are converted into an XML schema and fields (i.e. converting the input field to an XDM schema by mapping the input field to the XDM field)) based on the probability of the match (Collins, pg. 252, left-column, “Wrapper-mediator systems are sophisticated applications that abstract the data source from the users. In addition, they translate queries into the terms of the data sources and integrate the results…This research is similar to the work done with mediators and wrappers as well as global schemas. Our approach utilizes XML schemas the unifying data model.” Collins teaches Wrapper-mediator systems (i.e. based on the probability of the match)). 
Collins does not teach: wherein the XDM field represents to a path including a plurality of single level XDMs in an XDM tree hierarchy. 
However, Hainaut teaches wherein the XDM field represents to a path including a plurality of single level XDMs in an XDM tree hierarchy (Hainaut, pg. 1296; see also fig. 6 at pg. 1299(detailing the partial transformation of an Entity-relationship schema into a forest of trees hierarchy schema),  Figure 1(a) details that the transformation T1 creates a hierarchical data model with a tree hierarchy with three functional relationships Ra, Rc, and Rb pointing to blocks A, C, and B that are on the same level in the tree hierarchy. Hainaut teaches that the functional relationships Ra, Rc, and Rb of Figure 1(a) (i.e. wherein the XDM field represents to a path) blocks A, C, and B on the same level in the tree hierarchy (i.e. including a plurality of single level XDMs in an XDM tree hierarchy)). 
Accordingly, one of ordinary skill in the art would modify Collin’s method in view of Hainaut the motivation to do so would be to have a database schema designed to handle high speed transactional type data (Hainaut, pg. 1295,  “IMS [/Hierarchical data models are]…now a complex and powerful data management and data communication environment mostly used by data intensive batch and On-Line Transaction Processing (OLTP) applications.”).
Collins does not teach: generating an input field vector based on the first sequence of vectors using a sequence model; generating a plurality of single level XDM vectors based on the second sequence of vectors using the sequence model, wherein each single level XDM vector of the plurality of single level XDM vectors represents a single level XDM of the plurality of single level XDMs of the XDM field; computing a plurality of similarity scores based on the input field vector and the plurality of single level XDM vectors.
However, Ebraheem teaches generating an input field vector based on the first sequence of vectors using a sequence model(Ebraheem, pg. 6 left column; see also fig. 3 at pg. 3(detailing the construction of a composed vector from a word embedding lookup layer and LSTM cells of a RNN); see also Algorithm 2 at pg. 3(detailing  the construction of a distributed vector representation                         
                            v
                            (
                            t
                            )
                        
                     of a given tuple),  As figure 5 details, the word vectors for tuple                         
                            t
                        
                     outputted by the pre-trained embedding lookup layer are passed to a Composition layer to generate an attribute level vector. Ebraheem teaches the word vectors for tuple                         
                            t
                        
                     outputted by the pre-trained embedding lookup layer are passed to a Composition layer to generate an attribute level vector (i.e. generating a single level XDM field vector based on the second sequence of vectors) LSTM cells of a RNN (i.e. using the sequence model)); generating a plurality of single level XDM vectors based on the second sequence of vectors using the sequence model, wherein each single level XDM vector of the plurality of single level XDM vectors represents a single level XDM of the plurality of single level XDMs of the XDM field (Ebraheem, pg. 6 left column; see also fig. 3 at pg. 3(detailing the construction of a composed vector from a word embedding lookup layer and LSTM cells of a RNN); see also Algorithm 2 at pg. 3(detailing  the construction of a distributed vector representation                         
                            v
                            (
                            t
                            )
                        
                     of a given tuple),  As figure 5 details, the word vectors for tuple                         
                            t
                            '
                        
                     outputted by the pre-trained embedding lookup layer are passed to a Composition layer to generate an attribute level vector. Ebraheem teaches the word vectors for tuple                         
                            t
                            '
                        
                     outputted by the pre-trained embedding lookup layer are passed to a Composition layer to generate an attribute level vector LSTM cells of a RNN(i.e. generating a plurality of single level XDM vectors based on the second sequence of vectors using the sequence model, wherein each single level XDM vector of the plurality of single level XDM vectors represents a single level XDM of the plurality of single level XDMs of the XDM field )); computing a plurality of similarity scores based on the input field vector and the plurality of single level XDM vectors (Ebraheem, pg. 4 left column, sec. Computing Distributional Similarity, “Given the distributed representation of a pair of tuples                         
                            t
                        
                     and                         
                            t
                            '
                        
                    , the next step is to compute the similarity between their distributed representations v(                        
                            t
                        
                    ) and v(                        
                            t
                            '
                        
                    )…or the distributed representations computed by LSTM, each vector has x dimensions, we can use methods including subtracting (vector difference) or multiplying (hadamard product) the corresponding entries of the two vectors, which will result in a x-dimensional similarity vector.” Ebraheem teaches the similarity vector (i.e. computing a similarity score a similarity score), distributed representation v(                        
                            t
                        
                    ) (i.e. based on the input field vector) distributed representation v(                        
                            t
                            '
                        
                    ) (i.e. and the single level XDM field vector)). 
Accordingly, one of ordinary skill in the art would modify Collin’s method in view of Ebraheem the motivation to do so would be to include recent advances in deep learning such as word embeddings to capture similarities between databases regarding entity-resolution (Ebraheem, pg. 1, “With the recent advances in deep learning, in particular distributed representation of words (a.k.a. word embeddings), we present a novel ER system, called DeepER, that achieves good accuracy, high efficiency, as well as ease-of-use (i.e., much less human efforts). For accuracy, we use sophisticated composition methods, namely uni-and bi-directional recurrent neural networks (RNNs) with long short term memory (LSTM) hidden units, to convert each tuple to a distributed representation (i.e., a vector), which can in turn be used to effectively capture similarities….”).
Collins does not teach: wherein each similarity score of the plurality of similarity scores represents a similarity between a field of the non-hierarchical text data schema and a single level XDM of the plurality of single level XDMs of the XDM field; computing a probability of a match between the input field and the XDM field by multiplying the plurality of similarity scores. 
However Bernstein teaches: wherein each similarity score of the plurality of similarity scores represents a similarity between a field of the non-hierarchical text data schema and a single level XDM of the plurality of single level XDMs of the XDM field; computing a probability of a match between the input field and the XDM field by multiplying the plurality of similarity scores(Bernstein, para. 0036, see also fig. 1, 3, and 5-7, “A schema can be a template for data instances. Common types of schemas can include extensible markup language (XML) schemas, relational (e.g., structured query language (SQL)) schemas, ontology schemas (e.g., resource description framework (RDF) schema or web ontology language (OWL)), and object-oriented (e.g., common language runtime (CLR)) schemas. As illustrated in FIG. 1, given two schemas ( e.g., 106, 108), the systems and methods
described herein can facilitate automatically developing a mapping from the first schema 106 to the second schema 108.” & see Bernstein, paras. 0062-0066, “Phase two can use a combination of text, structure and type to calculate the similarity… [p]hase two can rank the candidates by scoring each candidate match based on textual similarity, structural similarity and type… [a]s well, each candidate's total similarity to the selected element can be computed as a weighted sum of textual and structural similarity. Moreover, each candidate's similarity scores can be normalized to a value in [0,1] based on the maximum value of each kind of score.” Bernstein teaches: mapping from relational (e.g., structured query language (SQL) to extensible markup language (XML) schemas where phase two can use a combination of text, structure and type to calculate the similarity as illustrated by fig. 1 (i.e. wherein each similarity score of the plurality of similarity scores represents a similarity between a field of the non-hierarchical text data schema and a single level XDM of the plurality of single level XDMs of the XDM field) mapping from relational (e.g., structured query language (SQL) to extensible markup language (XML) schemas where each candidate's total similarity to the selected element can be computed as a weighted sum of textual and structural similarity(i.e. computing a probability of a match between the input field and the XDM field by multiplying the plurality of similarity scores)). 
Accordingly, one of ordinary skill in the art would modify Collin’s method in view of Bernstein the motivation to do so would be use artificial intelligence to automatically do schema actions (Bernstein, para. 0017, “In yet another aspect thereof, an artificial intelligence component is provided that employs a probabilistic and/or statistical-based analysis to predict or infer an action that a user desires to be automatically performed.”). 
Regarding claim 2, Collins in view of Hainaut and in view of Ebraheem and in view of Bernstein teaches the method of claim 1, wherein the input field comprises text corresponding to a name of the input field and text corresponding to a description of the input field(Ebraheem, pg. 2, sec. 2.1 Entity Resolution, “Let T be a set of entities with n tuples and m attributes                         
                            {
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                             
                            
                                
                                    A
                                
                                
                                    m
                                
                            
                            }
                        
                    . Note that these entities can come from one table or multiple tables (with aligned attributes).” Ebraheem teaches m attributes                        
                            {
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                             
                            
                                
                                    A
                                
                                
                                    m
                                
                            
                            }
                        
                     (i.e. wherein the input field comprises text corresponding to a name of the input field and text corresponding to a description of the input field)).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Collins with the above teachings of Ebraheem for the same rationale stated at Claim 1.
Regarding claim 3, Collins in view of Hainaut and in view of Ebraheem and in view of Bernstein teaches the method of claim 1, wherein the input field consists of text corresponding to a description of the input field(Ebraheem, pg. 2, sec. 2.1 Entity Resolution, “Let T be a set of entities with n tuples and m attributes                         
                            {
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                             
                            
                                
                                    A
                                
                                
                                    m
                                
                            
                            }
                        
                    . Note that these entities can come from one table or multiple tables (with aligned attributes).” Ebraheem teaches m attributes                        
                            {
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                             
                            
                                
                                    A
                                
                                
                                    m
                                
                            
                            }
                        
                     (i.e. wherein the input field consists of text corresponding to a description of the input field)).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Collins with the above teachings of Ebraheem for the same rationale stated at Claim 1.
Regarding claim 4, Collins in view of Hainaut and in view of Ebraheem and in view of Bernstein teaches the method of claim 1, wherein identifying the first sequence of vectors comprises processing the input field with a Global Vectors for Word Representation (GloVe) algorithm and wherein identifying the second sequence of vectors comprises processing the XDM field with the GloVe algorithm(Ebraheem, pg. 4 left column (see also fig. 5), “Algorithm 2 gives the overall compositional process. For each word token in an attribute, we first look up its Glove vector. Then we use a [‘]shared[’ LSTM-RNN to compose each attribute value in a tuple into a vector. This results in a vector v(t) of d dimensions.”). 
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Collins with the above teachings of Ebraheem for the same rationale stated at Claim 1.
Regarding claim 5, Collins in view of Hainaut and in view of Ebraheem and in view of Bernstein teaches the method of claim 1, wherein the sequence model is a long short-term memory (LSTM) model (Ebraheem, pg. 4 left column (see also fig. 5), “Algorithm 2 gives the overall compositional process. For each word token in an attribute, we first look up its Glove vector. Then we use a [‘]shared[’ LSTM-RNN to compose each attribute value in a tuple into a vector. This results in a vector v(t) of d dimensions.”).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Collins with the above teachings of Ebraheem for the same rationale stated at Claim 1.
Referring to independent claims 8 and 14, they are rejected on the same basis as independent claim 1 since they are analogous claims.
Referring to dependent claims 9-12 and 15-18, they are also rejected on the same basis as dependent claims 2-5 since they are analogous claims.   
Claims 7 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Collins, Samuel Robert, et al. "XML schema mappings for heterogeneous database access." Information and Software Technology 44.4 (2002) (“Collins”) in view of Hainaut et al. Hierarchical Data Model. Encyclopedia of Database Systems. Springer, Boston, MA. (2009)(“Hainaut”) and in view of Ebraheem, et al. "Distributed representations of tuples for entity resolution." Proceedings of the VLDB Endowment 11.11 (2018)(“Ebraheem”) and in view of Bernstein et al. US 2007/0055655 Al(“Bernstein”) and further in view of Kang, et al. "Interactive entity resolution in relational data: A visual analytic tool and its evaluation." IEEE transactions on visualization and computer graphics 14.5 (2008)(“Kang”).
Regarding claim 7, Collins in view of Hainaut and in view of Ebraheem and in view of Bernstein teaches the method of claim 1, but does not teach: further comprising
However, Kang teaches: further comprising providing, via a user interface, an option for a user to override the mapping of the input field(Kang, pgs. 1001-1002 right column; see also fig. 1 at pg. 1001(detailing the user interface with the merge duplicates tab and mark detect tab that a user can use to override the similarity criterion), “Users begin the entity resolution process by loading one or multiple data files depending on their task (deduplication or data integration). Before searching for potential duplicates, users need to define a similarity metric, which describes what information should be used to determine if two records may match…Users can resolve the potential duplicate authors in three ways: 1) merge the potential duplicate authors, 2) mark them as distinct authors to exclude from further search results, and 3) leave them for later or other users’ decision.” Kang teaches Users can resolve the potential duplicate authors in three ways: 1) merge the potential duplicate authors, 2) mark them as distinct authors to exclude from further search results, and 3) leave them for later or other users’ decision (i.e. providing, via a user interface, an option for a user to override the mapping of the input field)). 
Accordingly, one of ordinary skill in the art would modify Collins method in view of Hainaut and in view of Ebraheem and in view Bernstein and further in view of Kang to teach: providing, via a user interface, an option for a user to override the mapping of the input field. The motivation to do so would be to allow users to visually inspect entity resolutions through the use of visual analysis (Kang, pg. 1001 left column, sec. 3 Interface Design Principles, “The challenge of entity resolution in large relational data requires an interface that provides tight integration of statistical data mining algorithms, meaningful presentation of carefully selected subnetworks, and ready access to rich details to confirm or refute user conjectures. Our interface design provides simple access to sophisticated entity resolution algorithms and enables users to flexibly apply sequences of actions to identify duplicates effectively. In addition, users are provided with a simple network visualization, which displays the relational context between potential duplicates and allows users to make quick resolution decisions based on the context.”).
Referring to dependent claim 20 it is also rejected on the same basis as dependent claim 7 since they are analogous claims.   
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
1. He et al. US 10789229 B2 (details an enterprise data mapper in which the table corpus is grouped into co-occurrence statistics to produce a candidate hierarchical tree of information)
2. Jiang et al. US 2019/0130309 A1 (details an automatic mapper that includes an index builder, column search with scoring and ranking and mapping generation process comprised of a deep learning scorer, statistical scorer and a rule based scorer)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Adam Clark Standke whose telephone number is (571)270-1806. The examiner can normally be reached 10AM-7PM M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Adam Clark Standke
Assistant Examiner
Art Unit 2129



/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 italics represents what Bernstein teaches and bold represents Applicant’s amended limitation.