DETAILED ACTION
Response to Arguments
In regards to applicant’s arguments with respect to claim 9, that Ma in view of Yang does not teach the claim elements of equidistant embedding layers configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category in a reduced dimension dataset in a reduced dimension space from a dimension space that is maintained from a dataset. See page 13 of Applicant’s Remarks submitted on 04/15/2022.  Examiner respectfully disagrees. 
Yang teaches on pages three and four of the prior art the claim elements of equidistant embedding layers configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category. See also pages nineteen and twenty-one of the Current Office Action for additional information.  
Yang details that to prevent trivial low-dimensional representations such as all-zero vectors, a stacked autoencoder is used as a decoding network g(·;            
                Z
            
        ) to map the             
                
                    
                        h
                    
                    
                        i
                    
                
            
        ’s back to the data domain and requires that g(            
                
                    
                        h
                    
                    
                        i
                    
                
            
        ;            
                Z
            
        ) and             
                
                    
                        x
                    
                    
                        i
                    
                
            
         match each other well under least squares-based measures[equidistant embedding layers].  And to enforce this matching between the two different domains (i.e. the data domain of             
                
                    
                         
                        R
                    
                    
                        M
                    
                
            
         and the latent domain of             
                
                    
                        R
                    
                    
                        R
                    
                
            
        ) the following cost function is minimized:             
                
                    
                        
                            
                                
                                    
                                        
                                        
                                            
                                                
                                                    
                                                        min
                                                    
                                                    
                                                        W
                                                        ,
                                                        Z
                                                        ,
                                                         
                                                        M
                                                        ,
                                                         
                                                        {
                                                        
                                                            
                                                                s
                                                            
                                                            
                                                                i
                                                            
                                                        
                                                        }
                                                         
                                                    
                                                
                                            
                                            ⁡
                                            
                                                
                                                    
                                                        ∑
                                                        
                                                            i
                                                        
                                                        
                                                            N
                                                        
                                                    
                                                    
                                                        (
                                                        l
                                                        
                                                            
                                                                g
                                                                
                                                                    
                                                                        f
                                                                        
                                                                            
                                                                                
                                                                                    
                                                                                        x
                                                                                    
                                                                                    
                                                                                        i
                                                                                    
                                                                                
                                                                            
                                                                        
                                                                    
                                                                
                                                                ,
                                                                
                                                                    
                                                                        x
                                                                    
                                                                    
                                                                        i
                                                                    
                                                                
                                                                 
                                                            
                                                        
                                                        +
                                                        
                                                            
                                                                λ
                                                            
                                                            
                                                                2
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                    
                    ⁡
                    
                        ‖
                        f
                        
                            
                                
                                    
                                        x
                                    
                                    
                                        i
                                    
                                
                            
                        
                    
                
                -
                M
                
                    
                        s
                    
                    
                        i
                    
                
                
                    
                        
                            
                                ​
                            
                        
                    
                    
                        2
                    
                    
                        2
                    
                
                )
            
         where  the function             
                l
                
                    
                        ⋅
                    
                
                :
                
                    
                        R
                    
                    
                        M
                    
                
                ⟶
                R
            
         is  the least-square loss that measures the reconstruction error [configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category].
While Yang does not teach the elements of in a reduced dimension dataset in a reduced dimension space from a dimension space that is maintained from a dataset, the newly added prior art of Denisiuk teaches this limitation by stating on pages 21-23, “that binary Hamming s-dimensional set [from a dimension space that is maintained from a dataset]  can be embedded into s − 1 dimensional sphere [in a reduced dimension dataset in a reduced dimension space].”
Accordingly, Yang in view of Denisiuk teaches the claim elements of equidistant embedding layers configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category in a reduced dimension dataset in a reduced dimension space from a dimension space that is maintained from a dataset as recited in claim nine. 
Furthermore, Applicant’s arguments with respect to claims 1-8 and 17-20 and the other arguments with respect to 9-16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in by Applicant’s arguments. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-8 and 17- 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ma, "Entire space multi-task model: An effective approach for estimating post-click conversion rate." The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (2018)(“Ma”) in view of  Denisiuk, "A variant of the k-means clustering algorithm for continuous-nominal data." Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. Springer, Cham, (2016)(“Denisiuk”).
Regarding claim 1, Ma teaches a method implemented by at least one computing device, the method comprising: receiving, by the at least one computing device, a dataset corresponding to a plurality of tasks(Ma, pgs.3-4, “We collect traffic logs [a dataset] from Taobao’s recommender system [receiving, by the at least one computing device]… [i]n CTCVR task, all models calculate pCTCVR by pCTR × pCVR, where: i) pCVR is estimated by each model respectively, ii) pCTR is estimated with a same independently trained CTR network…[b]oth of the two tasks [plurality of tasks] split the first 1/2 data in the time sequence to be training set while the rest to be test set.”);extracting, by the at least one computing device, complementary information from the plurality of tasks(Ma, pgs. 2-3, see also fig.2, “[E]mbedding layer maps large scale sparse inputs into low dimensional representation vectors. It contributes most of the parameters of deep network and learning of which needs huge volume of training samples. In ESMM, embedding dictionary of CVR network is shared with that of CTR network [extracting,by the at least one computing device,  complementary information from the plurality of tasks]. It follows a feature representation transfer learning paradigm. Training samples with all impressions for CTR task is relatively much richer than CVR task. This parameter sharing mechanism enables CVR network in ESMM to learn from un-clicked impressions and provides great help for alleviating the data sparsity trouble.”); training, by the at least one computing device, a machine learning model based on the reduced dimension dataset and the complementary information(Ma, pgs. 2-4, see also fig.2, fig.3, “To be fair, all competitors including ESMM [machine learning model] share the same network structure and hyper parameters…which i) uses ReLU activation function, ii) sets the dimension of embedding vector to be 18 [reduced dimension dataset and the complementary information] iii) sets dimensions of each layers in MLP network to be 360 × 200 × 80 × 2, iv) uses adam solver with parameter β1 = 0.9, β2 = 0.999, ϵ = 10−8 [training, by the at least one computing device].”). 
Ma does not teach:  the dataset describing a dimension space having a plurality of categories and a plurality of features included within respective categories of the plurality of categories, the plurality of features having equidistant relationships, one to another, within respective said categories in the dimension space; generating, a reduced dimension dataset in a reduced dimension space that is reduced in comparison with the dimension space of the dataset, the reduced dimension dataset maintaining the equidistant relationships, one to another of the plurality of features, respectively, within the respective said categories from the dataset.
However, Denisiuk teaches: the dataset describing a dimension space having a plurality of categories and a plurality of features included within respective categories of the plurality of categories, the plurality of features having equidistant relationships, one to another, within respective said categories in the dimension space(Denisiuk, pg. 19, see also algorithm 2, “Let                         
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ,
                            ⋯
                            ,
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                        
                     be finite set of nominal values,                         
                            H
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                            .
                        
                     The Hamming metric on                         
                            H
                        
                     is defined as follows: for each two vectors                         
                            n
                            ,
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            ∈
                            H
                            ,
                             
                        
                                             
                            
                                
                                    d
                                
                                
                                    H
                                
                            
                            
                                
                                    n
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    s
                                
                                
                                    -
                                    1
                                
                            
                            |
                            
                                
                                    i
                                    =
                                    1
                                    ,
                                    …
                                    ,
                                    s
                                
                                
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                    
                                    ≠
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                        
                                            '
                                        
                                    
                                
                            
                        
                     where                         
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    .
                                    .
                                    .
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            =
                            (
                            
                                
                                    n
                                
                                
                                    1
                                
                                
                                    '
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    n
                                
                                
                                    s
                                
                                
                                    '
                                
                            
                            )
                        
                    …[t]o represent continuous data we use points from the standard p-dimensional Euclidean metric space (                        
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ,
                             
                            
                                
                                    d
                                
                                
                                    E
                                
                            
                        
                    ). The nominal part of record is represented as a point at the Hamming metric space                         
                            H
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                        
                     The set X =                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                        
                    ×                         
                            H
                        
                     is considered as a set of all records. Every record r ∈ X is a pair of continuous and nominal data, r = (x, n).” Denisiuk teaches:  The set X =                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                        
                    ×                         
                            H
                        
                     where                         
                            H
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                        
                     and the set X is considered as a set of all records. Every record r ∈ X is a pair of continuous and nominal data, r = (x, n) (i.e. the dataset describing a dimension space having a plurality of categories and a plurality of features included within respective categories of the plurality of categories) The Hamming metric on                         
                            H
                        
                     is defined as follows: for each two vectors                         
                            n
                            ,
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            ∈
                            H
                            ,
                             
                        
                                             
                            
                                
                                    d
                                
                                
                                    H
                                
                            
                            
                                
                                    n
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    s
                                
                                
                                    -
                                    1
                                
                            
                            |
                            
                                
                                    i
                                    =
                                    1
                                    ,
                                    …
                                    ,
                                    s
                                
                                
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                    
                                    ≠
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                        
                                            '
                                        
                                    
                                
                            
                        
                     where                         
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    .
                                    .
                                    .
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            =
                            (
                            
                                
                                    n
                                
                                
                                    1
                                
                                
                                    '
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    n
                                
                                
                                    s
                                
                                
                                    '
                                
                            
                            )
                        
                      (i.e. the plurality of features having equidistant relationships, one to another, within respective said categories in the dimension space) ); generating, a reduced dimension dataset in a reduced dimension space that is reduced in comparison with the dimension space of the dataset, the reduced dimension dataset maintaining the equidistant relationships, one to another of the plurality of features, respectively, within the respective said categories from the dataset(Denisiuk, pgs. 21-23, see also algorithm 2, “We show that binary Hamming s-dimensional set can be embedded into s − 1 dimensional sphere…[l]et                         
                             
                            
                                
                                    
                                        
                                            A
                                        
                                        
                                            j
                                        
                                    
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                             
                            f
                            o
                            r
                             
                            j
                            =
                            1
                            ,
                            …
                            ,
                            s
                            .
                             
                        
                                              
                            T
                            h
                            e
                            n
                             
                            A
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                             
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                             
                        
                     with the standard Hamming metrics                         
                            
                                
                                    d
                                
                                
                                    H
                                
                            
                            
                                
                                    n
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    s
                                
                                
                                    -
                                    1
                                
                            
                            |
                            
                                
                                    i
                                    =
                                    1
                                    ,
                                    …
                                    ,
                                    s
                                
                                
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                    
                                    ≠
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                        
                                            '
                                        
                                    
                                
                            
                        
                     can be embedded into the standard unit sphere                         
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                            (
                            q
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        s
                                    
                                
                                
                                    
                                        
                                            a
                                        
                                        
                                            j
                                        
                                    
                                    -
                                    s
                                    )
                                
                            
                        
                     …[t]he embedding of a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     into the cylinder                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                     is defined as follows:                         
                            
                                
                                    x
                                    ,
                                    n
                                
                            
                            ↦
                             
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    
                                        
                                            ϕ
                                        
                                        
                                            s
                                        
                                    
                                    
                                        
                                            n
                                        
                                    
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            ,
                        
                    where                         
                            
                                
                                    ϕ
                                
                                
                                    s
                                
                            
                            :
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ⟼
                            (
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            1
                                        
                                    
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    s
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            s
                                        
                                    
                                
                                
                                    s
                                
                            
                            )
                        
                     where                         
                            
                                
                                    k
                                
                                
                                    j
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                            -
                            1
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    j
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            j
                                        
                                    
                                
                                
                                    j
                                
                            
                        
                     are the coordinates of                         
                            
                                
                                    n
                                
                                
                                    j
                                
                            
                        
                     on the corresponding simplex (                        
                            j
                            =
                            1
                            ,
                            …
                            ,
                             
                            s
                            )
                        
                    ,                         
                            
                                
                                    y
                                
                            
                            =
                            1
                        
                    . The distance between two records on the cylinder is defined by the following formula:                         
                            d
                            i
                            s
                            
                                
                                    t
                                
                                
                                    2
                                
                            
                            
                                
                                    
                                        
                                            x
                                            ,
                                            y
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    '
                                                
                                            
                                            ,
                                             
                                            y
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        p
                                    
                                
                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            2
                                        
                                    
                                    +
                                    ϰ
                                    
                                        
                                            
                                                
                                                    arccos
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ⁡
                                        
                                            (
                                            
                                                
                                                    y
                                                    ,
                                                     
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    .” Denisiuk teaches: the standard Hamming metrics  can be embedded into the standard unit sphere                         
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                            (
                            q
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        s
                                    
                                
                                
                                    
                                        
                                            a
                                        
                                        
                                            j
                                        
                                    
                                    -
                                    s
                                    )
                                
                            
                        
                     such that a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     is embedded into a cylinder of                          
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                    dimensions (i.e. generating, a reduced dimension dataset) we show that binary Hamming s-dimensional set can be embedded into s − 1 dimensional sphere  (i.e. in a reduced dimension space that is reduced in comparison with the dimension space of the dataset) the embedding of a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     into the cylinder                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                    (i.e. the reduced dimension dataset)                         
                            
                                
                                    x
                                    ,
                                    n
                                
                            
                            ⟼
                             
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    
                                        
                                            ϕ
                                        
                                        
                                            s
                                        
                                    
                                    
                                        
                                            n
                                        
                                    
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            ,
                        
                    where                         
                            
                                
                                    ϕ
                                
                                
                                    s
                                
                            
                            :
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ⟼
                            (
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            1
                                        
                                    
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    s
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            s
                                        
                                    
                                
                                
                                    s
                                
                            
                            )
                        
                     where                         
                            
                                
                                    k
                                
                                
                                    j
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                            -
                            1
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    j
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            j
                                        
                                    
                                
                                
                                    j
                                
                            
                        
                     are the coordinates of                         
                            
                                
                                    n
                                
                                
                                    j
                                
                            
                        
                     on the corresponding simplex (                        
                            j
                            =
                            1
                            ,
                            …
                            ,
                             
                            s
                            )
                        
                    ,                         
                            
                                
                                    y
                                
                            
                            =
                            1
                        
                     and                         
                            d
                            i
                            s
                            
                                
                                    t
                                
                                
                                    2
                                
                            
                            
                                
                                    
                                        
                                            x
                                            ,
                                            y
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    '
                                                
                                            
                                            ,
                                             
                                            y
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        p
                                    
                                
                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            2
                                        
                                    
                                    +
                                    ϰ
                                    
                                        
                                            
                                                
                                                    arccos
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ⁡
                                        
                                            (
                                            
                                                
                                                    y
                                                    ,
                                                     
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                     represents the distance metric between two records (i.e. maintaining the equidistant relationships, one to another of the plurality of features, respectively, within the respective said categories from the dataset)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Ma with the teachings of Denisiuk the motivation to do so would be find an embedding space for nominal data that best approximates the original metric space with the least amount of distortion (Denisiuk, pg. 18, “To perform the k-means algorithm, one should be able to measure a distance between two records of data and to compute a centroid of a finite set of data records. The embedding of the considered dataset into a metric space equipped with a method of computing centroids is the core idea of this paper. We search for a relevant space by embedding the Hamming metric space of nominal data into a Riemannian manifold with possibly small distortion. The classical approach, representing nominal values as equidistant vertexes of a simplex, can be considered as embedding of the Hamming metric space into the Euclidean space. In general, isometric embedding of the Hamming metric into the Euclidean space is not possible [due to distortion]…[t]wo- and three-dimensional examples suggest that embedding of nominal values into a sphere has a distortion that is less than distortion of classical embedding into Euclidean space…[w]e give a quantitative measure of this distortion improvement… which is at least 75% better than the distortion of embedding into the Euclidean space.”).  
Regarding claim 2, Ma in view of Denisiuk teaches the method of claim 1, further comprising: receiving, by the at least one computing device, another dataset that corresponds to a particular task from the plurality of tasks, the another dataset describing the plurality of categories and the plurality of features(Ma, pgs. 3-4, “[F]or CTCVR task, impressions with click and conversion events occurred simultaneously are labeled y&z = 1, otherwise y&z = 0.” And see Table 1, which details the statistics of the experimental dataset, in which the Dataset has the categories and features of  #impression, #click,  #conversion); generating, by the least one computing device, a prediction of an outcome of the particular task based on the another dataset and the trained machine learning model(Ma, pgs. 3-4, see also table 2 and fig.3, “CTCVR prediction task which estimates pCTCVR on dataset with all impressions. [This] [t]ask…aims to compare different CVR modeling methods over entire input space, which reflects the model performance corresponding to SSB problem. In CTCVR task, all models calculate pCTCVR by pCTR × pCVR, where: i) pCVR is estimated by each model respectively, ii) pCTR is estimated with a same independently trained CTR network…tasks split the first 1/2 data in the time sequence to be training set while the rest to be test set. Area under the ROC curve (AUC) is adopted as performance metrics. All experiments are repeated 10 times and averaged results are reported.” ); and outputting, by the at least one computing device, the generated prediction of the outcome of the particular task(Ma, pgs. 3-4, see also table 2 and fig.3, “Compared with BASE model the ESMM [model] achieves…[on the] CTCVR task with full samples, [a]…3.25% AUC gain.” Note: The AUC (i.e. area under the ROC curve) is a performance metric that measures the prediction accuracy of the model on a given task; the higher the AUC the more accurate are the model’s predictions).  
Regarding claim 3, Ma in view of Denisiuk teaches the method of claim 1, wherein the plurality of tasks comprises at least a first task and a second task(Ma, pg.2, see also fig.2, “Post-click CVR modeling is to estimate the probability of pCVR = p(z = 1|y = 1, x). Two associated probabilities are: post-view click-through rate (CTR) with pCTR = p([y] = 1|x)[a first task] and post-view click&conversion rate (CTCVR) with pCTCVR = p(y = 1, z = 1|x)[ a second task].”), and wherein the training the machine learning model comprises training a first machine learning model corresponding to the first task and training a second machine learning model corresponding to the second task(Ma, pg. 3, see also fig. 2,  “The loss function of ESMM is defined as Eq.(3). It consists of two loss terms from CTR and CTCVR tasks which are calculated over samples of all impressions, without using the loss of CVR task.                         
                            L
                            
                                
                                    
                                        
                                            θ
                                        
                                        
                                            c
                                            v
                                            r
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            θ
                                        
                                        
                                            c
                                            t
                                            r
                                        
                                    
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                    
                                        N
                                    
                                
                                
                                    l
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    i
                                                
                                            
                                            ,
                                             
                                            f
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                    ;
                                                     
                                                    
                                                        
                                                            θ
                                                        
                                                        
                                                            c
                                                            t
                                                            r
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                     [training a first machine learning model corresponding to the first task]                         
                            +
                             
                            
                                
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                    
                                        N
                                    
                                
                                
                                    l
                                    (
                                    
                                        
                                            y
                                        
                                        
                                            i
                                        
                                    
                                    &
                                    
                                        
                                            z
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                     
                                    f
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    i
                                                
                                            
                                            ;
                                             
                                            
                                                
                                                    θ
                                                
                                                
                                                    c
                                                    t
                                                    r
                                                
                                            
                                        
                                    
                                    ×
                                     
                                    f
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    i
                                                
                                            
                                            ;
                                             
                                            
                                                
                                                    θ
                                                
                                                
                                                    c
                                                    v
                                                    r
                                                
                                            
                                        
                                    
                                    )
                                
                            
                        
                     [training a second machine learning model corresponding to the second task]where                         
                            
                                
                                    θ
                                
                                
                                    c
                                    t
                                    r
                                
                            
                        
                     and                          
                            
                                
                                    θ
                                
                                
                                    c
                                    v
                                    r
                                
                            
                        
                     are the parameters of CTR and CVR networks and                         
                            l
                            (
                            ⋅
                            )
                        
                     is cross-entropy loss function.”).  
Regarding claim 4, Ma in view of Denisiuk teaches the method of claim 3, wherein the first task is a supervised task and the second task is one of a supervised task or an unsupervised task(Ma, pg. 2, As figure 2 states: “Architecture overview of ESMM for CVR modeling. In ESMM, two auxiliary tasks of CTR and CTCVR are introduced which: i) help to model CVR over entire input space, ii) provide feature representation transfer learning. ESMM mainly consists of two sub-networks: CVR network illustrated in the left part of this figure [the first task is a supervised task] and CTR network in the right part [the second task is one of a supervised task]. Embedding parameters of CTR and CVR network are shared. CTCVR takes the product of outputs from CTR and CVR network as the output.”).1  
Regarding claim 5, Ma in view of Denisiuk teaches the method of claim 3, wherein the first machine learning model and the second machine learning model are trained from a neural network that includes shared layers corresponding to the first task and the second task, exclusive layers corresponding to the first task, and exclusive layers corresponding to the second task(Ma, pg. 2, As figure 2 states: “Architecture overview of ESMM for CVR modeling. In ESMM, two auxiliary tasks of CTR and CTCVR are introduced which: i) help to model CVR over entire input space, ii) provide feature representation transfer learning. ESMM mainly consists of two sub-networks: CVR network illustrated in the left part of this figure and CTR network in the right part. Embedding parameters of CTR and CVR network are shared[first machine learning model and the second machine learning model are trained from a neural network that includes shared layers corresponding to the first task and the second task]. CTCVR takes the product of outputs from CTR and CVR network as the output [exclusive layers corresponding to the first task, and exclusive layers corresponding to the second task].”).  
Regarding claim 6, Ma in view of Denisiuk teaches the method of claim 3, wherein the second machine learning model is trained utilizing information corresponding to the first task(Ma, pg.3, see also fig. 2, “embedding layer maps large scale sparse inputs into low dimensional representation vectors. It contributes most of the parameters of deep network and learning of which needs huge volume of training samples. In ESMM, embedding dictionary of CVR network is shared with that of CTR network [the second machine learning model is trained utilizing information corresponding to the first task]. It follows a feature representation transfer learning paradigm. Training samples with all impressions for CTR task is relatively much richer than CVR task. This parameter sharing mechanism enables CVR network in ESMM to learn from un-clicked impressions....”).  
Regarding claim 7, Ma in view of Denisiuk teaches the method of claim 1, wherein the reduced dimension dataset includes an equidistant embedding that enforces, for each respective category of the plurality of categories, the equidistant relationship among the plurality of features included within the respective category(Denisiuk, pgs. 21-23, see also algorithm 2, “We show that binary Hamming s-dimensional set can be embedded into s − 1 dimensional sphere…[t]he embedding of a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     into the cylinder                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                     is defined as follows:                         
                            
                                
                                    x
                                    ,
                                    n
                                
                            
                            ↦
                             
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    
                                        
                                            ϕ
                                        
                                        
                                            s
                                        
                                    
                                    
                                        
                                            n
                                        
                                    
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            ,
                        
                    where                         
                            
                                
                                    ϕ
                                
                                
                                    s
                                
                            
                            :
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ↦
                            (
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            1
                                        
                                    
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    s
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            s
                                        
                                    
                                
                                
                                    s
                                
                            
                            )
                        
                     where                         
                            
                                
                                    k
                                
                                
                                    j
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                            -
                            1
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    j
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            j
                                        
                                    
                                
                                
                                    j
                                
                            
                        
                     are the coordinates of                         
                            
                                
                                    n
                                
                                
                                    j
                                
                            
                        
                     on the corresponding simplex (                        
                            j
                            =
                            1
                            ,
                            …
                            ,
                             
                            s
                            )
                        
                    ,                         
                            
                                
                                    y
                                
                            
                            =
                            1
                        
                    . The distance between two records on the cylinder is defined by the following formula:                         
                            d
                            i
                            s
                            
                                
                                    t
                                
                                
                                    2
                                
                            
                            
                                
                                    
                                        
                                            x
                                            ,
                                            y
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    '
                                                
                                            
                                            ,
                                             
                                            y
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        p
                                    
                                
                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            2
                                        
                                    
                                    +
                                    ϰ
                                    
                                        
                                            
                                                
                                                    arccos
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ⁡
                                        
                                            (
                                            
                                                
                                                    y
                                                    ,
                                                     
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                            )
                                        
                                    
                                
                            
                            .
                            "
                        
                     Denisiuk teaches: [t]he embedding of a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     into the cylinder                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                                               
                            
                                
                                    x
                                    ,
                                    n
                                
                            
                            ↦
                             
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    
                                        
                                            ϕ
                                        
                                        
                                            s
                                        
                                    
                                    
                                        
                                            n
                                        
                                    
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            ,
                        
                     where                         
                            
                                
                                    ϕ
                                
                                
                                    s
                                
                            
                            :
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ↦
                            (
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            1
                                        
                                    
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    s
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            s
                                        
                                    
                                
                                
                                    s
                                
                            
                            )
                        
                     where                         
                            
                                
                                    k
                                
                                
                                    j
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                            -
                            1
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    j
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            j
                                        
                                    
                                
                                
                                    j
                                
                            
                        
                     are the coordinates of                         
                            
                                
                                    n
                                
                                
                                    j
                                
                            
                        
                     on the corresponding simplex (                        
                            j
                            =
                            1
                            ,
                            …
                            ,
                             
                            s
                            )
                        
                    ,                         
                            
                                
                                    y
                                
                            
                            =
                            1
                        
                     (i.e. includes an equidistant embedding) and                         
                            d
                            i
                            s
                            
                                
                                    t
                                
                                
                                    2
                                
                            
                            
                                
                                    
                                        
                                            x
                                            ,
                                            y
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    '
                                                
                                            
                                            ,
                                             
                                            y
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        p
                                    
                                
                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            j
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            2
                                        
                                    
                                    +
                                    ϰ
                                    
                                        
                                            
                                                
                                                    arccos
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ⁡
                                        
                                            (
                                            
                                                
                                                    y
                                                    ,
                                                     
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            '
                                                        
                                                    
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                     represents the distance metric between two records (i.e. that enforces, for each respective category of the plurality of categories, the equidistant relationship among the plurality of features included within the respective category)).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Ma with the above teachings of Denisiuk for the same rationale stated at Claim 1.
Regarding claim 8, Ma in view of Denisiuk teaches the method of claim 1, wherein the generating the reduced dimension dataset is performed as a part of the training the machine learning model(Ma, pgs. 3-4, see also fig.2, fig.3, “In ESMM[machine learning model], pCVR is just an intermediate variable…pCTR and pCTCVR are the main factors ESMM[machine learning model] actually estimated over entire space. The multiplication form enables the three associated and co-trained estimators to exploit the sequential patte[r]n of data and communicate information with each other during training… ESMM[machine learning model] share the same network structure… sets the dimension of embedding vector to be 18 [generating the reduced dimension dataset], iii) sets dimensions of each layers in MLP network to be 360 × 200 × 80 × 2, iv) uses adam solver with parameter β1 = 0.9, β2 = 0.999, ϵ = 10−8 [is performed as a part of the training the machine learning model].”).  
Regarding claim 17, Ma teaches at least one computing device including a processing system and at least one computer-readable storage medium(Ma, pgs.3-4, “We collect traffic logs from Taobao’s recommender system [a processing system] and release a 1% random sampling version of the whole dataset, whose size still reaches 38GB(without compression) [at least one computer-readable storage medium].”), the at least one computing device comprising: means for receiving a first dataset corresponding to at least one task(Ma, pg.2, “For this work, we collect traffic logs from Taobao’s recommender system. The full dataset consists of 8.9 billion[] samples with sequential labels of click and conversion.");means for receiving an input corresponding to the at least one task(Ma, pg.2, “For this work, we collect traffic logs from Taobao’s recommender system. The full dataset consists of 8.9 billion[] samples with sequential labels of click and conversion."); means for generating a prediction of an outcome of the at least one task based on the input and the trained machine learning model(Ma, pgs. 3-4, see also table 2 and fig.3, “CTCVR prediction task which estimates pCTCVR on dataset with all impressions. [This] [t]ask…aims to compare different CVR modeling methods over entire input space, which reflects the model performance corresponding to SSB problem. In CTCVR task, all models calculate pCTCVR by pCTR × pCVR, where: i) pCVR is estimated by each model respectively, ii) pCTR is estimated with a same independently trained CTR network…tasks split the first 1/2 data in the time sequence to be training set while the rest to be test set. Area under the ROC curve (AUC) is adopted as performance metrics. All experiments are repeated 10 times and averaged results are reported.”); and means for outputting the generated prediction of the outcome of the at least one task(Ma, pgs. 3-4, see also table 2 and fig.3, “Compared with BASE model the ESMM [model] achieves…[on the] CTCVR task with full samples, [a]…3.25% AUC gain.” Note: The AUC (i.e. area under the ROC curve) is a performance metric that measures the prediction accuracy of the model on a given task; the higher the AUC the more accurate are the model’s predictions). 
Ma does not teach: the first dataset describing a dimension space having a plurality of categories and a plurality of features included within respective categories of the plurality of categories, the plurality of features having equidistant relationships, one to another, within respective said categories in the dimension space; means for training a machine learning model that enforces the equidistant relationships, in a reduced dimension space, of the plurality of features within respective said categories. 
However, Denisiuk teaches: the first dataset describing a dimension space having a plurality of categories and a plurality of features included within respective categories of the plurality of categories, the plurality of features having equidistant relationships, one to another, within respective said categories in the dimension space(Denisiuk, pg. 19, see also algorithm 2, “Let                         
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ,
                            ⋯
                            ,
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                        
                     be finite set of nominal values,                         
                            H
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                            .
                        
                     The Hamming metric on                         
                            H
                        
                     is defined as follows: for each two vectors                         
                            n
                            ,
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            ∈
                            H
                            ,
                             
                        
                                             
                            
                                
                                    d
                                
                                
                                    H
                                
                            
                            
                                
                                    n
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    s
                                
                                
                                    -
                                    1
                                
                            
                            |
                            
                                
                                    i
                                    =
                                    1
                                    ,
                                    …
                                    ,
                                    s
                                
                                
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                    
                                    ≠
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                        
                                            '
                                        
                                    
                                
                            
                        
                     where                         
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    .
                                    .
                                    .
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            =
                            (
                            
                                
                                    n
                                
                                
                                    1
                                
                                
                                    '
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    n
                                
                                
                                    s
                                
                                
                                    '
                                
                            
                            )
                        
                    …[t]o represent continuous data we use points from the standard p-dimensional Euclidean metric space (                        
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ,
                             
                            
                                
                                    d
                                
                                
                                    E
                                
                            
                        
                    ). The nominal part of record is represented as a point at the Hamming metric space                         
                            H
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                        
                     The set X =                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                        
                    ×                         
                            H
                        
                     is considered as a set of all records. Every record r ∈ X is a pair of continuous and nominal data, r = (x, n).” Denisiuk teaches: The set X =                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                        
                    ×                         
                            H
                        
                     where                         
                            H
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                        
                     and the set X is considered as a set of all records. Every record r ∈ X is a pair of continuous and nominal data, r = (x, n) (i.e. the first dataset describing a dimension space having a plurality of categories and a plurality of features included within respective categories of the plurality of categories) The Hamming metric on                         
                            H
                        
                     is defined as follows: for each two vectors                         
                            n
                            ,
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            ∈
                            H
                            ,
                             
                        
                                             
                            
                                
                                    d
                                
                                
                                    H
                                
                            
                            
                                
                                    n
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    s
                                
                                
                                    -
                                    1
                                
                            
                            |
                            
                                
                                    i
                                    =
                                    1
                                    ,
                                    …
                                    ,
                                    s
                                
                                
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                    
                                    ≠
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                        
                                            '
                                        
                                    
                                
                            
                        
                     where                         
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    .
                                    .
                                    .
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ,
                             
                            
                                
                                    n
                                
                                
                                    '
                                
                            
                            =
                            (
                            
                                
                                    n
                                
                                
                                    1
                                
                                
                                    '
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    n
                                
                                
                                    s
                                
                                
                                    '
                                
                            
                            )
                        
                      (i.e. the plurality of features having equidistant relationships, one to another, within respective said categories in the dimension space)); means for training a machine learning model that enforces the equidistant relationships, in a reduced dimension space, of the plurality of features within respective said categories(Denisiuk, pgs. 21-23, see also algorithm 2, “We show that binary Hamming s-dimensional set can be embedded into s − 1 dimensional sphere…[t]he embedding of a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     into the cylinder                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                     is defined as follows:                         
                            
                                
                                    x
                                    ,
                                    n
                                
                            
                            ↦
                             
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    
                                        
                                            ϕ
                                        
                                        
                                            s
                                        
                                    
                                    
                                        
                                            n
                                        
                                    
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            ,
                        
                    where                         
                            
                                
                                    ϕ
                                
                                
                                    s
                                
                            
                            :
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ↦
                            (
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            1
                                        
                                    
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    s
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            s
                                        
                                    
                                
                                
                                    s
                                
                            
                            )
                        
                     where                         
                            
                                
                                    k
                                
                                
                                    j
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                            -
                            1
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    j
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            j
                                        
                                    
                                
                                
                                    j
                                
                            
                        
                     are the coordinates of                         
                            
                                
                                    n
                                
                                
                                    j
                                
                            
                        
                     on the corresponding simplex (                        
                            j
                            =
                            1
                            ,
                            …
                            ,
                             
                            s
                            )
                        
                    ,                         
                            
                                
                                    y
                                
                            
                            =
                            1
                        
                     …[t]o compute a centroid (                        
                            
                                
                                    x
                                
                                -
                            
                            ,
                            
                                
                                    y
                                
                                -
                            
                             
                        
                    ) we should minimize the following expression:                        
                             
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        N
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                p
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        
                                                            
                                                                
                                                                    x
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                            -
                                                            
                                                                
                                                                    x
                                                                
                                                                
                                                                    k
                                                                
                                                                
                                                                    (
                                                                    j
                                                                    )
                                                                
                                                            
                                                        
                                                    
                                                
                                                
                                                    2
                                                
                                            
                                            +
                                            ϰ
                                            
                                                
                                                    ∑
                                                    
                                                        j
                                                        =
                                                        1
                                                    
                                                    
                                                        N
                                                    
                                                
                                                
                                                    
                                                        
                                                            
                                                                
                                                                    arccos
                                                                
                                                                
                                                                    2
                                                                
                                                            
                                                        
                                                        ⁡
                                                        
                                                            (
                                                            
                                                                
                                                                    y
                                                                    ,
                                                                     
                                                                    
                                                                        
                                                                            y
                                                                        
                                                                        
                                                                            (
                                                                            j
                                                                            )
                                                                        
                                                                    
                                                                
                                                            
                                                            )
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                    …[f]or the second term we suggest to use a gradient method. This is summarized in Algorithm 2.” Denisiuk teaches: to compute a centroid (                        
                            
                                
                                    x
                                
                                -
                            
                            ,
                            
                                
                                    y
                                
                                -
                            
                             
                        
                    ) we should minimize the following expression:                        
                             
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        N
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                p
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        
                                                            
                                                                
                                                                    x
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                            -
                                                            
                                                                
                                                                    x
                                                                
                                                                
                                                                    k
                                                                
                                                                
                                                                    (
                                                                    j
                                                                    )
                                                                
                                                            
                                                        
                                                    
                                                
                                                
                                                    2
                                                
                                            
                                            +
                                            ϰ
                                            
                                                
                                                    ∑
                                                    
                                                        j
                                                        =
                                                        1
                                                    
                                                    
                                                        N
                                                    
                                                
                                                
                                                    
                                                        
                                                            
                                                                
                                                                    arccos
                                                                
                                                                
                                                                    2
                                                                
                                                            
                                                        
                                                        ⁡
                                                        
                                                            (
                                                            
                                                                
                                                                    y
                                                                    ,
                                                                     
                                                                    
                                                                        
                                                                            y
                                                                        
                                                                        
                                                                            (
                                                                            j
                                                                            )
                                                                        
                                                                    
                                                                
                                                            
                                                            )
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                     and for the second term we suggest to use a gradient method as summarized in Algorithm 2 (i.e. means for training a machine learning model that enforces the equidistant relationships) we show that binary Hamming s-dimensional dataset can be embedded into s − 1 dimensional sphere (i.e. in a reduced dimension space, of the plurality of features within respective said categories)). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Ma with the teachings of Denisiuk the motivation to do so would be find an embedding space for nominal data that best approximates the original metric space with the least amount of distortion (Denisiuk, pg. 18, “To perform the k-means algorithm, one should be able to measure a distance between two records of data and to compute a centroid of a finite set of data records. The embedding of the considered dataset into a metric space equipped with a method of computing centroids is the core idea of this paper. We search for a relevant space by embedding the Hamming metric space of nominal data into a Riemannian manifold with possibly small distortion. The classical approach, representing nominal values as equidistant vertexes of a simplex, can be considered as embedding of the Hamming metric space into the Euclidean space. In general, isometric embedding of the Hamming metric into the Euclidean space is not possible [due to distortion]…[t]wo- and three-dimensional examples suggest that embedding of nominal values into a sphere has a distortion that is less than distortion of classical embedding into Euclidean space…[w]e give a quantitative measure of this distortion improvement… which is at least 75% better than the distortion of embedding into the Euclidean space.”).  
Regarding claim 19, Ma in view of Denisiuk teaches the at least one computing device of claim 18, wherein the first task is a supervised task and the second task is an unsupervised task(Ma, pg. 2, As figure 2 states: “Architecture overview of ESMM for CVR modeling. In ESMM, two auxiliary tasks of CTR and CTCVR are introduced which: i) help to model CVR over entire input space, ii) provide feature representation transfer learning. ESMM mainly consists of two sub-networks: CVR network illustrated in the left part of this figure [the first task is a supervised task] and CTR network in the right part. Embedding parameters of CTR and CVR network are shared[the second task is an unsupervised task]. CTCVR takes the product of outputs from CTR and CVR network as the output.” examiner note: the process of implementing an embedding is being interpreted as an unsupervised task ).
Referring to dependent claims 18 and 20, they are rejected on the same basis as dependent claims 2 and 4 since they are analogous claims.
Claims 9-16 are rejected under 35 U.S.C. 103 as being unpatentable over Ma, "Entire space multi-task model: An effective approach for estimating post-click conversion rate." The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (2018)(“Ma”) in view of Yang, "Towards k-means-friendly spaces: Simultaneous deep learning and clustering." international conference on machine learning. PMLR, (2017)(“Yang”) and further in view of Denisiuk,"A variant of the k-means clustering algorithm for continuous-nominal data." Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. Springer, Cham, (2016)(“Denisiuk”). 
Regarding claim 9, Ma teaches at least one computing device including a processing system and at least one computer-readable storage medium(Ma, pgs.3-4, “We collect traffic logs from Taobao’s recommender system [a processing system] and release a 1% random sampling version of the whole dataset, whose size still reaches 38GB(without compression) [at least one computer-readable storage medium].”), the at least one computing device comprising: 
shared layers of the neural network, the shared layers configured to extract feature interactions between the features described in the dataset corresponding to at least one of a first task and a second task; exclusive layers of the neural network corresponding to the first task, the exclusive layers corresponding to the first task configured to utilize the extracted feature interactions to generate a first machine learning model corresponding to the first task; and exclusive layers of the neural network corresponding to the second task, the exclusive layers corresponding to the second task configured to utilize the extracted feature interactions to generate a second machine learning model corresponding to the second task(Ma, pg. 2, As figure 2 states: “Architecture overview of ESMM for CVR modeling. In ESMM, two auxiliary tasks of CTR  and CTCVR are introduced which: i) help to model CVR over entire input space, ii) provide feature representation transfer learning. ESMM mainly consists of two sub-networks: CVR network illustrated in the left part of this figure [which outputs pCVR= p(z = 1|y = 1, x)] and CTR network in the right part of this figure[which outputs pCTR= p(y = 1| x)]. Embedding parameters of CTR and CVR network are shared[shared layers of the neural network, the shared layers configured to extract feature interactions between the features described in the dataset corresponding to at least one of a first task and a second task]. CTCVR takes the product of outputs from CTR[exclusive layers of the neural network corresponding to the first task, the exclusive layers corresponding to the first task configured to utilize the extracted feature interactions to generate a first machine learning model corresponding to the first task] and CVR[and exclusive layers of the neural network corresponding to the second task, the exclusive layers corresponding to the second task configured to utilize the extracted feature interactions to generate a second machine learning model corresponding to the second task] network as the output.”).  
Ma does not teach: equidistant embedding layers of a neural network, the equidistant embedding layers configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category.
However, Yang teaches: equidistant embedding layers of a neural network(Yang, pgs. 3-4, “We are motivated to model the relationship between the observable data                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     and its clustering-friendly latent representation                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                     using a nonlinear mapping, i.e.,                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                            =
                            f
                            
                                
                                    
                                        
                                            x
                                        
                                        
                                            i
                                        
                                    
                                    ;
                                     
                                    W
                                
                            
                            ,
                             
                             
                            f
                            
                                
                                    ⋅
                                    ;
                                     
                                    W
                                
                            
                            :
                            
                                
                                    R
                                
                                
                                    M
                                
                            
                             
                            →
                            
                                
                                    R
                                
                                
                                    R
                                
                            
                             
                        
                    where                         
                            f
                            
                                
                                    ⋅
                                    ;
                                     
                                    W
                                
                            
                        
                     denotes the mapping function and                          
                            W
                        
                     denote the set of parameters. In this work, we propose to employ a DNN as our mapping function [where R                         
                            ≪
                        
                     M]… [t]o prevent trivial low-dimensional representations such as all-zero vectors, SAE [stacked autoencoder] uses a decoding network g(·;                        
                            Z
                        
                    ) to map the                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ’s back to the data domain and requires that g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) and                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     match each other well under…least squares-based measures.” Yang teaches: g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) and                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     match each other well under least squares-based measures  (i.e. equidistant embedding)                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                            =
                            f
                            
                                
                                    
                                        
                                            x
                                        
                                        
                                            i
                                        
                                    
                                    ;
                                     
                                    W
                                
                            
                        
                     where                         
                            f
                            
                                
                                    
                                        
                                            x
                                        
                                        
                                            i
                                        
                                    
                                    ;
                                     
                                    W
                                
                            
                        
                     is a DNN and SAE uses a decoding network g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) to map the                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ’s back to the data domain (i.e. layers of a neural network)); the equidistant embedding layers configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category(Yang, pgs. 3-4, “We are motivated to model the relationship between the observable data                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     and its clustering-friendly latent representation                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                     using a nonlinear mapping, i.e.,                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                            =
                            f
                            
                                
                                    
                                        
                                            x
                                        
                                        
                                            i
                                        
                                    
                                    ;
                                     
                                    W
                                
                            
                            ,
                             
                             
                            f
                            
                                
                                    ⋅
                                    ;
                                     
                                    W
                                
                            
                            :
                            
                                
                                     
                                    R
                                
                                
                                    M
                                
                            
                             
                            →
                            
                                
                                    R
                                
                                
                                    R
                                
                            
                             
                        
                    where                         
                            f
                            
                                
                                    ⋅
                                    ;
                                     
                                    W
                                
                            
                        
                     denotes the mapping function and                          
                            W
                        
                     denote the set of parameters. In this work, we propose to employ a DNN as our mapping function [where R                         
                            ≪
                        
                     M]… [t]o prevent trivial low-dimensional representations such as all-zero vectors, SAE [stacked autoencoder] uses a decoding network g(·;                        
                            Z
                        
                    ) to map the                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ’s back to the data domain and requires that g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) and                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     match each other well under…least squares-based measures By the above reasoning, we come up with the following cost function:                         
                            
                                
                                    
                                        
                                            
                                                
                                                    
                                                    
                                                        
                                                            
                                                                
                                                                    min
                                                                
                                                                
                                                                    W
                                                                    ,
                                                                    Z
                                                                    ,
                                                                     
                                                                    M
                                                                    ,
                                                                     
                                                                    {
                                                                    
                                                                        
                                                                            s
                                                                        
                                                                        
                                                                            i
                                                                        
                                                                    
                                                                    }
                                                                     
                                                                
                                                            
                                                        
                                                        ⁡
                                                        
                                                            
                                                                
                                                                    ∑
                                                                    
                                                                        i
                                                                    
                                                                    
                                                                        N
                                                                    
                                                                
                                                                
                                                                    (
                                                                    l
                                                                    
                                                                        
                                                                            g
                                                                            
                                                                                
                                                                                    f
                                                                                    
                                                                                        
                                                                                            
                                                                                                
                                                                                                    x
                                                                                                
                                                                                                
                                                                                                    i
                                                                                                
                                                                                            
                                                                                        
                                                                                    
                                                                                
                                                                            
                                                                            ,
                                                                            
                                                                                
                                                                                    x
                                                                                
                                                                                
                                                                                    i
                                                                                
                                                                            
                                                                             
                                                                        
                                                                    
                                                                    +
                                                                    
                                                                        
                                                                            λ
                                                                        
                                                                        
                                                                            2
                                                                        
                                                                    
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    ‖
                                    f
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    i
                                                
                                            
                                        
                                    
                                
                            
                            -
                            M
                            
                                
                                    s
                                
                                
                                    i
                                
                            
                            
                                
                                    
                                        
                                            ​
                                        
                                    
                                
                                
                                    2
                                
                                
                                    2
                                
                            
                            )
                        
                    … [t]he function                         
                            l
                            
                                
                                    ⋅
                                
                            
                            :
                            
                                
                                    R
                                
                                
                                    M
                                
                            
                            ⟶
                            R
                        
                     is a certain loss function that measures the reconstruction error. In this work, we adopt the least-squares loss….” Yang teaches: that  g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) and                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     match each other well under least squares-based measures where                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                            =
                            f
                            
                                
                                    
                                        
                                            x
                                        
                                        
                                            i
                                        
                                    
                                    ;
                                     
                                    W
                                
                            
                        
                     which represents the latent lower dimensional representation created by a DNN and g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) that maps the                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ’s back to the data domain of                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                      created by a stacked autoencoder (i.e. the equidistant embedding layers) the following cost function:                         
                            
                                
                                    
                                        
                                            
                                                
                                                    
                                                    
                                                        
                                                            
                                                                
                                                                    min
                                                                
                                                                
                                                                    W
                                                                    ,
                                                                    Z
                                                                    ,
                                                                     
                                                                    M
                                                                    ,
                                                                     
                                                                    {
                                                                    
                                                                        
                                                                            s
                                                                        
                                                                        
                                                                            i
                                                                        
                                                                    
                                                                    }
                                                                     
                                                                
                                                            
                                                        
                                                        ⁡
                                                        
                                                            
                                                                
                                                                    ∑
                                                                    
                                                                        i
                                                                    
                                                                    
                                                                        N
                                                                    
                                                                
                                                                
                                                                    (
                                                                    l
                                                                    
                                                                        
                                                                            g
                                                                            
                                                                                
                                                                                    f
                                                                                    
                                                                                        
                                                                                            
                                                                                                
                                                                                                    x
                                                                                                
                                                                                                
                                                                                                    i
                                                                                                
                                                                                            
                                                                                        
                                                                                    
                                                                                
                                                                            
                                                                            ,
                                                                            
                                                                                
                                                                                    x
                                                                                
                                                                                
                                                                                    i
                                                                                
                                                                            
                                                                             
                                                                        
                                                                    
                                                                    +
                                                                    
                                                                        
                                                                            λ
                                                                        
                                                                        
                                                                            2
                                                                        
                                                                    
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    ‖
                                    f
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    i
                                                
                                            
                                        
                                    
                                
                            
                            -
                            M
                            
                                
                                    s
                                
                                
                                    i
                                
                            
                            
                                
                                    
                                        
                                            ​
                                        
                                    
                                
                                
                                    2
                                
                                
                                    2
                                
                            
                            )
                        
                     contains the function                         
                            l
                            
                                
                                    ⋅
                                
                            
                            :
                            
                                
                                    R
                                
                                
                                    M
                                
                            
                            ⟶
                            R
                        
                     which is a certain loss function that measures the reconstruction error between g(                        
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                    ;                        
                            Z
                        
                    ) and                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     where                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                     represents the data samples of the dataset  (i.e. configured to enforce, for each respective category of a plurality of categories described in a dataset, an equidistant relationship among a plurality of features included within the respective category)) .
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Ma with the teachings of Yang the motivation to do so would be to optimize DR (dimensionality reduction) using non-linear deep learning framework to find a more optimal lower dimensional space for K-Means clustering (Yang, pg.2, see also fig. 1,  “In this work, we propose a joint DR and K-means clustering framework, where the DR part is implemented through learning a DNN, rather than a linear model… our objective is well-motivated: by better modeling the data transformation process with a more general model, a much more K-means friendly latent space can be learned… the kind of performance that can be expected using our proposed method can be seen in Fig. 1, where we generate four clusters of 2-D data which are well separated in the 2-D Euclidean space… [o]ne can see that the proposed algorithm outputs reduced-dimension data that are most suitable for applying K-means.”)
Ma does not teach: in a reduced dimension dataset in a reduced dimension space from a dimension space that is maintained from a dataset.
However, Denisiuk  teaches: in a reduced dimension dataset in a reduced dimension space from a dimension space that is maintained from a dataset(Denisiuk, pgs. 21-23, see also algorithm 2, “We show that binary Hamming s-dimensional set can be embedded into s − 1 dimensional sphere…[l]et                         
                             
                            
                                
                                    
                                        
                                            A
                                        
                                        
                                            j
                                        
                                    
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                             
                            f
                            o
                            r
                             
                            j
                            =
                            1
                            ,
                            …
                            ,
                            s
                            .
                             
                        
                                              
                            T
                            h
                            e
                            n
                             
                            A
                            =
                            
                                
                                    A
                                
                                
                                    1
                                
                            
                            ×
                            ⋯
                            ×
                             
                            
                                
                                    A
                                
                                
                                    s
                                
                            
                             
                        
                     with the standard Hamming metrics                         
                            
                                
                                    d
                                
                                
                                    H
                                
                            
                            
                                
                                    n
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            '
                                        
                                    
                                
                            
                            =
                            
                                
                                    s
                                
                                
                                    -
                                    1
                                
                            
                            |
                            
                                
                                    i
                                    =
                                    1
                                    ,
                                    …
                                    ,
                                    s
                                
                                
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                    
                                    ≠
                                    
                                        
                                            n
                                        
                                        
                                            i
                                        
                                        
                                            '
                                        
                                    
                                
                            
                        
                     can be embedded into the standard unit sphere                         
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                            (
                            q
                            =
                            
                                
                                    ∑
                                    
                                        j
                                        =
                                        1
                                    
                                    
                                        s
                                    
                                
                                
                                    
                                        
                                            a
                                        
                                        
                                            j
                                        
                                    
                                    -
                                    s
                                    )
                                
                            
                        
                     …[t]he embedding of a record                         
                            r
                            =
                            (
                            x
                            ,
                            n
                            )
                        
                     into the cylinder                         
                            
                                
                                    R
                                
                                
                                    p
                                
                            
                            ×
                             
                            
                                
                                    S
                                
                                
                                    q
                                    -
                                    1
                                
                            
                        
                     is defined as follows:                         
                            
                                
                                    x
                                    ,
                                    n
                                
                            
                            ↦
                             
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    
                                        
                                            ϕ
                                        
                                        
                                            s
                                        
                                    
                                    
                                        
                                            n
                                        
                                    
                                
                            
                            =
                            
                                
                                    x
                                    ,
                                    y
                                
                            
                            ,
                        
                    where                         
                            
                                
                                    ϕ
                                
                                
                                    s
                                
                            
                            :
                            n
                            =
                            
                                
                                    
                                        
                                            n
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            n
                                        
                                        
                                            s
                                        
                                    
                                
                            
                            ↦
                            (
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            1
                                        
                                    
                                
                                
                                    1
                                
                            
                            ,
                             
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    s
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            s
                                        
                                    
                                
                                
                                    s
                                
                            
                            )
                        
                     where                         
                            
                                
                                    k
                                
                                
                                    j
                                
                            
                            =
                            
                                
                                    a
                                
                                
                                    j
                                
                            
                            -
                            1
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    1
                                
                                
                                    j
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    e
                                
                                
                                    
                                        
                                            k
                                        
                                        
                                            j
                                        
                                    
                                
                                
                                    j
                                
                            
                        
                     are the coordinates of                         
                            
                                
                                    n
                                
                                
                                    j
                                
                            
                        
                     on the corresponding simplex (                        
                            j
                            =
                            1
                            ,
                            …
                            ,
                             
                            s
                            )
                        
                     and                         
                            
                                
                                    y
                                
                            
                            =
                            1
                        
                     .” Denisiuk teaches: we show that binary Hamming s-dimensional dataset can be embedded into s − 1 dimensional sphere  (i.e. in a reduced dimension space from a dimension space that is maintained from a dataset)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Ma with the teachings of Denisiuk the motivation to do so would be to find an embedding space for nominal data that best approximates the original metric space with the least amount of distortion (Denisiuk, pg. 18, “To perform the k-means algorithm, one should be able to measure a distance between two records of data and to compute a centroid of a finite set of data records. The embedding of the considered dataset into a metric space equipped with a method of computing centroids is the core idea of this paper. We search for a relevant space by embedding the Hamming metric space of nominal data into a Riemannian manifold with possibly small distortion. The classical approach, representing nominal values as equidistant vertexes of a simplex, can be considered as embedding of the Hamming metric space into the Euclidean space. In general, isometric embedding of the Hamming metric into the Euclidean space is not possible [due to distortion]…[t]wo- and three-dimensional examples suggest that embedding of nominal values into a sphere has a distortion that is less than distortion of classical embedding into Euclidean space…[w]e give a quantitative measure of this distortion improvement… which is at least 75% better than the distortion of embedding into the Euclidean space.”).  
Regarding claim 10, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, the at least one computer-readable storage medium storing processor-executable instructions that, responsive to execution by the processing system, cause the processing system to perform operations comprising: receiving a second dataset corresponding to the first task(Ma, pgs. 3-4, “[F]or CTCVR task, impressions with click and conversion events occurred simultaneously are labeled y&z = 1, otherwise y&z = 0.” And see Table 1, which details the statistics of the experimental dataset, in which the Dataset has the categories and features of  #impression, #click,  #conversion); generating a prediction of an outcome of the first task based on the second dataset and the first machine learning model(Ma, pgs. 3-4, see also table 2 and fig.3, “CTCVR prediction task which estimates pCTCVR on dataset with all impressions. [This] [t]ask…aims to compare different CVR modeling methods over entire input space, which reflects the model performance corresponding to SSB problem. In CTCVR task, all models calculate pCTCVR by pCTR × pCVR, where: i) pCVR is estimated by each model respectively, ii) pCTR is estimated with a same independently trained CTR network…tasks split the first 1/2 data in the time sequence to be training set while the rest to be test set. Area under the ROC curve (AUC) is adopted as performance metrics. All experiments are repeated 10 times and averaged results are reported.”); and outputting the generated prediction of the outcome of the first task(Ma, pgs. 3-4, see also table 2 and fig.3, “Compared with BASE model the ESMM [model] achieves…[on the] CTCVR task with full samples, [a]…3.25% AUC gain.” Note: The AUC (i.e. area under the ROC curve) is a performance metric that measures the prediction accuracy of the model on a given task; the higher the AUC the more accurate are the model’s predictions).  
Regarding claim 11, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, wherein the first machine learning model and the second machine learning model are generated concurrently(Ma, pg. 3-4, see also fig.2, “ESMM share the same idea to model CVR over entire space which involve three networks of CVR[the first machine learning model], CTR[the second machine learning model] and CTCVR… ESMM co-train the three networks[are generated concurrently]…” ).  
Regarding claim 12, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, wherein the first task is a supervised task and the second task is an unsupervised task(Ma, pg. 2, As figure 2 states: “Architecture overview of ESMM for CVR modeling. In ESMM, two auxiliary tasks of CTR and CTCVR are introduced which: i) help to model CVR over entire input space, ii) provide feature representation transfer learning. ESMM mainly consists of two sub-networks: CVR network illustrated in the left part of this figure [the first task is a supervised task] and CTR network in the right part. Embedding parameters of CTR and CVR network are shared[the second task is an unsupervised task]. CTCVR takes the product of outputs from CTR and CVR network as the output.” examiner note: the process of implementing an embedding is being interpreted as an unsupervised task).
Regarding claim 13, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, wherein the first task is a supervised task and the second task is a supervised task(Ma, pg. 2, As figure 2 states: “Architecture overview of ESMM for CVR modeling. In ESMM, two auxiliary tasks of CTR and CTCVR are introduced which: i) help to model CVR over entire input space, ii) provide feature representation transfer learning. ESMM mainly consists of two sub-networks: CVR network illustrated in the left part of this figure [the first task is a supervised task] and CTR network in the right part [the second task is one of a supervised task]. Embedding parameters of CTR and CVR network are shared. CTCVR takes the product of outputs from CTR and CVR network as the output.”).  
Regarding claim 14, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, wherein the extracting feature interactions includes determining complementary information beneficial to the first task and the second task(Ma, pg.3, see also fig. 2, “embedding layer maps large scale sparse inputs into low dimensional representation vectors. It contributes most of the parameters of deep network and learning of which needs huge volume of training samples. In ESMM, embedding dictionary of CVR network is shared with that of CTR network. It follows a feature representation transfer learning paradigm. Training samples with all impressions for CTR task is relatively much richer than CVR task. This parameter sharing mechanism enables CVR network in ESMM to learn from un-clicked impressions and provides great help for alleviating the data sparsity trouble.”).  
Regarding claim 15, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, wherein the extracting feature interactions includes determining a feature interaction beneficial to the first task, and wherein the generating the second machine learning model corresponding to the second task includes utilizing the determined feature interaction(Ma, pg.3, see also fig. 2, “embedding layer maps large scale sparse inputs into low dimensional representation vectors. It contributes most of the parameters of deep network and learning of which needs huge volume of training samples. In ESMM, embedding dictionary of CVR network is shared with that of CTR network. It follows a feature representation transfer learning paradigm. Training samples with all impressions for CTR task is relatively much richer than CVR task. This parameter sharing mechanism enables CVR network in ESMM to learn from un-clicked impressions and provides great help for alleviating the data sparsity trouble.”).  
Regarding claim 16, Ma in view of Yang and in view of Denisiuk teaches the at least one computing device of claim 9, wherein the generating the first machine learning model and the generating the second machine learning model are subject to the same training criterion(Ma, pgs. 2-4, see also fig.2, fig.3, “To be fair, all competitors including ESMM  share the same network structure and hyper parameters…which i) uses ReLU activation function, ii) sets the dimension of embedding vector to be 18, iii) sets dimensions of each layers in MLP network to be 360 × 200 × 80 × 2, iv) uses adam solver with parameter β1 = 0.9, β2 = 0.999, ϵ = 10−8.”).  
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Adam Clark Standke whose telephone number is (571)270-1806. The examiner can normally be reached 10AM-10PM M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Adam Clark Standke
Assistant Examiner
Art Unit 2129



/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.