Detailed Action
This action is in response to claims filed February 7, 2022 for application 16/314,457 filed December 31, 2018. Claims 1, 4, 5, 9, 12, 13, 17, 21, and 22 are amended. Claims 1-20 are pending. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/11/2019 and 12/14/2020 was filed. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claims 1, 2, 3, 9, 10, 11, 17, 18, 21 are rejected under 35 U.S.C. 103 as being unpatentable over Commons (US 9015093 B1) in view of Zhan (“Urban link travel time estimation using large-scale taxi data with partial information”) in view of Dong (WO2018099480A1) in view of Lint (“Accurate freeway travel time prediction with state-space neural networks under missing data”).

Regarding claim 1, 
Commons teaches a computing system for estimating travel time and distance, comprising: one or more processors; and a memory storing instructions that, when executed by the one or more processors, cause the computing system to: (Commons, col. 53, ln. 26-30, “The computer system 400 may be used to implement the techniques described herein. According to one embodiment, those techniques are performed by computer system 400 in response to processor 404 executing one or more sequences of one or more instructions contained in main memory 406.”). 
train a neural network model … to obtain a trained model, (Commons, claim 1, “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”; “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”).
wherein: the neural network model comprises a first module and a second module; (Commons, col. 30, ln. 1-3, “FIG. 2 is a block diagram of an embodiment of the stacked neural network of the present invention comprising three architecturally distinct, ordered neural networks.”; col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module, and elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module)
the first module comprises a first number of neuron layers; (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 60, 20, and 22 in Fig. 2 can be interpreted as the first module).
the first module is … (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module).
the second module comprises a second number of neuron layers; (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module). and
the second module is configured to obtain information of a last layer of the first module … (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module, and elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module).
Commons does not explicitly teach 
obtain a vehicle trip dataset comprising an origin, a destination, a time-of-day, a trip time, and a trip distance associated with each of a plurality of trips, wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; 
preprocess the vehicle trip dataset to: discretize the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; and 
convert the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells;
with the preprocessed vehicle trip dataset;  
… configured to obtain the origin and the destination as first inputs to estimate a travel distance; 
… the time-of-day as second inputs to estimate a travel time.
Zhan, however, teaches 
obtain a vehicle trip dataset comprising an origin, a destination, …, a trip time, and a trip distance associated with each of a plurality of trips; (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; (Zhan, pg. 43, section 3, para. 1; “The data records each trip origin and destination GPS coordinate”). (GPS coordinates include geo-coordinates).
with the … vehicle trip dataset; (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
… configured to obtain the origin and the destination as first inputs to estimate a travel distance; (Zhan, pg. 39, section 2, para. 1, “The taxi trip origin and destination points are first mapped to the nearest links in the network. Instead of using all possible paths between each origin and destination points, we use k-shortest path algorithm to construct 20 shortest paths for each OD nodal pair of a trip, referred as the reasonable path set.”; pg. 47, section 5, para. 1, “The proposed model treats the path taken as latent, constructs a reasonable path set”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons with obtaining a vehicle trip dataset as taught by Zhan to obtain inputs to estimate travel distance. The motivation to do so is that the generated path data sets serve as the basis for the link travel time estimation process. (“the generated reasonable path sets serve as the basis for the link travel time estimation process.” (Zhan, pg. 39, section 2, para. 1)).
Commons/ Zhan does not explicitly teach 
preprocess the vehicle trip dataset to: discretize the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; and convert the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells;
[the] preprocessed [vehicle trip dataset];
Dong, however, teaches 
preprocess the vehicle trip dataset to: discretize the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”). and 
convert the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; (Dong, [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”; 
[0041] 

    PNG
    media_image1.png
    218
    874
    media_image1.png
    Greyscale

(Formula 6)).
[the] preprocessed [vehicle trip dataset]; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
Commons/ Zhan/ Dong does not explicitly to teach 
a time-of-day;  
… the time-of-day as second inputs to estimate a travel time.
Lint, however, teaches 
a time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer).
… the time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.) as second inputs to estimate a travel time. (Lint, Fig. 2). (Context layer x(t) in Fig. 2 could act as the input for the time-of-day. Time t can be interpreted as the time-of-day).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan/Dong to use the time information to obtain a travel time as taught by Lint. The motivation to do so is that it is useful to know the start time for accurate travel time estimation (Lint, pg. 353, section 3.2, para. 1). 

Regarding claim 2, 
Commons/Zhan/Dong/Lint teaches the computing system of claim 1, (and thus the rejection of claim 1 is incorporated) wherein the memory further stores instructions that, when executed by the one or more processors, cause the computing system to: (Commons, col. 53, ln. 26-30, “The computer system 400 may be used to implement the techniques described herein. According to one embodiment, those techniques are performed by computer system 400 in response to processor 404 executing one or more sequences of one or more instructions contained in main memory 406.”). 
Zhan further teaches receive a query comprising a query origin, a query destination, (Zhan, “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method) and 
Lint further teaches a query time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.); and
Zhan further teaches apply the trained model to the received query to estimate the travel time and the travel distance from the query origin to the query destination for the query time-of-day. (Zhan, “In this study, a new model is proposed to use the limited information provided in the taxi GPS data to estimate urban link travel times.”; “The proposed model treats the path taken as latent, constructs a reasonable path set, formulates an MNL model to compute the probability of a path being taken by the driver”) 

Regarding claim 3, 
Commons/Zhan/Dong/Lint teaches the computing system of claim 1, (and thus the rejection of claim 1 is incorporated). Zhan further teaches wherein: the origin and the destination each comprise binned GPS (Global Positioning System) coordinates; (Zhan, Fig. 2, “Thus a data mapping process is introduced to pre-process the raw GPS data. There are two purposes in this step: first, to map the data to nearest links in the road network to reduce GPS errors; second, to match the starting and ending points to the actual road network and transform the raw data into usable data for network level analysis.” “Fig. 2 illustrates the data mapping procedure. The raw origin and destination points (black points in Fig. 2) are mapped to the perpendicular foot of the nearest link (blue points in Fig. 2), and the new points are then used in the later analysis.”). and
Zhan further teaches the trip time comprises binned trip time. (Zhan, Fig. 4 (a), (b), “In this study, the data is split into hourly intervals, and link travel times are estimated using the data from the corresponding hour.). 

Regarding claim 9, 
Commons teaches a method for estimating travel time and distance, comprising:
training a neural network model … to obtain a trained model, (Commons, claim 1, “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”; “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”).
wherein: the neural network model comprises a first module and a second module; (Commons, col. 30, ln. 1-3, “FIG. 2 is a block diagram of an embodiment of the stacked neural network of the present invention comprising three architecturally distinct, ordered neural networks.”; col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module, and elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module)
the first module comprises a first number of neuron layers; (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 60, 20, and 22 in Fig. 2 can be interpreted as the first module).
the first module is … (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module).
the second module comprises a second number of neuron layers; (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module). and
	the second module is configured to obtain information of a last layer of the first module … (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module, and elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module). 
Commons does not explicitly teach 
obtaining a vehicle trip dataset comprising an origin, a destination, a time-of-day, a trip time, and a trip distance associated with each of a plurality of trips, 
wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; 
with the preprocessed vehicle trip dataset;  
… configured to obtain the origin and the destination as first inputs to estimate a travel distance; 
… the time-of-day as second inputs to estimate a travel time.
Zhan, however, teaches 
obtaining a vehicle trip dataset comprising an origin, a destination, …, a trip time, and a trip distance associated with each of a plurality of trips; (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; (Zhan, pg. 43, section 3, para. 1; “The data records each trip origin and destination GPS coordinate”). (GPS coordinates include geo-coordinates).
with the … vehicle trip dataset; (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
… configured to obtain the origin and the destination as first inputs to estimate a travel distance; (Zhan, pg. 39, section 2, para. 1, “The taxi trip origin and destination points are first mapped to the nearest links in the network. Instead of using all possible paths between each origin and destination points, we use k-shortest path algorithm to construct 20 shortest paths for each OD nodal pair of a trip, referred as the reasonable path set.”; pg. 47, section 5, para. 1, “The proposed model treats the path taken as latent, constructs a reasonable path set”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons with obtaining a vehicle trip dataset as taught by Zhan to obtain inputs to estimate travel distance. The motivation to do so is that the generated path data sets serve as the basis for the link travel time estimation process. (“the generated reasonable path sets serve as the basis for the link travel time estimation process.” (Zhan, pg. 39, section 2, para. 1)).
Commons/ Zhan does not explicitly to teach 
preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; 
[the] preprocessed [vehicle trip dataset];
Dong, however, teaches 
preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”). and 
converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; (Dong, [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”; 
[0041] 

    PNG
    media_image1.png
    218
    874
    media_image1.png
    Greyscale

(Formula 6)).
[the] preprocessed [vehicle trip dataset]; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
Commons/ Zhan/ Dong does not explicitly to teach 
a time-of-day;  
… the time-of-day as second inputs to estimate a travel time.
Lint, however, teaches 
a time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.)
… the time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.) as second inputs to estimate a travel time. (Lint, Fig. 2). (Context layer x(t) in Fig. 2 could act as the input for the time-of-day. Time t can be interpreted as the time-of-day). 
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan/Dong to use the time information to obtain a travel time as taught by Lint. The motivation to do so is that it is useful to know the start time for accurate travel time estimation (Lint, pg. 353, section 3.2, para. 1). 

Regarding claim 10, 
Commons/Zhan/Dong/Lint teaches the method of claim 9, (and thus the rejection of claim 9 is incorporated). Zhan further teaches receiving a query comprising a query origin, a query destination, (Zhan, “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”) and 
Lint further teaches a query time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.); and
Zhan further teaches applying the trained model to the received query to estimate the travel time and the travel distance from the query origin to the query destination for the query time-of-day. (Zhan, “In this study, a new model is proposed to use the limited information provided in the taxi GPS data to estimate urban link travel times.”; “The proposed model treats the path taken as latent, constructs a reasonable path set, formulates an MNL model to compute the probability of a path being taken by the driver”) 

Regarding claim 11, 
Commons/Zhan/Dong/Lint teaches the method of claim 9, (and thus the rejection of claim 9 is incorporated). Zhan further teaches the origin and the destination each comprise binned GPS (Global Positioning System) coordinates; (Zhan, Fig. 2, “Thus a data mapping process is introduced to pre-process the raw GPS data. There are two purposes in this step: first, to map the data to nearest links in the road network to reduce GPS errors; second, to match the starting and ending points to the actual road network and transform the raw data into usable data for network level analysis.” “Fig. 2 illustrates the data mapping procedure. The raw origin and destination points (black points in Fig. 2) are mapped to the perpendicular foot of the nearest link (blue points in Fig. 2), and the new points are then used in the later analysis.”). and
Zhan further teaches the trip time comprises binned trip time. (Zhan, Fig. 4 (a), (b), “In this study, the data is split into hourly intervals, and link travel times are estimated using the data from the corresponding hour.). 

Regarding claim 17, 
Commons teaches a non-transitory computer-readable medium for estimating travel time and distance, comprising instructions stored therein, wherein the instructions, when executed by one or more processors, cause the one or more processors to perform the steps of: (Commons, col. 53, ln. 26-30, “The computer system 400 may be used to implement the techniques described herein. According to one embodiment, those techniques are performed by computer system 400 in response to processor 404 executing one or more sequences of one or more instructions contained in main memory 406.”). 
training a neural network model … to obtain a trained model, (Commons, claim 1, “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”; “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”).
wherein: the neural network model comprises a first module and a second module; (Commons, col. 30, ln. 1-3, “FIG. 2 is a block diagram of an embodiment of the stacked neural network of the present invention comprising three architecturally distinct, ordered neural networks.”; col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module, and elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module)
the first module comprises a first number of neuron layers; (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 60, 20, and 22 in Fig. 2 can be interpreted as the first module).
the first module is … (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module).
the second module comprises a second number of neuron layers; (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module). and
	the second module is configured to obtain information of a last layer of the first module … (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module, and elements 22, 24 and 62 in Fig. 2 can be interpreted as the second module). 
Commons does not explicitly teach 
obtaining a vehicle trip dataset comprising an origin, a destination, a time-of-day, a trip time, and a trip distance associated with each of a plurality of trips, 
wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; 
with the preprocessed vehicle trip dataset;  
… configured to obtain the origin and the destination as first inputs to estimate a travel distance; 
… the time-of-day as second inputs to estimate a travel time.
Zhan, however, teaches 
obtaining a vehicle trip dataset comprising an origin, a destination, …, a trip time, and a trip distance associated with each of a plurality of trips, (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; (Zhan, pg. 43, section 3, para. 1; “The data records each trip origin and destination GPS coordinate”). (GPS coordinates include geo-coordinates).
with the … vehicle trip dataset; (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
… configured to obtain the origin and the destination as first inputs to estimate a travel distance; (Zhan, pg. 39, section 2, para. 1, “The taxi trip origin and destination points are first mapped to the nearest links in the network. Instead of using all possible paths between each origin and destination points, we use k-shortest path algorithm to construct 20 shortest paths for each OD nodal pair of a trip, referred as the reasonable path set.”; pg. 47, section 5, para. 1, “The proposed model treats the path taken as latent, constructs a reasonable path set”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons with obtaining a vehicle trip dataset as taught by Zhan to obtain inputs to estimate travel distance. The motivation to do so is that the generated path data sets serve as the basis for the link travel time estimation process. (“the generated reasonable path sets serve as the basis for the link travel time estimation process.” (Zhan, pg. 39, section 2, para. 1)).
preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; 
[the] preprocessed [vehicle trip dataset];
Dong, however, teaches 
preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”). and 
converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; (Dong, [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”; 
[0041] 

    PNG
    media_image1.png
    218
    874
    media_image1.png
    Greyscale

(Formula 6)).
[the] preprocessed [vehicle trip dataset]; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
Commons/ Zhan/ Dong does not explicitly to teach 
a time-of-day;  
… the time-of-day as second inputs to estimate a travel time.
However, Lint teaches 
a time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.)
… the time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.) as second inputs to estimate a travel time. (Lint, Fig. 2). (Context layer x(t) in Fig. 2 could act as the input for the time-of-day. Time t can be interpreted as the time-of-day.) 
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan/Dong to use the time information to obtain a travel time as taught by Lint. The motivation to do so is that it is useful to know the start time for accurate travel time estimation (Lint, pg. 353, section 3.2, para. 1). 

Regarding claim 18, 
Commons/Zhan/Dong/Lint teaches the non-transitory computer-readable medium of claim 17, (and thus the rejection of claim 17 is incorporated). 
Commons teaches wherein the instructions, when executed by one or more processors, further perform the steps of: (Commons, col. 53, ln. 26-30, “The computer system 400 may be used to implement the techniques described herein. According to one embodiment, those techniques are performed by computer system 400 in response to processor 404 executing one or more sequences of one or more instructions contained in main memory 406.”). 
Zhan further teaches receiving a query comprising a query origin, a query destination, (Zhan, “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method) and 
Lint further teaches a query time-of-day (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.); and
Zhan further teaches applying the trained model to the received query to estimate the travel time and the travel distance from the query origin to the query destination for the query time-of-day. (Zhan, “In this study, a new model is proposed to use the limited information provided in the taxi GPS data to estimate urban link travel times.”; “The proposed model treats the path taken as latent, constructs a reasonable path set, formulates an MNL model to compute the probability of a path being taken by the driver”).

Regarding claim 21, 
Commons teaches a method for estimating a travel time, comprising: inputting … to a first trained neural network module (Commons, Fig. 2). (Elements 60 and 20 in Fig. 2 can be interpreted as the first module). and inputting … information and information of a last layer of the first trained neural network module to a second trained neural network module (Commons, Fig. 2, col. 32, ln. 11-13, “Referring to FIG. 2, in another embodiment of the present invention, stacked neural network 10 has three architecturally distinct ordered neural networks, 20, 22, and 24.”; col. 61, ln. 40-42; “each respective neural network layer having a plurality of parameters representing training of the respective neural network layer”). (Elements 24 and 62 in Fig. 2 are interpreted as the second module).
Commons does not explicitly teach 
obtaining a vehicle trip dataset comprising time information, an origin, and a destination of a trip, wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; 
the preprocessed origin and the preprocessed destination; 
time information [and information of a last layer of the first trained neural network module to a second trained neural network module] to obtain a travel time
Zhan, however, teaches 
obtaining a vehicle trip dataset comprising … information, an origin, and a destination of a trip, (Zhan, pg. 43, section 3, para. 1; “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”). 
wherein the origin and the destination are represented as origin geo-coordinates and destination geo-coordinates, respectively; (Zhan, pg. 43, section 3, para. 1; “The data records each trip origin and destination GPS coordinate”). (GPS coordinates include geo-coordinates).
the [preprocessed] origin and the [preprocessed] destination [to a first trained neural network module] to obtain a travel distance; (Zhan, pg. 39, section 2, para. 1, “The taxi trip origin and destination points are first mapped to the nearest links in the network. Instead of using all possible paths between each origin and destination points, we use k-shortest path algorithm to construct 20 shortest paths for each OD nodal pair of a trip, referred as the reasonable path set.”; pg. 47, section 5, para. 1, “The proposed model treats the path taken as latent, constructs a reasonable path set”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons with obtaining a vehicle trip dataset as taught by Zhan to obtain inputs to estimate travel distance. The motivation to do so is that the generated path data sets serve as the basis for the link travel time estimation process. (“the generated reasonable path sets serve as the basis for the link travel time estimation process.” (Zhan, pg. 39, section 2, para. 1)).
Commons/ Zhan does not explicitly to teach 
preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; 
Dong, however, teaches 
preprocessing the vehicle trip dataset, wherein the preprocessing comprises: discretizing the origin geo-coordinates and the destination geo-coordinates into respective spatial cells; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”). and 
converting the origin geo-coordinates and the destination geo-coordinates into geo-coordinates of the respective spatial cells; (Dong, [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”; 
[0041] 

    PNG
    media_image1.png
    218
    874
    media_image1.png
    Greyscale

(Formula 6)).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
Commons/ Zhan/ Dong does not explicitly to teach 
time information [and information of a last layer of the first trained neural network module to a second trained neural network module] to obtain a travel time 
Lint, however, teaches 
time information (Lint, pg. 351, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.) [and information of a last layer of the first trained neural network module to a second trained neural network module] to obtain a travel time. (Lint, Fig. 2). (Context layer x(t) in Fig. 2 could act as the input for the time-of-day. Time t can be interpreted as the time-of-day).
 	It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan/Dong to use the time information to obtain a travel time as taught by Lint. The motivation to do so is that it is useful to know the start time for accurate travel time estimation (Lint, pg. 353, section 3.2, para. 1). 


Claims 4, 5, 12, 13 are rejected under 35 U.S.C. 103 as being unpatentable over Commons (US 9015093 B1) in view of Zhan (“Urban link travel time estimation using large-scale taxi data with partial information”) in view of Dong (WO2018099480A1) in view of Lint (“Accurate freeway travel time prediction with state-space neural networks under missing data”) in view of Hoppe (US 2018/0292484 A1).

Regarding claim 4, 
Commons/Zhan/Dong/Lint teaches the computing system of claim 1, (and thus the rejection of claim 1 is incorporated) wherein, to train the neural network model with the [preprocessed] vehicle trip dataset, the computing system is caused to:
Zhan further teaches feed the [converted] origin [geo-coordinates] and the [converted] destination [geo-coordinates] to the first module … (Zhan, “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”).
Dong further teaches preprocessed; converted origin geo-coordinates; converted destination geo-coordinates; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
However, Commons/Zhan/Dong/Lint does not explicitly teach 
… to obtain the first number and a number of neurons in each of the first number of neuron layers; and
compare the estimated travel distance with the trip distance to tune the first number and the number of neurons in each of the first number of neuron layers.
However, Hoppe teaches … to obtain the first number and a number of neurons in each of the first number of neuron layers; (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”). (a first number of neurons and layers must exist during training before the tuning step). and
compare the estimated travel distance with the trip distance (Hoppe, [0041] “The error between the results and the expected values is then calculated and the gradient of the error function is used to iteratively change the weights in the artificial neural network 100 and to minimize the errors.”) to tune the first number and the number of neurons in each of the first number of neuron layers. (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”).
 It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to tune the number of neurons in each layer and the number of layers. The motivation to do so is “to improve the training results by way of a change in hyper parameters of the network 100.” (Hoppe, [0048]).

Regarding claim 5, 
Commons/Zhan/Dong/Lint/Hoppe teaches the computing system of claim 4, (and thus the rejection of claim 4 is incorporated) wherein, to train the neural network model with the [preprocessed] vehicle trip dataset, the computing system is further caused to: 
Dong further teaches preprocessed; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
Lint further teaches feed the information of the last layer of the first module (Lint, Fig. 2, section 1, section m, section M) and the time-of-day to the second module (Lint, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.) to obtain the second number and a number of neurons in each of the second number of neuron layers; (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”). and
Hoppe further teaches compare the estimated travel time with the trip time (Hoppe, [0041] “The error between the results and the expected values is then calculated and the gradient of the error function is used to iteratively change the weights in the artificial neural network 100 and to minimize the errors.”) to tune the second number and a number of neurons in each of the second number of neuron layers. (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”).

Regarding claim 12, 
Commons/Zhan/Dong/Lint teaches the method of claim 9, (and thus the rejection of claim 9 is incorporated) wherein training the neural network model with the [preprocessed] vehicle trip dataset, the computing system is caused to:
Zhan further teaches feeding the [converted] origin [geo-coordinates] and the [converted] destination [geo-coordinates] to the first module … (Zhan, “The data used in this research was collected by New York City Taxi and Limousine Commission on a trip by trip basis. The data records each trip origin and destination GPS coordinate, trip distance and duration, fare, payment method, and other related information. The data set contains data from February 2008 to November 2010. In this study, a week’s data (from 3/15/2010 to 3/21/2010) is selected to test the proposed method.”).
Dong further teaches preprocessed; converted origin geo-coordinates; converted destination geo-coordinates; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
However, Commons/Zhan/Dong/Lint does not explicitly teach 
… to obtain the first number and a number of neurons in each of the first number of neuron layers; and
comparing the estimated travel distance with the trip distance to tune the first number and the number of neurons in each of the first number of neuron layers.
However, Hoppe teaches … to obtain the first number and a number of neurons in each of the first number of neuron layers; (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”). (a first number of neurons and layers must exist during training before the tuning step). and
comparing the estimated travel distance with the trip distance (Hoppe, [0041] “The error between the results and the expected values is then calculated and the gradient of the error function is used to iteratively change the weights in the artificial neural network 100 and to minimize the errors.”) to tune the first number and the number of neurons in each of the first number of neuron layers. (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”).
 It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to tune the number of neurons in each layer and the number of layers. The motivation to do so is “to improve the training results by way of a change in hyper parameters of the network 100.” (Hoppe, [0048]).

Regarding claim 13, 
Commons/Zhan/Dong/Lint/Hoppe teaches the method of claim 12, (and thus the rejection of claim 4 is incorporated). wherein training the neural network model with the [preprocessed] vehicle trip dataset comprises:
Dong further teaches preprocessed; (Dong, [0040] “Define the grid mapping function ρ(p): R^2 →G, where R is the geographic coordinate, G is the grid set after map mapping, p is a point in a two-dimensional continuous space, … In order to discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories, let the grid mapping function g=ρ(p), where p=(longitude, latitude) is the GPS coordinate point, and g=(tileX, tileY) is grid coordinates.”; [0042] “Formula 6 can be directly applied to the grid mapping (Mapping) component, enter the longitude and latitude of any GPS coordinate point within the specified valid range, and specify the map zoom level value that affects the grid size, and then the corresponding grid can be obtained. Coordinates to realize continuous domain trajectory point discretization.”).
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to modify Commons/Zhan to discretize and convert the origin geo-coordinates and the destination geo-coordinates into corresponding grid points as taught by Dong. The motivation to do so is that the search of similar trajectories can be facilitated. (Dong, [0040] “discretize the GPS coordinate points into grid points to facilitate the search of similar trajectories”).
Lint further teaches feeding the information of the last layer of the first module (Lint, Fig. 2, section 1, section m, section M) and the time-of-day to the second module (Lint, Fig. 2). (Time t can be interpreted as the time-of-day, X(t) in the context layer.) to obtain the second number and a number of neurons in each of the second number of neuron layers; (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”). and
Hoppe further teaches comparing the estimated travel time with the trip time (Hoppe, [0041] “The error between the results and the expected values is then calculated and the gradient of the error function is used to iteratively change the weights in the artificial neural network 100 and to minimize the errors.”) to tune the second number and a number of neurons in each of the second number of neuron layers. (Hoppe, [0048] “Validation data sets are used to improve the training results by way of a change in hyper parameters of the network 100. The hyper parameters can be, for example, the number of hidden layers in the network, a learning rate, a size and number of the filter kernels, an optimization method and/or a number of neurons per layer.”).


Claims 6, 7, 14, 15, 19, 22 are rejected under 35 U.S.C. 103 as being unpatentable over Commons (US 9015093 B1) in view of Zhan (“Urban link travel time estimation using large-scale taxi data with partial information”) in view of Dong (WO2018099480A1) in view of Lint (“Accurate freeway travel time prediction with state-space neural networks under missing data”) in view of Ertuna (“Stock Market Prediction Using Neural Network Time Series Forecasting”).

Regarding claim 6, 
Commons/Zhan/Dong/Lint teaches the computing system of claim 1, (and thus the rejection of claim 1 is incorporated). 
However, Commons/Zhan/Dong/Lint does not explicitly teach wherein: the first module is a neural network comprising three neuron layers;, but Ertuna teaches this limitation. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”) and the second module is another neural network comprising three neuron layers. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”)
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three hidden layers. The motivation to do so is “expanding from the input layer towards the middle, and shrinking from the middle towards the output layer” (Ertuna, section 7, para. 1).

Regarding claim 7, 
Commons/Zhan/Dong/Lint/Ertuna teaches the computing system of claim 6, (and thus the rejection of claim 6 is incorporated). 
Ertuna further teaches wherein the first module comprises: 
a first neuron layer comprising 20 neurons;
a second neuron layer comprising 100 neurons; and
a third neuron layer comprising 20 neurons, but Ertuna teaches this limitation. (Ertuna, pg. 8, section 7.2, Table 5, Three Diamond Layers, Hidden Structure, “20-100-20”) 
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three Diamond Layers, Hidden Structure, “20-100-20”. The motivation to do so is “the best behavior in all testing data sets (1%, 10% and 20%) was demonstrated by the 5-20-100-20-1 neural network trained with resilient backpropagation method.” (Ertuna, pg. 9, para. 2).

Regarding claim 14, 
Commons/Zhan/Dong/Lint teaches the method of claim 9, (and thus the rejection of claim 1 is incorporated). However, Commons/Zhan/Dong/Lint does not explicitly teach the first module is a neural network comprising three neuron layers;, but Ertuna teaches this limitation. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”) and the second module is another neural network comprising three neuron layers. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”)
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three hidden layers. The motivation to do so is “expanding from the input layer towards the middle, and shrinking from the middle towards the output layer” (Ertuna, section 7, para. 1).

Regarding claim 15, 
Commons/Zhan/Dong/Lint/Ertuna teaches the method of claim 14, (and thus the rejection of claim 6 is incorporated). 
Ertuna further teaches wherein the first module comprises: 
a first neuron layer comprising 20 neurons;
a second neuron layer comprising 100 neurons; and
a third neuron layer comprising 20 neurons, but Ertuna teaches this limitation. (Ertuna, pg. 8, section 7.2, Table 5, Three Diamond Layers, Hidden Structure, “20-100-20”) 
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three Diamond Layers, Hidden Structure, “20-100-20”. The motivation to do so is “the best behavior in all testing data sets (1%, 10% and 20%) was demonstrated by the 5-20-100-20-1 neural network trained with resilient backpropagation method.” (Ertuna, pg. 9, para. 2).

Regarding claim 19, 
Commons/Zhan/Dong/Lint teaches the non-transitory computer-readable medium of claim 17, (and thus the rejection of claim 1 is incorporated). However, Commons/Zhan/Dong/Lint does not explicitly teach the first module is a neural network comprising three neuron layers;, but Ertuna teaches this limitation. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”) and the second module is another neural network comprising three neuron layers. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”)
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three hidden layers. The motivation to do so is “expanding from the input layer towards the middle, and shrinking from the middle towards the output layer” (Ertuna, section 7, para. 1).

Regarding claim 22, 
Commons/Zhan/Dong/Lint teaches the method of claim 21, (and thus the rejection of claim 21 is incorporated). However, Commons/Zhan/Lint does not explicitly teach wherein: the first trained neural network module comprises three neuron layers; but Ertuna teaches this limitation. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”) and the second trained neural network module comprises three neuron layers. (Ertuna, pg. 5, section 7, para. 1, “three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and shrinking from the middle towards the output layer)”)
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three hidden layers. The motivation to do so is it is known in the art to use three neuron layers in a neural network to yield predictable results.


Claims 8, 16, 20  are rejected under 35 U.S.C. 103 as being unpatentable over Commons (US 9015093 B1) in view of Zhan (“Urban link travel time estimation using large-scale taxi data with partial information”) in view of Dong (WO2018099480A1) in view of Lint (“Accurate freeway travel time prediction with state-space neural networks under missing data”) in view of Ertuna (“Stock Market Prediction Using Neural Network Time Series Forecasting”) in view of Nowicki, (“Low-effort place recognition with WiFi fingerprints using deep learning”).

Regarding claim 8, 
Commons/Zhan/Dong/Lint/Ertuna teaches the computing system of claim 6, (and thus the rejection of claim 6 is incorporated). 
However, Commons/Zhan/Dong/Lint/Ertuna does not explicitly teach wherein the second module comprises:
a first neuron layer comprising 64 neurons;
a second neuron layer comprising 120 neurons; and
a third neuron layer comprising 20 neurons, but Nowicki teaches this limitation. (Nowicki, pg. 7, para. 1, “The influence of those parameters for floor and building classification problem with the network consisting of SAE (256-128-64) containing three hidden layers of 256, 128, and 64 neurons”)
Nowicki teaches a number of neurons that includes the particular number of neurons in the claim limitations, and it would have been obvious through routine experimentation (see MPEP §2144.05.II.A) to arrive at the specific values required by the claim language.)

Regarding claim 16, 
Commons/Zhan/Dong/Lint/Ertuna teaches the method of claim 14, (and thus the rejection of claim 14 is incorporated). 
However, Commons/Zhan/Dong/Ertuna does not explicitly teach wherein the second module comprises:
a first neuron layer comprising 64 neurons;
a second neuron layer comprising 120 neurons; and
a third neuron layer comprising 20 neurons, but Nowicki teaches this limitation. (Nowicki, pg. 7, para. 1, “The influence of those parameters for floor and building classification problem with the network consisting of SAE (256-128-64) containing three hidden layers of 256, 128, and 64 neurons”)
Nowicki teaches a number of neurons that includes the particular number of neurons in the claim limitations, and it would have been obvious through routine experimentation (see MPEP §2144.05.II.A) to arrive at the specific values required by the claim language.)

Regarding claim 20,
Commons/Zhan/Dong/Lint teaches the non-transitory computer-readable medium of claim 17, (and thus the rejection of claim 17 is incorporated). 
However, Commons/Zhan/Dong/Lint does not teach wherein:
the first module comprises:
a first neuron layer comprising 20 neurons;
a second neuron layer comprising 100 neurons; and
a third neuron layer comprising 20 neurons; but Ertuna teaches this limitation. (Ertuna, pg. 8, section 7.2, Table 5, Three Diamond Layers, Hidden Structure, “20-100-20”) 
It would have been obvious to one of ordinary skill of the art before the effective filing date of the claimed invention to have three Diamond Layers, Hidden Structure, “20-100-20”. The motivation to do so is “the best behavior in all testing data sets (1%, 10% and 20%) was demonstrated by the 5-20-100-20-1 neural network trained with resilient backpropagation method.” (Ertuna, pg. 9, para. 2).
Commons/Zhan/Dong/Lint teaches does not explicitly teach wherein the second module comprises:
a first neuron layer comprising 64 neurons;
a second neuron layer comprising 120 neurons; and
a third neuron layer comprising 20 neurons, but Nowicki teaches this limitation. (Nowicki, pg. 7, para. 1, “The influence of those parameters for floor and building classification problem with the network consisting of SAE (256-128-64) containing three hidden layers of 256, 128, and 64 neurons”)
Nowicki teaches a number of neurons that includes the particular number of neurons in the claim limitations, and it would have been obvious through routine experimentation (see MPEP §2144.05.II.A) to arrive at the specific values required by the claim language.)

Response to Arguments
Applicant’s arguments filed February 7, 2022 have been fully considered but they are not persuasive.

Regarding the rejection of claims 1, 4, 5, 9, 12, 13, 17, 21, and 22 under 35 U.S.C. §103:

Applicant’s arguments with respect to claims 1, 4, 5, 9, 12, 13, 17, 21, and 22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. 
Applicant’s arguments with respect to the rejections of the dependent claims have been fully considered but they are not persuasive as they rely upon the allowability of the independent claims. 

Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Deuk Lee whose telephone number is 571-272-8440.  The examiner can normally be reached on Monday-Friday 8:30am-5:30pm CDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DL/
Examiner, Art Unit 2122  

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122