DETAILED ACTION
This Action is responsive to the Amendments and Remarks filed on 11/16/2020. Claims 1-21 are pending claims.  Claims 1, 5, and 14 are written in independent form.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant’s amendments and remarks filed on 11/16/2020 have been fully considered and thus necessitated the new grounds of rejection presented herein.  Accordingly, THIS ACTION IS MADE FINAL.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-10, 12-15, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Cunningham et al. (U.S. Pre-Grant Publication No. 2014/0059552, hereinafter referred to as Cunningham) and further in view of Non-Patent Literature Bryan Liston, "Ad Hoc Big Data Processing Made Simple with Serverless MapReduce", November 4, 2016, Amazon Web Services .

Regarding Claim 1:
Cunningham teaches a system for implementing a MapReduce programming model comprising:
a code execution system comprising a set of computing devices configured to:
obtain a request to analyze a set of data, wherein the request designates:
Cunningham teaches receiving data to be read and deserialized into a key/value sequence before being passed to Mapper 104 (Para. [0052] & Fig. 1).
a map task corresponding to code executable by the code execution system to process a portion of the set of data to result in an output; and
Cunningham teaches performing a map task for each input that results in an output from the mapper operating the map task (Para. [0036])
a reduce task corresponding to code executable by the code execution system to process a plurality of outputs from individual executions of the map task to result in an aggregated output;
Cunningham teaches a reduce task for processing the mapper output as a result of a “accumulating all of the output of that mapper” (Para. [0036] & [0039]).
initiate individual executions of the map task to process respective portions of the set of data and to publish outputs of the individual executions into a partitioned message stream, the partitioned message stream comprising a plurality of sub-streams, individual sub-streams containing messages 
Cunningham teaches “the map tasks deserialize the input data to generate a stream of key/value pairs that is passed into the mapper. The mapper outputs key/value pairs, which are immediately serialized and placed in a buffer” (Para. [0015]) and publishing outputs of mappers to a combiner which processes the outputs and sends them to a reducer (Para. [0005]).  Cunningham further teaches “if the involved data sets are large, they are automatically partitioned across multiple nodes and the operations are applied in parallel” (Para. [0004]) where “Partitioner code (P) which takes as input a key, a value, the number of partitions, (and may take other inputs) and produces a number, the partition number” (Para. [0040]) thereby teaching partitioned message sub-streams operating in parallel for combining and sending outputs to reducers using partition numbers.
a streaming data processing system comprising a set of computing devices configured to:
receive outputs of the individual executions of the map task as messages to be published into the partitioned message stream; and
Cunningham teaches receiving at a combiner outputs from map tasks (Para. [0036]) and “if the mapper then outputs keys that map to the same partition, it is guaranteed that they will be locally shuffled” (Para. [0058]).
divide the messages among a plurality of sub-streams within the partitioned message stream,
Cunningham teaches “each mapper has a tree map for each destination reducer, which stores the data output by that mapper for that reducer” and “when all local mappers have terminated, these trees are merged together (preserving their ordering) into one tree for each destination reducer” where “the result is written into the buffer for communication” (Para. [0039]).  Therefore, Cunningham teaches the data output by mappers being divided among buffers for communication to each destination reducer where ordering is preserved.
wherein dividing the messages comprises, for individual messages received to be published into the partitioned message stream,
identifying a sub-stream, of the plurality of sub-streams, according to the message’s respective values for the attribute; and
Cunningham teaches “each mapper has a tree map for each destination reducer, which stores the data output by that mapper for that reducer” and “when all local mappers have terminated, these trees are merged together (preserving their ordering) into one tree for each destination reducer” where “the result is written into the buffer for communication” (Para. [0039]). Therefore, Cunningham is teaching identifying a buffer for communication of the output of a mapper according to the output’s values for the attribute.
If the intent was to specify that the message’s respective values for the attribute of the message is used to determine/decide which sub-stream to enqueue the individual message to, it is recommended to clarify the limitation to reflect that scope.
enqueueing the individual message onto the sub-stream;
Cunningham teaches “each mapper has a tree map for each destination reducer, which stores the data output by that mapper for that reducer” and “when all local mappers have terminated, these trees are merged together (preserving their ordering) into one tree for each destination reducer” where “the result is written into the buffer for communication” (Para. [0039]). Therefore, Cunningham is teaching enqueueing the mapper output to a buffer for communication to the buffer’s respective reducer.
wherein the set of computing devices of the code execution system are further configured to:
for individual sub-streams of the plurality of sub-streams:
detect one or more messages within the individual sub-stream; and
Cunningham teaches “once map output has been pushed out to disk, reducer tasks start fetching their input data” (Para. [0016]) thereby teaching a detection of one or more messages within the buffer for communication (Para. [0039]).
responsive to detecting the one or more messages within the individual sub-stream, initiate an execution of the reduce task on the code execution system,
Cunningham teaches “once map output has been pushed out to disk, reducer tasks start fetching their input data” and “each reducer starts processing its input” (Para. [0016]) thereby teaching a detection of one or more messages within the buffer for communication (Para. [0039]).
the individual execution of the reduce task aggregating the outputs of the individual executions of the map task corresponding to the one or more messages of the individual sub-stream to result in an aggregate result; and
Cunningham teaches a reducer receiving and executing a reduce task at a reducer 106 corresponding to the results of specific map tasks (Figure 1) where “all jobs in the sequence of the same partition number is mapped to the same place (Para. [0045]). Cunningham further teaches “each reducer outputs a…sequence of key/value pairs that is sent to the OutputFormat…for output” which “involves serializing the data and writing it out to disk” (Para. [0016]) thereby teaching the outputs from each message for a reduce task, associated with its own individual buffer, being aggregated into a sequence of key/value pairs to be sent to the OutputFormat.
output the aggregate results of the individual executions of the reduce task.
Cunningham teaches producing output data from reducer code, the output data capable of being stored in stable storage (Para. [0040]).

Cunningham teaches all of the elements of the claimed invention as recited above except:
a serverless code execution system;

However, in the related field of endeavor of data processing, Rash teaches:
a serverless code execution system;
Liston teaches using a serverless execution system “for ad hoc MapReduce workloads” (Page 2 -Section “Serverless MapReduce overview” - Paragraph 3).

Thus, it would have been obvious to one of ordinary skill in the art, having the teachings of Liston and Cunningham at the time that the claimed invention was effectively filed, to have combined the serverless architecture for MapReduce tasks, as taught by Liston, with the in-memory execution of Map Reduce job sequences, as taught by Cunningham.
One would have been motivated to make such combination because Liston teaches the serverless solution as being “much cheaper than existing solutions” where “given that the solution is serverless, customers pay only when the MapReduce job is executed” (Page 2 -Section “Serverless MapReduce overview” - Paragraph 3) and it would be obvious to a person having ordinary skill in the art that reducing price for customers would improve the customer satisfaction when executing a MapReduce job.


Regarding Claim 2:
Cunningham and Liston further teach:
wherein the set of computing devices of the serverless code execution system comprises a plurality of poller devices, individual poller devices from the plurality of poller devices corresponding to an individual sub-streams of the plurality of sub-streams, and
Cunningham teaches Hadoop having an inherent “task polling model” (Para. [0114]) thereby teaching polling devices being utilized within the task polling model.
wherein the set of computing devices of the serverless code execution system are configured to initiate executions of the reduce task for individual sub-streams of the plurality of sub-streams at least partly by:
at an individual poller device, retrieve messages from the sub-stream corresponding to the individual poller device; and
Cunningham teaches retrieving, by the reducer, map outputs (Para. [0003]) where “when all local mappers have terminated, these trees are merged together (preserving their ordering) into one tree for each destination reducer” and “the result is written into the buffer for communication” (Para. [0039]). 
Cunningham further teaches a reducer receiving and executing a reduce task at a reducer 106 corresponding to the results of specific map tasks (Figure 1) where “all jobs in the sequence of the same partition number is mapped to the same place” (Para. [0045])).
submit calls to the on-demand code execution system to initiate an execution of the reduce task to process the retrieved messages.
Cunninham teaches initiating the execution of the reduce task at reducer 106 to process the retrieved messages from the mapper 104 to produce output 110 (Figure 1 & Para. [0053]).

Regarding Claim 3:
Cunningham and Liston further teach:
wherein the individual poller device is configured to:
obtain, in response to an individual call, state information reflecting a state of the execution of the reduce task after processing a message submitted with the individual call; and
Cunningham teaches “a job controller tracks the state of execution of the job across multiple nodes” (Para. [0009]).
pass in a subsequent call the state information and an additional message from the retrieved messages.
Cunningham teaches “sharing heap state between jobs to improve metrics associated with the job such as performance metrics, e.g., minimize execution time” (Para. [0041]).

Regarding Claim 4:
Cunningham and Liston further teach:
wherein the code of the map task is further executable by the serverless code execution system to select the portion of the set of data processed during an execution of the map task.
Cunningham teaches reading in data and deserializing it into a key/value sequence to be passed to mapper 104 (Para. [0052]) where “a mapper takes a small chunk of data (typically in the form of pairs of (key, value)), and produces zero or more additional key value pairs” and “multiple mappers are executed in parallel on all the available data” (Para. [0003]).

Regarding Claim 5:
All of the limitations herein are similar to some or all of the limitations of Claim 1.

Regarding Claim 6:
All of the limitations herein are similar to some or all of the limitations of Claim 1.

Regarding Claim 7:
Cunningham and Liston further teach:
initiating a plurality of invocations of the map task to process the data set at least partially in parallel, wherein the plurality of invocations results in a plurality of executions of the map task, which executions represent the applications of the map function to the data set.
Cunningham further teaches “if the involved data sets are large, they are automatically partitioned across multiple nodes and the operations are applied in parallel” (Para. [0004]) where “Partitioner code (P) which takes as input a key, a value, the number of partitions, (and may take other inputs) and produces a number, the partition number” (Para. [0040]) thereby teaching partitioned message sub-streams operating in parallel for combining and sending outputs to reducers using partition numbers.

Regarding Claim 8:
Cunningham and Liston further teach:
wherein the data set includes a set of messages published to a second data stream, and wherein initiating the plurality of invocations of the map task includes passing individual messages from the second data stream to individual invocations of the map task as payload data.
Cunningham teachesCunningham teaches “the map tasks deserialize the input data to generate a stream of key/value pairs that is passed into the mapper. The mapper outputs key/value pairs, which are immediately serialized and placed in a buffer” (Para. [0015]) and publishing outputs of mappers to a combiner which processes the outputs and sends them to a reducer (Para. [0005]).  Cunningham further teaches “if the involved data sets are large, they are automatically partitioned across multiple nodes and the operations are applied in parallel” (Para. [0004]) where “Partitioner code (P) which takes as input a key, a value, the number of partitions, (and may take other inputs) and produces a number, the partition number” (Para. [0040]) thereby teaching partitioned message sub-streams operating in parallel for combining and sending outputs to reducers using partition numbers. 

Regarding Claim 9:
Cunningham and Liston further teach:
wherein the individual executions of the map task are implemented within distinct execution environments on the serverless code execution system, and
Cunningham teaches “the processing system shown may be operational with numerous other general purpose or special purpose computing system environments or configurations” (Para. [0130]).
wherein an individual execution of the map task causes the serverless code execution system to:
select a portion of the data set to be processed by the individual execution of the map task;
Cunningham teaches reading in data and deserializing it into a key/value sequence to be passed to mapper 104 (Para. [0052]) where “a mapper takes a small chunk of data (typically in the form of pairs of (key, value)), and produces zero or more additional key value pairs” and “multiple mappers are executed in parallel on all the available data” (Para. [0003]).
apply the map function to the portion of the data set; and
Cunningham teaches reading in data and deserializing it into a key/value sequence to be passed to mapper 104 (Para. [0052]) where “a mapper takes a small chunk of data (typically in the form of pairs of (key, value)), and produces zero or more additional key value pairs” and “multiple mappers are executed in parallel on all the available data” (Para. [0003]).
publish outputs of the map function to the data stream.
Cunningham teaches publishing outputs of mappers to a combiner which processes the outputs and sends them to a reducer (Para. [0005]).  

Regarding Claim 10:
Cunningham and Liston further teach:
wherein subsequent to applying the map function to the portion of the data set, the individual execution of the map task causes the serverless code execution system to at least one of:
i) invoke an additional execution of the map task; or
ii) select an additional portion of the data set to be processed by the individual execution of the map task.
Cunningham teaches “a combiner may be run for a given mapper after accumulating all the output of that mapper” (Para. [0036]) thereby teaching a mapper having more than one output from multiple portions of data being processed by the mapper.

Regarding Claim 13:
Cunningham and Liston further teach:
 wherein initiating executions of the reduce function on the serverless code execution system for an individual sub-stream of the plurality of sub-streams comprises invoking an instance of a reduce task on the serverless code execution system, the instance of the reduce task maintaining state information regarding application of the reduce function to outputs within the individual sub-stream.
Cunningham teaches “each instance of M3R runs on a fixed number (possibly one0 of multi-threaded JVMs” and “an M3R instance runs all jobs in the HMR job sequence submitted to it, potentially running multiple mappers and reduces in the same JVM (for the same job) (Para. [0070]).

Regarding Claim 14:
Some of the limitations herein are similar to some or all of the limitations of Claim 1.

Cunningham and Liston further teach:
a non-transitory computer-readable media comprising instructions executable by a computing system (Paras. [0022] & [0139]).

Regarding Claim 15:
All of the limitations herein are similar to some or all of the limitations of Claim 2.

Regarding Claim 18:
Cunningham and Liston further teach:
wherein calling for execution of the reduce function to process the individual output comprises calling for execution of an individual instance of a reduce task on the serverless code execution system, the individual instance of the reduce task being executed by the serverless code execution system in an execution environment distinct from a second execution environment used to execute a second instance of the reduce task.
Cunningham teaches “each instance of M3R runs on a fixed number (possibly one0 of multi-threaded JVMs” and “an M3R instance runs all jobs in the HMR job sequence submitted to it, potentially running multiple mappers and reduces in the same JVM (for the same job) (Para. [0070]).

Regarding Claim 19:
Cunningham further teaches:
wherein the instructions are executable by the computing system to output the aggregate results at least partly by writing the aggregate result of each execution of the reduce function to a common data storage location.
Cunningham teaches serializing and writing output files to disk (Para. [0053]).


Claims 11, 12, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Cunningham and Liston, and further in view of Rash et al. (U.S. Pre-Grant Publication No. 2014/0214752, hereinafter referred to as Rash).

Regarding Claim 11:
Cunningham and Liston teach all of the elements of the claimed invention as recited above except:
wherein a number of sub-streams within the plurality of sub-streams is determined by a stream data processing system based at least partly on a volume of data within the data stream.

However, in the related field of endeavor of data processing, Rash teaches:
wherein a number of sub-streams within the plurality of sub-streams is determined by a stream data processing system based at least partly on a volume of data within the data stream.
Rash teaches “the total number of back end servers is determined by the estimated total data volume and data bandwidth that each back end server can handle.  The number of buckets is determined as a number larger than the number of back end servers so that the system can scale up to include more back end servers” where “this invention relates…to a data capturing and processing system capable of splitting the data into multiple data streams” (Para. [0001]) using “the category field modulo to a total number of buckets” and “the total number of buckets is a total number of the plurality of log data streams” to which the data is being assigned to (Para. [0054]).

Thus, it would have been obvious to one of ordinary skill in the art, having the teachings of Rash, Liston, and Cunningham at the time that the claimed invention was effectively filed, to have combined the data stream splitting for low-latency data access, as taught by Rash, with the serverless architecture for MapReduce tasks, as taught by Liston, and the in-memory execution of Map Reduce job sequences, as taught by Cunningham.
One would have been motivated to make such combination because Rash teaches lowering the latency of processing large amounts of data by splitting the data being processed “into a plurality of data streams so that the data streams are sent to a receiving application in parallel” (Para. [0010]).

Regarding Claim 12:
Cunningham, Liston, and Rash further teach:
wherein division of the set of outputs among the plurality of sub-streams comprises applying a hashing operation to each individual output’s respective value for the attribute and according to the number of sub-streams.
Cunningham teaches applying a “hash function to map keys to partitions” (Para. [0058]).

Regarding Claim 20:
All of the limitations herein are similar to some or all of the limitations of Claim 11.

Regarding Claim 21:
Cunningham, Liston, and Rash further teach:
wherein the stream data processing system is configured to divide the outputs among the plurality of sub-streams at least partly by applying a modulus division operation to each individual output’s respective value for the attribute and according to the number of sub-streams.
Rash teaches “this invention relates…to a data capturing and processing system capable of splitting the data into multiple data streams” (Para. [0001]) using “the category field modulo to a total number of buckets” where “the total number of buckets is a total number of the plurality of log data streams” to which the data is being assigned to (Para. [0054]).

Claims 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Cunningham and Liston, and further in view of Ko et al. (U.S. Pre-Grant Publication No. 2016/0092493, hereinafter referred to as Ko).

Regarding Claim 16:
Cunningham and Liston teach all of the elements of the claimed invention as recited above except:
wherein calling for execution of the reduce function to process the individual output comprises making a synchronous hypertext transport protocol (HTTP) call to the serverless code execution system, the synchronous HTTP call comprising the individual output.

However, in the related field of endeavor of executing map-reduce jobs, Ko teaches:
wherein calling for execution of the reduce function to process the individual output comprises making a synchronous hypertext transport protocol (HTTP) call to the serverless code execution system, the synchronous HTTP call comprising the individual output.
Ko teaches implementing  “a Pull Execution model” where “an HTTP communication model, or any other communication model that provides equivalent communication functionalities, is also implemented by the MapReduce system where HTTP is utilized to name and retrieve all output data produced in any of the computation states (e.g., splitter data, mapper data, and reducer data)” (Para. [0031]).

Thus, it would have been obvious to one of ordinary skill in the art, having the teachings of Ko and Cunningham at the time that the claimed invention was effectively filed, to have combined the executing of map-reduce jobs with named data, as taught by Ko, with the serverless architecture for MapReduce tasks, as taught by Liston, and the in-memory execution of Map Reduce job sequences, as taught by Cunningham.
One would have been motivated to make such combination because Ko teaches “the HTTP communication model in combination…reduce the I/O and network load of the MapReduce system and enable MapReduce deployments outside of data centers” (Para. [0031]).

Regarding Claim 17:
Cunningham, Liston, and Ko further teach:
wherein the synchronous HTTP call further comprises state information for a prior execution of the reduce function generated based on a prior output of the subset of outputs.
Cunningham  teaches communications comprising state information for prior execution of the reduce functions generated based on prior outputs by teaching “a job controller tracks the state of execution of the job across multiple nodes” (Para. [0009]) and “sharing heap state between jobs to improve metrics associated with the job such as performance metrics, e.g., minimize execution time” (Para. [0041]).  Ko further teaches communicating using “an HTTP communication model, or any other communication model that provides equivalent communication functionalities, is also implemented by the MapReduce system where HTTP is utilized to name and retrieve all output data produced in any of the computation states (e.g., splitter data, mapper data, and reducer data)” (Para. [0031]).

Response to Amendment
Applicant’s Amendments, filed on 11/16/2020, are acknowledged and accepted.
As stated above and restated here for convenience, Applicant’s amendments and remarks filed on 11/16/2020 have been fully considered and thus necessitated the new grounds of rejection presented herein.  Accordingly, THIS ACTION IS MADE FINAL.

Response to Arguments
In the remarks filed on 11/16/2020, Applicant argues that “in contrast to the present claims, the system taught in Cunningham requires each ‘mapper’… to have knowledge of a corresponding ‘reducer’… to which result will be output” whereas “the present claims recite a different approach, in which each map task publishes its respective outputs to a particular message stream, and in which a streaming data processing system hosting the stream the divides the outputs among sub-streams”.Applicant’s argument is not convincing because the claim language does not necessarily restrict the scope to where a mapper cannot have knowledge of a corresponding reducer to which the mapper output will be directed.As recommended in the rejection above,  if the intent was to specify that the message’s respective values for the attribute of the message is used to determine/decide which sub-stream to enqueue the individual message to, it is recommended to clarify the claims to reflect that scope.
In the remarks filed on 11/16/2020, Applicant argues that Cunningham does not teach a serverless code execution system.  Applicant’s argument is convincing and thus necessitated the new grounds of rejection presented herein where the Liston reference teaches a serverless architecture to run MapReduce jobs.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Rus et al. (U.S. Patent No. 9,613,127) teaches executing a MapReduce job on streamed data that includes an arbitrary amount of imbalance with respect to the frequency distribution of the data keys in the data set, the map task module maps the data set to a course partitioning and generates a list of the top K keys with the highest frequency among the data set, and the sort task module employs a plurality of sorters to read the coarse partitioning and sort the data into buckets by data key.
Kardes et al. (U.S. Pre-Grant Publication No. 2015/0269494) teaches implementing a MapReduce framework where a MapReduce job is divided into a number of map tasks and reduce tasks, the Task trackers periodically send heartbeat messages to the job tracker that also doubles as a vehicle for task allocation.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT F MAY whose telephone number is (571)272-3195.  The examiner can normally be reached on Monday-Friday 9:30am to 6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/R. F. M./
Examiner, Art Unit 2154
2/27/2021

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154