Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
1.	The applicant filed an IDS on 12/14/2020 and 04/25/2022. They have been annotated and considered. 

Claim Interpretation
2.	The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

3.	The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
4.	This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“processing unit configured to” in claims 7, 10, 12, and 13.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
A review of the specification (citation to US pub. No. 20220080584) shows the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitations:
“processing unit” corresponds to processing unit 1104 and can be a processor [0127]. 

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
5.	The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


6.	Claims 7, 8, and 9 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specification (citation to US pub. No. 20220080584) is utilized for the description citations below. 

Regarding claim 7, the phrase “a platform comprising a conveyor configured to operate in a first operating mode” and “a robotic arm comprising an end effector configured to operate in a second operating mode” is indefinite because the applicant’s intended scope is unclear and conflicting with specification and claims 4 and 17. Specification along with claims 4 and 17 indicates that “In the first operating mode, the end effector of the robotic arm 115 of the robotic item handler 101 can be actuated to pick an item by grasping a portion of the item. Further, in the second operating mode, a section (e.g. a front/leading end) of the platform 118 of the robotic item handler 101 can be actuated and moved so as to sweep below a portion of a pile of items (e.g. the wall of items 125), thereby, pulling one or more of items of the pile of items on the platform 118 and further guiding the one or more items onto the conveyor 180.” See at least [0063]. Examiner is giving the claim language its plain meaning and treating the first/second operation as a typo when it conflicts with the specification.

Regarding claim 8, the phrase “the first operating mode, a section of the platform is configured to sweep a pile of items, thereby, guiding one or more items of the pile of items onto the conveyor” is indefinite because the applicant’s intended scope is unclear and conflicting with specification and claims 4 and 17. Specification along with claims 4 and 17 indicates that “In the first operating mode, the end effector of the robotic arm 115 of the robotic item handler 101 can be actuated to pick an item by grasping a portion of the item. Further, in the second operating mode, a section (e.g. a front/leading end) of the platform 118 of the robotic item handler 101 can be actuated and moved so as to sweep below a portion of a pile of items (e.g. the wall of items 125), thereby, pulling one or more of items of the pile of items on the platform 118 and further guiding the one or more items onto the conveyor 180.” See at least [0063]. Examiner is giving the claim language its plain meaning and treating the first/second operation as a typo when it conflicts with the specification.

Regarding claim 9, the phrase “the second operating mode, a section of the robotic arm is configured to pick an item by grasping a portion of the item using the end effector of the robotic arm” is indefinite because the applicant’s intended scope is unclear and conflicting with specification and claims 4 and 17. Specification along with claims 4 and 17 indicates that “In the first operating mode, the end effector of the robotic arm 115 of the robotic item handler 101 can be actuated to pick an item by grasping a portion of the item. Further, in the second operating mode, a section (e.g. a front/leading end) of the platform 118 of the robotic item handler 101 can be actuated and moved so as to sweep below a portion of a pile of items (e.g. the wall of items 125), thereby, pulling one or more of items of the pile of items on the platform 118 and further guiding the one or more items onto the conveyor 180.” See at least [0063]. Examiner is giving the claim language its plain meaning and treating the first/second operation as a typo when it conflicts with the specification.

	In the art rejection above, the claims have been treated as best understood by the examiner. Any claim not explicitly rejected under this heading is rejected as being dependent on an indefinite claim. 

Claim Rejections - 35 USC § 103
7.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

8.	Claims 1-18 are rejected under 35 U.S.C. 103 as being unpatentable over Islam et al.  
“Planning, Learning and Reasoning Framework for Robot Truck Unloading” (hereinafter Islam) in view of Wicks et al. (US 20150352721, hereafter Wicks) and Mousavian et al. (US 20200361083, hereinafter Mousavian).

Regarding claim 1, Islam teaches a method for controlling a robotic item handler (See at least: [Abstract] “autonomously unloading boxes from trucks using an industrial manipulator robot”) comprising:
obtaining first point cloud data related to a first three-dimensional image captured by a first sensor device of the robotic item handler (see at least: [Page 5012] “The robot is equipped with a wide suite of sensors, such as RGB and depth sensors, that allow it to estimate occupancy of the trailer's walls, and estimate box poses which are used to construct W.”; [Page 5015] “We extract the pile and the walls by fitting planes to the depth values from the point cloud obtained using our sensors.”). As discussed, Islam teaches utilizing point cloud data from a plurality of sensors.;
outputting a decision classification associated with a first operating mode and a second operating mode (see at least: [Fig. 2 and page 5012] “Example of a decision tree, or strategy, that specifies the best actions (red arrows) along the possible action/observation histories from the initial belief bo.”; [Fig. 4 and page 5014] “The Strategy Chooser learns a mapping M offline which is used in the online phase to decide which strategy to use given current world state.”); and
operating the robotic item handler according to the first operating mode and operating the robotic item handler according to the second operating mode (see at least: [Fig. 5 page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”).
Islam fails to explicitly teach obtaining second point cloud data related to a second three-dimensional image captured by a second sensor device of the robotic item handler; and transforming the first point cloud data and the second point cloud data to generate combined point cloud data.
However, Wicks teaches a method, device, and system for a robotic carton unloader comprises obtaining second point cloud data related to a second three-dimensional image captured by a second sensor device of the robotic item handler; and transforming the first point cloud data and the second point cloud data to generate combined point cloud data (see at least: [0032] “The computing device may also perform operations to stitch the various image data from the sensor devices in order to generate a combined, singular data set. For example, the computing device may combine an image of a top portion of a wall of boxes, a bottom right portion of the wall, and a bottom left portion of the wall to create a single image of the entire wall.”; [0045] “In some embodiments, the sensor devices 102, 104, 106 may be capable of tracking real-time movement as well as scanning objects three-dimensionally (i.e., 3D scanning).”; [0072] “Such 3D point cloud information may be obtained from a combination of the output from the sensor devices 102, 104, 106 (e.g., depth data in combination with other imagery), or alternatively may be acquired from 3D data (e.g., LIDAR data) collected by the various sensors associated with the robotic carton unloader. In various embodiments, there may be an RGB point associated with each 3D point.”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Wicks and provide a method of obtaining second point cloud data related to a second three-dimensional image captured by a second sensor device of the robotic item handler; and transforming the first point cloud data and the second point cloud data to generate combined point cloud data in order to provide a means of creating an image of the entire wall.  

Islam fails to explicitly teach constructing a machine learning model based on the combined point cloud data as an input to a convolution neural network; outputting, via the machine learning model, a decision classification indicative of a first probability associated with a first operating mode and a second probability associated with a second operating mode; and operating the robotic item handler according to the first operating mode in response to the first probability being higher than the second probability and operating the robotic item handler according to the second operating mode in response to the second probability being higher than the first probability.  
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object comprises constructing a machine learning model based on the combined point cloud data as an input to a convolution neural network (see at least: [0079] “Various embodiments apply deep learning to 3D point cloud data. In some examples, 3D data are represented as 3D voxels or as extracted features from 2.5 depth images and processed using convolutional neural networks.”; [0538] “In at least one embodiment, training system 4504 may be used to perform training, deployment, and implementation of machine learning models (e.g., neural networks, object detection algorithms, computer vision algorithms, etc.) for use in deployment system 4506.”); 
outputting, via the machine learning model, a decision classification indicative of a first probability associated with a first operating mode and a second probability associated with a second operating mode (see at least: [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”); and
operating the robotic item handler according to the first operating mode in response to the first probability being higher than the second probability and operating the robotic item handler according to the second operating mode in response to the second probability being higher than the first probability (see at least: [0087] “In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”; [0132]  “In at least one embodiment, the measure is a probability that the grasp pose will succeed, where success is measured by the chance that the grasp will allow performance of a defined object-manipulation task. In at least one embodiment, the measure is an proportional indication of grasp security. In at least one embodiment, the evaluation network provides a differentiable measure of the quality of the grasp.”; [0133] “In at least one embodiment, if the grasp is sufficiently refined, executed advances to block 1514 and the refined grasp is used to direct a robot to grasp the object.”).  
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method of constructing a machine learning model based on the combined point cloud data as an input to a convolution neural network; outputting, via the machine learning model, a decision classification indicative of a first probability associated with a first operating mode and a second probability associated with a second operating mode; and operating the robotic item handler according to the first operating mode in response to the first probability being higher than the second probability and operating the robotic item handler according to the second operating mode in response to the second probability being higher than the first probability in order to provide a means to decide which operation mode is effective to proceed with. 

Regarding claim 2, Islam teaches the limitations of claim 1, further wherein the sensor devices are coupled the robotic item handler (see at least: [Page 5012] “The robot is equipped with a wide suite of sensors, such as RGB and depth sensors, that allow it to estimate occupancy of the trailer's walls, and estimate box poses which are used to construct W.”). 
Islam fails to explicitly teach the first sensor device is coupled to a robotic arm of the robotic item handler and the second sensor device is coupled to a platform of the robotic item handler.  
However, Wicks teaches a method, device, and system for a robotic carton unloader including the first sensor device is coupled to a robotic arm of the robotic item handler and the second sensor device is coupled to a platform of the robotic item handler (see at least: [Fig. 2 and 0052] “The robotic carton unloader 201 may include various mounting options for the various sensors 202-212. In particular, the robotic carton unloader 201 may include visual sensors 202, 206, 210 placed around the robotic carton unloader frame and robotic arm 115”). As discussed, Wicks teaches a first sensor 210 located at the robotic arm and second sensors 202 and 203 around the platform. 
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Wicks and provide the first sensor device is coupled to a robotic arm of the robotic item handler and the second sensor device is coupled to a platform of the robotic item handler in order to provide another means of visualizing the environment and target object. 

Regarding claim 3, Islam teaches the limitations of claim 1, further wherein the robotic item handler is a robotic carton unloader configured for performing at least one of: loading and unloading of items (See at least: [Abstract] “autonomously unloading boxes from trucks using an industrial manipulator robot”).

Regarding claim 4, Islam teaches the limitations of claim 1, further wherein, the first operating mode is associated with picking an item by grasping the item using an end effector of a robotic arm of the robotic item handler; and the second operating mode is associated with sweeping a pile of items from an item docking station with a platform of the robotic item handler (see at least: [page 5011] “a custom built truck-unloading robot (Fig. 1a) equipped with a mobile omnidirectional base referred to as base and two articulated mechanisms-a manipulator-like tool with suction grippers and a scooper-like tool with conveyor belts, referred to as arm and nose, respectively.”; [Fig. 5 and page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”). As discussed, Islam teaches the first operating mode as the pick action using arm end effector and the second operating mode as the sweep action using the nose (platform). 

Regarding claim 5, Islam teaches the limitations of claim 1, further comprising: evaluating a selection of one of: the first operating mode and the second operating mode of the robotic item handler based on the decision classification outputted by a pre-defined heuristic associated with past operations of the robotic item handler (see at least: [page 5013] “The goal for this belief space planning problem is to unload all boxes in the truck. Heuristic functions are used to guide the search to the goal, so that the algorithm does not need to exhaustively evaluate all the possible action sequences.”; [page 5015] “We evaluated the Strategy Executor module in simulation and in the real world. Table II shows the simulation results for each action, Pick and Sweep, the number of executions, planning and execution times and the box unloading rate.”); and
controlling the robotic item handler based on the evaluation of the selection (see at least: [page 5015] “To execute the Pick and Sweep actions the executor makes multiple calls to these modes as detailed in Fig. 5…A total of 128 boxes were unloaded in the entire run which results in an unloading rate of 0.2 boxes/sec”).
Islam fails to explicitly teach evaluating a selection based on the machine learning model and a pre-defined heuristic.
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object comprises evaluating a selection based on the machine learning model and a pre-defined heuristic (see at least: [0069] “In some examples, this problem is approached by geometry-inspired heuristics to select promising grasp points around an object, optionally followed by a more in-depth geometric analysis of the stability and reachability of a sampled grasp.”; [0070] “In an embodiment, deep learning techniques are used to evaluate the quality of grasps from raw point cloud data. While, in some examples, this approach provides good grasp assessments, it still uses manually designed heuristics to sample grasps for evaluation or relies on blackbox optimization techniques such as the Cross Entropy Method (“CEM”).”; [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method of evaluating a selection based on the machine learning model and a pre-defined heuristic associated with past operations of the robotic item handler in order to provide a means to decide which operation mode is effective to proceed with. 

Regarding claim 6, Islam teaches the limitations of claim 5, further comprising the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output over a period of time (see at least [page 5013] “To this end, we chose ARA* [ 16] as our planner. ARA* is an anytime heuristic search-based planner which tunes solution optimality based on available search time. Specifically, it computes an initial plan quickly and refines its quality as time permits.” [page 5015] “We also report the overall planning times for the Pick and Sweep actions (Table II) which come from accumulated planning times for all the subactions.”).  
Islam fails to explicitly teach comprising adjusting a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time.  
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object comprising adjusting a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time (see at least: [0103] “In at least one embodiment, after closing its fingers the gripper executes a predefined shaking motion. In at least one embodiment, a grasp is labeled successful if the object is kept between both fingers.”; [0147] “In at least one embodiment, training framework 1704 adjusts weights that control untrained neural network 1706. In at least one embodiment, training framework 1704 includes tools to monitor how well untrained neural network 1706 is converging towards a model, such as trained neural network 1708, suitable to generating correct answers, such as in result 1714, based on input data such as a new dataset 1712. In at least one embodiment, training framework 1704 trains untrained neural network 1706 repeatedly while adjust weights to refine an output of untrained neural network 1706 using a loss function and adjustment algorithm, such as stochastic gradient descent. In at least one embodiment, training framework 1704 trains untrained neural network 1706 until untrained neural network 1706 achieves a desired accuracy. In at least one embodiment, trained neural network 1708 can then be deployed to implement any number of machine learning operations.”; [0158] “For example, in at least one embodiment, a machine learning model may be trained by calculating weight parameters according to a neural network architecture using software and computing resources described above with respect to data center 1800. In at least one embodiment, trained machine learning models corresponding to one or more neural networks may be used to infer or predict information using resources described above with respect to data center 1800 by using weight parameters calculated through one or more training techniques described herein.” [0206] “In at least one embodiment, confidence may be represented or interpreted as a probability, or as providing a relative “weight” of each detection compared to other detections. In at least one embodiment, a confidence measure enables a system to make further decisions regarding which detections should be considered as true positive detections rather than false positive detections.”).  As discussed, Mousavian teaches different pre-defined heuristics (rules) that the robot needs to meet in order to be a treated as successful grasp such as keeping an object between both fingers of the robot. Mousavian also teaches a machine learning model and applying/adjusting weights to control and refine the output from the machine learning model to attain an accurate grasp. 
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method of adjusting a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time in order to provide a means to determine the importance of each selection. 

Regarding claim 7, Islam teaches a robotic item unloader comprising:
a vision system (see at least: [pages 5014 and 5016] Fig. 4 and Fig. 8 (e) showing real world environment image captured by the vision system of the robot system) comprising:
a sensor device positioned location on the robotic item unloader (see at least: [Page 5012] “The robot is equipped with a wide suite of sensors, such as RGB and depth sensors, that allow it to estimate occupancy of the trailer's walls, and estimate box poses which are used to construct W.”; [Page 5015] “We extract the pile and the walls by fitting planes to the depth values from the point cloud obtained using our sensors.”);
a platform comprising a conveyor configured to operate in a first operating mode; a robotic arm comprising an end effector configured to operate in a second operating mode (see at least: [page 5011] “a custom built truck-unloading robot (Fig. 1a) equipped with a mobile omnidirectional base referred to as base and two articulated mechanisms-a manipulator-like tool with suction grippers and a scooper-like tool with conveyor belts, referred to as arm and nose, respectively.”; [Fig. 5 and page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”). As discussed, Islam teaches the first operating mode as the pick action using arm end effector and the second operating mode as the sweep action using the nose (platform); and
a processing unit communicatively coupled to at least one of the vision system, the platform, and the robotic arm, the processing unit (see at least: [page 5015] “We used Intel Xeon Gold 3.40GHz CPU machine for all our experiments.”) configured to:
obtain first point cloud data related to a first three-dimensional image captured by the first sensor device (see at least: [Page 5012] “The robot is equipped with a wide suite of sensors, such as RGB and depth sensors, that allow it to estimate occupancy of the trailer's walls, and estimate box poses which are used to construct W.”; [Page 5015] “We extract the pile and the walls by fitting planes to the depth values from the point cloud obtained using our sensors.”). As discussed, Islam teaches utilizing point cloud data from a plurality of sensors.;
control the robotic item unloader by: operating the platform of the robotic item unloader according to the first operating mode and operating the robotic arm of the robotic item unloader according to the second operating mode (see at least: [Fig. 5 page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”).  
Islam fails to explicitly teach a first sensor device positioned at a first location on the robotic item unloader; and a second sensor device positioned at a second location on the robotic item unloader; obtain first point cloud data related to a first three-dimensional image captured by the first sensor device; obtain second point cloud data related to a second three-dimensional image captured by the second sensor device; transform the first point cloud data and the second point cloud data to generate a combined point cloud data.
However, Wicks teaches a method, device, and system for a robotic carton unloader comprises a first sensor device positioned at a first location on the robotic item unloader; and a second sensor device positioned at a second location on the robotic item unloader (see at least: [0032] “The computing device may also perform operations to stitch the various image data from the sensor devices in order to generate a combined, singular data set. For example, the computing device may combine an image of a top portion of a wall of boxes, a bottom right portion of the wall, and a bottom left portion of the wall to create a single image of the entire wall.”; [0045] “In some embodiments, the sensor devices 102, 104, 106 may be capable of tracking real-time movement as well as scanning objects three-dimensionally (i.e., 3D scanning).”; [0072] “Such 3D point cloud information may be obtained from a combination of the output from the sensor devices 102, 104, 106 (e.g., depth data in combination with other imagery), or alternatively may be acquired from 3D data (e.g., LIDAR data) collected by the various sensors associated with the robotic carton unloader. In various embodiments, there may be an RGB point associated with each 3D point.”); 
obtain second point cloud data related to a second three-dimensional image captured by the second sensor device; transform the first point cloud data and the second point cloud data to generate a combined point cloud data (see at least: [0032] “The computing device may also perform operations to stitch the various image data from the sensor devices in order to generate a combined, singular data set. For example, the computing device may combine an image of a top portion of a wall of boxes, a bottom right portion of the wall, and a bottom left portion of the wall to create a single image of the entire wall.”; [0045] “In some embodiments, the sensor devices 102, 104, 106 may be capable of tracking real-time movement as well as scanning objects three-dimensionally (i.e., 3D scanning).”; [0072] “Such 3D point cloud information may be obtained from a combination of the output from the sensor devices 102, 104, 106 (e.g., depth data in combination with other imagery), or alternatively may be acquired from 3D data (e.g., LIDAR data) collected by the various sensors associated with the robotic carton unloader. In various embodiments, there may be an RGB point associated with each 3D point.”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Wicks and provide a first sensor device positioned at a first location on the robotic item unloader; and a second sensor device positioned at a second location on the robotic item unloader; obtain second point cloud data related to a second three-dimensional image captured by the second sensor device; transform the first point cloud data and the second point cloud data to generate a combined point cloud data in order to provide a means of creating an image of the entire wall.  
Islam fails to explicitly teach to construct a machine learning model by using the combined point cloud data as an input to a convolution neural network; output, by the machine learning model, a decision classification indicative of a first probability associated with the first operating mode and a second probability associated with the second operating mode; and control the robotic item unloader by: operating the platform of the robotic item unloader according to the first operating mode, in response to, the first probability being higher than the second probability; and operating the robotic arm of the robotic item unloader according to the second operating mode in response to the second probability being higher than the first probability.  
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object and construct a machine learning model by using the combined point cloud data as an input to a convolution neural network (see at least: [0079] “Various embodiments apply deep learning to 3D point cloud data. In some examples, 3D data are represented as 3D voxels or as extracted features from 2.5 depth images and processed using convolutional neural networks.”; [0538] “In at least one embodiment, training system 4504 may be used to perform training, deployment, and implementation of machine learning models (e.g., neural networks, object detection algorithms, computer vision algorithms, etc.) for use in deployment system 4506.”); 
output, by the machine learning model, a decision classification indicative of a first probability associated with the first operating mode and a second probability associated with the second operating mode (see at least: [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”); and 
control the robotic item unloader by: operating the platform of the robotic item unloader according to the first operating mode, in response to, the first probability being higher than the second probability; and operating the robotic arm of the robotic item unloader according to the second operating mode in response to the second probability being higher than the first probability (see at least: [0087] “In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”; [0132]  “In at least one embodiment, the measure is a probability that the grasp pose will succeed, where success is measured by the chance that the grasp will allow performance of a defined object-manipulation task. In at least one embodiment, the measure is an proportional indication of grasp security. In at least one embodiment, the evaluation network provides a differentiable measure of the quality of the grasp.”; [0133] “In at least one embodiment, if the grasp is sufficiently refined, executed advances to block 1514 and the refined grasp is used to direct a robot to grasp the object.”).   
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method to construct a machine learning model by using the combined point cloud data as an input to a convolution neural network; output, by the machine learning model, a decision classification indicative of a first probability associated with the first operating mode and a second probability associated with the second operating mode; and control the robotic item unloader by: operating the platform of the robotic item unloader according to the first operating mode, in response to, the first probability being higher than the second probability; and operating the robotic arm of the robotic item unloader according to the second operating mode in response to the second probability being higher than the first probability in order to provide a means to decide which operation mode is effective to proceed with. 

Regarding claim 8, Islam teaches the limitations of claim 7, further wherein in the first operating mode, a section of the platform is configured to sweep a pile of items, thereby, guiding one or more items of the pile of items onto the conveyor (see at least: [page 5011] “a custom built truck-unloading robot (Fig. 1a) equipped with a mobile omnidirectional base referred to as base and two articulated mechanisms-a manipulator-like tool with suction grippers and a scooper-like tool with conveyor belts, referred to as arm and nose, respectively.”; [Fig. 5 and page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”). As discussed, Islam teaches the first operating mode as the pick action using arm end effector and the second operating mode as the sweep action using the nose (platform).

Regarding claim 9, Islam teaches the limitations of claim 7, further wherein in the second operating mode, a section of the robotic arm is configured to pick an item by grasping a portion of the item using the end effector of the robotic arm(see at least: [page 5011] “a custom built truck-unloading robot (Fig. 1a) equipped with a mobile omnidirectional base referred to as base and two articulated mechanisms-a manipulator-like tool with suction grippers and a scooper-like tool with conveyor belts, referred to as arm and nose, respectively.”; [Fig. 5 and page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”). As discussed, Islam teaches the first operating mode as the pick action using arm end effector and the second operating mode as the sweep action using the nose (platform).  

Regarding claim 10, Islam teaches the limitations of claim 7, further wherein the processing unit is configured to further: evaluate a selection of the first operating mode and the second operating mode, based on a pre-defined heuristic associated with past operations of the robotic item unloader (see at least: [page 5013] “The goal for this belief space planning problem is to unload all boxes in the truck. Heuristic functions are used to guide the search to the goal, so that the algorithm does not need to exhaustively evaluate all the possible action sequences.”; [page 5015] “We evaluated the Strategy Executor module in simulation and in the real world. Table II shows the simulation results for each action, Pick and Sweep, the number of executions, planning and execution times and the box unloading rate.”); and
control the robotic item unloader based on the evaluation of the selection (see at least: [page 5015] “To execute the Pick and Sweep actions the executor makes multiple calls to these modes as detailed in Fig. 5…A total of 128 boxes were unloaded in the entire run which results in an unloading rate of 0.2 boxes/sec”).
Islam fails to explicitly teach to evaluate a selection of the first operating mode and the second operating mode, based on the first probability and the second probability outputted by the machine learning model.
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object and evaluate a selection of the first operating mode and the second operating mode, based on the first probability and the second probability outputted by the machine learning model (see at least: [0069] “In some examples, this problem is approached by geometry-inspired heuristics to select promising grasp points around an object, optionally followed by a more in-depth geometric analysis of the stability and reachability of a sampled grasp.”; [0070] “In an embodiment, deep learning techniques are used to evaluate the quality of grasps from raw point cloud data. While, in some examples, this approach provides good grasp assessments, it still uses manually designed heuristics to sample grasps for evaluation or relies on blackbox optimization techniques such as the Cross Entropy Method (“CEM”).”; [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method evaluate a selection of the first operating mode and the second operating mode, based on the first probability and the second probability outputted by the machine learning model and a pre-defined heuristic associated with past operations of the robotic item unloader in order to provide a means to decide which operation mode is effective to proceed with.

Regarding claim 11, Islam teaches the limitations of claim 7, further wherein the first sensor device and the second sensor device comprises at least one depth camera and a color camera (see at least: [Page 5012]: “The robot is equipped with a wide suite of sensors, such as RGB and depth sensors, that allow it to estimate occupancy of the trailer's walls, and estimate box poses which are used to construct W.”).
Islam fails to explicitly teach wherein the first sensor device is coupled to the robotic arm of the robotic item unloader and the second sensor device is coupled to the platform of the robotic item unloader.  
However, Wicks teaches a method, device, and system for a robotic carton unloader comprises wherein the first sensor device is coupled to the robotic arm of the robotic item unloader and the second sensor device is coupled to the platform of the robotic item unloader (see at least [0045] “In some embodiments, the sensor devices 102, 104, 106 may be capable of tracking real-time movement as well as scanning objects three-dimensionally (i.e., 3D scanning).”; [0072] “Such 3D point cloud information may be obtained from a combination of the output from the sensor devices 102, 104, 106 (e.g., depth data in combination with other imagery), or alternatively may be acquired from 3D data (e.g., LIDAR data) collected by the various sensors associated with the robotic carton unloader. In various embodiments, there may be an RGB point associated with each 3D point.”).  
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Wicks and provide the first sensor device coupled to the robotic arm of the robotic item unloader and the second sensor device is coupled to the platform of the robotic item unloader in order to provide another means of visualizing the environment and target object.

Regarding claim 12, Islam teaches the limitations of claim 7, further wherein to perform at least one of: loading and unloading of a plurality of items from an item docking station, the processing unit is (see at least: Abstract: “autonomously unloading boxes from trucks using an industrial manipulator robot.”; [page 5015] “We used Intel Xeon Gold 3.40GHz CPU machine for all our experiments.”) configured to:
generate a first command to operate the robotic arm according to the first operating mode; and generate a second command to operate the platform according to the second operating mode (see at least: [Fig. 5 page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”).  

Regarding claim 13, Islam teaches the limitations of claim 7, further wherein the processing unit (see at least: [page 5015] “We used Intel Xeon Gold 3.40GHz CPU machine for all our experiments.”) is used for the evaluation of the selection, based on a performance over a period of time (see at least [Fig. 2 and page 5012] “Example of a decision tree, or strategy, that specifies the best actions (red arrows) along the possible action/observation histories from the initial belief bo.”; [page 5013] “To this end, we chose ARA* [ 16] as our planner. ARA* is an anytime heuristic search-based planner which tunes solution optimality based on available search time. Specifically, it computes an initial plan quickly and refines its quality as time permits.” [page 5015] “We also report the overall planning times for the Pick and Sweep actions (Table II) which come from accumulated planning times for all the subactions.”).  
Islam fails to explicitly teach to further adjust a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time.  
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object configured to further adjust a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time (see at least: [0103] “In at least one embodiment, after closing its fingers the gripper executes a predefined shaking motion. In at least one embodiment, a grasp is labeled successful if the object is kept between both fingers.”; [0147] “In at least one embodiment, training framework 1704 adjusts weights that control untrained neural network 1706. In at least one embodiment, training framework 1704 includes tools to monitor how well untrained neural network 1706 is converging towards a model, such as trained neural network 1708, suitable to generating correct answers, such as in result 1714, based on input data such as a new dataset 1712. In at least one embodiment, training framework 1704 trains untrained neural network 1706 repeatedly while adjust weights to refine an output of untrained neural network 1706 using a loss function and adjustment algorithm, such as stochastic gradient descent. In at least one embodiment, training framework 1704 trains untrained neural network 1706 until untrained neural network 1706 achieves a desired accuracy. In at least one embodiment, trained neural network 1708 can then be deployed to implement any number of machine learning operations.”; [0158] “For example, in at least one embodiment, a machine learning model may be trained by calculating weight parameters according to a neural network architecture using software and computing resources described above with respect to data center 1800. In at least one embodiment, trained machine learning models corresponding to one or more neural networks may be used to infer or predict information using resources described above with respect to data center 1800 by using weight parameters calculated through one or more training techniques described herein.” [0206] “In at least one embodiment, confidence may be represented or interpreted as a probability, or as providing a relative “weight” of each detection compared to other detections. In at least one embodiment, a confidence measure enables a system to make further decisions regarding which detections should be considered as true positive detections rather than false positive detections.”).  As discussed, Mousavian teaches different pre-defined heuristics (rules) that the robot needs to meet in order to be a treated as successful grasp such as keeping an object between both fingers of the robot. Mousavian also teaches a machine learning model and applying/adjusting weights to control and refine the output from the machine learning model to attain an accurate grasp.
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method to further adjust a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time in order to provide a means to determine the importance of each selection.

Regarding claim 14, Islam teaches a non-transitory computer readable medium that stores thereon computer-executable instructions that in response to execution by a processor (see at least: [Fig. 4 and page 5015] “We used Intel Xeon Gold 3.40GHz CPU machine for all our experiments.”), perform operations comprising:
obtaining first point cloud data related to a first three-dimensional image captured by a first sensor device of a robotic item handler (see at least: [Page 5012] “The robot is equipped with a wide suite of sensors, such as RGB and depth sensors, that allow it to estimate occupancy of the trailer's walls, and estimate box poses which are used to construct W.”; [Page 5015] “We extract the pile and the walls by fitting planes to the depth values from the point cloud obtained using our sensors.”) As discussed, Islam teaches utilizing point cloud data from a plurality of sensors.;
outputting a decision associated with an operating mode of the robotic item handler (see at least: [Fig. 2 and page 5012] “Example of a decision tree, or strategy, that specifies the best actions (red arrows) along the possible action/observation histories from the initial belief bo.”; [Fig. 4 and page 5014] “The Strategy Chooser learns a mapping M offline which is used in the online phase to decide which strategy to use given current world state.”); and
generating a first command to operate the robotic item handler (see at least: [Fig. 5 page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”).  
Islam fails to explicitly teach obtaining second point cloud data related to a second three-dimensional image captured by a second sensor device of the robotic item handler; transforming the first point cloud data and the second point cloud data to generate a combined point cloud data.
However, Wicks teaches a method, device, and system for a robotic carton unloader comprises obtaining first point cloud data related to a first three-dimensional image captured by a first sensor device of a robotic item handler; obtaining second point cloud data related to a second three-dimensional image captured by a second sensor device of the robotic item handler; transforming the first point cloud data and the second point cloud data to generate a combined point cloud data (see at least: [0032] “The computing device may also perform operations to stitch the various image data from the sensor devices in order to generate a combined, singular data set. For example, the computing device may combine an image of a top portion of a wall of boxes, a bottom right portion of the wall, and a bottom left portion of the wall to create a single image of the entire wall.”; [0045] “In some embodiments, the sensor devices 102, 104, 106 may be capable of tracking real-time movement as well as scanning objects three-dimensionally (i.e., 3D scanning).”; [0072] “Such 3D point cloud information may be obtained from a combination of the output from the sensor devices 102, 104, 106 (e.g., depth data in combination with other imagery), or alternatively may be acquired from 3D data (e.g., LIDAR data) collected by the various sensors associated with the robotic carton unloader. In various embodiments, there may be an RGB point associated with each 3D point.”).  
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Wicks and provide a method of obtaining second point cloud data related to a second three-dimensional image captured by a second sensor device of the robotic item handler; transforming the first point cloud data and the second point cloud data to generate a combined point cloud data in order to provide a means of creating an image of the entire wall.  
Islam fails to explicitly teach constructing a machine learning model by using the combined point cloud data as an input to a convolution neural network; outputting, by the machine learning model, a decision classification indicative of a probability associated with an operating mode of the robotic item handler; and generating a first command to operate the robotic item handler based on the decision classification.  
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object comprises constructing a machine learning model by using the combined point cloud data as an input to a convolution neural network (see at least: [0079] “Various embodiments apply deep learning to 3D point cloud data. In some examples, 3D data are represented as 3D voxels or as extracted features from 2.5 depth images and processed using convolutional neural networks.”; [0538] “In at least one embodiment, training system 4504 may be used to perform training, deployment, and implementation of machine learning models (e.g., neural networks, object detection algorithms, computer vision algorithms, etc.) for use in deployment system 4506.”); 
outputting, by the machine learning model, a decision classification indicative of a probability associated with an operating mode of the robotic item handler (see at least: [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”); and 
generating a first command to operate the robotic item handler based on the decision classification (see at least: [0087] “In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”; [0132]  “In at least one embodiment, the measure is a probability that the grasp pose will succeed, where success is measured by the chance that the grasp will allow performance of a defined object-manipulation task. In at least one embodiment, the measure is an proportional indication of grasp security. In at least one embodiment, the evaluation network provides a differentiable measure of the quality of the grasp.”; [0133] “In at least one embodiment, if the grasp is sufficiently refined, executed advances to block 1514 and the refined grasp is used to direct a robot to grasp the object.”).  
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method of constructing a machine learning model by using the combined point cloud data as an input to a convolution neural network; outputting, by the machine learning model, a decision classification indicative of a probability associated with an operating mode of the robotic item handler; and generating a first command to operate the robotic item handler based on the decision classification in order to provide a means to decide which operation mode is effective to proceed with.

Regarding claim 15, Islam teaches the limitations of claim 14, further comprising evaluating a selection of the operating mode of the robotic item handler based on a pre-defined heuristic associated with past operations of the robotic item handler (see at least: [page 5013] “The goal for this belief space planning problem is to unload all boxes in the truck. Heuristic functions are used to guide the search to the goal, so that the algorithm does not need to exhaustively evaluate all the possible action sequences.”; [page 5015] “We evaluated the Strategy Executor module in simulation and in the real world. Table II shows the simulation results for each action, Pick and Sweep, the number of executions, planning and execution times and the box unloading rate.”); and
generating a second command to operate the robotic item handler based on the evaluation of the selection (see at least: [page 5015] “To execute the Pick and Sweep actions the executor makes multiple calls to these modes as detailed in Fig. 5…A total of 128 boxes were unloaded in the entire run which results in an unloading rate of 0.2 boxes/sec”).  
Islam fails to explicitly teach evaluating a selection of the operating mode based on the decision classification and a pre-defined heuristic. 
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object comprises evaluating a selection of the operating mode based on the decision classification and a pre-defined heuristic (see at least: [0069] “In some examples, this problem is approached by geometry-inspired heuristics to select promising grasp points around an object, optionally followed by a more in-depth geometric analysis of the stability and reachability of a sampled grasp.”; [0070] “In an embodiment, deep learning techniques are used to evaluate the quality of grasps from raw point cloud data. While, in some examples, this approach provides good grasp assessments, it still uses manually designed heuristics to sample grasps for evaluation or relies on blackbox optimization techniques such as the Cross Entropy Method (“CEM”).”; [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method of evaluating a selection of the operating mode based on the decision classification and a pre-defined heuristic in order to provide a means to decide which operation mode is effective to proceed with.

Regarding claim 16, Islam teaches the limitations of claim 14, further wherein the decision classification is associated to a first operating mode and a second operating mode and wherein with the first command is to: operate the robotic item handler according to the first operating mode ; and operate the robotic item handler according to the second operating mode (see at least: [Fig. 2 and page 5012] “Example of a decision tree, or strategy, that specifies the best actions (red arrows) along the possible action/observation histories from the initial belief bo.”; [Fig. 4 and page 5014] “The Strategy Chooser learns a mapping M offline which is used in the online phase to decide which strategy to use given current world state.”) [Fig. 5 page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”).  
Islam fails to explicitly teach the decision classification is indicative of a first probability associated a first operating mode and a second probability associated with a second operating mode and wherein with the first command is to: operate the robotic item handler according to the first operating mode in response to the first probability being higher than the second probability; and operate the robotic item handler according to the second operating mode in response to the second probability being higher than the first probability.  
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object that comprises the decision classification is indicative of a first probability associated a first operating mode and a second probability associated with a second operating mode (see at least: [0087] “In at least one embodiment, in order to detect and refine these negative grasps, an evaluation module is trained to predict P(S|g, X), i.e., the probability of success for a grasp g and the observed point cloud X In at least one embodiment, applied to a sampled grasp, the evaluation module predicts grasp success and propagates the success gradient back through the network to generate an improved grasp pose. In at least one embodiment, this process is repeated. In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”) and 
wherein with the first command is to: operate the robotic item handler according to the first operating mode in response to the first probability being higher than the second probability; and operate the robotic item handler according to the second operating mode in response to the second probability being higher than the first probability (see at least: [0087] “In at least one embodiment, discarding all grasps that remain below a threshold provides the final set of high quality grasps, an example of which is illustrated in FIG. 4.”; [0132]  “In at least one embodiment, the measure is a probability that the grasp pose will succeed, where success is measured by the chance that the grasp will allow performance of a defined object-manipulation task. In at least one embodiment, the measure is an proportional indication of grasp security. In at least one embodiment, the evaluation network provides a differentiable measure of the quality of the grasp.”; [0133] “In at least one embodiment, if the grasp is sufficiently refined, executed advances to block 1514 and the refined grasp is used to direct a robot to grasp the object.”).  
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide the decision classification is indicative of a first probability associated a first operating mode and a second probability associated with a second operating mode and wherein with the first command is to: operate the robotic item handler according to the first operating mode in response to the first probability being higher than the second probability; and operate the robotic item handler according to the second operating mode in response to the second probability being higher than the first probability in order to provide a means to decide which operation mode is effective to proceed with.

Regarding claim 17, Islam teaches the limitations of claim 16, further wherein the first operating mode is associated with picking an item by grasping the item using an end effector of a robotic arm of the robotic item handler; and the second operating mode is associated with sweeping a pile of items from an item docking station with a platform of the robotic item handler (see at least: [page 5011] “a custom built truck-unloading robot (Fig. 1a) equipped with a mobile omnidirectional base referred to as base and two articulated mechanisms-a manipulator-like tool with suction grippers and a scooper-like tool with conveyor belts, referred to as arm and nose, respectively.”; [Fig. 5 and page 5013] “The robot is capable of executing two high level actions-Pick and Sweep, both depicted in Fig. 5. Pick action works well in unloading boxes in structured walls, whereas the Sweep action is geared towards boxes lying in unstructured piles on the trailer floor.”). As discussed, Islam teaches the first operating mode as the pick action using arm end effector and the second operating mode as the sweep action using the nose (platform).

Regarding claim 18, Islam teaches the limitations of claim 15, further comprising the the pre-defined heuristic, used for the evaluation of the selection, based on a performance over a period of time (see at least: [Fig. 2 and page 5012] “Example of a decision tree, or strategy, that specifies the best actions (red arrows) along the possible action/observation histories from the initial belief bo.”; [page 5013] “To this end, we chose ARA* [ 16] as our planner. ARA* is an anytime heuristic search-based planner which tunes solution optimality based on available search time. Specifically, it computes an initial plan quickly and refines its quality as time permits.” [page 5015] “We also report the overall planning times for the Pick and Sweep actions (Table II) which come from accumulated planning times for all the subactions.”).
Islam fails to explicitly teach comprising adjusting a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time.
However, Mousavian teaches a machine learning system that generates grasp poses that can be used by a robot to manipulate an object comprising adjusting a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time (see at least: [0103] “In at least one embodiment, after closing its fingers the gripper executes a predefined shaking motion. In at least one embodiment, a grasp is labeled successful if the object is kept between both fingers.”; [0147] “In at least one embodiment, training framework 1704 adjusts weights that control untrained neural network 1706. In at least one embodiment, training framework 1704 includes tools to monitor how well untrained neural network 1706 is converging towards a model, such as trained neural network 1708, suitable to generating correct answers, such as in result 1714, based on input data such as a new dataset 1712. In at least one embodiment, training framework 1704 trains untrained neural network 1706 repeatedly while adjust weights to refine an output of untrained neural network 1706 using a loss function and adjustment algorithm, such as stochastic gradient descent. In at least one embodiment, training framework 1704 trains untrained neural network 1706 until untrained neural network 1706 achieves a desired accuracy. In at least one embodiment, trained neural network 1708 can then be deployed to implement any number of machine learning operations.”; [0158] “For example, in at least one embodiment, a machine learning model may be trained by calculating weight parameters according to a neural network architecture using software and computing resources described above with respect to data center 1800. In at least one embodiment, trained machine learning models corresponding to one or more neural networks may be used to infer or predict information using resources described above with respect to data center 1800 by using weight parameters calculated through one or more training techniques described herein.” [0206] “In at least one embodiment, confidence may be represented or interpreted as a probability, or as providing a relative “weight” of each detection compared to other detections. In at least one embodiment, a confidence measure enables a system to make further decisions regarding which detections should be considered as true positive detections rather than false positive detections.”).  As discussed, Mousavian teaches different pre-defined heuristics (rules) that the robot needs to meet in order to be a treated as successful grasp such as keeping an object between both fingers of the robot. Mousavian also teaches a machine learning model and applying/adjusting weights to control and refine the output from the machine learning model to attain an accurate grasp.
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Islam to incorporate the teachings of Mousavian and provide a method of adjusting a first weight associated with the decision classification and a second weight associated with the pre-defined heuristic, used for the evaluation of the selection, based on a performance associated with the output of the machine learning model over a period of time in order to provide a means to determine the importance of each selection.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Clucas et al. (US 20180362270) teaches an apparatus for unloading cargo from a trailer with various sensors mounted on a robotic arm and a platform having different operating modes. 
 Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIEN MINH LE whose telephone number is (571)272-3903. The examiner can normally be reached Monday to Friday (8:30am-5:30pm eastern time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Khoi Tran can be reached on (571)272-6919. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/T.M.L./Examiner, Art Unit 3664                                                                                                                                                                                                        /KHOI H TRAN/Supervisory Patent Examiner, Art Unit 3664