DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
Claims 1-11 are pending in this application. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 


35 U.S.C. § 112 Sixth Paragraph - Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: “unit” in claims 1-6 and 10-11.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art 

Claims 1-11 are rejected under 35 U.S.C. 103(a) as being unpatentable over Delp (US PGPub US 2017 /0075356 A1, hereby referred to as “Delp”), hereby referred to as “”, in view of Park et al. (US PGPub US 2018/0232606 Al), hereby referred to as “Park”. 

Consider Claim 1, 10 and 11. 
Delp teaches: 
-; 1. A processing device structured to recognize an object based on image data, comprising: / 10. A processing device structured to / 11. A learning method for a processing device structured to (Delp: abstract, A method of autonomous driving includes generating, with a 3D sensor, 3D points representing objects in the environment surrounding a vehicle. The method further includes, with a computing device, identifying, from the 3D points, a temporal series of clusters of 3D points representing the same object in the environment surrounding the vehicle as a track, identifying cluster-based classifiers for the object based on identified local features for the clusters in the track, identifying track-based classifiers for the object based on identified global features for the track, combining the cluster-based classifiers and the track-based classifiers to classify the object, with the cluster-based classifiers being weighted based on an amount of information on the clusters from which they are identified, and with the weight increasing with increasing amounts of information, and driving the vehicle along a route based on the object's classification. [0017] FIG. 1 shows a vehicle 10 including an autonomous operation system 20 whose operation is supported by a LIDAR sensor 22 and one or more optional auxiliary sensors 24. The LIDAR sensor 22 and the auxiliary sensors 24 are mounted on the vehicle 10 and positioned to have fields of view in the environment surrounding the vehicle 10.)
-; 1. an object recognition unit structured to identify an object based on the image data; / 10. recognize an object based on a sensor output acquired by a sensor, comprising: / 11. recognize an object based on image data acquired by a camera, wherein the processing device comprises: (Delp: [0018] The LIDAR sensor 22 is configured to scan the environment surrounding the vehicle 10 and generate signals, including but not limited to 3D points, representing the objects in the environment surrounding the vehicle 10. [0024] The auxiliary sensors 24 may have fields of view individually, or collectively, common to the field of view of the LIDAR sensor 22 in the environment surrounding the vehicle 10. Generally, the auxiliary sensors 24 can be, or include, one or more image sensors configured for capturing light or other electromagnetic energy from the environment surrounding the vehicle 10.)
-; 1. and a conversion unit structured as a neural network / 10. a conversion unit structured as a neural network, and structured to convert the sensor output to intermediate data; /11.  a conversion unit structured to convert the image data acquired by the camera; (Delp: [0042] The operations of a process 100 for the classifier thread in the detection module 50 of the autonomous operation system 20 of the vehicle 10 are shown in FIG. 3. [0043] As described below, the process 100 culminates in the combination of cluster-based classifiers identified based on local features for a track's clusters of 3D points, or cluster features, and a track-based classifier based on the global features for the track itself, or holistic features.)
-; 1. provided as an upstream stage of the object recognition unit, / 10. and an object recognition unit structured to identify an object based on the intermediate data, / 11. (Delp: [0045] In general, the holistic features are higher level summary statistics of the object represented by the track's clusters of3D points. The holistic features may, for example, correspond in whole or in part to the motion of the object represented by the track's clusters of 3D points. For the track, there is a single global, or holistic, feature set co. With both the cluster feature set z11 and the single holistic feature set co, the feature set for the track at T is xr=zLD co. [0046] In operation 102, the local features for a track's clusters of 3D points, or cluster features, are identified, and in operation 104, the global features for the track itself, or holistic features, are identified.)
-; 1. and structured to convert a first image acquired by a camera into a second image, / 10.  / wherein the conversion unit converts the sensor output into the intermediate data as acquired in the same environment / 11. and wherein the learning method comprises: training the object recognition unit using an image acquired in a predetermined environment as learning data; (Delp: [0047] The local features for a track's clusters of 3D points, or cluster features, may be identified, for instance, from spin images and histogram of oriented gradients (HOG) features derived from virtual orthographic images of the track's clusters of 3D points. In general, this identification requires the track's clusters of 3D points to be oriented consistently, which can be accomplished by estimating the principle direction of each of the track's clusters of 3D points.)
-; 1. and to input the second image to the object recognition unit. / 10. wherein the conversion unit converts the sensor output into the intermediate data as acquired in the same environment as that in which learning data used for training of the object recognition unit was acquired  / 11. and training the conversion unit using a set of an (Delp: [0053] The global features for a track, or holistic features, may be, or include, a velocity of the track's clusters of 3D points that represents a velocity of the object represented by the track's clusters of 3D points. Accordingly, the global features for a track may include a maximum velocity of the track's clusters of 3D points or a maximum velocity of the track's clusters of 3D points, or both, for instance. Alternatively, or additionally, the global features for a track may be, or include, an acceleration of the track's clusters of 3D points that represents an acceleration of the object represented by the track's clusters of 3D points. Accordingly, the global features for a track may include a maximum acceleration of the track's clusters of 3D points or a maximum acceleration of the track's clusters of 3D points, or both, for instance. These and other global features for a track may be identified, for example, using a Kalman filter over the centroids of the track's clusters of 3D points. [0054] In operation 106, it is learned which local features, or cluster features, and which global features, or holistic features, are predictive of objects belonging to the pedestrian (cP), bicycle (c 6), vehicle (cJ and background (c6g) object classes. FIG. 7 is an example graphical model encoding the probabilistic independencies between the local features for a track's clusters of 3D points, or cluster features, and the global features for the track, or holistic features. The learning in operation 106 may implement a decision-tree-based Gentle ADABoost.)
Delp does not teach: 
1/10. a conversion unit structured as a neural network 
Park teaches: 
(Park: abstract, Disclosed is a sensory information providing apparatus. The sensory information providing apparatus may comprise a learning model database storing a plurality of learning models related to sensory effect information with respect to a plurality of videos; and a video analysis engine generating the plurality of learning models by extracting sensory effect association information by analyzing the plurality of videos and sensory effect meta information of the plurality of videos, and extracting sensory information corresponding to an input video stream by analyzing the input video stream based on the plurality of learning model. [0053] As shown in FIG. 1, the video analysis engine 100 according to the present disclosure may perform feature point extraction, training/retraining, event/object extraction, and the like based on deep learning. [0087] FIG. 5 is a block diagram illustrating a sensory information extraction unit according to an embodiment of the present disclosure.)
-; 1. an object recognition unit structured to identify an object based on the image data; / 10. recognize an object based on a sensor output acquired by a sensor, comprising: / 11. recognize an object based on image data acquired by a camera, wherein the processing device comprises: (Park: [0088] As explained referring to FIG. 2, the deep learning-based video analysis unit 110 in the video analysis engine may separate an input video stream into video frames, extract feature points for each video frame or segmented video, and the sensory effect information analysis unit 120 may extract sensory effect information by using a video analysis result and sensory effect meta information of the input video stream. [0089] The sensory information extraction unit 130 may generate the sensory information using the video analysis result and the sensory effect information in cooperation with the deep learning based video analysis unit 110 and the sensory information analysis unit 120, and may comprise an event recognition unit 131, an object recognition unit 132, and an association information extraction unit 133.)
-; 10. wherein the conversion unit converts the sensor output into the intermediate data as acquired in the same environment as that in which learning data used for training of the object recognition unit was acquired  (Park: [0090] Specifically, the sensory information extraction unit 130 may comprise the event recognition unit 131 for recognizing events to be sensory effect elements from the video analysis result, the object recognition unit 132 for recognizing objects to be sensory effect elements from the video analysis result, and the association information extraction unit 133 for extracting sensory effect association information using the recognized events or objects and the sensory effect meta information. [0091] The event recognition unit 131 may recognize events included in the input video stream, that is, events to be sensory effect elements, and output information on the events (referred to as 'event information).)
-;  11. and wherein the learning method comprises: training the object recognition unit using an image acquired in a predetermined environment as learning data; and training the conversion unit using a set of an image acquired in the predetermined environment and an image acquired in an environment that differs from the predetermined environment. (Park: [0092] The object recognition unit 132 may recognize objects included in the input video stream, that is, objects to be sensory effect elements, and output information on the objects (referred to as 'object information'). [0093] The association information extraction unit 133 may extract sensory effect association information based on the event information, the object information, context information)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify Delp’s method and system for object identification and classification in vehicular imaging to leverage the sensor data feature extraction algorithm and machine learning algorithms of Park as they are all directed towards the field of image analysis. The determination of obviousness is predicated upon the following findings: One skilled in the art would have been motivated to modify Delp in order to improve the overall automated image analysis algorithm to leverage a learned neural network that performs feature-based video analysis.  Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and/or programming techniques, without changing a “fundamental” operating principle of Delp, while the teaching of Park continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of feature-based video data analysis using learned neural networks. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Consider Claim 2. The combination of Park and Delp teaches: The processing device according to claim 1, wherein the second image is obtained by correcting shades of the first image such that they are an approximation of learning data used for training of the object recognition unit. (Delp: [0047] The local features for a track's clusters of 3D points, or cluster features, may be identified, for instance, from spin images and histogram of oriented gradients (HOG) features derived from virtual orthographic images of the track's clusters of 3D points. In general, this identification requires the track's clusters of 3D points to be oriented consistently, which can be accomplished by estimating the principle direction of each of the track's clusters of 3D points. Park: [0072] Among the sensory effect information, the sensory effect types may include light, flash, temperature, wind, vibration, air jet, water jet, fog, bubble, motion, scent, and the like. [0073] The sensory effect start time (i.e., presentation time stamp (pts )) may represent a time at which an object appears or an event starts. The duration may represent a time taken from the sensory effect start time to a sensory effect end time (i.e., sensory effect duration). Here, units of the sensory effect start time pts and the duration are milliseconds. [0074] The sensory effect objects or events may represent objects or events causing sensory effects, and in a motion effect, etc., they may be useful for a motion tracking. Here, a motion tracking algorithm according to types of the sensory effect objects and events may be used as a motion tracking algorithm for the motion tracking.)

Consider Claim 3. The combination of Park and Delp teaches: The processing device according to claim 1, wherein the second image of the same scene as that of the first image is generated as an image as acquired in the same environment as an environment in which the learning data used for the training of the object recognition unit was acquired. (Park: [0072] Among the sensory effect information, the sensory effect types may include light, flash, temperature, wind, vibration, air jet, water jet, fog, bubble, motion, scent, and the like. [0073] The sensory effect start time (i.e., presentation time stamp (pts )) may represent a time at which an object appears or an event starts. The duration may represent a time taken from the sensory effect start time to a sensory effect end time (i.e., sensory effect duration). Here, units of the sensory effect start time pts and the duration are milliseconds. [0074] The sensory effect objects or events may represent objects or events causing sensory effects, and in a motion effect, etc., they may be useful for a motion tracking. Here, a motion tracking algorithm according to types of the sensory effect objects and events may be used as a motion tracking algorithm for the motion tracking.)

Consider Claim 4. The combination of Park and Delp teaches: The processing device according to claim 3, wherein the learning data is acquired in daytime, and wherein the conversion unit converts a first image acquired in nighttime into a second image as acquired in daytime. (Park: [0072] Among the sensory effect information, the sensory effect types may include light, flash, temperature, wind, vibration, air jet, water jet, fog, bubble, motion, scent, and the like. [0073] The sensory effect start time (i.e., presentation time stamp (pts )) may represent a time at which an object appears or an event starts. The duration may represent a time taken from the sensory effect start time to a sensory effect end time (i.e., sensory effect duration). Here, units of the sensory effect start time pts and the duration are milliseconds. [0074] The sensory effect objects or events may represent objects or events causing sensory effects, and in a motion effect, etc., they may be useful for a motion tracking. Here, a motion tracking algorithm according to types of the sensory effect objects and events may be used as a motion tracking algorithm for the motion tracking.)

Consider Claim 5. The combination of Park and Delp teaches: The processing device according to claim 1, wherein, in the training of the conversion unit, the neural network of the conversion unit is optimized with reference to an identification rate of the object recognition unit so as to improve the identification rate. (Delp: [0046] In operation 102, the local features for a track's clusters of 3D points, or cluster features, are identified, and in operation 104, the global features for the track itself, or holistic features, are identified. In the process 100, for the track, each of the resulting feature sets corresponds to a classifier, so there will be T cluster-based classifiers and one track-based classifier incorporated into the object's ultimate classification. Park: Figure 5, [0089] The sensory information extraction unit 130 may generate the sensory information using the video analysis result and the sensory effect information in cooperation with the deep learning based video analysis unit 110 and the sensory information analysis unit 120, and may comprise an event recognition unit 131, an object recognition unit 132, and an association information extraction unit 133.)

Consider Claim 6. The combination of Park and Delp teaches: The processing device according to claim 1, wherein the conversion unit receives a plurality of consecutive frames as input. (Park: [0063] Specifically, the deep learning-based video analysis unit 110 may separate video frames from the input video stream and extract feature points for each video frame or segmented video to construct a learning data set for extracting sensory effects. [0071] The input data and the output data for the video analysis according to the present disclosure may be similar in shape but not the same. The output data, which is the result of the video analysis of the present disclosure, is the sensory information, and may additionally include position information of sensory effect objects or events in addition to the input data. That is, the positions of objects or events in video frames that cannot be expressed through a manual operation of the user may be information extracted through the video analysis engine according to the present disclosure, and thus not included in the input data input to the video analysis engine. Delp: [0023] The auxiliary sensors 24 may also be configured to scan the environment surrounding the vehicle 10 and generate signals representing objects, or the lack thereof, in the environment surrounding the vehicle 10. [0024] The auxiliary sensors 24 may have fields of view individually, or collectively, common to the field of view of the LIDAR sensor 22 in the environment surrounding the vehicle 10. Generally, the auxiliary sensors 24 can be, or include, one or more image sensors configured for capturing light or other electromagnetic energy from the environment surrounding the vehicle 10. These image sensors may be, or include, one or more photodetectors, solid state photodetectors, photodiodes or photomultipliers, or any combination of these.)

Consider Claim 7. The combination of Park and Delp teaches: An object identification system comprising: a camera; and the processing device according to claim 1. (Delp: [0023]-[0026], [0024] The auxiliary sensors 24 may have fields of view individually, or collectively, common to the field of view of the LIDAR sensor 22 in the environment surrounding the vehicle 10. [0025] The vehicle 10 includes a computing device 30 to which the LIDAR sensor 22 and the auxiliary sensors 24 are communicatively connected through one or more communication links 32. [0026] The computing device 30 may include a processor 40 communicatively coupled with a memory 42. Park: [0053] As shown in FIG. 1, the video analysis engine 100 according to the present disclosure may perform feature point extraction, training/retraining, event/object extraction, and the like based on deep learning. [0087] FIG. 5 is a block diagram illustrating a sensory information extraction unit according to an embodiment of the present disclosure.)

Consider Claim 8. The combination of Park and Delp teaches: An automotive lamp comprising the object identification system according to claim 7. (Delp: [0017]-[0021], [0017] FIG. 1 shows a vehicle 10 including an autonomous operation system 20 whose operation is supported by a LIDAR sensor 22 and one or more optional auxiliary sensors 24. The LIDAR sensor 22 and the auxiliary sensors 24 are mounted on the vehicle 10 and positioned to have fields of view in the environment surrounding the vehicle 10. [0023]-[0026], [0024] The auxiliary sensors 24 may have fields of view individually, or collectively, common to the field of view of the LIDAR sensor 22 in the environment surrounding the vehicle 10. [0025] The vehicle 10 includes a computing device 30 to which the LIDAR sensor 22 and the auxiliary sensors 24 are communicatively connected through one or more communication links 32. [0026] The computing device 30 may include a processor 40 communicatively coupled with a memory 42. Park: [0053] As shown in FIG. 1, the video analysis engine 100 according to the present disclosure may perform feature point extraction, training/retraining, event/object extraction, and the like based on deep learning. [0087] FIG. 5 is a block diagram illustrating a sensory information extraction unit according to an embodiment of the present disclosure.)

Consider Claim 9. The combination of Park and Delp teaches: An automobile comprising: a camera built into a headlamp; and the processing device according to claim 1. (Delp: [0017]-[0021], [0017] FIG. 1 shows a vehicle 10 including an autonomous operation system 20 whose operation is supported by a LIDAR sensor 22 and one or more optional auxiliary sensors 24. The LIDAR sensor 22 and the auxiliary sensors 24 are mounted on the vehicle 10 and positioned to have fields of view in the environment surrounding the vehicle 10. [0023]-[0026], [0024] The auxiliary sensors 24 may have fields of view individually, or collectively, common to the field of view of the LIDAR sensor 22 in the environment surrounding the vehicle 10. [0025] The vehicle 10 includes a computing device 30 to which the LIDAR sensor 22 and the auxiliary sensors 24 are communicatively connected through one or more communication links 32. [0026] The computing device 30 may include a processor 40 communicatively coupled with a memory 42. Park: [0053] As shown in FIG. 1, the video analysis engine 100 according to the present disclosure may perform feature point extraction, training/retraining, event/object extraction, and the like based on deep learning. [0087] FIG. 5 is a block diagram illustrating a sensory information extraction unit according to an embodiment of the present disclosure.)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAHMINA ANSARI whose telephone number is 571-270-3379.  The examiner can normally be reached on IFP Flex - Monday through Friday 9 to 5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SUMATI LEFKOWITZ can be reached on 571-272-3638.  The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications. TC 2600’s customer service number is 571-272-2600.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2600.



2662
/Tahmina Ansari/

October 22, 2021
/TAHMINA N ANSARI/Primary Examiner, Art Unit 2662