DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 6, 8-11, 13, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Trajkovic US PG-Pub (US 20020167537 A1) in view of Oami US-PG-Pub (US 20200285845 A1).
Regarding Claim 1, Trajkovic teaches a method for identifying human body (Fig. 1), comprising:
acquiring a first original picture captured(Fig. 1, ¶[0016], Video input, in the form of image frames is continually received, at 110, and continually processed, via the image processing loop 140-180);adjusting a resolution according to the first original picture acquired to obtain a target picture (¶[0017], the target tracking system may also be configure to adjust the camera's field of view, via 
processing the target picture based on a preset model for human body feature point detection model for human body feature point detection to determine whether the target picture comprises human body information (¶[0016], a target is selected for tracking within the image frames, at 120. After the target is identified, it is modeled for efficient processing, at 130. At block 140, the current image is aligned to a prior image, taking into account any camera adjustments that may have been made, at block 180. After aligning the prior and past images in the image frames, the motion of objects within the frame is determined, at 150. Generally, a target that is being tracked is a moving target, and the identification of independently moving objects improves the efficiency of locating the target, by ignoring background detail. At 160, color matching is used to identify the portion of the image, or the portion of the moving objects in the image, corresponding to the target. Based on the color matching and/or other criteria, such as size, shape, speed of movement, etc., the target is identified in the image, at 170.) ;
if the target picture comprises the human body information, determining human body area information in the original picture according to the human body information and inputting the human body area information into a filter, enabling that the filter determines target human body area information according to the human body area information (¶[0037] In an automated tracking system, the identification of the target 170 provides location information that facilitates the control 180 of one or more cameras, preferably to keep the target substantially centered in the image, and to maintain a relatively constant focal length to the target. Note, however, that the image alignment technique of this invention is not dependent upon the availability of automated camera control. Once the target is identified the location of the target will be provided such that the camera can focus and center the target in the image.);
acquiring a next original picture captured (¶[0020] One or more cameras 210 provide input to a video processor 220. The video processor 220 processes the images from one or more cameras 210. The system relies on multiple cameras to capture multiple frames of the target.);
and determining a possible human body area in the next original picture according to the target human body area information(¶[0020], The video processor 220 processes the images from one or more cameras 210, and, if configured for target identification stores target characteristics in a memory 250, under the control of a system controller 240. The video processor if configured for target identification will determine whether or not the target is present in the image and store it into memory.);
and performing the step of adjusting the resolution according to the possible human body area (¶[0017] the target tracking system, determines when to "hand-off" the tracking from one camera to another, for example, when the target travels from one camera's field of view to another. In either a single or multi-camera system, the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls.);
	Trajkovic does not explicitly teach wherein the filter determines target human body area information according to the human body area information, comprises: the filter determines the target human body area information corresponding to the human body area information according to the human body area information and historical target human body area information, wherein the historical target human body area information is an target human body area information previously determined according to an original picture.
	Oami teaches wherein the filter determines target human body area information according to the human body area information, comprises: the filter determines the target human body area information corresponding to the human body area information according to the human body area information and historical target human body area information, wherein the historical target human body area information is an target human body area information previously determined according to an original picture. (¶[0148] The position estimation unit 2120 estimates a position of each tracking target person at the first time point using the tracking information (S310). The position of the tracking target person shown in the tracking information is a position in the past (for example, a position in the immediately previous video frame 14). Therefore, the position estimation unit 2120 estimates the position of the tracking target person at the first time point from the position of the tracking target person in the past. The examiner interprets that the prior art is determining the location of the tracked target person using the previous video frame as a reference position of the tracked person.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order to determine that the current target area corresponds to the historical information previously determined in a previous frame. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 2, the combination Trajkovic and Oami teaches the method according to claim 1, where Trajkovic further teaches wherein if the target picture does not comprise the human body information, performing the step of acquiring the next original picture captured (¶0018], Alternatively, to minimize false-alarms, such a system may be configured to provide a "general" description of a potential targets, such as a minimum size or a particular shape, in the target modeling block 130, and detect such a target in the target identification block 170. In like manner, a system may be configured to ignore particular targets, or target types, based on general or specific modeling parameters. The system is configured to detect a certain target and if that target isn’t presence will ignore other targets and acquire another image.)
and adjusting a resolution according to the next picture to obtain a target picture (¶[0017], the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls, if any. Alternatively, or additionally, the target tracking system may be configured to notify a security person of the movements of the target, for a manual control of the camera, or selection of cameras.)
Regarding Claim 3, the combination Trajkovic and Oami teaches the method according to claim 1, where Trajkovic further teaches wherein the adjusting the resolution comprises: performing at least one of compression, stretching, and padding processing on an original picture to obtain a target picture, so that a resolution of the target picture matches a resolution of an input image of the preset model for human body feature point detection model for human body feature point detection (¶[0010] Low resolution images are computed from two consecutive image frames, and feature points are determined and matched between the two low resolution images. Statistical methods are used to estimate the motion in terms of a translation and rotation of the image plane. Corresponding feature points in the original images are matched, based on the estimated motion of the low-resolution images. Statistical techniques are then applied to determine a homography matrix that describes the motion between the corresponding feature points in the original images, and this matrix is used to align the original images. Differences between the aligned images are identified, to indicate the movement of one or more objects in the image.).
Regarding Claim 4, the combination Trajkovic and Oami teaches the method according to claim 1, where Oami further teaches wherein the determining human body area information in the original picture according to the human body information comprises: determining an area where the human body is positioned in the target picture according to the human body information (¶[0064]the detection unit 2020 includes a detector that learns an image feature of the person. The detector detects an image region matching the learned image feature from the video frame 14 as a region (hereinafter, a person region) representing the person 20. For example, a detector that performs detection based on a Histograms of Oriented Gradients (HOG) feature or a detector that uses a Convolutional Neural Network (CNN) can be used as the detector. It should be noted that the detector may be a detector trained to detect the region of the whole person 20 or a detector trained to detect a part of the region of the person 20. For example, in a case where a head part position and a foot position can be detected using the detector that has learned a head part and a foot, the person region can be determined.)and determining coordinate information in the original picture according to the area where the human body is positioned in the target picture(¶[0065] The detector outputs information (hereinafter, detection information) related to the detected person 20. For example, the detection information indicates a position and a size of the person 20. The position of the person 20 in the detection information may be represented as a position on the video frame 14 (for example, coordinates using the upper left end of the video frame 14 as an origin) or may be represented as real world coordinates. Existing technologies can be used as a technology for computing the real world coordinates of an object included in an image generated by a camera. For example, the real world coordinates of the person 20 can be computed from the position on the video frame 14 using a camera parameter.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order to determine the area where the human body is located in the frame alongside the coordinates. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 6, the combination Trajkovic and Oami teaches the method according to claim 4, where Oami further teaches wherein the area where the human body is positioned is a rectangular area(¶[0066], the size of the person 20 is represented by a size (for example, lengths of vertical and horizontal edges or an average value thereof) of a circumscribed rectangle (hereinafter, referred to as a person rectangle) of the person or a circumscribed rectangle of a part of the region of the person such as the head part or the foot. This size may be a size in the video frame 14 or a size in a real world.).
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order for the area where the human body is position to be a rectangular area. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 8, Trajkovic teaches a device for identifying human body (Fig. 2), comprising: 
a memory (Fig. 2, 250); a processor (Fig. 2, 220); and a computer program(¶[0020],The tracking system is preferably embodied as a combination of hardware devices and programmed processors.); wherein the computer program is stored in the memory and is configured to be executed by the processor to (¶[0020] The tracking system is preferably embodied as a combination of hardware devices and programmed processors.): acquire a first original picture captured(Fig. 1, ¶[0016], Video input, in the form of image frames is continually received, at 110, and continually processed, via the image processing loop 140-180);
adjust a resolution according to the first original picture acquired to obtain a target picture(¶[0017], the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls, if any. Alternatively, or additionally, the target tracking system may be configured to notify a security person of the movements of the target, for a manual control of the camera, or selection of cameras.);
process the target picture based on a preset model for human body feature point detection model for human body feature point detection to determine whether the target picture comprises human body information (¶[0016], a target is selected for tracking within the image frames, at 120. After the target is identified, it is modeled for efficient processing, at 130. At block 140, the current image is aligned to a prior image, taking into account any camera adjustments that may have been made, at block 180. After aligning the prior and past images in the image frames, the motion of objects within the frame is determined, at 150. Generally, a target that is being tracked is a moving target, and the identification of independently moving objects improves the efficiency of locating the target, by ignoring background detail. At 160, color matching is used to identify the portion of the image, or the portion of the moving objects in the image, corresponding to the target. Based on the color matching and/or other criteria, such as size, shape, speed of movement, etc., the target is identified in the image, at 170.); 
if the target picture comprises the human body information, determine human body area information in the original picture according to the human body information and input the human body area information into a filter, enabling that the filter determines target human body area information according to the human body area information (¶[0037] In an automated tracking system, the identification of the target 170 provides location information that facilitates the control 180 of one or more cameras, preferably to keep the target substantially centered in the image, and to maintain a relatively constant focal length to the target. Note, however, that the image alignment technique of this invention is not dependent upon the availability of automated camera control. Once the target is identified the location of the target will be provided such that the camera can focus and center the target in the image.); 
acquire a next original picture captured(¶[0020] One or more cameras 210 provide input to a video processor 220. The video processor 220 processes the images from one or more cameras 210. The system relies on multiple cameras to capture multiple frames of the target.); 
and determine a possible human body area in the next original picture according to the target human body area information(¶[0020], The video processor 220 processes the images from one or more cameras 210, and, if configured for target identification stores target characteristics in a memory 250, under the control of a system controller 240. The video processor if configured for target identification will determine whether or not the target is present in the image and store it into memory.); 
and perform the step of adjusting the resolution according to the possible human body area (¶[0017] the target tracking system, determines when to "hand-off" the tracking from one camera to another, for example, when the target travels from one camera's field of view to another. In either a single or multi-camera system, the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls.).
	Trajkovic does not explicitly teach wherein the computer program is further configured to be executed by the processor to: the filter determines the target human body area information corresponding to the human body area information according to the human body area information and historical target human body area information, wherein the historical target human body area information is an target human body area information previously determined according to an original picture.
	Oami teaches wherein the computer program is further configured to be executed by the processor to: the filter determines the target human body area information corresponding to the human body area information according to the human body area information and historical target human body area information, wherein the historical target human body area information is an target human body area information previously determined according to an original picture. (¶[0148] The position estimation unit 2120 estimates a position of each tracking target person at the first time point using the tracking information (S310). The position of the tracking target person shown in the tracking information is a position in the past (for example, a position in the immediately previous video frame 14). Therefore, the position estimation unit 2120 estimates the position of the tracking target person at the first time point from the position of the tracking target person in the past. The examiner interprets that the prior art is determining the location of the tracked target person using the previous video frame as a reference position of the tracked person.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order to determine that the current target area corresponds to the historical information previously determined in a previous frame. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 9, the combination Trajkovic and Oami teaches the device according to claim 8, where Trajkovic further teaches wherein the computer program is further configured to be executed by the processor to: if the target picture does not comprise the human body information, perform the step of acquiring the next original picture captured (¶0018], Alternatively, to minimize false-alarms, such a system may be configured to provide a "general" description of a potential targets, such as a minimum size or a particular shape, in the target modeling block 130, and detect such a target in the target identification block 170. In like manner, a system may be configured to ignore particular targets, or target types, based on general or specific modeling parameters. The system is configured to detect a certain target and if that target isn’t presence will ignore other targets and acquire another image.)
and adjusting a resolution according to the next picture to obtain a target picture (¶[0017], the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls, if any. Alternatively, or additionally, the target tracking system may be configured to notify a security person of the movements of the target, for a manual control of the camera, or selection of cameras.)
Regarding Claim 10, the combination Trajkovic and Oami teaches the device according to claim 8,  where Trajkovic further teaches wherein the computer program is further configured to be executed by the processor to: perform at least one of compression, stretching, and padding processing on an original picture to obtain a target picture, so that a resolution of the target picture matches a resolution of an input image of the preset model for human body feature point detection model for human body feature point detection(¶[0010] Low resolution images are computed from two consecutive image frames, and feature points are determined and matched between the two low resolution images. Statistical methods are used to estimate the motion in terms of a translation and rotation of the image plane. Corresponding feature points in the original images are matched, based on the estimated motion of the low-resolution images. Statistical techniques are then applied to determine a homography matrix that describes the motion between the corresponding feature points in the original images, and this matrix is used to align the original images. Differences between the aligned images are identified, to indicate the movement of one or more objects in the image.).
Regarding Claim 15, Trajkovic teaches a non-transitory computer readable storage medium(¶[0020],The tracking system is preferably embodied as a combination of hardware devices and programmed processors.) wherein the computer readable storage medium has a computer program stored thereon, wherein the computer program is executed by a processor to: 
acquire a first original picture captured(Fig. 1, ¶[0016], Video input, in the form of image frames is continually received, at 110, and continually processed, via the image processing loop 140-180);
adjust a resolution according to the first original picture acquired to obtain a target picture(¶[0017], the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls, if any. Alternatively, or additionally, the target tracking system may be configured to notify a security person of the movements of the target, for a manual control of the camera, or selection of cameras.);
process the target picture based on a preset model for human body feature point detection model for human body feature point detection to determine whether the target picture comprises human body information(¶[0016], a target is selected for tracking within the image frames, at 120. After the target is identified, it is modeled for efficient processing, at 130. At block 140, the current image is aligned to a prior image, taking into account any camera adjustments that may have been made, at block 180. After aligning the prior and past images in the image frames, the motion of objects within the frame is determined, at 150. Generally, a target that is being tracked is a moving target, and the identification of independently moving objects improves the efficiency of locating the target, by ignoring background detail. At 160, color matching is used to identify the portion of the image, or the portion of the moving objects in the image, corresponding to the target. Based on the color matching and/or other criteria, such as size, shape, speed of movement, etc., the target is identified in the image, at 170.) ; 
if the target picture comprises the human body information, determine human body area information in the original picture according to the human body information and input the human body area information into a filter, enabling that the filter determines target human body area information according to the human body area information (¶[0037] In an automated tracking system, the identification of the target 170 provides location information that facilitates the control 180 of one or more cameras, preferably to keep the target substantially centered in the image, and to maintain a relatively constant focal length to the target. Note, however, that the image alignment technique of this invention is not dependent upon the availability of automated camera control. Once the target is identified the location of the target will be provided such that the camera can focus and center the target in the image.); 
acquire a next original picture captured (¶[0020] One or more cameras 210 provide input to a video processor 220. The video processor 220 processes the images from one or more cameras 210. The system relies on multiple cameras to capture multiple frames of the target.); 
and determine a possible human body area in the next original picture according to the target human body area information(¶[0020], The video processor 220 processes the images from one or more cameras 210, and, if configured for target identification stores target characteristics in a memory 250, under the control of a system controller 240. The video processor if configured for target identification will determine whether or not the target is present in the image and store it into memory.); 
and perform the step of adjusting the resolution according to the possible human body area (¶[0017] the target tracking system, determines when to "hand-off" the tracking from one camera to another, for example, when the target travels from one camera's field of view to another. In either a single or multi-camera system, the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls.).
	Trajkovic does not explicitly teach wherein the computer program is executed by a processor to: the filter determines the target human body area information corresponding to the human body area information according to the human body area information and historical target human body area information, wherein the historical target human body area information is an target human body area information previously determined according to an original picture.
	Oami teaches wherein the computer program is executed by a processor to: the filter determines the target human body area information corresponding to the human body area information according to the human body area information and historical target human body area information, wherein the historical target human body area information is an target human body area information previously determined according to an original picture(¶[0148] The position estimation unit 2120 estimates a position of each tracking target person at the first time point using the tracking information (S310). The position of the tracking target person shown in the tracking information is a position in the past (for example, a position in the immediately previous video frame 14). Therefore, the position estimation unit 2120 estimates the position of the tracking target person at the first time point from the position of the tracking target person in the past. The examiner interprets that the prior art is determining the location of the tracked target person using the previous video frame as a reference position of the tracked person.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order to determine that the current target area corresponds to the historical information previously determined in a previous frame. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 11, the combination of Trajkovic and Oami teaches the device according to claim 8, wherein the computer program is further configured to be executed by the processor to: determine an area where the human body is positioned in the target picture according to the human body information(¶[0064]the detection unit 2020 includes a detector that learns an image feature of the person. The detector detects an image region matching the learned image feature from the video frame 14 as a region (hereinafter, a person region) representing the person 20. For example, a detector that performs detection based on a Histograms of Oriented Gradients (HOG) feature or a detector that uses a Convolutional Neural Network (CNN) can be used as the detector. It should be noted that the detector may be a detector trained to detect the region of the whole person 20 or a detector trained to detect a part of the region of the person 20. For example, in a case where a head part position and a foot position can be detected using the detector that has learned a head part and a foot, the person region can be determined.); and determine coordinate information in the original picture according to the area where the human body is positioned in the target picture(¶[0065] The detector outputs information (hereinafter, detection information) related to the detected person 20. For example, the detection information indicates a position and a size of the person 20. The position of the person 20 in the detection information may be represented as a position on the video frame 14 (for example, coordinates using the upper left end of the video frame 14 as an origin) or may be represented as real world coordinates. Existing technologies can be used as a technology for computing the real world coordinates of an object included in an image generated by a camera. For example, the real world coordinates of the person 20 can be computed from the position on the video frame 14 using a camera parameter.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order to determine the area where the human body is located in the frame alongside the coordinates. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 13, the combination of Trajkovic and Oami teaches the device according to claim 11, where Oami further teaches wherein the area where the human body is positioned is a rectangular area(¶[0066], the size of the person 20 is represented by a size (for example, lengths of vertical and horizontal edges or an average value thereof) of a circumscribed rectangle (hereinafter, referred to as a person rectangle) of the person or a circumscribed rectangle of a part of the region of the person such as the head part or the foot. This size may be a size in the video frame 14 or a size in a real world.).
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order for the area where the human body is position to be a rectangular area. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Regarding Claim 16, the combination Trajkovic and Oami teaches the non-transitory computer readable storage medium according to claim 15, where Trajkovic further teaches wherein the computer program is further configured to be executed by the processor to: if the target picture does not comprise the human body information, perform the step of acquiring the next original picture captured, (¶0018], Alternatively, to minimize false-alarms, such a system may be configured to provide a "general" description of a potential targets, such as a minimum size or a particular shape, in the target modeling block 130, and detect such a target in the target identification block 170. In like manner, a system may be configured to ignore particular targets, or target types, based on general or specific modeling parameters. The system is configured to detect a certain target and if that target isn’t presence will ignore other targets and acquire another image.)
and adjusting a resolution according to the next picture to obtain a target picture (¶[0017], the target tracking system may also be configure to adjust the camera's field of view, via control of the camera's pan, tilt, and zoom controls, if any. Alternatively, or additionally, the target tracking system may be configured to notify a security person of the movements of the target, for a manual control of the camera, or selection of cameras.)
Regarding Claim 17, the combination Trajkovic and Oami teaches the non-transitory computer readable storage medium according to claim 15, where Trajkovic further teaches wherein the computer program is executed by a processor to: perform at least one of compression, stretching, and padding processing on an original picture to obtain a target picture, so that a resolution of the target picture matches a resolution of an input image of the preset model for human body feature point detection model for human body feature point detection(¶[0010] Low resolution images are computed from two consecutive image frames, and feature points are determined and matched between the two low resolution images. Statistical methods are used to estimate the motion in terms of a translation and rotation of the image plane. Corresponding feature points in the original images are matched, based on the estimated motion of the low-resolution images. Statistical techniques are then applied to determine a homography matrix that describes the motion between the corresponding feature points in the original images, and this matrix is used to align the original images. Differences between the aligned images are identified, to indicate the movement of one or more objects in the image.).
Regarding Claim 18, the combination of Trajkovic and Oami teaches the non-transitory computer readable storage medium according to claim 15, where Oami further teaches wherein the computer program is executed by a processor to: determine an area where the human body is positioned in the target picture according to the human body information(¶[0064]the detection unit 2020 includes a detector that learns an image feature of the person. The detector detects an image region matching the learned image feature from the video frame 14 as a region (hereinafter, a person region) representing the person 20. For example, a detector that performs detection based on a Histograms of Oriented Gradients (HOG) feature or a detector that uses a Convolutional Neural Network (CNN) can be used as the detector. It should be noted that the detector may be a detector trained to detect the region of the whole person 20 or a detector trained to detect a part of the region of the person 20. For example, in a case where a head part position and a foot position can be detected using the detector that has learned a head part and a foot, the person region can be determined.); and determine coordinate information in the original picture according to the area where the human body is positioned in the target picture(¶[0065] The detector outputs information (hereinafter, detection information) related to the detected person 20. For example, the detection information indicates a position and a size of the person 20. The position of the person 20 in the detection information may be represented as a position on the video frame 14 (for example, coordinates using the upper left end of the video frame 14 as an origin) or may be represented as real world coordinates. Existing technologies can be used as a technology for computing the real world coordinates of an object included in an image generated by a camera. For example, the real world coordinates of the person 20 can be computed from the position on the video frame 14 using a camera parameter.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Oami to Trajkovic in order to determine the area where the human body is located in the frame alongside the coordinates. One skilled in the art would have been motivated to modify Trajkovic in this manner in order to track the person with higher accuracy and the position of the person at each time can be estimated with high accuracy. (Oami, ¶[0175])
Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Trajkovic US PG-Pub (US 20020167537 A1) in view of Oami US-PG-Pub (US 20200285845 A1) in view of Marman et al. US PG-Pub (US 20120062732 A1).
Regarding Claim 5, the combination of Trajkovic and Oami teaches the method according to claim 4, wherein the determining coordinate information in the original picture according to the area where the human body is positioned in the target picture comprises:where Trajkovic further teaches determining an original human body area in the original picture according to the area where the human body is positioned in the target picture(¶[0037] In an automated tracking system, the identification of the target 170 provides location information that facilitates the control 180 of one or more cameras, preferably to keep the target substantially centered in the image, and to maintain a relatively constant focal length to the target)
They don’t explicitly teach enlarging the original human body area by a preset multiple times to obtain an enlarged area; and determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information.
Marman teaches enlarging the original human body area by a preset multiple times to obtain an enlarged area(¶[0083], Video player object 1105 is operable to compute coordinates for multiple crop rectangles when the user desires to see multiple zoomed-in tracking windows as shown in FIG. 9, for example. Video player object 1105 may also be operable to determine when to combine or split crop rectangles (e.g., when objects move close to one another or when objects move away from one another. The examiner interprets the prior art is capable of generating zoomed in tracking windows of the target object.).and determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information (¶[0086] The ideal crop rectangle is a crop rectangle that should be reached at the end of the smoothing period. For each frame, however, video player object 1105 computes an actual crop rectangle that is used for creating the cropped close-up image. The actual crop rectangle is a linear transformation of the prior frame's actual crop rectangle. Moreover, because tracked objects are typically in a state of movement, the ideal crop rectangle is recalculated at each frame. In one example, the actual crop rectangle includes four coordinates: X coordinate of the top-left corner of the actual crop rectangle; the Y coordinate of the top-left corner of the actual crop rectangle; the X coordinate of the bottom-right corner of the actual crop rectangle; and the Y coordinate of the bottom-right corner of the actual crop rectangle. The examiner interprets that in the enlarged area the prior art is obtaining the x and y coordinates of each corner in the cropped rectangle.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Marman to Trajkovic and Oami in order to enlarge the area of interest and determine the x and y coordinates of the corners. One skilled in the art would have been motivated to modify Trajkovic and Oami in this manner in order to track multiple objects individually present at a scene. (Marman, ¶[0025])
Regarding Claim 12, the combination of Trajkovic and Oami teaches the device according to claim 8, wherein the computer program is further configured to be executed by the processor to: where Trajkovic further teaches determine an original human body area in the original picture according to the area where the human body is positioned in the target picture (¶[0037] In an automated tracking system, the identification of the target 170 provides location information that facilitates the control 180 of one or more cameras, preferably to keep the target substantially centered in the image, and to maintain a relatively constant focal length to the target)
They don’t explicitly teach enlarging the original human body area by a preset multiple times to obtain an enlarged area; and determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information.
Marman teaches enlarging the original human body area by a preset multiple times to obtain an enlarged area(¶[0083], Video player object 1105 is operable to compute coordinates for multiple crop rectangles when the user desires to see multiple zoomed-in tracking windows as shown in FIG. 9, for example. Video player object 1105 may also be operable to determine when to combine or split crop rectangles (e.g., when objects move close to one another or when objects move away from one another. The examiner interprets the prior art is capable of generating zoomed in tracking windows of the target object.).and determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information (¶[0086] The ideal crop rectangle is a crop rectangle that should be reached at the end of the smoothing period. For each frame, however, video player object 1105 computes an actual crop rectangle that is used for creating the cropped close-up image. The actual crop rectangle is a linear transformation of the prior frame's actual crop rectangle. Moreover, because tracked objects are typically in a state of movement, the ideal crop rectangle is recalculated at each frame. In one example, the actual crop rectangle includes four coordinates: X coordinate of the top-left corner of the actual crop rectangle; the Y coordinate of the top-left corner of the actual crop rectangle; the X coordinate of the bottom-right corner of the actual crop rectangle; and the Y coordinate of the bottom-right corner of the actual crop rectangle. The examiner interprets that in the enlarged area the prior art is obtaining the x and y coordinates of each corner in the cropped rectangle.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Marman to Trajkovic and Oami in order to enlarge the area of interest and determine the x and y coordinates of the corners. One skilled in the art would have been motivated to modify Trajkovic and Oami in this manner in order to track multiple objects individually present at a scene. (Marman, ¶[0025])
Regarding Claim 19, the combination of Trajkovic and Oami teaches the non-transitory computer readable storage medium according to claim 15, wherein the computer program is executed by a processor to: where Trajkovic further teaches determine an original human body area in the original picture according to the area where the human body is positioned in the target picture (¶[0037] In an automated tracking system, the identification of the target 170 provides location information that facilitates the control 180 of one or more cameras, preferably to keep the target substantially centered in the image, and to maintain a relatively constant focal length to the target)
They don’t explicitly teach enlarging the original human body area by a preset multiple times to obtain an enlarged area; and determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information.
Marman teaches enlarging the original human body area by a preset multiple times to obtain an enlarged area(¶[0083], Video player object 1105 is operable to compute coordinates for multiple crop rectangles when the user desires to see multiple zoomed-in tracking windows as shown in FIG. 9, for example. Video player object 1105 may also be operable to determine when to combine or split crop rectangles (e.g., when objects move close to one another or when objects move away from one another. The examiner interprets the prior art is capable of generating zoomed in tracking windows of the target object.).and determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of an upper left corner of the enlarged area  and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information or determining a horizontal coordinate and a vertical coordinate of a lower right corner of the enlarged area and a horizontal coordinate and a vertical coordinate of a center point of the enlarged area as the coordinate information (¶[0086] The ideal crop rectangle is a crop rectangle that should be reached at the end of the smoothing period. For each frame, however, video player object 1105 computes an actual crop rectangle that is used for creating the cropped close-up image. The actual crop rectangle is a linear transformation of the prior frame's actual crop rectangle. Moreover, because tracked objects are typically in a state of movement, the ideal crop rectangle is recalculated at each frame. In one example, the actual crop rectangle includes four coordinates: X coordinate of the top-left corner of the actual crop rectangle; the Y coordinate of the top-left corner of the actual crop rectangle; the X coordinate of the bottom-right corner of the actual crop rectangle; and the Y coordinate of the bottom-right corner of the actual crop rectangle. The examiner interprets that in the enlarged area the prior art is obtaining the x and y coordinates of each corner in the cropped rectangle.)
It would have been obvious at the time of filing to one of ordinary skill in the art to add the teaching of Marman to Trajkovic and Oami in order to enlarge the area of interest and determine the x and y coordinates of the corners. One skilled in the art would have been motivated to modify Trajkovic and Oami in this manner in order to track multiple objects individually present at a scene. (Marman, ¶[0025])
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAN D HOANG whose telephone number is (571)272-4344.  The examiner can normally be reached on Monday-Friday 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire X. Wang can be reached on (571) 270-1051.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/HAN HOANG/Examiner, Art Unit 2663          

/CLAIRE X WANG/Supervisory Patent Examiner, Art Unit 2663