DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “person storage part that stores”, “image acquisition part that acquires”, “detection region setting part that sets”, “person detection part that detects”, “person region acquisition part that acquires”, “partial image acquisition part that acquires” and “detection image generation part that generates” in claims 13-16.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 102

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-9, 11 and 13-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Cheng et al. (US 2020/0401812).
Regarding claim 1, Cheng et al. discloses a non-transitory computer readable medium storing an image processing program causing a computer to execute image processing, the image processing program causing the computer to execute: 
a person storage step of storing a person region that is a region that includes a person detected in a frame image and is smaller than the frame image (“determining the position range CX of the target object in the current frame of image according to the position range CX-1 of the target object in the previous frame of image without using the first-stage neural network” at paragraph 0090, line 2; “In this embodiment of this disclosure, a hand or a face is used just as an example of the target object for description. The target object may be alternatively any other type of object. For example, the target object may be another part such as a foot, or the target object may be an entire human body” at paragraph 0068, line 1); 
an image acquisition step of acquiring the frame image from a moving image (“Specifically, the position range CX-1 of the target object in the previous frame of image may be directly used as the position range CX of the target object in the current frame of image” at paragraph 0090, line 6; the range CX is interpreted as a frame image as a specific subset of the current frame); 
a detection region setting step of setting a detection region that is a region based on the person region stored in the person storage step in the frame image acquired in the image acquisition step (the detection region is therefore the range CX as designated for the current frame); and 
a person detection step of detecting the person from the detection region set in the detection region setting step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 2, Cheng et al. discloses a computer readable medium 
wherein the computer further executes a person region acquisition step of acquiring a person region based on the person detected in the person detection step (as noted above, the region of the hand, face, foot or entire body is detected), and 
wherein the person storage step is a step of storing the person region set in the person region acquisition step (the region is at least temporarily stored for analysis of the next frame to determine if the position range can be maintained).
Regarding claim 3, Cheng et al. discloses a computer readable medium 
wherein the computer further executes 
a partial image acquisition step of acquiring a partial image that is an image based on the person region stored in the person storage step from the frame image acquired in the image acquisition step (the range CX is interpreted as a partial image as a specific subset of the current frame that is used for subsequent detection), and 
a detection image generation step of generating a detection image that is an image based on the partial image acquired in the partial image acquisition step (as noted previously, the detection is performed on the specific range CX), and 
wherein the person detection step is a step of detecting the person from the detection image generated in the detection image generation step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 4, Cheng et al. discloses a computer readable medium 
wherein the computer further executes 
a partial image acquisition step of acquiring a partial image that is an image based on the person region stored in the person storage step from the frame image acquired in the image acquisition step (the range CX is interpreted as a partial image as a specific subset of the current frame that is used for subsequent detection), and 
a detection image generation step of generating a detection image that is an image based on the partial image acquired in the partial image acquisition step (as noted previously, the detection is performed on the specific range CX), and 
wherein the person detection step is a step of detecting the person from the detection image generated in the detection image generation step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 5, Cheng et al. discloses a computer readable medium 
wherein the person detection step is a step of detecting the person from the detection image generated in the detection image generation step and detecting the person from the detection region set in the detection region setting step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 6, Cheng et al. discloses a computer readable medium 
wherein the person detection step is a step of detecting the person from the detection image generated in the detection image generation step and detecting the person from the detection region set in the detection region setting step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 7, Cheng et al. discloses a computer readable medium 
wherein the computer further executes a range setting step of setting a specific range including the person in the frame image acquired in the image acquisition step (“Specifically, the position range CX-1 of the target object in the previous frame of image may be directly used as the position range CX of the target object in the current frame of image” at paragraph 0090, line 6), and 
wherein the person detection step is a step of detecting the person present in a region that satisfies the specific range set in the range setting step and the detection region set in the detection region setting step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 8, Cheng et al. discloses a computer readable medium 
wherein the range setting step is a step of setting, as the specific range, a range in which a specific body part of the person in the frame image acquired in the image acquisition step is located (“In this embodiment of this disclosure, a hand or a face is used just as an example of the target object for description. The target object may be alternatively any other type of object. For example, the target object may be another part such as a foot” at paragraph 0068, line 1), and 
wherein the person detection step is a step of detecting the person whose specific body part is present in the region that satisfies the specific range set in the range setting step and the detection region set in the detection region setting step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 9, Cheng et al. discloses a computer readable medium 
wherein the computer further executes a non-specific range setting step of setting a non-specific range that is a range not specified as a person region in the frame image acquired in the image acquisition step (though not explicit, areas that do not fall in the range CX are not considered for detection), and 
wherein the person detection step is a step of detecting the person from a region in which the non-specific range set in the non-specific range setting step is excluded from the detection region set in the detection region setting step (detection is done only on the detection range CX as described above, so areas outside of that are considered to be excluded).
Regarding claim 11, Cheng et al. discloses a computer readable medium wherein the range setting step is a step of setting feet of the person as the specific body part (“In this embodiment of this disclosure, a hand or a face is used just as an example of the target object for description. The target object may be alternatively any other type of object. For example, the target object may be another part such as a foot” at paragraph 0068, line 1).
Regarding claim 13, Cheng et al. discloses an image processing apparatus comprising: 
a person storage part that stores a person region that is a region that includes a person detected in a frame image and is smaller than the frame image (“determining the position range CX of the target object in the current frame of image according to the position range CX-1 of the target object in the previous frame of image without using the first-stage neural network” at paragraph 0090, line 2; “In this embodiment of this disclosure, a hand or a face is used just as an example of the target object for description. The target object may be alternatively any other type of object. For example, the target object may be another part such as a foot, or the target object may be an entire human body” at paragraph 0068, line 1); 
an image acquisition part that acquires the frame image from a moving image (“Specifically, the position range CX-1 of the target object in the previous frame of image may be directly used as the position range CX of the target object in the current frame of image” at paragraph 0090, line 6; the range CX is interpreted as a frame image as a specific subset of the current frame); 
a detection region setting part that sets a detection region that is a region based on the person region stored in the person storage part in the frame image acquired in the image acquisition part (the detection region is therefore the range CX as designated for the current frame); and 
a person detection part that detects the person from the detection region set by the detection region setting part (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 14, Cheng et al. discloses an apparatus further comprising:
a person region acquisition part that acquires a person region based on the person detected by the person detection part (as noted above, the region of the hand, face, foot or entire body is detected), and 
wherein the person storage part stores the person region set by the person region acquisition part (the region is at least temporarily stored for analysis of the next frame to determine if the position range can be maintained).
Regarding claim 15, Cheng et al. discloses an apparatus further comprising: 
a partial image acquisition part that acquires a partial image that is an image based on the person region stored in the person storage step from the frame image acquired in the image acquisition step (the range CX is interpreted as a partial image as a specific subset of the current frame that is used for subsequent detection), and 
a detection image generation part that generates a detection image that is an image based on the partial image acquired in the partial image acquisition step (as noted previously, the detection is performed on the specific range CX), and 
wherein the person detection part detects the person from the detection image generated by the detection image generation step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 16, Cheng et al. discloses an apparatus further comprising: 
a partial image acquisition part that acquires a partial image that is an image based on the person region stored in the person storage step from the frame image acquired in the image acquisition step (the range CX is interpreted as a partial image as a specific subset of the current frame that is used for subsequent detection), and 
a detection image generation part that generates a detection image that is an image based on the partial image acquired in the partial image acquisition step (as noted previously, the detection is performed on the specific range CX), and 
wherein the person detection part detects the person from the detection image generated by the detection image generation step (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 17, Cheng et al. discloses an image processing method: 
storing a person region that is a region that includes a person detected in a frame image and is smaller than the frame image (“determining the position range CX of the target object in the current frame of image according to the position range CX-1 of the target object in the previous frame of image without using the first-stage neural network” at paragraph 0090, line 2; “In this embodiment of this disclosure, a hand or a face is used just as an example of the target object for description. The target object may be alternatively any other type of object. For example, the target object may be another part such as a foot, or the target object may be an entire human body” at paragraph 0068, line 1); 
acquiring the frame image from a moving image (“Specifically, the position range CX-1 of the target object in the previous frame of image may be directly used as the position range CX of the target object in the current frame of image” at paragraph 0090, line 6; the range CX is interpreted as a frame image as a specific subset of the current frame); 
setting a detection region that is a region based on the stored person region in the acquired frame image (the detection region is therefore the range CX as designated for the current frame); and 
detecting the person from the set detection region (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 18, Cheng et al. discloses a method further comprising:
acquiring a person region based on the person detected from the set detection region (as noted above, the region of the hand, face, foot or entire body is detected), and 
wherein the storing the person region comprises storing the acquired person region (the region is at least temporarily stored for analysis of the next frame to determine if the position range can be maintained).
Regarding claim 19, Cheng et al. discloses a method further comprising: 
acquiring a partial image that is an image based on the stored person region from the acquired frame image (the range CX is interpreted as a partial image as a specific subset of the current frame that is used for subsequent detection), and 
generating a detection image that is an image based on the acquired partial image (as noted previously, the detection is performed on the specific range CX), and 
wherein the detecting the person from the detection image comprises detecting the person from the generated detection image (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).
Regarding claim 20, Cheng et al. discloses a method further comprising: 
acquiring a partial image that is an image based on the stored person region from the acquired frame image (the range CX is interpreted as a partial image as a specific subset of the current frame that is used for subsequent detection), and 
generating a detection image that is an image based on the acquired partial image (as noted previously, the detection is performed on the specific range CX), and 
wherein the detecting the person from the detection image comprises detecting the person from the generated detection image (“In S140, target object recognition is performed on the current frame of image according to the position range CX of the target object in the current frame of image by using the second-stage neural network to obtain a target object recognition result RX of the current frame of image” at paragraph 0092).

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 10 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Cheng et al. and Wu et al. (US 7,949,188).
Regarding claim 10, Cheng et al. discloses a computer readable medium as described in claim 7 above.
Cheng et al. does not explicitly disclose that the person is a performer, and wherein the range setting step is a step of setting, as the specific range, a stage on which the person performs in the frame image acquired in the image acquisition step.
Wu et al. teaches a computer readable medium
wherein the person is a performer (“If the images of a concert, are adopted as a moving image content, the stage of the concert may be found noteworthy and the image area corresponding to the stage may be detected as the area of interest” at col. 8, line 24; “In the example of FIG. 27, a player's face 175 is extracted as an object feature” at col. 16, line 17; as the concert stage is a region of interest and a performer is analogous to a sports player in this embodiment, the person can therefore be a performer), and 
wherein the range setting step is a step of setting, as the specific range, a stage on which the person performs in the frame image acquired in the image acquisition step (as shown above, the stage is the area of interest).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to utilize the person detection of Cheng et al. in a concert setting as taught by Wu et al. as Wu et al. demonstrates that person detection is applicable to many settings, such as sports, tv, music, etc.
Regarding claim 12, Cheng et al. discloses a computer readable medium as described in claim 9 above.
Cheng et al. does not explicitly disclose that the person is a performer, and wherein the non-specific range setting step is a step of setting, as the non-specific range, a screen provided around a stage on which the person performs in the frame image acquired in the image acquisition step.
Wu et al. teaches a computer readable medium
wherein the person is a performer (“If the images of a concert, are adopted as a moving image content, the stage of the concert may be found noteworthy and the image area corresponding to the stage may be detected as the area of interest” at col. 8, line 24; “In the example of FIG. 27, a player's face 175 is extracted as an object feature” at col. 16, line 17; as the concert stage is a region of interest and a performer is analogous to a sports player in this embodiment, the person can therefore be a performer), and 
wherein the non-specific range setting step is a step of setting, as the non-specific range, a screen provided around a stage on which the person performs in the frame image acquired in the image acquisition step (as the stage is designated as area of interest, portions outside of that are not considered for further feature extraction; therefore, if there is a screen outside of the stage area, it will be excluded from further processing).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to utilize the person detection of Cheng et al. in a concert setting as taught by Wu et al. as Wu et al. demonstrates that person detection is applicable to many settings, such as sports, tv, music, etc.  Furthermore, by excluding areas outside the stage, the system is able to limit the further processing to only areas that would likely contain the persons of interest.


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATRINA R FUJITA whose telephone number is (571)270-1574. The examiner can normally be reached Monday - Friday 9:30-5:30 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on 5712723638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KATRINA R FUJITA/Primary Examiner, Art Unit 2662