DETAILED ACTION
*Note in the following document:
1. Texts in italic bold format are limitations quoted either directly or conceptually from claims/descriptions disclosed in the instant application.
2. Texts in regular italic format are quoted directly from cited reference or Applicant’s arguments.
3. Texts with underlining are added by the Examiner for emphasis.
4. Texts with 

	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 16 September 2022 has been entered.

 Status of Claims
This is in response to applicant’s amendment/response file on 16 September 2022, which has been entered and made of record.  Claims 1, 6, 11-12, 15 and 17-20 have been amended.  Claims 3-4 and 13-14 have been cancelled.  No Claim has been added.  Claims 1-2, 5-12 and 15-20 are pending in the application.
	
	Response to Arguments
Applicant's arguments, with respect to 35 U.S.C. §112(f) Claim Interpretation, see p.7-8, filed on 16 September 2022 have been fully considered and are persuasive.  The previous 35 U.S.C. §112(f)  Claim Interpretation is withdrawn after related claims being amended.
Applicant's arguments, with respect to Claim Objection, see p.8, filed on 16 September 2022 have been fully considered and are persuasive.  The previous Objection to Claim is withdrawn after related claims being amended.
Applicant’s arguments, see p.8-11, filed on 16 September 2022, with respect to the rejection(s) of independent Claim(s) 1/11 and their dependent claims under 35 USC §103 have been fully considered but are moot because the arguments do not apply to any of the references being used in the current rejection. The newly amended Claim(s) 1/11 is/are now rejected under 35 USC §103 as being unpatentable over *. See detailed rejections below.
	

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 16 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 16 recites the limitation "The intelligent device of claim 14" in line 1.  There is insufficient antecedent basis for this limitation in the claim since Claim 14 has been cancelled.  For examination purpose, the examiner assumes Claim 16 is intended to depend on Claim 11.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 1-2, 5-7, 11-12 and 15-17 are rejected under 35 U.S.C.  103 as being unpatentable over Park (US 2018/ 0144194 A1) in view of Harada (US 2013/0289992 A1) and Fink et al. (US 2020/0065589 A1).
Regarding Claim 1, Park discloses a method ([0001]: The present invention relates to a method and an apparatus for classifying videos) of controlling an intelligent device ([0013]: intelligent multilayer classification), the method comprising:
extracting sound information from a video ([0035]: The audio extractor 100 extracts audio data from inputted AV (Audio & Video) stream and [0039]: the audio signal classifier 200 receives data corresponding to the audio signal); 
learning the obtained sound information and recognizing a sound based on a result of the learning ([0044]: the composite feature decision unit 220 can process acoustic feature information for each time interval obtained from the acoustic feature extracting unit 210 as primary feature data, and decide composite feature information on the basis of the primary feature data.  [0050]: the category determination unit 230 determines classification category of the audio data.  Note Park does not explicitly use the phrase learning the obtained sound information.  However it would have been obvious to a POSITA before the effective filing date of the claimed invention that the processing acoustic feature information and deciding composite feature information is a learning process since Park teaches or suggests using neural network used for Artificial Intelligence and machine learning is a branch of AI); and 
classifying the image based on the recognized sound (Fig.1 and [0057]: the video classifier 300 can process the detailed classification of the video on the basis of the broad category information classified by the audio signal classifier 20);
wherein the learning the obtained sound information and the recognizing the sound based on the result of the learning includes: extracting a feature value from the sound information ([0017]: According to embodiments of the present invention, videos can be primarily classified using composite features of audio signals).
But Park does not explicitly recite analyzing, based on the feature value, whether a state of the recognized sound is a clear state or an unclear state.
However Harada discloses a voice recognition method includes: detecting a vocal section including a vocal sound in a voice, based on a feature value of an audio signal representing the voice; identifying a word expressed by the vocal sound in the vocal section, by matching the feature value of the audio signal of the vocal section and an acoustic model of each of a plurality of words; and selecting, with a processor, the word expressed by the vocal sound in a word section based on a comparison result between a signal characteristic of the word section and a signal characteristic of the vocal section ([0009]).  Harada teaches the selection of word is based finding match words by comparing S/N values of feature value of audio signal and acoustic model (Fig.1 and [0042]: the selection unit 18 may select a word expressed by the vocal sound of the word section having an SNR which is not lower than a lower limit threshold value and not higher than an upper limit threshold value with respect to the SNR of the vocal section).  Harada further discloses match words can be or not be obtained and if the word is not selected by the selection unit 18, the output unit 19 may output nothing, and may also output a notification that a result of the voice recognition has failed to be obtained ([0039]).  Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Harada into that of Park and to add the limitation of analyzing, based on the feature value, whether a state of the recognized sound is a clear state or an unclear state (the successful identifying a word is a clear state and unable to identify is an unclear state) in order to suppress influence of noise as suggested by Harada ([0003]).
Park modified by Harada fails to directly disclose capturing, via a camera in the intelligent device, an image of surroundings of the intelligent device; 
obtaining, via a sensor in the intelligent device, sound information based on sounds generated around the intelligent device when the camera is capturing the image; 
However Fink, in the same field of endeavor, discloses capturing, via a camera in the intelligent device, an image of surroundings of the intelligent device; obtaining, via a sensor in the intelligent device, sound information based on sounds generated around the intelligent device when the camera is capturing the image ([0018]: Computing devices that are typically used to capture the still images and/or video, including laptops, desktops, tablets, smartphones, and similar such devices, as well as point and shoot and DSLR cameras, are also often capable of simultaneously capturing audio.  [0020]: In embodiments, method 100 begins with operation 102, where an image and audio stream are captured. The image may be one or more still images, or a video clip. In various embodiments, the audio may be captured as part of a video clip, or as a separate stream).  Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Fink into that of Park modified by Harada and to add the limitation of capturing, via a camera in the intelligent device, an image of surroundings of the intelligent device; obtaining, via a sensor in the intelligent device, sound information based on sounds generated around the intelligent device when the camera is capturing the image for the benefit of automatically tagging detected objects in an image or video with tags obtained from an audio signal as taught by Fink ([0002]).
Harada teaches, as cited above, due to signal to noise ratio, outputting a notification when an audio signal cannot be recognized ([0009], [0042] and [0039]).  Fink teaches Tagging may be performed automatically, with the identified keywords tagged or otherwise associated to the image(s) and/or video, including any appropriate identified objects ([0028]) and In other embodiments, a user may make the final decision to tag, with the identified keywords being presented to the user as suggestions for tags, and the user being given the opportunity to confirm or reject suggested tag(s) ([0029]).  Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Harada and Fink into that of Park and to add the limitation of wherein the recognizing the sound includes: classifying, when the state of the recognized sound is determined to be the clear state, the image into predetermined categories based on the recognized sound, and feeding back, when the state of the recognized sound is determined to be the unclear state, the image to be displayed by a display of the intelligent device without classifying the image into the predetermined categories and classifying the image based on a user selection since it only requires routine skills for a POSITA to combine the teachings of Harada and Fink to handle the SNR issues in order to accurately tag a video stream.

Regarding Claim 2, Park teaches or suggests setting a sound tag to the classified image; and storing the set sound tag ([0148]: for filtering harmful content according to the video classification method of the present invention, the broad category information classified by the audio signal classifier 200 and the detailed category information classified by the video classifier 300 can be used.  Also, the broad category information and the detailed category information can be utilized to index specific content.  In addition, it can be applied to a new content generation that creates content grouped by the detailed category according to the video classification method).  Fink also discloses setting a sound tag to the classified image; and storing the set sound tag ([0030]: The tags may be associated with the image(s) and/or video by any method for tagging now known or later developed. The tags may be stored as part of the image metadata, e.g. in an EXIF field or similar structure, in part of a database associated with the stored image(s) and/or video, in a separate file, or in another suitable fashion).  The same reason to combine as taught in Claim 1 is incorporated herein.

Regarding Claim 5, Park discloses wherein the classifying of the image comprises classifying the image using a clustering algorithm capable of collecting and classifying recognized sounds within a threshold of a pattern sample vector ([0043]: The acoustic feature extracting unit 210 can analyze audio signals using frequency block separation method utilizing Fourier transform or pattern matching method identifying specific patterns matched with time-specific data of frequency, and it can determine the existence and occurrence interval of acoustic feature information using spectrograph, hidden Markov model, Gaussian mixture model, etc.  A skilled person would have known that pattern matching requires a threshold and Gaussian mixture model can be used for data clustering).

Regarding Claim 6, Park modified by Harada further teaches or suggests wherein the fed back image is displayed through a display (Harada [0039]: if the word is not selected by the selection unit 18, the output unit 19 may output nothing, and may also output a notification that a result of the voice recognition has failed to be obtained).  The same reason to combine as taught in Claim 1 is incorporated herein.

Regarding Claim 7, Park discloses wherein the sound information is converted to data using at least one of a pulse code modulation method, a fast Fourier transform algorithm, a Fourier transform algorithm, and a spectrogram algorithm ([0043]: The acoustic feature extracting unit 210 can analyze audio signals using frequency block separation method utilizing Fourier transform or pattern matching method identifying specific patterns matched with time-specific data of frequency, and it can determine the existence and occurrence interval of acoustic feature information using spectrograph, hidden Markov model, Gaussian mixture model, etc.  [0075]: The frequency conversion analysis module 211 analyzes the audio data based on frequency, and generates frequency spectrogram of voice signal for each time interval to provide to the pattern matching module 215).
.
Regarding Claim 11, Claim 11 is in similar scope to Claim 1 except in the format of “device” and the device further includes a camera configured to capture an image of surroundings of the intelligent device; a sensor configured to obtain sound information and a  display for displaying the image.  Park discloses comprising a display for display the image ([0013]: FIG.  13 shows characteristics of animation and drama videos.  For both animation and drama, a logo indicating a title of the respective video can be displayed upper left/right side).  Find discloses a camera configured to capture an image of surroundings of the intelligent device; a sensor configured to obtain sound information (Fig.3: camera 308 and microphone 310).  Therefore the rejection to Claim 1 is also applied to Claim 11.

Regarding Claims 12, 15-17, Claims 12, 15-17 are in similar scopes to Claims 2, 5-7 except in the format of “device”.  Therefore the rejections to Claims 2, 5-7 are also applied to Claim 12, 15-17.

Claims 8 and 18 are rejected under 35 U.S.C.  103 as being unpatentable over Park (US 2018/ 0144194 A1) in view of Harada (US 2013/0289992   A1) and Fink et al. (US 2020/0065589 A1) as applied to Claims 1 and 11 above, and further in view of Guo et al.  (US 2019/0141693 A1).
Regarding Claim 8, Park discloses the SI classification system uses in 5G application ([0002]).  But Park fails to explicitly disclose further comprising receiving, from a network, downlink control information (DCI) used for scheduling transmission of the sound information obtained from at least one sensor provided inside the intelligent device, wherein the sound information is transmitted to the network based on the DCI.
However Guo teaches or suggests it had been known to a POSITA before the effective filing date of the claimed invention for a user equipment (UE) to receive from the BS, a downlink control information (DCI) format to schedule a transmission over a physical uplink shared channel (PUSCH) (Abstract and Fig.19 step 1911).  Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Guo into that of Park as modified and to add the limitation of receiving, from the network, downlink control information (DCI) used for scheduling transmission of the sound information obtained from at least one sensor provided inside the intelligent device, wherein the sound information is transmitted to the network based on the DCI in order to allow a UE to communicate a base station through 5G network.

Regarding Claim 18, Claim 18 is in similar scope to Claim 8 except in the format of “device”.  Therefore the rejection to Claim 8 is also applied to Claim 18.

Claim 9 is rejected under 35 U.S.C.  103 as being unpatentable over Park (US 2018/ 0144194 A1) in view of Harada (US 2013/0289992   A1), Fink et al. (US 2020/0065589 A1) and Guo et al.  (US 2019/0141693 A1) as applied to Claim 8 above, and further in view of Huang et al.  (US 2019/0320469 A1).
Regarding Claim 9, Guo further discloses wherein the sound information is transmitted to the network through a physical uplink shared channel (PUSCH) ([0018]: The BS further comprises a transceiver operably connected to the processor, the transceiver configured to transmit, to a UE, the system configuration information identifying the set of PUCCH resources, wherein each of the set of PUCCH resources is identified via an identifier (ID) and information associated with the Tx beam, transmit, to the UE, scheduling information including a DCI format to schedule the UE with a transmission over a physical uplink shared channel (PUSCH), and receive, from the UE, data over the PUSCH based on the scheduling information using an receive (Rx) beam that corresponds to the Tx beam applied to a transmission over the PUSCH by the U).
But Park as modified fails to explicitly recite comprising performing an initial access procedure with the network based on a synchronization signal block (SSB).
However Huang teaches or suggests performing an initial access procedure with the network based on a synchronization signal block (SSB), wherein the sound information is transmitted to the network through a physical uplink shared channel (PUSCH) ([0122]: The UE receives an activation command [10, TS 38.321] used to map up to 8 TCI states to the codepoints of the DCI field ‘Transmission Configuration Indication’.  After a UE receives [initial] higher layer configuration of TCI states and before reception of the activation command, the UE may assume that the antenna ports of one DM-RS port group of PDSCH of a serving cell are spatially quasi co-located with the SSB determined in the initial access procedure with respect to Doppler shift, Doppler spread, average delay, delay spread, spatial Rx parameters, where applicable) had been known to a POSITA before the effective filing date of the claimed invention.  Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Huang into that of Park as modified and to add above missing limitation in order to use the intelligent AI device in 5G network.

Claim 10 is rejected under 35 U.S.C.  103 as being unpatentable over Park (US 2018/ 0144194 A1) in view of Harada (US 2013/0289992   A1), Fink et al. (US 2020/0065589 A1), Guo et al.  (US 2019/0141693 A1) and Huang et al.  (US 2019/0320469 A1) as applied to Claim 9 above, and further in view of 5GAA (“Toward fully connected vehicles: Edge computing for advanced automotive communications”, white paper by 5GAA, December 2017).
Regarding Claim 10, Park discloses classification through machine learning which is AI technology ([0017]: intelligent multilayer classification through machine learning is applicable without additional aid).  But Park as modified fails to disclose controlling a transceiver to transmit the sound information to an Artificial Intelligence (AI) processor included in the network; and controlling the transceiver to receive AI processed information from the AI processor, wherein the AI processed information is information determined to any one of the clear state in which the sound is clearly recognized or the unclear state in which the sound is unclearly recognized.
However 5GAA discloses V2X (Vehicle-to-everything) communication had already been known to a POSITA before the effective filing date of the claimed invention (p.4 Fig.1).  5GAA discloses the V2X uses a cloud server (p.4 Fig.1).  5GAA teaches or suggests the importance of artificial intelligence needed to provide the required levels of autonomy (p.12 lines 13-14) and Wireless communication is a key enabling technology for co-operative intelligent transportation systems (p.12 line 21).  The wireless communication includes 5G (p.16 second last paragraph: In particular, from a standardization perspective, some use cases targeting fully connected cars will require the fulfillment of challenging requirements, possible only with the introduction of 5G networks).
Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of 5GAA into that of Park as modified and to include a communication unit and to add the limitation of controlling a transceiver to transmit the sound information to an Artificial Intelligence (AI) processor (Edge server) included in the network (5G network used in V2X); and controlling the transceiver to receive AI processed information from the AI processor (cloud server), wherein the AI processed information is information determined to any one of the clear state in which the sound is clearly recognized or the unclear state in which the sound is unclearly recognized (through using server’s processor) in order to allow vehicles to share camera images of road conditions and limitless computing power compared to any in-vehicle embedded processor environment as taught by 5GAA (p.9 second last paragraph and p.10 third last paragraph).

Claim 19 is rejected under 35 U.S.C.  103 as being unpatentable over Park (US 2018/ 0144194 A1) in view of Harada (US 2013/0289992   A1), Fink et al. (US 2020/0065589 A1) as applied to Claim 11 above, and further in view of Huang et al.  (US 2019/0320469 A1).
Regarding Claim 19, Park discloses the SI classification system uses in 5G application ([0002]).   But Park as modified by Harada and Fink fails to discloses wherein the processor is configured to perform an initial access procedure with a network based on a synchronization signal block (SSB), and wherein the sound information is transmitted to the network through a physical uplink shared channel (PUSCH).
However Huang teaches or suggests wherein the processor is configured to perform an initial access procedure with a network based on a synchronization signal block (SSB), and wherein the sound information is transmitted to the network through a physical uplink shared channel (PUSCH)) ([0122]: The UE receives an activation command [10, TS 38.321] used to map up to 8 TCI states to the codepoints of the DCI field ‘Transmission Configuration Indication’.  After a UE receives [initial] higher layer configuration of TCI states and before reception of the activation command, the UE may assume that the antenna ports of one DM-RS port group of PDSCH of a serving cell are spatially quasi co-located with the SSB determined in the initial access procedure with respect to Doppler shift, Doppler spread, average delay, delay spread, spatial Rx parameters, where applicable) had been known to a POSITA before the effective filing date of the claimed invention.  Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Huang into that of Park as modified and to add above missing limitation in order to use the intelligent AI device in 5G network.

Claim 20 is rejected under 35 U.S.C.  103 as being unpatentable over Park (US 2018/ 0144194 A1) in view of Harada (US 2013/0289992   A1), Fink et al. (US 2020/0065589 A1), and Huang et al.  (US 2019/0320469 A1) as applied to Claim 19 above, and further in view of 5GAA (“Toward fully connected vehicles: Edge computing for advanced automotive communications”, white paper by 5GAA, December 2017).
Regarding Claim 20, Park discloses classification through machine learning which is AI technology ([0017]: intelligent multilayer classification through machine learning is applicable without additional aid).  But Park as modified fails to disclose further comprising a transceiver configured to transmit the sound information to an Artificial Intelligence (AI) processor included in the network, wherein the processor controls the transceiver to receive AI processed information from the AI processor, and wherein the AI processed information is information determined to be any one of the clear state in which the sound is clearly recognized or the unclear state in which the sound is unclearly recognized.
However 5GAA discloses V2X (Vehicle-to-everything) communication had already been known to a POSITA before the effective filing date of the claimed invention (p.4 Fig.1).  5GAA discloses the V2X uses a cloud server (p.4 Fig.1).  5GAA teaches or suggests the importance of artificial intelligence needed to provide the required levels of autonomy (p.12 lines 13-14) and Wireless communication is a key enabling technology for co-operative intelligent transportation systems (p.12 line 21).  The wireless communication includes 5G (p.16 second last paragraph: In particular, from a standardization perspective, some use cases targeting fully connected cars will require the fulfillment of challenging requirements, possible only with the introduction of 5G networks).
Therefore it would have been obvious to one ordinary person skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of 5GAA into that of Park as modified and to include a communication unit and to add the limitation of further comprising a transceiver configured to transmit the sound information to an Artificial Intelligence (AI) processor included in the network (5G network used in V2X), wherein the processor controls the transceiver to receive AI processed information from the AI processor (cloud server), and wherein the AI processed information is information determined to be any one of the clear state in which the sound is clearly recognized or the unclear state in which the sound is unclearly recognized (through using server’s processor) in order to allow vehicles to share camera images of road conditions and limitless computing power compared to any in-vehicle embedded processor environment as taught by 5GAA (p.9 second last paragraph and p.10 third last paragraph).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YINGCHUN HE whose telephone number is (571)270-7218. The examiner can normally be reached M-F 8:00-5:00 MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao M Wu can be reached on 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YINGCHUN HE/Primary Examiner, Art Unit 2613