DETAILED ACTION

Introduction
This office action is in response to Applicant’s submission filed on 3/24/2021. Claims
1-20 are pending in the application. As such, claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings filed on 3/24/2021 is accepted and considered by the Examiner.


Claim Objections
Claim 6 objected to because of the following informalities: 	Claim 6 should be depended on claim 5.
Claims 13-17 should read “The device of claim …” and not “The system of claim …”
Appropriate correction is required.







Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-5, and 7 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Casado et al. (WO 2020040745 A1) hereinafter as Casado.
Regarding claim 1, Casado discloses: A method implemented by one or more processors, comprising: executing an automated assistant in at least in part on an assistant device; ([0017] In some implementations, a method performed by one or more processors is provided that includes: executing an automated assistant in a default listening state, wherein the automated assistant is executed at least in part on one or more computing devices operated by a user; …)
obtaining, at the assistant device: a stream of image frames captured by one or more cameras, ([0020] In various implementations, the one or more hardware sensors may include a camera. In various implementations, the attribute of the user may include: the user being detected by the camera, or the user being detected by the camera within a predetermined distance of one or more of the computing devices.)
and audio data detected by one or more microphones of the assistant device; ([0021] In various implementations, the one or more hardware sensors may include one or more of the microphones. In various implementations, the one or more attributes may include the user being audibly detected based on audio data generated by one or more of the microphones.)
processing the audio data using at least one audio portion of a neural network model to generate voice activity data indicating voice activity detected in the audio data; ([0044] In some implementations, one or more on-device invocation models, e.g., stored in an on-device model database 114, may be used by invocation module 113 to determine whether an utterance and/or visual cue(s) qualify as an invocation. Such an on-device invocation model may be trained to detect variations of invocation phrases/gestures. For example, in some implementations, the on-device invocation model (e.g., one or more neural networks) may be trained using training examples that each include an audio recording (or an extracted feature vector) of an utterance from a user, …)
processing the image frames of the stream using at least one vision portion of the neural network model to generate visual feature data indicating one or more visual features present in the image frames of the stream; ([0044] In some implementations, one or more on-device invocation models, e.g., stored in an on-device model database 114, may be used by invocation module 113 to determine whether an utterance and/or visual cue(s) qualify as an invocation. Such an on-device invocation model may be trained to detect variations of invocation phrases/gestures. For example, in some implementations, the on-device invocation model (e.g., one or more neural networks) may be trained using training examples that each include an audio recording (or an extracted feature vector) of an utterance from a user, as well as data indicative of one or more image frames and/or detected visual cues captured contemporaneously with the utterance.)
applying the voice activity data and the visual feature data as inputs to one or more interaction prediction layers of the neural network model, to receive, as output, indications of: one or more users determined to be present in the stream of image frames or the audio data, and a confidence level for each user, each of the confidence levels indicating a level of confidence that the corresponding user intended to interact with the automated assistant. ([0044] In some implementations, one or more on-device invocation models, e.g., stored in an on-device model database 114, may be used by invocation module 113 to determine whether an utterance and/or visual cue(s) qualify as an invocation. Such an on-device invocation model may be trained to detect variations of invocation phrases/gestures. For example, in some implementations, the on-device invocation model (e.g., one or more neural networks) may be trained using training examples that each include an audio recording (or an extracted feature vector) of an utterance from a user, as well as data indicative of one or more image frames and/or detected visual cues captured contemporaneously with the utterance.  {confidence level/threshold is mentioned in [0009] and [0075]})

	Regarding claim 2, Casado discloses: The method of claim 1, Casado further discloses: wherein the receiving occurs during or after automated assistant content is provided at the assistant device, and wherein an indication of the automated assistant content is applied as a further input to one or more of the interaction prediction layers of the neural network model.  ([0066] Additionally or alternatively, fulfillment module 124 may be configured to receive, e.g., from intent matcher 1S5, a user's intent and any slot values provided by the user or determined using other means (e.g., GPS coordinates of the user, user preferences, etc.) and trigger a responsive action. Responsive actions may include, for instance, ordering a good/service, starting a timer, setting a reminder, initiating a phone call, playing media, sending a message, etc. {also see [0063-0064] regarding recieving inputs and [0069] regarding timing and processing and listening states.})

	Regarding claim 3, Casado discloses: The method of claim 2, Casado further discloses: wherein the automated assistant content comprises at least one of: audio content, image content, video content, and textual content. ([0037] In some implementations, automated assistant 120 may engage in a human-to- computer dialog session in response to user interface input, even when that user interface input is not explicitly directed to automated assistant 120. For example, automated assistant 120 may examine the contents of user interface input and engage in a dialog session in response to certain terms being present in the user interface input and/or based on other cues. In many implementations, automated assistant 120 may utilize speech recognition to convert utterances from users into text, and respond to the text accordingly, e.g., by providing search results, general information, and/or taking one or more responsive actions (e.g., playing media, launching a game, ordering food, etc.). In some implementations, the automated assistant 120 can additionally or alternatively respond to utterances without converting the utterances into text. For example, the automated assistant 120 can convert voice input into an embedding, into entity representation(s) (that indicate entity/entities present in the voice input), and/or other "non-textual" representation and operate on such non-textual representation.)

	Regarding claim 4, Casado discloses: The method of claim 1, Casado further discloses: wherein the voice activity data includes voice recognition data and wherein the visual feature data includes facial recognition data. ([0012] In some implementations, machine learning models trained to detect these dynamic hot words may be downloaded as needed. For example, suppose a particular user is identified, e.g., based on facial recognition processing performed on one or more images captured of the user or speech recognition processing performed on audio data generated from the user's speech. If not already available on-device, these models may be downloaded, e.g., from the cloud based on the user's online profile.  [0022] also disclose method of analyzing including voice recognition processing.)

	Regarding claim 5, Casado discloses: The method of claim 1, Casado further discloses: wherein the visual features data indicates a change in visual features between two or more consecutive image frames of the stream. ([0040] In various implementations, speech capture module 110, which may be implemented using any combination of hardware and software, may interface with hardware such as a microphone 109 or other pressure sensor to capture an audio recording of a user's utterance(s). Various types of processing may be performed on this audio recording for various purposes. In some implementations, image capture module 111, which may be implemented using any combination of hardware or software, may be configured to interface with camera 107 to capture one or more image frames (e.g., digital photographs) that correspond to a field of view of the vision sensor 107. [0041] In various implementations, visual cue module 112i (and/or cloud-based visual cue module 112.sub.2) may be implemented using any combination of hardware or software, and may be configured to analyze one or more image frames provided by image capture module 111 to detect one or more visual cues captured in and/or across the one or more image frames.)

	Regarding claim 7, Casado discloses: The method of claim 1, Casado further discloses: further comprising: comparing the confidence levels of the one or more users to a first threshold; ([0009] Or, in the alternative, the automated assistant may continue to listen for the default hot words, but may raise a confidence threshold required for invocation based on those default hot words, while activating or lowering a threshold associated with the custom hot words.)
determining, based on the comparing, that at least one user of the one or more users intended to interact with the automated assistant; ([0043] In some implementations, a threshold that is employed by invocation module 113 to determine whether to invoke automated assistant 120 in response to a vocal utterance may be lowered when particular visual cues are also detected.)
and initiating performance, by the automated assistant and based on determining that the at least one user intended to interact with the automated assistant, of at least one automated assistant function. ([0037] In many implementations, automated assistant 120 may utilize speech recognition to convert utterances from users into text, and respond to the text accordingly, e.g., by providing search results, general information, and/or taking one or more responsive actions (e.g., playing media, launching a game, ordering food, etc.).)


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Casado, in view of Maddika et al. (US Patent Application Publication No.: US 20220293125 A1) hereinafter as Maddika. 
	Regarding claim 6, Casado discloses: The method of claim 5, Casado further discloses: wherein the change in the visual features between two or more consecutive image frames of the stream is determined to correspond to: a change in proximity of at least one of the users with respect to the assistant device; ([0007] As an example, in some implementations, if a user is detected by a proximity sensor, e.g., within a predetermined proximity of a computing device usable to engage with an automated assistant, one or more additional hot words may be activated to enable the proximate user to more easily invoke the automated assistant. In some implementations, the closer the user is to the assistant device, the more dynamic hot words may be activated. For example, a user detected within three meters of the assistant device may activate a first set of dynamic hot words. If the user is detected in closer proximity, e.g., within one meter, additional or alternative hot words may be activated.)
a change in direction of gaze of at least one of the users; ([0043]  Consequently, even when a user provides a vocal utterance that is different from but somewhat phonetically similar to the proper invocation phrase, "OK assistant," that utterance may nonetheless be accepted as a proper invocation when detected in conjunction with a visual cue (e.g., hand waving by the speaker, speaker gazes directly into vision sensor 107, etc.).)
a recognized physical gesture performed by at least one of the users; ([0043] Consequently, even when a user provides a vocal utterance that is different from but somewhat phonetically similar to the proper invocation phrase, "OK assistant," that utterance may nonetheless be accepted as a proper invocation when detected in conjunction with a visual cue (e.g., hand waving by the speaker, speaker gazes directly into vision sensor 107, etc.).)
and an interaction between at least one of the users and an additional user, the additional user being one of the one or more users that are determined to be present in the image frames of the stream, or a different user. ([0052] As used herein, a "dialog session" may include a logically-self-contained exchange of one or more messages between a user and automated assistant 120 (and in some cases, other human participants).)
	Casado does not explicitly, but Maddika discloses: wherein the change in the visual features between two or more consecutive image frames of the stream is determined to correspond to: lip movements of at least one of the users; ([0244] In general, the first plurality of values 806 may be based on various signals related to user gaze, user lip movements, relative attention of the user, device positional information, speech detection, and the like.)
Casado and Maddika are considered analogous art because they are in the related art of intelligent automated assistant.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Casado to combine the teaching of Maddika, to incorporate the above mentioned elements, because improving digital assistant system having a continuous dialog capabilities is desired (Maddika, background).



Claims 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Casado, in view of Yang et al. (CN 112188306 A) with reference to the English machine translation provided, hereinafter as Yang. 
Regarding claim 12, Casado discloses: A client device comprising: at least one vision component; ([0034] In some implementations, client device 106 may be equipped with one or more vision sensors 107 having one or more fields of view, although this is not required. Vision sensor(s) 107 may take various forms, such as digital cameras, passive infrared ("PIR") sensors, stereoscopic cameras, RGBd cameras, etc.)
at least one microphone; ([0017] monitoring audio data captured by one or more microphones for one or more of a default set of one or more hot words, …)
one or more processors; memory operably coupled with the one or more processors, wherein the memory stores instructions that, in response to execution of the instructions by one or more of the processors, cause one or more of the processors to perform the following operations: ([0023] In addition, some implementations include one or more processors of one or more computing devices, where the one or more processors are operable to execute instructions stored in associated memory, and where the instructions are configured to cause performance of any of the aforementioned methods.)
obtaining, at the client device: a stream of vision data captured by the vision component, and a stream of audio data captured by the microphone, ([0020] In various implementations, the one or more hardware sensors may include a camera. In various implementations, the attribute of the user may include: the user being detected by the camera, or the user being detected by the camera within a predetermined distance of one or more of the computing devices. ([0021] In various implementations, the one or more hardware sensors may include one or more of the microphones. In various implementations, the one or more attributes may include the user being audibly detected based on audio data generated by one or more of the microphones.)
 wherein the audio data excludes a hotword used to invoke an automated assistant at the client device; ([0080] In Fig. 3B, for instance, user 101 does not need to begin his utterance with "Hey Assistant." Instead, user simply says, "set a timer for five minutes". Words such as "set" or phrases such as "set a timer for" may include hot words that are part of the enhanced set of hot words that are now usable to invoke automated assistant 120. Consequently, automated assistant 120 replies, "OK. Timer starting...now" and initiates a five-minute timer.)
applying the audio data as input to one or more layers of a first portion of a neural network model to receive first output comprising voice activity data indicating voice activity detected in the audio data; ([0044] In some implementations, one or more on-device invocation models, e.g., stored in an on-device model database 114, may be used by invocation module 113 to determine whether an utterance and/or visual cue(s) qualify as an invocation. Such an on-device invocation model may be trained to detect variations of invocation phrases/gestures. For example, in some implementations, the on-device invocation model (e.g., one or more neural networks) may be trained using training examples that each include an audio recording (or an extracted feature vector) of an utterance from a user, as well as data indicative of one or more image frames and/or detected visual cues captured contemporaneously with the utterance.) 
applying the vision data as input to one or more of layers of a second portion of the neural network model to receive second output comprising image recognition data indicating one or more objects or users detected in the vision data; ([0044] In some implementations, one or more on-device invocation models, e.g., stored in an on-device model database 114, may be used by invocation module 113 to determine whether an utterance and/or visual cue(s) qualify as an invocation. Such an on-device invocation model may be trained to detect variations of invocation phrases/gestures. For example, in some implementations, the on-device invocation model (e.g., one or more neural networks) may be trained using training examples that each include an audio recording (or an extracted feature vector) of an utterance from a user, as well as data indicative of one or more image frames and/or detected visual cues captured contemporaneously with the utterance.)
wherein the confidence level indicates a level of confidence that the user intended to invoke the automated assistant at the client device; ([0009] Or, in the alternative, the automated assistant may continue to listen for the default hot words, but may raise a confidence threshold required for invocation based on those default hot words, while activating or lowering a threshold associated with the custom hot words.)
 and when the confidence level for the user satisfies one or more criteria: initiating performance of an automated assistant function at the client device. ([0037] In many implementations, automated assistant 120 may utilize speech recognition to convert utterances from users into text, and respond to the text accordingly, e.g., by providing search results, general information, and/or taking one or more responsive actions (e.g., playing media, launching a game, ordering food, etc.).  Client device is shown on fig. 1 and also discussed in multiple paras, including [0031-0036])
Casado does not explicitly, but Yang discloses: and applying the first output and the second output to one or more fusion layers of the neural network model to receive a confidence level for a user, ([pg. 4, last para] wherein the input is K image segments and K audio segments, wherein the image convolutional neural network is used for convolution processing the image segment; the audio convolutional neural network is used for performing convolution processing to the audio segment. Further, the image segment and the audio segment before inputting the convolutional neural network needs to be pre-processed. inputting the image segment processed by the image convolutional neural network and the audio segment processed by the audio convolutional neural network into the fully-connected network layer and performing prediction fusion; …) {confidence level for user is already mentioned in the Casado reference discussed above}
Casado and Yang are considered analogous art because they are in the related art of labeling and classification.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Casado to combine the teaching of Yang, to incorporate the above mentioned elements, because it would perform type recognition processing (Yang, contents of invention).

Regarding claim 13, Casado in view of Yang discloses: The device of claim 12, Casado further discloses: wherein at least a portion of the first output is applied as an input to one or more of the layers of the second portion of the neural network model and/or at least a portion of the second output is applied as an input to one or more of the layers of the first portion of the neural network model. ([0043] In various implementations, invocation module 113 may be configured to determine whether to invoke automated assistant 120, e.g., based on output provided by speech capture module 110 and/or visual cue module 112i (which in some implementations may be combined with image capture module 111 in a single module). For example, invocation module 113 may determine whether a user's utterance qualifies as an invocation phrase that should initiate a human-to-computer dialog session with automated assistant 120. In some implementations, invocation module 113 may analyze data indicative of the user's utterance, such as an audio recording or a vector of features extracted from the audio recording (e.g., an embedding), alone or in conjunction with one or more visual cues detected by visual cue module 112i. In some implementations, a threshold that is employed by invocation module 113 to determine whether to invoke automated assistant 120 in response to a vocal utterance may be lowered when particular visual cues are also detected. Consequently, even when a user provides a vocal utterance that is different from but somewhat phonetically similar to the proper invocation phrase, "OK assistant," that utterance may nonetheless be accepted as a proper invocation when detected in conjunction with a visual cue (e.g., hand waving by the speaker, speaker gazes directly into vision sensor 107, etc.).)

 Regarding claim 14, Casado in view of Yang discloses: The device of claim 13, Casado further discloses: wherein: at least a portion of the voice activity data corresponds to the user; ([0021] In various implementations, the one or more hardware sensors may include one or more of the microphones. In various implementations, the one or more attributes may include the user being audibly detected based on audio data generated by one or more of the microphones. In various implementations, the analyzing may include voice recognition processing. In various implementations, the attribute of the user may include an identity of the user. In various implementations, the attribute of the user may include membership of the user in a group.)
at least a portion of the image recognition data corresponds to a different user; ([0010] As another example, in some implementations, dynamic hot words may be associated with a group of people (e.g., employees, gender, age range, share a visual characteristic, etc.), and may be activated one or more of those people is detected and/or recognized. For example, a group of people may share a visual characteristic. In some implementations, when one or more of these visual characteristics are detected by one or more hardware sensors of an assistant device, one or more dynamic hot words may be activated that might not otherwise be available to non-group-members.)
an additional confidence level is received from the neural network model for the additional user: and when the additional confidence level satisfies one or more additional criteria: initiating performance of an additional automated assistant function at the client device.  ([0088] Additionally or alternatively, it may not be necessary to detect the particular identity of user 101B. In some implementations, it may suffice to recognize some visual attribute of a user to activate certain dynamic hot words. For example, in Fig. 4B, user 101B is a doctor wearing clothing typically worn by doctors. This clothing (or a badge having indicia, RFID, etc.) may be detected and used to determine that user 101B is a member of a group (e.g., medical personnel) for which certain hot words should be activated.  Also see [0090-0091] regarding how detection of others are applied.  {Maddika disclosure also discuss additional confidence, see fig. 9})

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Casado, in view of Yang, and further in view of Maddika.  
Regarding claim 15, Casado in view of Yang discloses: The device of claim 14, Casado in view of Yang does not explicitly, but Maddika discloses: wherein the additional confidence level for the additional user indicates a level of confidence that the additional user intended to invoke the automated assistant at the client device. ([0248] Returning to FIG. 8, a second plurality of values 814 may also be obtained in accordance with a determination that the first confidence level exceeds a first threshold confidence level at step 812. In a subsequent step, a second confidence level corresponding to second speech input 808 is obtained. The second confidence level may be based on both the first plurality of values 806 and the second plurality of values 814, for example, determined at step 816. In general, the second plurality of values 814 may be based on additional signals indicative of whether the user's speech is intended for the digital assistant. In particular, these signals and corresponding values may be associated with device processes having higher and otherwise more robust processing capabilities as compared to the first plurality of signals, such as determinations involving speaker identity, user intent, neural networks, and the like.)
Casado, Yang and Maddika are considered analogous art because they are in the related art of intelligent automated assistant and/or labeling and classification.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Casado, in view of Yang, to combine the teaching of Maddika, to incorporate the above mentioned elements, because improving digital assistant system having a continuous dialog capabilities is desired (Maddika, background).



Allowable Subject Matter
Claims 8-11 and 16-17 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Hoang et al. (US Patent Application Publications No: US 20220172021 A1) hereinafter as Hoang.   Hoang discloses a method of assigning confidence score associated with assigned prediction of layer of machine learning model as overall confidence score to be associated with overall prediction of machine learning model.
Quinn et al. (US Patent Application Publications No: US 20190371327 A1) hereinafter as Quinn.  Quinn discloses a method and system for controlling a smart device using a gestured-based control system that identifies the user.
Anders et al. (US Patent Application Publications No: US 20190348030 A1) hereinafter as Anders.  Anders teaches a method and system for enabling an digital assistant to adjust its behavior depending on the user.
Beaumont et al. (US Patent Application Publications No: US 20150088515 A1) hereinafter as Beaumont.  Beaumont teaches a technique to identify the speaker using the combination of audio and visual features.
Lin et al. (US Patent Application Publications No: US 20190333516 A1) hereinafter as Lin.  Lin teaches a method and device for speech recognition by chronologically linking the obtained speech and images together.  It also use combination of voice and hand gesture to determine if the user’s intent are validated.
Flaks et al. (US Patent Application Publications No: US 20110184735 A1) hereinafter as Flaks.  Flaks teaches a method and system for determine and adjusting confidence level of speech recognition using a combination of audio and visual features.
(Kepuska, V., & Bohouta, G. (2018, January). Next-generation of virtual personal assistants (microsoft cortana, apple siri, amazon alexa and google home). In 2018 IEEE 8th annual computing and communication workshop and conference (CCWC) (pp. 99-103). IEEE.) hereinafter as Kepuska.  Kepuska discloses a multimodal dialogue system which combines speech, image, video, touch, gestures, gaze, head and body movement in next generation of virtual personal assistants.
(Kampman, O., Barezi, E. J., Bertero, D., & Fung, P. (2018). Investigating audio, visual, and text fusion methods for end-to-end automatic personality prediction. arXiv preprint arXiv:1805.00705.) hereinafter as Kampman.  Kampman discloses a fusion model using a Convolution Neural Network (CNN) to leverage the combination of audio, video and textual feature for automatic personality prediction.
(Jaafar, N., & Lachiri, Z. (2019, July). Audio-Visual Fusion for Aggression Detection Using Deep Neural Networks. In 2019 International Conference on Control, Automation and Diagnosis (ICCAD) (pp. 1-5). IEEE.) hereinafter as Jaafar.  Jaafar discloses a method of aggression detection using a Deep Neural Network multimodal fusion combining audio and visual features.  
(Zhang, S., Zhang, S., Huang, T., Gao, W., & Tian, Q. (2017). Learning affective features with a hybrid deep model for audio–visual emotion recognition. IEEE Transactions on Circuits and Systems for Video Technology, 28(10), 3030-3043.) hereinafter as Zhang.  Zhang discloses a hybrid deep model for aggregating audio and visual features in emotion recognition.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Phillip H Lam whose telephone number is (571)272-1721. The examiner can normally be reached 10 AM-6 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on (571) 272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PHILIP H LAM/Examiner, Art Unit 2656

/HUYEN X VO/Primary Examiner, Art Unit 2656