DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
EXAMINER’S AMENDMENT
	An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

	Annie McNally has authorized this examiner’s amendment in an email-interview on 1/26/2022 as follows: 

1. (currently amended) A conference recording method applied to a data processing device, the method comprising:
obtaining a multimedia file corresponding to a conference, wherein the multimedia file comprises video data and audio data;
recognizing posture language of [[each]]a person from the video data of the multimedia file; and
extracting facial features of [[each]]the person from the video data, and extracting voice features of [[each]]the person from the audio data of the multimedia file;
identifying personal identity information of [[each]]the person according to the facial features and the voice features of [[each]]the person;

outputting the posture language, the personal identity information, and the text information corresponding to the person;
wherein the recognizing of the posture language of [[each]]the person from the video data comprises: 
extracting [[each]]an image frame comprising the person from the video data;
identifying key points of [[each]]the person in [[each]]the image frame;
generating a connection line corresponding to [[each]]the image frame by connecting the key points in [[each]]the image frame; 
converting [[each]]the connection line into a vector distance;
determining a posture feature of the [[each]] person in [[each]]the image frame according to the vector distance; and
determining the posture language of the [[each]] person by searching a predetermined database according to the posture feature, wherein the predetermined database pre-stores a first relationship between the posture feature and the posture language.
2. (currently amended) The method according to claim 1, further comprising:
separating the video data and the audio data from the multimedia file when the multimedia file is obtained from a capture device;
sharpening [[the]] face information when the video data comprises the face information; sharpening [[the]] posture information when the video data comprises the posture information; and enhancing [[the]]a voice comprised in the audio data when the audio data comprises the voice; and

3. (canceled). 
4. (previously presented) The method according to claim 1, wherein the determining of the posture feature comprises:
determining distance changes between two of the key points.
5. (currently amended) The method according to claim 1, wherein the identifying of personal identity information of [[each]]the person according to the facial features and the voice features of [[each]]the person comprises:
searching the predetermined database according to the facial features and the voice features of [[each]]the person, the predetermined database pre-stores a second relationship between the personal identity information of [[each]]the person and the facial features and the voice features of [[each]]the person.
6. (original) The method according to claim 5, wherein when the personal identity information of the person cannot be obtained from the predetermined database, the method further comprises:
storing the facial features and the voice features of the person in the predetermined database.
7. (currently amended) A data processing device comprising:
a storage device;
at least one processor; and
the storage device storing one or more programs, which when executed by the at least one processor, cause the at least one processor to:

recognize posture language of [[each]]a person from the video data of the multimedia file; and
extract facial features of [[each]]the person from the video data, and extracting voice features of [[each]]the person from the audio data of the multimedia file;
identify personal identity information of [[each]]the person according to the facial features and the voice features of [[each]]the person.;
convert the audio data corresponding to [[each]]the person into text information; and
output the posture language, the personal identity information, and the text information corresponding to [[each]]the person;
wherein the recognizing of the posture language of [[each]]the person from the video data comprises: 
extracting [[each]]an image frame comprising the person from the video data;
identifying key points of [[each]]the person in [[each]]the image frame;
generating a connection line corresponding to [[each]]the image frame by connecting the key points in [[each]]the image frame; 
converting [[each]]the connection line into a vector distance;
determining a posture feature of the [[each]] person in [[each]]the image frame according to the vector distance; and
determining the posture language of the [[each]] person by searching a predetermined database according to the posture feature, wherein the predetermined database pre-stores a first relationship between the posture feature and the posture 
8. (currently amended) The data processing device according to claim 7, the at least one processor is further caused to:
separate the video data and the audio data from the multimedia file when the multimedia file is obtained from a capture device;
sharpen [[the]] face information when the video data comprises the face information; sharpen [[the]] posture information when the video data comprises the posture information; and enhance [[the]]a voice comprised in the audio data when the audio data comprises the voice; and
obtain another multimedia file from the capture device when the video data does not comprise the face information, and the audio data does not comprise the voice.
9. (canceled).
10. (previously presented) The data processing device according to claim 7, wherein the determining of the posture feature comprises:
determining distance changes between two of the key points.
11. (currently amended) The data processing device according to claim 7, wherein the identifying of personal identity information of [[each]]the person according to the facial features and the voice features of [[each]]the person comprises:
searching the predetermined database according to the facial features and the voice features of [[each]]the person, the predetermined database pre-stores a second relationship between the personal identity information of [[each]]the person and the facial features and the voice features of [[each]]the person.
12. (original) The data processing device according to claim 11, wherein when the 
store the facial features and the voice features of the person in the predetermined database.
13. (currently amended) A non-transitory storage medium having instructions stored thereon, when the instructions are executed by a processor of a data processing device, the processor is configured to perform a conference recording method, wherein the method comprises: 
obtaining a multimedia file corresponding to a conference, wherein the multimedia file comprises video data and audio data;
recognizing posture language of [[each]]a person from the video data of the multimedia file; and
extracting facial features of [[each]]the person from the video data, and extracting voice features of [[each]]the person from the audio data of the multimedia file;
identifying personal identity information of [[each]]the person according to the facial features and the voice features of [[each]]the person.;
converting the audio data corresponding to [[each]]the person into text information; and
outputting the posture language, the personal identity information, and the text information corresponding to [[each]]the person;
wherein the recognizing of the posture language of [[each]]the person from the video data comprises: 
extracting [[each]]an image frame comprising the person from the video data;
the person in [[each]]the image frame;
generating a connection line corresponding to [[each]]the image frame by connecting the key points in [[each]]the image frame; 
converting [[each]]the connection line into a vector distance;
determining a posture feature of the [[each]] person in [[each]]the image frame according to the vector distance; and
determining the posture language of the [[each]] person by searching a predetermined database according to the posture feature, wherein the predetermined database pre-stores a first relationship between the posture feature and the posture language.
14. (currently amended) The non-transitory storage medium according to claim 13, wherein the method further comprises:
separating the video data and the audio data from the multimedia file when the multimedia file is obtained from a capture device;
sharpening [[the]] face information when the video data comprises the face information; sharpening [[the]] posture information when the video data comprises the posture information; and enhancing [[the]]a voice comprised in the audio data when the audio data comprises the voice; and
obtaining another multimedia file from the capture device when the video data does not comprise the face information, and the audio data does not comprise the voice.
15. (canceled). 
16. (previously presented) The non-transitory storage medium according to claim 13, wherein the determining of the posture feature comprises:

17. (currently amended) The non-transitory storage medium according to claim 13, wherein the identifying of personal identity information of [[each]]the person according to the facial features and the voice features of [[each]]the person comprises:
searching the predetermined database according to the facial features and the voice features of [[each]]the person, the predetermined database pre-stores a second relationship between the personal identity information of [[each]]the person and the facial features and the voice features of [[each]]the person.
18. (original) The non-transitory storage medium according to claim 17, wherein when the personal identity information of the person cannot be obtained from the predetermined database, the method further comprises:
storing the facial features and the voice features of the person in the predetermined database.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUNG-HOANG J. NGUYEN whose telephone number is (571)270-1949. The examiner can normally be reached Reg. Sched. 6:00-3:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PHUNG-HOANG J NGUYEN/Primary Examiner, Art Unit 2651