DETAILED ACTION
Applicant has amended claims 2-4, 9, 12, 15, 19 in the filed amendment on 7/26/2022.  Claims 2-21 are pending in this office action.

Response to Arguments
Applicant’s arguments with respect to claim(s) 2-21 have been considered but are moot new ground of rejection.
Applicant argued the prior does not teach amended claims.
In response to Applicant’s argument, claims are rejected under the new ground.
In addition, 
a. Applicant argued that Ka does not teach “optical sensor of a device”.
Examiner respectfully disagrees.
According to the specification of publication of this application, “optical sensor of a device” is defined as camera of device e.g., phone or device (paragraphs 45, 88).
In this case, Ka teaches camera of a phone that is built into a cellular phone (paragraph 4) or camera of an image capturing element that includes image sensor (paragraphs 5, 53-54, fig. 4).  The camera of a phone is represented as optical sensor of a device or the camera of image capturing element is represented as optical sensor of device.

b.  Applicant argued that the combination of prior arts does not teach the claimed limitations “capturing, by an optical sensor of a device, live media content showing one or more objects; creating at least one intermediate representation of the live media content; causing a transmission of the live media content, the at least one intermediate representation, or a combination thereof via a network to a server for image recognition to identify the one or more objects with one or more tags associated one or more existing objects; and receiving and causing a presentation of at least one of the one or more existing objects on a user interface of the device” in claim 2.
Examiner respectfully disagrees.
 Ka teaches the claimed limitations:
 “capturing, by an optical sensor of a mobile device, live media content showing one or more objects” as capturing, by a camera of a cellular phone as a mobile phone (paragraphs 4-5, abstract), image that shows one or more object e.g., car and a person (paragraphs 7-8, 50). The captured image is represented as live media content.  The camera of a phone is represented as optical sensor of a device or the camera of image capturing element is represented as optical sensor;
“creating, at the mobile device, at least one intermediate representation of the live media content” as creating, the cellular phone as the device,  a transforming image data of the captured image for transmitting to a phone 3 (fig. 1).  The transformed image data is represented as at least one intermediate representation of the live media content (paragraphs 7, 50-53, fig. 11).  
In particularly, the center apparatus 2 receives each captured image from the monitoring cameras 1, stores the captured image, adequately performs data conversion on the captured image in response to a request from the cellular phone 3, and transmits the converted image to the cellular phone 3 (paragraph 50);
 “causing a transmission of the live media content, the at least one intermediate representation, or a combination thereof from the mobile device via a network to a server for image recognition to identify the one or more objects with one or more tags associated with one or more existing objects” as when transmission of, for example, monitor image data is requested from the cellular phone 3, the user accesses a home page provided by the WEB function unit 18 by using a WEB browser in the cellular phone 3 (paragraph 73), cause a transmission of the captured image from the center apparatus 2 as server, the converted image or transformed image via a network to the phone 3 (paragraph 45, 50, 103, fig. 2) for image recognition e.g., when the user recognizes the object to be displayed in enlarged form, the center apparatus 2 uses the image recognition unit 16 to extract the object to be displayed in enlarged form from the image data stored in the image database 13, which corresponds to the object (paragraph 95).   The pattern matching is processing of recognizing and extracting an object in the image based on, for example, object shape, and color, or the like (paragraph 82).
The stored image data in the image database 13 that corresponds to the object is represented as existing object.  The object shape and color are represented as one or more tags. The cellular phone 3 is not server. The apparatus 2 is not server.
In particularly, [0081] when the cellular phone 3 receives the entire image, it is displayed on the display screen.  Accordingly, the user of the cellular phone 3 can check the entirety of the monitor image.  When the user determines that an object to be displayed in enlarged form is shown in the entire image displayed on the display screen, the user accesses the WEB page provided by the center apparatus 2 by utilizing the WEB browsing function of the cellular phone 3, and, after performing authentication, allows the page of the enlarged display menu in the monitor image transmission service to be displayed.  Then, an object to be displayed in enlarged form is selected and transmitted to the center apparatus 2 (paragraph 81).
The enlarged display menu will be described.  FIG. 7 shows an example of the enlarged display menu when it is displayed on the display screen of the cellular phone 3.  In this example, a list of (1) person, (2) dog, (3) cat, and (4) automobile is shown as objects to be enlarged.  As these objects to be enlarged, those that can be identified from the image by pattern matching or the like in the image recognizing unit 16 in the center apparatus 2 are shown (paragraph 82).
“receiving and causing a presentation of the at least one image of the one or more existing objects on a user interface of the mobile device” as receive and display, on a user interface of a phone (fig. 6), the image data of the extracted object that converted into a format for transmission to the phone S17 .  The object, which is extracted from  an image data stored in image database 13, correspond to the object  (fig. 7, paragraphs 94-97).  The image data of the extracted object in image database 13 is represented as one existing object.
In particularly, the user selects an enlarged display menu from the displayed top menu (S12), and its contents are transmitted to the center apparatus 2.  The center apparatus 2 transmits data of the enlarged display menu (S13), and the enlarged display menu is displayed on the display screen of the cellular phone 3 (S14).  This enlarged display menu is shown as the above-described screen display shown in FIG. 7 (paragraph 94).  The user selects an object to be displayed in enlarged form from the displayed enlarged display menu (S15), and its contents are transmitted to the center apparatus 2.  When the user recognizes the object to be displayed in enlarged form, the center apparatus 2 uses the image recognition unit 16 to extract the object to be displayed in enlarged form from the image data stored in the image database 13, which corresponds to the object (paragraph 95).  After that, the extracted image data is converted into the format suitable for the transmission to the cellular phone 3 (S17), this image data is attached to the mail, and transmitted toward the mail address of the cellular phone 3 of the applicable user.  The cellular phone 3 receives this image data,  and displays it on the display screen (S18) (paragraph 96).
As discussed above Ka teaches the above limitations.

	C)  For claim 6, Applicant argued Barber does not  teach the claimed limitation “wherein the at least one image of the one or more existing objects is determined as similar to the live media content based on the one or more tags, a similarity algorithm, or a combination thereof”.
	Examiner respectfully disagrees.
	Barber teaches at least one image of images stored in database is determined as similar to sample image based on characteristic values as one or more tags (Barber: col. 7, lines 1-20; col. 2, lines 1-30; col. 6, lines 55-67; col. 10, lines 1-25, figs. 5-6).  The sample image is represented as the live media content. The stored images in database is represented as one or more existing objects. 
	In particularly, when a query is assembled, an object/thumbnail procedure described below is employed to construct a description (a "sample image") of the images which a user wishes to retrieve from the image database, with the query being constructed in terms of values of the image characteristics of interest. The query is used to find images in the database with image characteristic values that are similar to those included in the sample image (col. 7, lines 1-20); one or more objects in image (Walker: paragraph 94, 26).	 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 19-21 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
The limitation “capturing by a network live media content showing one or more objects” in (claim 19) were not described in paragraphs 45, 52 of the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed.
The dependent claims of claim 19 are rejected under the same reason as discussed in claim 19. 

Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

	Claims 2-3, 5, 12, 14, 16, 19-20  are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Kanayama et al (or hereinafter “Ka”) (US 20060007318) in view of Walker et al (or hereinafter “Walker”) (US 20080192129), and Barber et al (or hereinafter “Barber”) (US 5579471).
As to claim 2, Ka teaches the claimed limitations:
“capturing, by an optical sensor of a mobile device, live media content showing one or more objects” as capturing, by a camera of a cellular phone as a mobile phone (paragraphs 4-5, abstract), image that shows one or more object e.g., car and a person (paragraphs 7-8, 50). The captured image is represented as live media content.  The camera of a phone is represented as optical sensor of a device or the camera of image capturing element is represented as optical sensor;
“creating, at the mobile device, at least one intermediate representation of the live media content” as creating, the cellular phone as the device,  a transforming image data of the captured image for transmitting to a phone 3 (fig. 1).  The transformed image data is represented as at least one intermediate representation of the live media content (paragraphs 7, 50-53, fig. 11).  
In particularly, the center apparatus 2 receives each captured image from the monitoring cameras 1, stores the captured image, adequately performs data conversion on the captured image in response to a request from the cellular phone 3, and transmits the converted image to the cellular phone 3 (paragraph 50);
 “causing a transmission of the live media content, the at least one intermediate representation, or a combination thereof from the mobile device via a network to a server for image recognition to identify the one or more objects with one or more tags associated with one or more existing objects” as when transmission of, for example, monitor image data is requested from the cellular phone 3, the user accesses a home page provided by the WEB function unit 18 by using a WEB browser in the cellular phone 3 (paragraph 73), cause a transmission of the captured image from the center apparatus 2 as server, the converted image or transformed image via a network to the phone 3 (paragraph 45, 50, 103, fig. 2) for image recognition e.g., when the user recognizes the object to be displayed in enlarged form, the center apparatus 2 uses the image recognition unit 16 to extract the object to be displayed in enlarged form from the image data stored in the image database 13, which corresponds to the object (paragraph 95).   The pattern matching is processing of recognizing and extracting an object in the image based on, for example, object shape, and color, or the like (paragraph 82).
The stored image data in the image database 13 that corresponds to the object is represented as existing object.  The object shape and color are represented as one or more tags. The cellular phone 3 is not server. The apparatus 2 is not server.
In particularly, [0081] when the cellular phone 3 receives the entire image, it is displayed on the display screen.  Accordingly, the user of the cellular phone 3 can check the entirety of the monitor image.  When the user determines that an object to be displayed in enlarged form is shown in the entire image displayed on the display screen, the user accesses the WEB page provided by the center apparatus 2 by utilizing the WEB browsing function of the cellular phone 3, and, after performing authentication, allows the page of the enlarged display menu in the monitor image transmission service to be displayed.  Then, an object to be displayed in enlarged form is selected and transmitted to the center apparatus 2 (paragraph 81).
The enlarged display menu will be described.  FIG. 7 shows an example of the enlarged display menu when it is displayed on the display screen of the cellular phone 3.  In this example, a list of (1) person, (2) dog, (3) cat, and (4) automobile is shown as objects to be enlarged.  As these objects to be enlarged, those that can be identified from the image by pattern matching or the like in the image recognizing unit 16 in the center apparatus 2 are shown (paragraph 82).
“receiving and causing a presentation of the at least one image of the one or more existing objects on a user interface of the mobile device” as receive and display, on a user interface of a phone (fig. 6), the image data of the extracted object that converted into a format for transmission to the phone S17 .  The object, which is extracted from  an image data stored in image database 13, correspond to the object  (fig. 7, paragraphs 94-97).  The image data of the extracted object in image database 13 is represented as one existing object.
In particularly, the user selects an enlarged display menu from the displayed top menu (S12), and its contents are transmitted to the center apparatus 2.  The center apparatus 2 transmits data of the enlarged display menu (S13), and the enlarged display menu is displayed on the display screen of the cellular phone 3 (S14).  This enlarged display menu is shown as the above-described screen display shown in FIG. 7 (paragraph 94).  The user selects an object to be displayed in enlarged form from the displayed enlarged display menu (S15), and its contents are transmitted to the center apparatus 2.  When the user recognizes the object to be displayed in enlarged form, the center apparatus 2 uses the image recognition unit 16 to extract the object to be displayed in enlarged form from the image data stored in the image database 13, which corresponds to the object (paragraph 95).  After that, the extracted image data is converted into the format suitable for the transmission to the cellular phone 3 (S17), this image data is attached to the mail, and transmitted toward the mail address of the cellular phone 3 of the applicable user.  The cellular phone 3 receives this image data,  and displays it on the display screen (S18) (paragraph 96).
Ka does not explicitly teach limitations: 
continuously; from the device to a server; 
 and to retrieve from a database at least one image of the one or more existing objects similar to the live media content.
Walker teaches the claimed limitations:
 “continuously capture, by an optical sensor of a device, live media content showing one or more objects” as continuously capture, by a camera of cellular phone (paragraphs 78, 90, scene or image showing an object (paragraphs 268-269) e.g., a family building a sandcastle on a beach (paragraph 26). automatically capture a plurality of images (e.g., via a digital camera or other imaging device) and is further operable to determine whether to stop automatically capturing images. According to one embodiment, a camera is operable to automatically capture a plurality of images of a scene (e.g., a family building a sandcastle on a beach). After a number of images have been taken (e.g., halfway through a predetermined set), the camera may evaluate the images already captured by rating them. If one or more of the images already captured are determined to be of sufficient quality (e.g., by meeting a predetermined rating or other measure of quality), then the camera may determine that it should stop capturing images. Otherwise (e.g., if the camera determines that the captured images are of insufficient quality), the camera may proceed to capture one or more additional images of the scene. In this way, the camera may ensure that at least one image of desirable quality is captured (paragraph 26);
“causing a transmission of the live media content, the at least one intermediate representation, or a combination thereof from the mobile device via a network to a server” as using the cellular telephone, the camera 210 transmit one or more images via a network to a server, which stores the images (paragraphs 87, 472, 473).  For example, using the wireless capabilities of his mobile phone, a user may upload an image captured using the integrated digital camera to his personal computer, or to a personal database of images on a Web server maintained by his telecommunications company (paragraph 94);
“a server” as a server (paragraphs 87, 94);
“one or more tags” as links 1208 as tags (fig. 12, paragraphs 69-70, 76);
 “a server for image recognition to identify the one or more objects with one or more tags” as an image recognition program running on the server 310 use the user's personalized database of images for reference in identifying people, objects, and/or scenes with meta-tag (paragraphs 64, 160) in an image captured by the user (paragraphs 96).
Ka and Walker disclose a method of extracting data from a database for transmitting and displaying retrieved data to a user device.  These references are in the same field with application’s field.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Walker’s teaching to Ka’s system in order to automatically capture and manage images such that users may be able to capture higher quality images easily and reliably while minimizing or otherwise managing the danger of running out of memory on their cameras (Walker: paragraphs 22-24),  to upload an image captured using the integrated digital camera to user’s personal computer on a Web server via network  while a user is still away from home on vacation (Walker: paragraph 94) and further to direct a server to execute a facial recognition program on a captured image and to return an indication of the best matches to the camera  via the communication network (Walker: paragraph 95).
Barber teaches the claimed limitations:
“to retrieve from a database at least one image of the one or more existing objects similar to the live media content” as retrieve from image database at least one image of bear as object  similar to sample image e.g., Bears thumbnail 106  (col. 7, lines 1-20, col. 9, lines 33-67, figs. 5-6);
“receiving and causing a presentation of the at least one image of the one or more existing objects on a user interface of the mobile device” as FIG. 6, the bears/water texture query returns pictures containing bears and water.  The query results are illustrated in a container 110 shown in FIG. 6.  The three bear pictures were returned by the query because in each a bear existed slightly off center and there was water in the picture off to the right, which corresponds to the layout information for the two thumbnails illustrated in the example image window 90 (fig. 5) of a device that is not mobile device.  The order of the returned images is preferably sorted from best to worst match, and the number of images returned can be controlled by manipulation of the thumbnail attributes of weight and distance described above;
“to identify the one or more objects with one or more tags associated with one or more existing objects” as  identify the one or more objects with one or more tags associated with one or more existing objects (col. 7, lines 1-20, col. 9, lines 33-67, figs. 5-6).
Ka and Barber disclose a method of extracting data from a database for transmitting and displaying retrieved data to a user device.  These references are in the same field with application’s field.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Barber’s teaching to Ka’s system in order to find images in the database with image characteristic values that are similar to those included in the sample image, to provide, in a query-by-image-content system, a visual direct-manipulation interface for the construction of database queries based on image content and further to construct a result list of images satisfying the query parameters.  

	As to claim 3, Ka and Walker teach the claimed limitation “wherein the image recognition applied on the one or more objects includes object-recognition, face-recognition, bar-code recognition, optical character recognition, or a combination thereof” as the image recognition applied on the object includes object recognition (Ka: paragraphs 16, 74, 95, 98; Walker: paragraph 95), “wherein the at least one image of the one or more existing objects was captured before the live media content being captured” as at least one image of the images already captured is captured before the additional image of scene being captured (Walker: paragraph 26).
	In particularly, after a number of images have been taken (e.g., halfway through a predetermined set), the camera may evaluate the images already captured by rating them. If one or more of the images already captured are determined to be of sufficient quality (e.g., by meeting a predetermined rating or other measure of quality), then the camera may determine that it should stop capturing images. Otherwise (e.g., if the camera determines that the captured images are of insufficient quality), the camera may proceed to capture one or more additional images of the scene.  The one or more additional image of scene (Walker: paragraph 26).
	In other way, when a query is assembled, an object/thumbnail procedure described below is employed to construct a description (a "sample image") of the images which a user wishes to retrieve from the image database, with the query being constructed in terms of values of the image characteristics of interest. The query is used to find images in the database with image characteristic values that are similar to those included in the sample image (Barber: col. 7, lines 1-20).  In this case, the sample image is created as captured after the images are stored in the database.

	As to claims 5, 16, Ka and Walker teach the claimed limitation, wherein the one or more sensors include one or more audio sensors, one or more proximity sensor, one or more wireless interface sensors, one or more 2Attorney Docket No.: P3230US03Patent temperature sensors, one or more smell sensors, one or more body parameter sensors, one or more motion sensors, one or more accelerometers, one or more brightness sensors, one or more optical sensors, or a combination thereof” as wireless cellular telephone or device (Walker: paragraphs 86-87, 89), audio sensor (Walker: paragraph 143) and the communication network 4 may be wireless or wired, or may have a mixed form of wireless and wired communications (Ka: paragraph 46).

Claim 12 has the same claimed limitation subject matter as discussed in claim 2; thus claim 12 is rejected under the same reason as discussed in claim 2.  In addition, Ka, teaches an apparatus comprising: 
“at least one processor; and at least one memory including computer program code for one or more programs, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus embedded in a mobile device to perform at least the following” as a CPU, memory includes program and program functions configured to, with the at least one processor, cause the apparatus connected to a mobile device to perform operations following (Ka: paragraphs 61, 63, 68);
“cause a transmission of the live media content via a network to a server for image recognition to identify the one or more objects with one or more tags associated one or more existing objects” as when transmission of, for example, monitor image data is requested from the cellular phone 3, the user accesses a home page provided by the WEB function unit 18 by using a WEB browser in the cellular phone 3 (Ka: paragraph 73), cause a transmission of the captured image from the center apparatus 2 as server, the converted image or transformed image via a network to the phone 3 (Ka: paragraph 45, 50, 103, fig. 2) for image recognition e.g., when the user recognizes the object to be displayed in enlarged form, the center apparatus 2 uses the image recognition unit 16 to extract the object to be displayed in enlarged form from the image data stored in the image database 13, which corresponds to the object (Ka: paragraph 95).   The pattern matching is processing of recognizing and extracting an object in the image based on, for example, object shape, and color, or the like (paragraph 82).
The stored image data in the image database 13 that corresponds to the object is represented as existing object.  The object shape and color are represented as one or more tags. The cellular phone 3 is not server. The apparatus 2 is not server.
In particularly, when the cellular phone 3 receives the entire image, it is displayed on the display screen.  Accordingly, the user of the cellular phone 3 can check the entirety of the monitor image.  When the user determines that an object to be displayed in enlarged form is shown in the entire image displayed on the display screen, the user accesses the WEB page provided by the center apparatus 2 by utilizing the WEB browsing function of the cellular phone 3, and, after performing authentication, allows the page of the enlarged display menu in the monitor image transmission service to be displayed.  Then, an object to be displayed in enlarged form is selected and transmitted to the center apparatus 2 (Ka: paragraph 81).
The enlarged display menu will be described.  FIG. 7 shows an example of the enlarged display menu when it is displayed on the display screen of the cellular phone 3.  In this example, a list of (1) person, (2) dog, (3) cat, and (4) automobile is shown as objects to be enlarged.  As these objects to be enlarged, those that can be identified from the image by pattern matching or the like in the image recognizing unit 16 in the center apparatus 2 are shown (Ka: paragraph 82).
Walker teaches the claimed limitation:
“cause a transmission of the live media content via a network to a server for image recognition to identify the one or more objects with one or more tags associated one or more existing objects” as using the cellular telephone, the camera 210 transmit one or more images via a network to a server, which stores the images (paragraphs 87, 472, 473).  For example, using the wireless capabilities of his mobile phone, a user may upload an image captured using the integrated digital camera to his personal computer, or to a personal database of images on a Web server maintained by his telecommunications company (paragraph 94); an image recognition program running on the server 310 use the user's personalized database of images for reference in identifying people, objects, and/or scenes with meta-tag (paragraphs 64, 160) in an image captured by the user (paragraphs 96).
Ka and Walker disclose a method of extracting data from a database for transmitting and displaying retrieved data to a user device.  These references are in the same field with application’s field.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Walker’s teaching to Ka’s system in order to automatically capture and manage images such that users may be able to capture higher quality images easily and reliably while minimizing or otherwise managing the danger of running out of memory on their cameras (Walker: paragraphs 22-24),  to upload an image captured using the integrated digital camera to user’s personal computer on a Web server via network  while a user is still away from home on vacation (Walker: paragraph 94) and further to direct a server to execute a facial recognition program on a captured image and to return an indication of the best matches to the camera  via the communication network (Walker: paragraph 95).

	As to claims 14, 20, Ka and Walker teach the claimed limitation “wherein the image recognition applied on the one or more objects includes object-recognition, face-recognition, bar-code recognition, optical character recognition, or a combination thereof” as the image recognition applied on the object includes object recognition (Ka: paragraphs 16, 74, 95, 98; Walker: paragraph 95). 

Claim 19 has the same claimed limitation subject matter as discussed in claim 2; thus claim 19 is rejected under the same reason as discussed in claim 2.  In addition, Ka teaches non-transitory computer-readable storage medium carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause an apparatus to perform (memory includes instructions executed by a processor: paragraphs 61-63):
“capturing by a network live media content showing one or more objects” as capturing, by a camera of a cellular phone as a mobile phone (paragraphs 4-5, abstract), image that shows one or more object e.g., car and a person (paragraphs 7-8, 50). The captured image is represented as live media content.  The camera of a phone is represented as optical sensor of a device or the camera of image capturing element is represented as optical sensor, “wherein the live media content is captured via an optical sensor of a mobile device” as the image is captured by a camera of the phone (paragraphs 4, 7-8);
“applying image recognition on the live media content to identify the one or more objects with one or more tags” as apply pattern matching as image recognition on the captured image to identify an object with shape and color (paragraphs 82, 95, 100).
In particularly, objects to be enlarged, those that can be identified from the image by pattern matching or the like in the image recognizing unit 16 in the center apparatus 2 are shown.  The pattern matching is processing of recognizing and extracting an object in the image based on, for example, object shape, and color, or the like (paragraph 82);
“searching a database for at least one image of the one or more existing objects associated with the one or more tags” as searching a store image data S4 (fig. 9) or image database 13 for the image data associated with display menu (fig. 9, paragraphs 82, 94-97) or shape and color (paragraphs 94, 100).  The display menu or shape or color is the one or more tags;
“causing a presentation of the at least one image of the one or more existing objects on a user interface of the mobile device” as receive and display, on a user interface of a phone (fig. 6), the image data of the extracted object that converted into a format for transmission to the phone S17 .  The object, which is extracted from an image data stored in image database 13, correspond to the object (fig. 7, paragraphs 94-97).  The image data of the extracted object in image database 13 is represented as one existing object.  In particularly, the user selects an enlarged display menu from the displayed top menu (S12), and its contents are transmitted to the center apparatus 2.  The center apparatus 2 transmits data of the enlarged display menu (S13), and the enlarged display menu is displayed on the display screen of the cellular phone 3 (S14).  This enlarged display menu is shown as the above-described screen display shown in FIG. 7 (paragraph 94).  The user selects an object to be displayed in enlarged form from the displayed enlarged display menu (S15), and its contents are transmitted to the center apparatus 2.  When the user recognizes the object to be displayed in enlarged form, the center apparatus 2 uses the image recognition unit 16 to extract the object to be displayed in enlarged form from the image data stored in the image database 13, which corresponds to the object (paragraph 95).  After that, the extracted image data is converted into the format suitable for the transmission to the cellular phone 3 (S17), this image data is attached to the mail, and transmitted toward the mail address of the cellular phone 3 of the applicable user.  The cellular phone 3 receives this image data,  and displays it on the display screen (S18).
Ka does not explicitly teach limitations: 
continuously; associated with the one or more tags and similar to the live media content.
Walker teaches the claimed limitations:
 “continuously capturing by a live media content showing one or more objects” as continuously capture, by a camera of cellular phone (paragraphs 78, 90, scene or image showing an object (paragraphs 268-269) e.g., a family building a sandcastle on a beach (paragraph 26). automatically capture a plurality of images (e.g., via a digital camera or other imaging device) and is further operable to determine whether to stop automatically capturing images. According to one embodiment, a camera is operable to automatically capture a plurality of images of a scene (e.g., a family building a sandcastle on a beach). After a number of images have been taken (e.g., halfway through a predetermined set), the camera may evaluate the images already captured by rating them. If one or more of the images already captured are determined to be of sufficient quality (e.g., by meeting a predetermined rating or other measure of quality), then the camera may determine that it should stop capturing images. Otherwise (e.g., if the camera determines that the captured images are of insufficient quality), the camera may proceed to capture one or more additional images of the scene. In this way, the camera may ensure that at least one image of desirable quality is captured (paragraph 26);
“applying image recognition on the live media content to identify the one or more objects with one or more tags” as an image recognition program running on the server 310 use the user's personalized database of images for reference in identifying people, objects, and/or scenes with meta-tag (paragraphs 64, 160) in an image captured by the user (paragraphs 96).
Ka and Walker disclose a method of extracting data from a database for transmitting and displaying retrieved data to a user device.  These references are in the same field with application’s field.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Walker’s teaching to Ka’s system in order to automatically capture and manage images such that users may be able to capture higher quality images easily and reliably while minimizing or otherwise managing the danger of running out of memory on their cameras (Walker: paragraphs 22-24),  to upload an image captured using the integrated digital camera to user’s personal computer on a Web server via network  while a user is still away from home on vacation (Walker: paragraph 94), to easily transfer files from the camera to a personal computer,  and further to direct a server to execute a facial recognition program on a captured image and to return an indication of the best matches to the camera  via the communication network (Walker: paragraph 95).
Barber teaches the claimed limitations:
 “searching a database for at least one image of the one or more existing objects associated with the one or more tags and similar to the live media content” as searching and retrieving from a database for an image of the object (col. 7, lines 1-20, col. 9, lines 33-67, figs. 5-6) associated a feature e.g., color and similar to sample image such as Bears thumbnail 106  (col. 2, lines 1-30; col. 6, lines 55-67; col. 10, lines 1-25);
 “applying image recognition on the live media content to identify the one or more objects with one or more tags” as 106  (col. 2, lines 1-30; col. 6, lines 55-67; col. 10, lines 1-25);
“causing a presentation of the at least one image of the one or more existing objects on a user interface of the mobile device” as causing a presentation of the at least one image of the one or more existing objects on a user interface of the  device (figs. 5-6, col. 9, lines 33-67) that is not mobile device.
Ka and Barber disclose a method of extracting data from a database for transmitting and displaying retrieved data to a user device.  These references are in the same field with application’s field.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Barber’s teaching to Ka’s system in order to find images in the database with image characteristic values that are similar to those included in the sample image, to provide, in a query-by-image-content system, a visual direct-manipulation interface for the construction of database queries based on image content and further to construct a result list of images satisfying the query parameters.  

	Claims 4, 6, 8-9, 13, 15, 17, 21,  are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Ka in view of Walker and Barber and Sheha et al (or hereinafter “Sheha”) (US 20030036848).
	As to claims 4, 15, Ka does not explicitly teach the claimed limitation “determining meta-information based on sensor data from one or more sensors of the mobile device, wherein the one or more objects are identified with the one or more tags further based on the meta-information”.
Sheha teaches a first restaurant is identified with name of the restaurant as a tag based on location of restaurant that is determined by user’s position or device’s location as sensor of device (paragraphs 63, 66, 76-77, fig. 12). Walker teaches comparing image with stored images based on metadata (paragraphs 291-311).  Barber teaches identify the one or more objects with one or more tags associated with one or more existing objects (col. 7, lines 1-20, col. 9, lines 33-67, figs. 5-6).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Walker’s teaching, Barber’s teaching, and Sheha’s teaching to Ka’s system in order to allow most of the processing to be completed by the networked server system, where there are abundant resources, such as memory, processing power, and electrical power, since most mobile devices typically do not have an abundance of these resources relative to the networked server system, to allow a user navigate the links to an web page in web browser for information related to objects, to enable delivery of results in proper formats from server to a device by using a graphical user interface (Sheha: paragraphs 18, 67), and further to allow users to conveniently retrieve the most important objects that they feel are applicable, thus reducing a time to initiate a new search (Sheha: paragraph 89).

	As to claims 6, 17, 21, Ka, Walker and Barber teach the claimed limitation “wherein the at least one image of the one or more existing objects is determined as similar to the live media content based on the one or more tags, a similarity algorithm, or a combination thereof” as at least one image of images stored in database is determined as similar to sample image based on characteristic values as one or more tags (Barber: col. 7, lines 1-20; col. 2, lines 1-30; col. 6, lines 55-67; col. 10, lines 1-25, figs. 5-6).  The sample image is represented as the live media content. The stored images in database is represented as one or more existing objects. 
	In particularly, when a query is assembled, an object/thumbnail procedure described below is employed to construct a description (a "sample image") of the images which a user wishes to retrieve from the image database, with the query being constructed in terms of values of the image characteristics of interest. The query is used to find images in the database with image characteristic values that are similar to those included in the sample image (Barber: col. 7, lines 1-20); one or more objects in image (Walker: paragraph 94, 26).	 
	Ka does not explicitly teach the limitations:
	 “wherein the presentation of the at least one existing object further includes information of the at least one existing object, information for ordering the at least one existing object, or a combination thereof or 
	wherein the one or more existing objects include one or more products, one or more services, one or more points of interest, one or more point of interest reviews, one or more people, one or more social networking profiles associated with the one or more existing objects, or a combination thereof, wherein the at least one image of the one or more existing objects is determined as similar to the live media content based on the one or more tags, a similarity algorithm, or a combination thereof.
	Walker teaches images may be ordered according to their ratings (e.g., highest quality images first), or stored in folders based on content or quality (e.g., a first folder for high quality image of Alice, a second folder for medium-quality images of Alice, and a third folder for high quality images of Bob) (paragraph 340).
	Sheha teaches first search result of restaurant that is stored in memory ( fig. 4, paragraphs 65, 67) as an representation of object includes ranking e.g., 90% for ranking the restaurant (fig. 10, paragraphs 33, 68).  The stored restaurant in the memory is represented as existing object and live media content (paragraphs 78-79).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Walker’s teaching and Sheha’s teaching to Ka’s system in order to allow most of the processing to be completed by the networked server system, where there are abundant resources, such as memory, processing power, and electrical power, since most mobile devices typically do not have an abundance of these resources relative to the networked server system, to allow a user navigate the links to an web page in web browser for information related to objects, to enable delivery of results in proper formats from server to a device by using a graphical user interface (Sheha: paragraphs 18, 67), and further to allow users to conveniently retrieve the most important objects that they feel are applicable, thus reducing a time to initiate a new search (Sheha: paragraph 89).

As to claim 8, Ka does not explicitly teach the claimed limitation “wherein the one or more existing objects include one or more products, one or more services, one or more points of interest, one or more point of interest reviews, one or more people, one or more social networking profiles associated with the one or more existing objects, or a combination thereof”.  Sheha teaches the restaurant e.g., the Chart House restaurant would also have a category of Seafood, with a rating as NULL or user ratings or user reviews (fig. 17, paragraphs 91-92) and live media content (paragraphs 78-79).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Sheha’s teaching to Ka’s system in order to allow most of the processing to be completed by the networked server system, where there are abundant resources, such as memory, processing power, and electrical power, since most mobile devices typically do not have an abundance of these resources relative to the networked server system, to allow a user navigate the links to an web page in web browser for information related to objects, to enable delivery of results in proper formats from server to a device by using a graphical user interface (Sheha: paragraphs 18, 67), and further to allow users to conveniently retrieve the most important objects that they feel are applicable, thus reducing a time to initiate a new search (Sheha: paragraph 89).

As to claim 9, Ka does not explicitly teach the claimed limitation “estimating a size, a position, or a combination thereof of one of the objects as pointed by the mobile device based on the live media content, metadata associated with the live media content, or a combination thereof”.  Sheha teaches determining one’s position as pointed by a navigation device based on point of interest e.g., nearest gas station (Sheha: paragraphs 8-10) and live media content (paragraphs 78-79). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Sheha’s teaching to Ka’s system in order to allow most of the processing to be completed by the networked server system, where there are abundant resources, such as memory, processing power, and electrical power, since most mobile devices typically do not have an abundance of these resources relative to the networked server system, to allow a user navigate the links to an web page in web browser for information related to objects, to enable delivery of results in proper formats from server to a device by using a graphical user interface (Sheha: paragraphs 18, 67), and further to allow users to conveniently retrieve the most important objects that they feel are applicable, thus reducing a time to initiate a new search (Sheha: paragraph 89).

 As to claim 13, Ka does not explicitly teach the claimed limitation “wherein the apparatus is further caused to: create at least one intermediate representation of the live media content; and cause a transmission of the at least one intermediate representation via the network to the server, wherein the one or more objects are identified with the one or more tags based on the at least one intermediate representation”.
However, Ka teaches causing a transmission of the captured image, the converted image via network to user phone or server (Ka: fig. 6, paragraphs 5, 50, 103) for image recognizing processing to identify image data that includes object (Ka: fig. 1) as an object  (Ka: paragraphs 71, 74)  with object shape, and color as one or more tags (Ka: paragraph 82).  
Sheha teaches restaurants are identified based on search parameters e.g., categories, sub-categories, search distances or names of restaurants as tags based on representation 903 (fig. 9, paragraphs 63-65).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Sheha’s teaching to Ka’s system in order to allow most of the processing to be completed by the networked server system, where there are abundant resources, such as memory, processing power, and electrical power, since most mobile devices typically do not have an abundance of these resources relative to the networked server system, to allow a user navigate the links to an web page in web browser for information related to objects, to enable delivery of results in proper formats from server to a device by using a graphical user interface (Sheha: paragraphs 18, 67), and further to allow users to conveniently retrieve the most important objects that they feel are applicable, thus reducing a time to initiate a new search (Sheha: paragraph 89).

Claims 7, 18 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Ka in view Walker and Barber and further in view of Boncyk et al (or hereinafter Boncyk07”) (US 20060002607).
	As to claim 7, Ka and Walker teach limitation “wherein the presentation of the at least one existing object further includes the translation” as enlarged image as presentation of object including the object which the user wishes to check in detail can be displayed on the display screen of the cellular phone 3 (Ka: paragraph 96; Walker: paragraphs 21, 23).The object is not translation. 
 Ka does not explicitly teach the claimed limitation: translation; causing a translation of at least one of the one or more objects into a predetermined language, wherein the presentation of the at least one existing object further includes the translation.  Boncyk07 teaches translation (paragraph 64) and imagery is captured of a person gesturing in sign language.  Image/motion recognition techniques are used to translate the sign language into text or other machine-understandable data, such as text (paragraph 64).   Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Boncyk07’s teaching to Ka’s system in order to allow for multiple independent convergent search processes of the database to occur in parallel, which greatly improves image match speed and match robustness in the database matching to achieve fast searching of large databases and further to identify a specific target object  out of many such objects that have similar appearance and differ only in the identifying marks.

As to claim 18, Ka and Walker teach limitation “wherein the presentation of the at least one existing object further includes the translation” as enlarged image as presentation of object including the object which the user wishes to check in detail can be displayed on the display screen of the cellular phone 3 (Ka: paragraph 96; Walker: paragraphs 21, 23).The object is not translation. 
 Ka does not explicitly teach the claimed limitation: cause a translation of at least one of the one or more objects into a predetermined language.  Boncyk07 teaches translation (paragraph 64) and imagery is captured of a person gesturing in sign language.  Image/motion recognition techniques are used to translate the sign language into text or other machine-understandable data, such as text (paragraph 64).   Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Boncyk07’s teaching to Ka’s system in order to allow for multiple independent convergent search processes of the database to occur in parallel, which greatly improves image match speed and match robustness in the database matching to achieve fast searching of large databases and further to identify a specific target object  out of many such objects that have similar appearance and differ only in the identifying marks.

	Claims 10 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Ka in view of Walker and Barber and further in view of Seki et al (or hereinafter “Seki”) (US 20010048774)
As to claim 10, Ka does not explicitly teach the claimed limitation automatically zooming to the one object based on the size, the position, or a combination thereof.  Seki teaches when automatic processing instruction information (instruction for post-recording), corresponding to a selected image title in this list, indicates sending a zoomed image, the shot image data is zoomed to a specified size, which is attached to e-mail, and the data is sent to a specified destination (paragraph 332). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Seki’s teaching to Ka’s system in order to send zoomed image data in a specified size as an attachment to a specified destination via network after the data is recorded and further to prevent the user from forgetting to take required pictures, thus improving usability for the user. 

	Claims 11 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Ka in view of Walker and Barber and Vitikainen et al (US 20030065802).
	As to claim 11, Ka does not explicitly teach the claimed limitation automatically retrieving a preview associated with the one object based on a corresponding one of the tags.  Vitikainen teaches one or more size parameters associated with a multimedia preview sample to be created are provided 200.  One or more parameters concerning the composition of the multimedia preview sample are also provided 202.  Using the size and composition parameters, a preview sample is dynamically extracted 204 from the subject multimedia content.  A customized preview sample is generated 206 using the extracted preview sample (paragraph 44).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Vitikainen’s teaching to Ka’s system in order to edit images where desired for distributing the images to one or more mobile phone users at a sufficient level of quality, and further to provide high quality multimedia previewing that is optimized for a given mobile terminal.


	Claims 11 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Ka in view of Walker and Barber and further in view of  O’Brien (US 20070043792).
	As to claim 11, Ka does not explicitly teach the claimed limitation automatically retrieving a preview associated with the one object based on a corresponding one of the tags. O’Brien teaches once a master image file has been saved at 74, the software attempts to find a preview of the master image within the saved jfif file, as is shown in FIG. 5 at 76.  The preview size is defined to be any image with its longest side in the range of 175 to 520 pixels.  If a preview image is found, the data associated with the preview image is extracted from the master image file at 78 and sent as a preview image file in jfif format to an output folder within a queueing system at 80 ready for onward transmission at 82 to a server at the event editing facility 56 (paragraphs 47-48).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Seki’s teaching to Ka’s system in order to edit the photographs where desired and to distributing the photographs to one or more clients such as mobile phone users to forwarding images, and further to edit the images or to add other information such as one or more captions prior to distribution. 










Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 










Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAM-Y T TRUONG whose telephone number is (571)272-4042.  The examiner can normally be reached on (571) 272 4042.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on (571) 272 4046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CAM Y T TRUONG/             Primary Examiner, Art Unit 2169