DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 3, 5, 7, 12, 19, 21, 23, 25, and 37, are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sofer (USPG 2006/0098089, hereinafter referred to as Sofer).

Regarding claims 1, 19, and 37, Sofer discloses a system, method, and computer program product comprising an non-transitory computer-readable storage medium having program code embodied therewith (figure 2, inherently includes hardware and software components such as object synthesis and recognition and vision processing), comprising: 
at least one hardware processor configured to: 
receive a digital image of a scene (paragraphs 50-51, 57-58; also see figure 2, CCD, 3DImager for capturing image of a scene); 
figure 2, Object Synthesis and Recognition module 214 responsible for analyzing captured images and identifying objects in the captured images); and 
for each identified object: 
determine values for a plurality of physical attributes of the respective identified object (paragraphs 58-63, determining object’s physical properties, such as dimension, color, texture, etc.), 
synthesize a vocalized verbal description of the respective identified object, wherein at least some of the values of said plurality of physical attributes are expressed by non-verbal audio parameters of the synthesized vocalized verbal description (paragraph 63, generating verbal description of the identified object, including determined physical properties of the object), and 
output said synthesized vocalized verbal description through a loudspeaker or an earphone (figure 2, headphone 230).  

Regarding claims 3, 5, 7, 21, 23, and 25, Sofer further discloses wherein said plurality of physical attributes are selected from the group consisting of: location in the horizontal dimension, location in the vertical dimension, location in the depth dimension, height, width, size, color, depth, weight, texture, temperature, identity of a human, sex, height of a human, weight of a human, age of a human, nationality of a human, and emotional state or mood of a human (paragraphs 58-63, determining object’s physical properties, such as dimension, color, texture, etc.); wherein said object comprises at paragraph 64, use of OCR to extract textual data from image); wherein said identification comprises retrieving information with respect to said identified object from at least one of a database of the system, an external network resource, a cloud server, and the Internet (figure 2, World Model Knowledge base 221; also see paragraph 57, searching World Model Knowledge base).  

Regarding claim 12, Sofer further discloses wherein each of the non-verbal audio parameters is associated with a unique physical attribute, based on user selection (paragraphs 66-69, user’s conformation or rejection or correction).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 10-11, 20, and 28-29 are rejected under 35 U.S.C. 103 as being unpatentable over Sofer in view of Cavender et al. (USPG 2009/0192785, hereinafter referred to as Cavender). 

paragraphs 35 and 71, “use of 3D audio properties to reflect the location of each object” and “a distant object on the left could be described using a low volume in the left ear”); wherein a said vocalized verbal description of said respective identified object comprises two or more concurrent vocalized verbal descriptions (paragraphs 45, 50, and 67, condense description based on priority of objects, and priority of objects is based on distance or location of objects). 
Since Sofer and Cavender are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of condensing description and generating speech using physical attributes of the object in order to improve the naturalness of the generated speech.  One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).

Regarding claims 11 and 29, Sofer further discloses wherein said at least one hardware processor is further configured to: slice the image into a plurality of slices (paragraphs 52 and 56, frames and/or blocks); detect, in a specified order for each slice at a time, the location of at least a portion of the object contained within the slice paragraph 51, computing distance of objects; paragraphs 63 and 77, determining distance and/or position of objects); combine in the specified order each object-dependent signal with a respective location- dependent signal for creating a combined object-location signal (paragraphs 51, computing distance of objects; paragraphs 63 and 77, determining distance and/or position of objects; paragraphs 56-58, processing all frames and/or blocks); and 4output the combined object-location signal concurrently with said synthesized vocalized verbal description (paragraphs 51, computing distance of objects; paragraphs 63 and 77, determining distance and/or position of objects; paragraphs 56-58, processing all frames and/or blocks).  
Sofer fails to explicitly disclose, however, Cavender teaches associate a sound or tactile object-dependent signal with the object (Cavender: paragraph 71, “a distant object on the left could be described using a low volume in the left ear”); associate a sound or tactile location-dependent signal unique for each slice (Cavender: paragraph 71, “a distant object on the left could be described using a low volume in the left ear”; audio volume and left or right speaker are determined based on location or position of objects detected in each slice of image).
Since Sofer and Cavender are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of combining and condensing multiple descriptions.  One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).

Claims 8 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Sofer in view of Cavender, and further in view of Siskind et al. (USPG 2014/0369596, hereinafter referred to as Siskind).

Regarding claims 8 and 26, Sofer fails to explicitly disclose, however, Cavender teaches wherein a plurality of said synthesized vocalized verbal descriptions corresponding to a plurality of objects disposed in different locations about a said image, are combined into a continuous sequence in a specified order, based on the relative locations of said plurality of objects in the image (paragraphs 45, 50, and 67, condense description based on priority of objects, and priority of objects is based on distance or location of objects).  
Since Sofer and Cavender are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of combining and condensing multiple descriptions.  One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).
The modified Sofer fails to explicitly disclose, however, Siskind teaches and wherein said specified order is selected from the group consisting of: left-to-right, right-to-left, top-to-bottom, and bottom-to-top (figures 1A-B and/or paragraphs 48-49, “The person to the left of the backpack approached the trash-can”).
.

Claims 13 and 31 are rejected under 35 U.S.C. 103 as being unpatentable over Sofer in view of Shin (USPN 10175863, hereinafter referred to as Shin).

Regarding claims 13 and 31, Sofer fails to explicitly disclose, however, Shin teaches wherein at least some of the determined values for said plurality of physical attributes of a said identified object are expressed by haptic signals, and wherein said haptic signals are being output concurrently with said vocalized verbal description of the respective identified object (col. 12, line 41 to col. 14, line 36 and/or referring to figure 6; user identifies object in an image via touch or haptic and generated vocal description is associated with the object in the image.  The final output is a video generated and transmitted or outputted as a result of a user’s input and generated description at the same time).  
Since Sofer and Shin are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of using haptic input to .

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Cooper (USPN 6118456) teaches a method of prioritizing and streaming objects within a 3D virtual environment, which is considered pertinent to the claimed invention.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUYEN X VO whose telephone number is (571)272-7631. The examiner can normally be reached M-F, 8-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, 





/HUYEN X VO/Primary Examiner, Art Unit 2656