Prior art — US Patent 8355484

Analysis of Prior Art for U.S. Patent 8,355,484

Based on a thorough review of the patent file for U.S. Patent 8,355,484, entitled "Methods and apparatus for masking latency in text-to-speech systems," the following prior art references are considered most relevant. This analysis examines the examiner-cited references and their potential to anticipate the claims of the '484 patent under 35 U.S.C. § 102.

The core of the '484 patent is a method to make the delay in a text-to-speech (TTS) system feel more natural to a user. This is achieved by playing "transitional messages," like "um" or "let me see," while the system is processing the user's request and before the final synthesized speech response is ready.

Key Prior Art References and Potential Anticipation

1. U.S. Patent 5,737,393 A

Full Citation: Wolf, E. (1998). Script-based interactive voice mail and voice response system. U.S. Patent No. 5,737,393. U.S. Patent and Trademark Office.
Publication Date: April 7, 1998
Filing Date: July 31, 1995
Brief Description: This patent describes an interactive voice response (IVR) system that uses scripts to guide a caller through a series of voice prompts. It details how the system can play various pre-recorded messages and prompts based on user input. A key feature is the ability to provide "acknowledgment" messages to the user, confirming that their input was received and is being processed.
Potential Anticipation of Claims: The '393 patent could be seen as anticipating the broader concepts within claims 1, 3, 6, and 12 of the '484 patent. These claims cover the fundamental idea of receiving a communication, processing it, and providing a transitional message while processing. The "acknowledgment" messages in the '393 patent could be interpreted as a form of "transitional message." However, the '393 patent does not explicitly mention the use of paralinguistic events (like "um" or a cough) for the purpose of masking latency in a natural-sounding way, which is a more specific element of the '484 patent.

2. U.S. Patent 6,345,250 B1

Full Citation: Martin, D. L. (2002). Developing voice response applications from pre-recorded voice and stored text-to-speech prompts. U.S. Patent No. 6,345,250. U.S. Patent and Trademark Office.
Publication Date: February 5, 2002
Filing Date: February 24, 1998
Brief Description: This patent discloses a method for creating voice response applications that can combine pre-recorded audio with dynamically generated text-to-speech prompts. It discusses the seamless integration of these different audio sources to provide a more fluid user experience. The system can play introductory or transitional phrases while fetching data or preparing a more complex, synthesized response.
Potential Anticipation of Claims: The '250 patent is relevant to claims 1, 3, 5, 6, 10, 12, and 16. It describes providing a message (which could be considered "transitional") while processing a request, a core concept of the '484 patent. The combination of pre-recorded and TTS-generated speech also touches upon how the final response is delivered. However, similar to the '393 patent, the '250 patent's primary focus isn't on using these transitional messages, specifically paralinguistic ones, to mimic human-like pauses and mask processing latency for a more natural interaction.

3. U.S. Patent Application Publication 2005/0273338 A1

Full Citation: Eide, E. M., & Epstein, M. E. (2005). Generating paralinguistic phenomena via markup. U.S. Patent Application Publication No. 2005/0273338 A1.
Publication Date: December 8, 2005
Filing Date: June 4, 2004
Brief Description: This patent application is highly relevant as it directly addresses the generation of "paralinguistic phenomena" in synthesized speech. It describes a system that uses markup language (similar to HTML or XML) to insert non-lexical sounds like "uh," "um," coughs, and breaths into TTS output to make it sound more natural.
Potential Anticipation of Claims: This publication presents a strong case for anticipating several specific claims of the '484 patent, particularly claims 1, 3, 6, 11, 12, and 17, which explicitly mention "paralinguistic events" and specific examples like "uh," "um," coughs, or breaths. The core novelty of using these specific sounds to enhance naturalness is clearly described. The '484 patent builds on this by applying it specifically to the problem of masking latency, which may provide a narrow path to novelty.

4. U.S. Patent 6,546,097 B1

Full Citation: Pelletier, D. P. (2003). Automatic call distribution system with signal generator and method. U.S. Patent No. 6,546,097. U.S. Patent and Trademark Office.
Publication Date: April 8, 2003
Filing Date: May 13, 1999
Brief Description: This patent relates to automatic call distribution (ACD) systems, often used in call centers. It describes a system that can play pre-recorded announcements or music to a caller while they are on hold, waiting for an agent. The purpose of these "filler" signals is to reassure the caller that the connection is active and that their call is being handled.
Potential Anticipation of Claims: The '097 patent could be considered to anticipate the broader aspects of claims 1, 3, 6, and 12, which involve providing a message to a user during a processing delay. The on-hold music or announcements are a form of "transitional message." However, the '097 patent is focused on a call-queuing context, not a conversational dialog system, and the messages described (music, standard announcements) are not the human-like, paralinguistic fillers that are a key element of the '484 patent's approach to making latency feel natural.