Prior art — US Patent 12417756

Analysis of Prior Art Cited in U.S. Patent 12,417,756

Based on the patent documentation for U.S. Patent 12,417,756, the following patent documents have been cited as prior art by the examiner. This analysis details the most relevant of these citations and their potential impact on the patent's claims under 35 U.S.C. § 102 (Anticipation).

A patent claim is anticipated if a single prior art reference discloses each and every element of the claim. The independent claims of the '756 patent (claims 1, 7, and 14) broadly cover a system, method, and non-transitory computer-readable medium for modifying a second user's speech to mimic a first user's accent while preserving the second user's natural voice characteristics, using machine learning models.

Key Prior Art and Potential Anticipation

The following references are identified as the most relevant to the claims of U.S. Patent 12,417,756.

1. U.S. Patent 9,129,602 B1: "Mimicking user speech patterns"

Full Citation: US Patent 9,129,602 B1
Assignee: Amazon Technologies, Inc.
Filing Date: December 14, 2012
Publication Date: September 8, 2015
Brief Description: This patent describes a system that can generate synthesized speech that mimics the speech patterns of a user. It involves receiving a user's speech, analyzing it to determine speech patterns (like pitch, prosody, and accent), and then using these patterns to generate new speech in the user's voice. The goal is to make synthesized speech, such as from a digital assistant, sound more like the user it is interacting with.
Potential Anticipation: This reference appears highly relevant. It discloses the core concept of analyzing a user's speech to extract patterns (analogous to the "first user's accent features" and "second user's natural voice") and using those to synthesize new speech.
- Claims 1, 7, and 14: The '602 patent's disclosure of analyzing user speech to determine patterns like accent and prosody and using those to generate new speech could be argued to anticipate the process of extracting accent features and synthesizing a modified output. The key distinction would be whether the '602 patent explicitly teaches the combination of a first user's accent with the preservation of a second user's distinct natural voice. If "mimicking user speech patterns" is interpreted broadly enough to cover this combination, it could potentially anticipate these claims.

2. U.S. Patent Application Publication 2020/0193971 A1: "System and methods for accent and dialect modification"

Full Citation: US 2020/0193971 A1
Assignee: i2x GmbH
Filing Date: December 13, 2018
Publication Date: June 18, 2020
Brief Description: This application describes a system for modifying a speaker's accent or dialect in real-time. It involves capturing audio, identifying accent or dialect features, and transforming the speech to a target accent or dialect while aiming to maintain the speaker's voice identity. It is particularly aimed at call centers to help agents be more clearly understood.
Potential Anticipation: This reference is also highly relevant as it explicitly addresses real-time accent modification while preserving the speaker's voice.
- Claims 1, 7, and 14: The '971 application appears to describe all the key steps of the '756 patent's independent claims. It teaches capturing speech (second user), modifying its accent to a target (first user's accent), and importantly, "maintaining the speaker's voice identity" (preserving the natural voice). The distinction may lie in the specific techniques used for analysis and synthesis (e.g., the '756 patent's specific mention of MFCC or unique fingerprints), but the overall process seems to be disclosed.

3. U.S. Patent 11,134,217 B1: "System that provides video conferencing with accent modification and multiple video overlaying"

Full Citation: US Patent 11,134,217 B1
Assignee: Surendra Goel
Filing Date: January 11, 2021
Publication Date: September 28, 2021
Brief Description: This patent details a video conferencing system that includes real-time accent modification. A participant's speech is modified to an accent that is selected or deemed more understandable to other participants. The system is designed to improve clarity in multi-lingual or multi-accent conversations.
Potential Anticipation: This reference is relevant due to its application of accent modification in a real-time communication context.
- Claims 1, 7, and 14: The '217 patent describes modifying a speaker's accent in real-time within a conferencing system. It inherently involves capturing a second user's speech and modifying it. The central question for anticipation would be whether it explicitly teaches the analysis of a first user's accent to use as the target and the specific step of preserving the second user's "natural voice" characteristics as defined in the '756 patent. If the modification is to a generic "standard" accent rather than mimicking a specific participant, it may not fully anticipate the claims.

4. U.S. Patent Application Publication 2024/0161764 A1: "Accent personalization for speakers and listeners"

Full Citation: US 2024/0161764 A1
Assignee: Dell Products L.P.
Filing Date: November 9, 2022
Publication Date: May 16, 2024
Brief Description: This application covers a system that personalizes audio by modifying a speaker's accent to match a listener's preference or native accent. The system can detect a listener's accent and convert the speaker's audio to that accent in real-time to improve comprehension.
Potential Anticipation: This is a very strong reference, published before the priority date of the '756 patent. It describes the core functionality of the invention.
- Claims 1, 7, and 14: The '764 application teaches modifying a speaker's accent to match a listener's accent, which directly corresponds to the '756 patent's concept of mimicking a "first user" (the listener) for the benefit of a "second user" (the speaker's output). The disclosure of personalizing the accent for the listener strongly implies that the system analyzes the listener's speech characteristics. It likely also discusses preserving the speaker's voice to avoid unnatural output. This reference has a high probability of anticipating the independent claims.

Other Notable Cited References

US 2023/0267941 A1 ("Personalized Accent and/or Pace of Speaking Modulation for Audio/Video Streams"): Discloses modifying a speaker's accent and pace for a listener, which is conceptually similar.
US 2024/0146560 A1 ("Participant Audio Stream Modification Within A Conference"): Describes modifying a participant's audio in a conference, which could include accent modification, to improve clarity.

Summary of Analysis

The prior art cited against U.S. Patent 12,417,756, particularly US 9,129,602 B1, US 2020/0193971 A1, and US 2024/0161764 A1, appears to be highly relevant. These documents describe the core concepts of analyzing speech to identify accent and voice characteristics, and using this analysis to modify a speaker's accent in real-time while preserving their voice identity.

The strength of an anticipation argument under 35 U.S.C. § 102 would depend on whether a single one of these references discloses every limitation of the independent claims. Based on the descriptions, the '971 and '764 applications seem to come closest to disclosing the entire claimed process. This body of prior art likely forms the basis for the pending Post-Grant Review (PGR2026-00033) filed by Krisp Technologies, Inc., which challenges the validity of the patent's claims.