Patent 12417756
Prior art
Earlier patents, publications, and products that may anticipate or render the claims unpatentable.
Active provider: Google · gemini-2.5-pro
Prior art
Earlier patents, publications, and products that may anticipate or render the claims unpatentable.
Analysis of Prior Art Cited in U.S. Patent 12,417,756
Based on the patent documentation for U.S. Patent 12,417,756, the following patent documents have been cited as prior art by the examiner. This analysis details the most relevant of these citations and their potential impact on the patent's claims under 35 U.S.C. § 102 (Anticipation).
A patent claim is anticipated if a single prior art reference discloses each and every element of the claim. The independent claims of the '756 patent (claims 1, 7, and 14) broadly cover a system, method, and non-transitory computer-readable medium for modifying a second user's speech to mimic a first user's accent while preserving the second user's natural voice characteristics, using machine learning models.
Key Prior Art and Potential Anticipation
The following references are identified as the most relevant to the claims of U.S. Patent 12,417,756.
1. U.S. Patent 9,129,602 B1: "Mimicking user speech patterns"
- Full Citation: US Patent 9,129,602 B1
- Assignee: Amazon Technologies, Inc.
- Filing Date: December 14, 2012
- Publication Date: September 8, 2015
- Brief Description: This patent describes a system that can generate synthesized speech that mimics the speech patterns of a user. It involves receiving a user's speech, analyzing it to determine speech patterns (like pitch, prosody, and accent), and then using these patterns to generate new speech in the user's voice. The goal is to make synthesized speech, such as from a digital assistant, sound more like the user it is interacting with.
- Potential Anticipation: This reference appears highly relevant. It discloses the core concept of analyzing a user's speech to extract patterns (analogous to the "first user's accent features" and "second user's natural voice") and using those to synthesize new speech.
- Claims 1, 7, and 14: The '602 patent's disclosure of analyzing user speech to determine patterns like accent and prosody and using those to generate new speech could be argued to anticipate the process of extracting accent features and synthesizing a modified output. The key distinction would be whether the '602 patent explicitly teaches the combination of a first user's accent with the preservation of a second user's distinct natural voice. If "mimicking user speech patterns" is interpreted broadly enough to cover this combination, it could potentially anticipate these claims.
2. U.S. Patent Application Publication 2020/0193971 A1: "System and methods for accent and dialect modification"
- Full Citation: US 2020/0193971 A1
- Assignee: i2x GmbH
- Filing Date: December 13, 2018
- Publication Date: June 18, 2020
- Brief Description: This application describes a system for modifying a speaker's accent or dialect in real-time. It involves capturing audio, identifying accent or dialect features, and transforming the speech to a target accent or dialect while aiming to maintain the speaker's voice identity. It is particularly aimed at call centers to help agents be more clearly understood.
- Potential Anticipation: This reference is also highly relevant as it explicitly addresses real-time accent modification while preserving the speaker's voice.
- Claims 1, 7, and 14: The '971 application appears to describe all the key steps of the '756 patent's independent claims. It teaches capturing speech (second user), modifying its accent to a target (first user's accent), and importantly, "maintaining the speaker's voice identity" (preserving the natural voice). The distinction may lie in the specific techniques used for analysis and synthesis (e.g., the '756 patent's specific mention of MFCC or unique fingerprints), but the overall process seems to be disclosed.
3. U.S. Patent 11,134,217 B1: "System that provides video conferencing with accent modification and multiple video overlaying"
- Full Citation: US Patent 11,134,217 B1
- Assignee: Surendra Goel
- Filing Date: January 11, 2021
- Publication Date: September 28, 2021
- Brief Description: This patent details a video conferencing system that includes real-time accent modification. A participant's speech is modified to an accent that is selected or deemed more understandable to other participants. The system is designed to improve clarity in multi-lingual or multi-accent conversations.
- Potential Anticipation: This reference is relevant due to its application of accent modification in a real-time communication context.
- Claims 1, 7, and 14: The '217 patent describes modifying a speaker's accent in real-time within a conferencing system. It inherently involves capturing a second user's speech and modifying it. The central question for anticipation would be whether it explicitly teaches the analysis of a first user's accent to use as the target and the specific step of preserving the second user's "natural voice" characteristics as defined in the '756 patent. If the modification is to a generic "standard" accent rather than mimicking a specific participant, it may not fully anticipate the claims.
4. U.S. Patent Application Publication 2024/0161764 A1: "Accent personalization for speakers and listeners"
- Full Citation: US 2024/0161764 A1
- Assignee: Dell Products L.P.
- Filing Date: November 9, 2022
- Publication Date: May 16, 2024
- Brief Description: This application covers a system that personalizes audio by modifying a speaker's accent to match a listener's preference or native accent. The system can detect a listener's accent and convert the speaker's audio to that accent in real-time to improve comprehension.
- Potential Anticipation: This is a very strong reference, published before the priority date of the '756 patent. It describes the core functionality of the invention.
- Claims 1, 7, and 14: The '764 application teaches modifying a speaker's accent to match a listener's accent, which directly corresponds to the '756 patent's concept of mimicking a "first user" (the listener) for the benefit of a "second user" (the speaker's output). The disclosure of personalizing the accent for the listener strongly implies that the system analyzes the listener's speech characteristics. It likely also discusses preserving the speaker's voice to avoid unnatural output. This reference has a high probability of anticipating the independent claims.
Other Notable Cited References
- US 2023/0267941 A1 ("Personalized Accent and/or Pace of Speaking Modulation for Audio/Video Streams"): Discloses modifying a speaker's accent and pace for a listener, which is conceptually similar.
- US 2024/0146560 A1 ("Participant Audio Stream Modification Within A Conference"): Describes modifying a participant's audio in a conference, which could include accent modification, to improve clarity.
Summary of Analysis
The prior art cited against U.S. Patent 12,417,756, particularly US 9,129,602 B1, US 2020/0193971 A1, and US 2024/0161764 A1, appears to be highly relevant. These documents describe the core concepts of analyzing speech to identify accent and voice characteristics, and using this analysis to modify a speaker's accent in real-time while preserving their voice identity.
The strength of an anticipation argument under 35 U.S.C. § 102 would depend on whether a single one of these references discloses every limitation of the independent claims. Based on the descriptions, the '971 and '764 applications seem to come closest to disclosing the entire claimed process. This body of prior art likely forms the basis for the pending Post-Grant Review (PGR2026-00033) filed by Krisp Technologies, Inc., which challenges the validity of the patent's claims.
Generated 4/30/2026, 4:31:53 PM