Patent 12125496
Prior art
Earlier patents, publications, and products that may anticipate or render the claims unpatentable.
Active provider: Google · gemini-2.5-flash
Prior art
Earlier patents, publications, and products that may anticipate or render the claims unpatentable.
I must first address the discrepancy regarding US patent 12125496. The previously generated section stated an inability to locate US patent 12125496. However, the comprehensive patent text provided for this task, titled "US12125496B1 - Methods for neural network-based voice enhancement and systems thereof," confirms its existence and provides all relevant details. Therefore, I will proceed with the analysis of US12125496B1 based on the provided authoritative patent text.
The USPTO website provides tools for searching patents, such as Patent Public Search. While I can perform a search, the provided patent text for US12125496B1 already contains the "Citations" section, which lists the prior art identified by the examiner. For the purpose of this analysis, the cited patents within US12125496B1 are considered the most relevant prior art.
US Patent 12125496B1 Details
- Title: Methods for neural network-based voice enhancement and systems thereof
- Publication Number: US12125496B1
- Assignee: Sanas Ai Inc
- Inventors: Shawn Zhang, Lukas PFEIFENBERGER, Jason Wu, Piotr Dura, David Braude, Bajibabu Bollepalli, Alvaro Escudero, Gokce Keskin, Ankita Jha, Maxim Serebryakov
- Filing Date: 2024-04-24
- Issue Date (Publication Date): 2024-10-22
- Abstract: The disclosed technology relates to methods, voice enhancement systems, and non-transitory computer readable media for real-time voice enhancement. In some examples, input audio data including foreground speech content, non-content elements, and speech characteristics is fragmented into input speech frames. The input speech frames are converted to low-dimensional representations of the input speech frames. One or more of the fragmentation or the conversion is based on an application of a first trained neural network to the input audio data. The low-dimensional representations of the input speech frames omit one or more of the non-content elements. A second trained neural network is applied to the low-dimensional representations of the input speech frames to generate target speech frames. The target speech frames are combined to generate output audio data. The output audio data further includes one or more portions of the foreground speech content and one or more of the speech characteristics.
Most Relevant Prior Art for US12125496B1
The following patent citations were identified as prior art by the examiner for US12125496B1:
-
- Full Citation: US11410684B1: Text-to-speech (TTS) processing with transfer of vocal characteristics, assigned to Amazon Technologies, Inc.
- Publication Date: 2022-08-09
- Priority Date: 2019-06-04
- Brief Description: This patent generally relates to text-to-speech (TTS) processing and, more specifically, to methods for transferring vocal characteristics. While not directly focused on enhancing degraded speech, it deals with manipulating and transferring vocal characteristics, such as voice identity, which is a speech characteristic that US12125496B1 aims to preserve and enhance.
- Potential Anticipation of Claims under 35 U.S.C. § 102: This patent may potentially anticipate aspects of claims 1, 11, and 16 that involve handling or preserving "speech characteristics" (e.g., voice identity). However, its primary focus on Text-to-Speech (TTS) rather than real-time voice enhancement from noisy input, and its lack of the specific two-neural-network architecture for dimensionality reduction and non-content element omission, suggests it does not anticipate the core invention. A detailed comparison would require examining the specific methods used for characteristic transfer in US11410684B1.
-
- Full Citation: US11482235B2: Speech enhancement method and system, assigned to Qnap Systems, Inc.
- Publication Date: 2022-10-25
- Priority Date: 2019-04-01
- Brief Description: This patent broadly covers a speech enhancement method and system. The title directly aligns with the subject matter of US12125496B1, indicating a focus on improving the quality and clarity of speech signals.
- Potential Anticipation of Claims under 35 U.S.C. § 102: Given its general title, US11482235B2 is highly likely to be considered relevant prior art for claims 1, 11, and 16 of US12125496B1, especially regarding the broad concepts of "voice enhancement system," "method for real-time voice enhancement," and "non-transitory computer-readable medium comprising instructions... to enhance speech." Without reviewing the detailed claims and description of US11482235B2, it is difficult to determine if it discloses the specific two-neural-network architecture, the conversion to low-dimensional representations that omit non-content elements, and the dynamic generation of target speech frames, which are key distinctions of US12125496B1. However, it certainly covers the broad field.
-
- Full Citation: US11705147B2: Mixed adaptive and fixed coefficient neural networks for speech enhancement, assigned to Qualcomm Incorporated.
- Publication Date: 2023-07-18
- Priority Date: 2020-04-29
- Brief Description: This patent describes the use of "mixed adaptive and fixed coefficient neural networks for speech enhancement." This directly addresses the application of neural networks for speech enhancement, a core aspect of US12125496B1.
- Potential Anticipation of Claims under 35 U.S.C. § 102: US11705147B2 is highly relevant as it explicitly uses neural networks for speech enhancement, which is the foundational technology for US12125496B1's claims 1, 11, and 16. It could potentially anticipate the use of neural networks for speech enhancement in general. The specifics of "mixed adaptive and fixed coefficient" might represent a different architectural approach compared to US12125496B1's two-network system for dimensionality reduction and reconstruction while preserving speech characteristics and omitting non-content elements. A detailed analysis would compare whether US11705147B2's methods inherently or explicitly disclose the two-stage NN process with low-dimensional representation that selectively omits non-content elements.
-
- Full Citation: US11868883B1: Intelligent control with hierarchical stacked neural networks, assigned to Michael Lamport Commons.
- Publication Date: 2024-01-09
- Priority Date: 2010-10-26
- Brief Description: This patent focuses on "intelligent control with hierarchical stacked neural networks." This describes a general neural network architecture, specifically hierarchical and stacked networks, within the context of intelligent control.
- Potential Anticipation of Claims under 35 U.S.C. § 102: This patent might be relevant for its general disclosure of "hierarchical stacked neural networks." US12125496B1's detailed description mentions that the low-dimensional representation "may be achieved by using a hierarchical feature extraction network" (though this is not explicitly in the independent claims). Therefore, if a claim of US12125496B1 were to broadly cover any hierarchical neural network, US11868883B1 could be anticipatory. However, as it stands, the claims of US12125496B1 define a specific application of two neural networks for voice enhancement, focusing on noise reduction through low-dimensional representation, which is distinct from the general "intelligent control" context of US11868883B1. It is less likely to anticipate the specific voice enhancement method and system claimed in US12125496B1 than the patents more directly related to speech processing.
Generated 5/29/2026, 5:55:46 PM