Invalidity dossier

US 12131745

System and method for automatic alignment of phonetic content for real-time accent conversion

Current assignee: Sanas Ai Inc

Added 5/12/2026, 11:37:51 PM

IndustrySoftware Technology & Computing Systems (T)

Active provider: Google · gemini-2.5-flash

Auto-generating section 1 of 2: Extensions…

Each section takes ~30-60s with web-search grounding. Keep this tab open — sections will fill in below as they complete.

⚖️ Active PTAB challenge: 1 pending proceeding against this patent

1 active — Inter Partes Review, Post-Grant Review, or Covered Business Method proceedings at the USPTO Patent Trial and Appeal Board.

See proceedings →

Got a demand letter citing US 12131745?

Paste the full letter into the analyzer. We extract every asserted patent (this one and any others), characterize the asserter, flag validity vulnerabilities, and draft a sample response letter your attorney can adapt.

Analyze a letter →

Generic sample response letter (PDF)

Generates a draft reply letter to a generic infringement claim citing this patent, using the analysis in this dossier. For a response tailored to a specific letter you received, use the demand letter analyzer instead. Sample only — not legal advice. Do not send without review by a licensed patent attorney.

Watchlist

Get alerted when this patent moves.

Email-only, free, anonymous. We'll notify you when US 12131745 gets a new lawsuit, a new PTAB proceeding, or a new dossier section. One-click unsubscribe from any alert.

Patent summary

Title, assignee, inventors, filing/issue dates, abstract, and a plain-language overview of the claims.

✓ Generated

Here's a concise summary of US patent 12131745:

Title: System and method for automatic alignment of phonetic content for real-time accent conversion
Assignee: Sanas Ai Inc.
Inventors: Lukas PFEIFENBERGER, Shawn Zhang
Filing Date: 2024-06-26
Issue Date: 2024-10-29
Abstract: The disclosed technology relates to methods, accent conversion systems, and non-transitory computer readable media for real-time accent conversion. In some examples, a set of phonetic embedding vectors is obtained for phonetic content representing a source accent and obtained from input audio data. A trained machine learning model is applied to the set of phonetic embedding vectors to generate a set of transformed phonetic embedding vectors corresponding to phonetic characteristics of speech data in a target accent. An alignment is determined by maximizing a cosine distance between the set of phonetic embedding vectors and the set of transformed phonetic embedding vectors. The speech data is then aligned to the phonetic content based on the determined alignment to generate output audio data representing the target accent. The disclosed technology transforms phonetic characteristics of a source accent to match the target accent more closely for efficient and seamless accent conversion in real-time applications.

Plain-language overview of independent claims:

Claim 1 (System): This claim describes an accent conversion system comprising hardware (audio interface, memory, processors) configured to perform several steps. The system first receives audio data, then generates numerical representations of the original speech sounds (first phonetic embedding vectors) representing a source accent. A trained neural network transforms these into new numerical representations (second phonetic embedding vectors) for a target accent. The system then determines a precise, differentiable alignment by maximizing the cosine distance (a measure of similarity) between the original and transformed phonetic embedding vectors. Finally, it aligns the original speech data to this determined alignment to produce output audio data in the target accent.
Claim 8 (Method): This claim outlines a method for automatic alignment and real-time accent conversion, implemented by an accent conversion system. It involves obtaining phonetic embedding vectors from input audio data (representing a source accent). A trained machine learning model then generates transformed phonetic embedding vectors corresponding to a target accent. An alignment is determined by maximizing the cosine distance between the original and transformed phonetic embedding vectors. Based on this alignment, the speech data is aligned to generate output audio data representing the target accent.
Claim 15 (Non-transitory computer-readable medium): This claim covers a non-transitory computer-readable medium (e.g., a hard drive) storing instructions. When executed by at least one processor, these instructions cause the processor to perform steps similar to the method of Claim 8: obtaining first phonetic embedding vectors for a source accent from input audio, applying a trained neural network to generate second phonetic embedding vectors for a target accent, determining an alignment by maximizing the cosine distance between the first and second vectors, and aligning the speech data based on this alignment to generate output audio in the target accent.

CAFC 2026 Dockets:
A search of CAFC 2026 dockets for patent number US12131745 did not yield any specific results for this patent.

Generated 5/29/2026, 5:51:55 PM