Obviousness — US Patent 10297249

The current date is April 26, 2026.

Obviousness Analysis of US Patent 10,297,249 under 35 U.S.C. § 103

To establish obviousness under 35 U.S.C. § 103, it must be shown that the claimed invention as a whole would have been obvious to a person having ordinary skill in the art (POSITA) at the time of the invention, based on prior art references. This includes demonstrating a motivation to combine the references and a reasonable expectation of success.

The present analysis identifies potential combinations of prior art references cited within US Patent 10,297,249 itself, as these references are explicitly acknowledged by the inventors as relevant to the field.

Independent Claim 1 (Method Claim):

Claim 1 describes a method for a computer system to engage in a cooperative conversational voice user interface. Key elements include:

Receiving a human utterance.
Generating one or more preliminary interpretations of the utterance.
Providing these to a conversational speech engine.
Generating one or more hypotheses of a user's intent, leveraging short-term and long-term shared knowledge.
Ranking hypotheses by certainty.
Generating an adaptive conversational response based on the ranked hypotheses.
Learning from incorrect interpretations to avoid repetition for identical utterances.

Combination 1: U.S. Pat. No. 7,634,409 (Dynamic Speech Sharpening) in combination with U.S. Pat. No. 7,640,160 (Responding to Natural Language Speech Utterance) and U.S. Pat. No. 7,949,529 (Mobile Systems and Methods of Supporting Natural Language Human-Machine Interactions).

U.S. Pat. No. 7,634,409 ("Dynamic Speech Sharpening"): This patent is cited in US 10,297,249 for its description of a speech recognition engine (ASR 110) that interprets utterances using phonetic dictation to recognize a phoneme stream. This addresses the "receiving a human utterance" and "generating one or more preliminary interpretations of the utterance" elements of Claim 1.
U.S. Pat. No. 7,640,160 ("Systems and Methods for Responding to Natural Language Speech Utterance") and U.S. Pat. No. 7,949,529 ("Mobile Systems and Methods of Supporting Natural Language Human-Machine Interactions"): These patents are explicitly incorporated by reference and described as determining one or more contexts for a request by having context domain agents compete to determine the most appropriate domain for a given utterance. This directly addresses the "providing these to a conversational speech engine," and "generating one or more hypotheses of a user's intent" elements, particularly in the context of disambiguation and using context. The concept of "shared knowledge" (both short-term and long-term) is also fundamental to the cooperative conversational model described in US 10,297,249, which builds upon context determination. The idea of context determination inferring intended operations based on previous utterances, and removing incorrect interpretations to prevent repetition, is also explicitly detailed in US 10,297,249 and attributed to its context determination process, which builds upon these prior art references.

Motivation for Combination and Expectation of Success:

A POSITA at the time of the invention (October 16, 2006, based on the priority date) would have been motivated to combine the speech recognition capabilities of U.S. Pat. No. 7,634,409 with the natural language understanding and context determination methods of U.S. Pat. No. 7,640,160 and U.S. Pat. No. 7,949,529. The motivation would be to create a more robust and "cooperative" voice user interface that moves beyond simple command-and-control systems. U.S. Pat. No. 7,634,409 provides the foundational speech-to-text conversion. The latter two patents provide mechanisms for understanding the meaning of that text within a conversation, resolving ambiguities, and leveraging past interactions, which are crucial for a "cooperative" interface. The explicit incorporation by reference and the detailed descriptions in US 10,297,249 of how its features (like context determination) build upon these prior patents suggest a clear motivation and a high expectation of success for such a combination. The "learning from incorrect interpretations" feature, for example, is presented as an advancement over existing systems that repeatedly make the same errors, implying that the underlying context determination process from the earlier patents would be improved upon by this learning mechanism.

Independent Claim 13 (System Claim):

Claim 13 describes a computer system with a processor and memory configured to perform the method of Claim 1. This includes modules for building shared knowledge, generating and ranking hypotheses, and creating adaptive responses, with the ability to correct conversational course and not repeat incorrect interpretations.

Combination 2: A computer system implementing U.S. Pat. No. 7,634,409 in combination with U.S. Pat. No. 7,640,160 and U.S. Pat. No. 7,949,529.

The system aspects of Claim 13 are directly tied to the method of Claim 1. Therefore, a computer system designed to implement the combined methods described above would render Claim 13 obvious. The prior art references describe the underlying mechanisms (speech recognition engine, context domain agents) that would reside within such a computer system, or be accessible by it (e.g., databases 130). US 10,297,249 itself states that the speech recognition engine, conversational speech engine, and databases may reside locally or remotely, or in a hybrid model.

Motivation for Combination and Expectation of Success:

A POSITA would recognize that to implement the cooperative conversational methods, a computer system would be required. The "modules" for building shared knowledge, generating hypotheses, and creating adaptive responses are simply the software and hardware implementations of the methods described in the combination of prior art. The motivation would be to provide a functional system that embodies the advancements in conversational AI. Given that the prior art already describes the functional components, building a system to integrate these functions would be a straightforward engineering task for a POSITA. The patent itself describes the system architecture as "exemplary" (FIG. 1) and the conversational speech engine (FIG. 2) as including "modules," implying these are known components arranged to perform the described functions.