Patent 11929073

Obviousness

Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.

Active provider: Google · gemini-2.5-pro

Obviousness

Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.

✓ Generated

Obviousness Analysis of U.S. Patent 11,929,073

I. Introduction

This analysis evaluates the obviousness of the claims of U.S. Patent 11,929,073 ("the '073 patent") under 35 U.S.C. § 103. The '073 patent, titled "Hybrid arbitration system," describes a method for selecting a speech recognition result from either a local (on-device) processor or a cloud-based service. The core of the claimed invention is a two-stage arbitration process. First, it determines whether the local result is sufficiently reliable to be used immediately, without waiting for the cloud result (a "short-circuit" decision). If not, it then compares the local and cloud results to select the better one.

An invention is considered obvious if the differences between the claimed invention and the prior art are such that the invention as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art (a "POSITA"). This analysis will demonstrate that the claims of the '073 patent would have been obvious to a POSITA by combining the teachings of existing prior art references.

II. Prior Art References

The following prior art references are cited in this analysis:

  • US 2013/0346078 A1 ("Gelfenbeyn"): This application discloses a hybrid speech recognition system that uses both a local, device-based recognizer and a more powerful server-based recognizer. Gelfenbeyn teaches that the local recognizer can provide a quick initial result, while the server provides a more accurate result later. It explicitly discusses using confidence scores to determine which recognition result to use.

  • US 10,186,262 B2 ("Coon"): This patent describes a system with multiple, simultaneous speech recognizers. Coon teaches that the system can select a result from one recognizer before another has finished, based on factors like speed and confidence. This addresses the core "short-circuit" concept of acting on an early result.

  • US 2018/0342236 A1 ("Kim"): This application focuses on evaluating the performance of different speech recognizers in a hybrid system. Kim describes using various features beyond just a single confidence score, including natural language understanding (NLU) outputs, to assess the quality of a recognition result.

III. Claim Analysis and Obviousness Argument

The independent claims (1, 14, and 15) of the '073 patent broadly cover a method, system, and software for making a preliminary decision on a local speech recognition result before a cloud result is available.

Claim 1: The Method

The key steps of claim 1 are:

  1. Soliciting both a first (local) and second (cloud) speech recognition result.
  2. Receiving the first (local) result and its associated features.
  3. Determining, prior to receiving the second (cloud) result, whether to select the first result or wait.
  4. Selecting the first result based on that determination.

A POSITA would find this method obvious by combining Gelfenbeyn and Coon.

  • Gelfenbeyn teaches the fundamental architecture of a hybrid system with local and server-based ASR running in parallel. It also introduces the use of confidence scores to arbitrate between the results.
  • Coon teaches the specific concept of not waiting for all recognizers to finish. Coon discloses selecting a result from a faster recognizer if its confidence is high enough, which is precisely the "short-circuit" logic claimed in the '073 patent.

Motivation to Combine: A POSITA would have been motivated to combine the teachings of Gelfenbeyn and Coon for a very practical reason: to improve user experience by reducing latency. In voice-interactive systems, responsiveness is critical. A user wants a fast response. Gelfenbeyn provides the hybrid architecture, and Coon provides a known method for speeding up the decision-making process within such an architecture. The nature of the problem to be solved—balancing speed and accuracy in speech recognition—would naturally lead a skilled artisan to implement Coon's early-selection strategy within Gelfenbeyn's hybrid framework. The combination would yield the predictable result of a faster system when the local recognizer is highly confident.

Furthermore, the use of a "plurality of features" as recited in the claim would be an obvious extension in light of Kim. Kim teaches that to get a better assessment of a recognition result's quality, it is beneficial to look beyond a simple confidence score and include other data, such as NLU-derived features (e.g., recognized domain, intent, entities). A POSITA, seeking to improve the reliability of the "short-circuit" decision taught by the combination of Gelfenbeyn and Coon, would have found it obvious to incorporate the more sophisticated feature analysis described by Kim. This would simply be the application of a known technique (using more detailed features for quality assessment) to improve a known system (hybrid ASR arbitration).

Claim 14 (System) and Claim 15 (Software)

Independent claims 14 and 15 recite the system and software, respectively, for carrying out the method of claim 1. As the underlying method is obvious for the reasons stated above, the claims to the system and software that perform this method are also obvious. A POSITA would have been capable of programming processors and configuring hardware (inputs, outputs) to execute the combined teachings of Gelfenbeyn, Coon, and Kim. These claims do not add any non-obvious limitations beyond the process itself.

IV. Conclusion

The claims of US patent 11,929,073 are invalid as obvious under 35 U.S.C. § 103. The foundational concept of a hybrid on-device/cloud speech recognition system was well-established in the prior art, as shown by Gelfenbeyn. The strategy of making an early decision based on a preliminary, high-confidence result to reduce latency was explicitly taught by Coon. Finally, the use of a rich set of features, including NLU outputs, to improve the quality of this decision was taught by Kim. A person of ordinary skill in the art would have been motivated to combine these known elements to achieve the predictable goal of a more responsive and reliable voice interface. The combination represents a straightforward application of known techniques to solve a known problem, and therefore does not constitute a patentable invention.

Generated 5/8/2026, 10:00:36 PM