Obviousness — US Patent 11087750

Based on the provided prior art analysis, here is an analysis of the obviousness of the independent claims of U.S. Patent 11,087,750 under 35 U.S.C. § 103.

Obviousness Analysis of U.S. Patent 11,087,750

This analysis evaluates whether an invention claimed in US patent 11,087,750 would have been obvious to a "person having ordinary skill in the art" (PHOSITA) at the time the invention was made by combining the teachings of existing prior art references.

Claims 1 & 8: Trigger-less Voice Command Detection

These claims cover the broad method and apparatus for detecting a voice command "without requiring receipt of an explicit trigger."

Conclusion: These claims are likely rendered obvious by U.S. Patent 9,070,332 B2 (Microsoft '332).
Reasoning:
- The Microsoft '332 patent discloses the core novelty of these claims. Its abstract states the system can identify a command within general speech "without an explicit user action to transition the electronic device to a command mode." This directly teaches the limitation of detecting a command "without requiring receipt of an explicit trigger."
- Microsoft '332 further teaches monitoring an acoustic environment, receiving acoustic input (i.e., a stream of speech), and initiating a response by parsing the speech to determine if it contains an actionable command. This maps directly to all elements of claims 1 and 8.
- A PHOSITA would find it obvious to apply the method described in Microsoft '332 to a "mobile device," as mobile devices were a primary platform for voice-based user interfaces and command-and-control systems at the time of the invention. Therefore, no combination of references is necessary; Microsoft '332 alone appears to teach all the claimed elements.

Claims 29 & 36: Two-Processor System for Low-Power Detection

These claims cover a method and apparatus for using a first, low-power processor for an initial analysis of acoustic input, followed by a second, higher-power processor for further evaluation if needed, all while the device is in a low-power state.

Conclusion: These claims are likely rendered obvious by the combination of U.S. Patent Application Publication 2012/0101828 A1 (Qualcomm '828) in view of Microsoft '332.
Reasoning:
- What Qualcomm '828 Teaches: Qualcomm '828 explicitly discloses the two-processor architecture for power-efficient voice processing. It teaches a dedicated, low-power hardware component (a "voice trigger circuit" or low-power DSP) that performs a first processing stage (listening for a trigger) while the main "applications processor" remains in a low-power state. Upon detection, the main processor is woken for a second stage of processing. This directly teaches the central limitations of claims 29 and 36: a low-power mode, a first processor for a first stage, and a second processor for a second stage.
- What is Missing from Qualcomm '828: The Qualcomm system is designed to detect an "explicit trigger" (a specific voice phrase).
- What Microsoft '332 Teaches: As established above, Microsoft '332 teaches the method of detecting a voice command without an explicit trigger.
- Motivation to Combine: A PHOSITA would have been motivated to combine the power-saving two-processor architecture of Qualcomm '828 with the more natural, trigger-less interaction method of Microsoft '332. The known problem in the art was balancing "always-on" listening with battery life. Qualcomm '828 provided an elegant hardware solution for power management. Microsoft '332 provided a software solution for a more seamless user experience. A PHOSITA seeking to build a commercially competitive voice-enabled mobile device would have found it obvious to implement the more advanced, trigger-less software from Microsoft '332 on the power-efficient hardware architecture taught by Qualcomm '828. This would have been a predictable combination of a known method with a known system to achieve the known goals of improved usability and extended battery life.

Claims 15 & 22: Low-Power Detection Using Contextual Cues

These claims cover a method and apparatus for detecting a voice command in a low-power mode by using a multi-stage process that is assisted by "at least one contextual cue" (e.g., motion, location, time).

Conclusion: These claims are likely rendered obvious by the combination of U.S. Patent Application Publication 2012/0330663 A1 (Apple '663) in view of Microsoft '332.
Reasoning:
- What Apple '663 Teaches: Apple '663 is highly relevant as it explicitly discloses using "contextual information" to aid in voice detection while in a low-power state. It specifically teaches using a proximity sensor to determine if a device is near the user's ear as a "contextual cue" to enable or disable the voice trigger functionality. This directly teaches the core elements of the claims: operating in a low-power mode, performing detection, and using "at least one contextual cue" to assist.
- What is Missing from Apple '663: The Apple system uses this context to decide when to listen for an explicit voice trigger (e.g., "Siri").
- What Microsoft '332 Teaches: Microsoft '332 again supplies the missing element: detecting a command without an explicit trigger.
- Motivation to Combine: A key challenge for a trigger-less system like that in Microsoft '332 is avoiding false positives—that is, incorrectly identifying background conversation as a command. A PHOSITA would recognize this known problem. Apple '663 provides a clear solution: use contextual cues to make the listening process "smarter" and more selective. A PHOSITA would have been motivated to incorporate the use of contextual cues (as taught by Apple '663) into the trigger-less command detection system (taught by Microsoft '332) to improve its accuracy and reliability. For instance, using an accelerometer to know the device has just been picked up (a contextual cue) would be a logical signal to pay closer attention for a potential command in the subsequent audio stream. This combination would be a commonsense engineering step to improve the performance of a known system.