Patent 10237067

Obviousness

Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.

Active provider: Google · gemini-2.5-flash

Obviousness

Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.

✓ Generated

Obviousness Analysis of US Patent 10237067 under 35 U.S.C. § 103

This analysis identifies combinations of prior art references that would render the independent claims of US Patent 10237067 (Claims 1, 6, and 13) obvious to a person having ordinary skill in the art (PHOSITA) as of the patent's effective filing date (September 30, 2002, for subject matter supported by the earliest provisional application, or the actual filing date of November 28, 2017, for any new matter). The prior art references considered are primarily the earlier patents in the same family, as identified in the "Prior art" section of the provided information.

General Motivation for Combination

The various patents in the family of US10237067 consistently address problems related to the storage and search retrieval of digital media files, including the difficulty of manual indexing, file transfer, and maintaining searchable tags. [cite: Patent text] A PHOSITA in the field of multimedia data management and security would be motivated to combine known techniques and existing systems to improve the efficiency, automation, and user experience of such systems. The incremental development across this patent family, where later patents build upon earlier ones by adding or refining features like automated tagging, voice assistance, and object recognition, strongly suggests a continuous and obvious progression in the art driven by evident design needs and market pressures for enhanced functionality and predictability of results.

Combination 1: US Pat. No. 9,832,017 (Primary Reference)

  • Reference: US Pat. No. 9,832,017 ("Apparatus for Personal Voice Assistant, Location Services, Multi-Media Capture, Transmission, Speech to Text Conversion, Photo/Video Image/Object Recognition, Creation of Searchable Metatag(s)/Contextual Tag(s), Storage and Search Retrieval") [cite: Patent text]
  • Relevance: US9832017 is the direct parent of US10237067 [cite: Patent text] and has a nearly identical title, indicating comprehensive disclosure of the core functionalities claimed in US10237067. Its title explicitly mentions "voice assistant," "location services," "multi-media capture," "transmission," "speech to text conversion," "photo/video image/object recognition," and "creation of searchable metatag(s)/contextual tag(s), storage and search retrieval." [cite: Patent text] These elements directly correspond to virtually all aspects of Claims 1, 6, and 13 of US10237067. The "Prior art" section previously stated that US9832017 is "extremely likely to anticipate all or almost all aspects of Claims 1, 6, and 13 of US10237067."

Analysis for Claims 1, 6, and 13:

  • Claim 1 (Remote Processing): This claim describes a system for capturing image and audio, combining it with location and time information, encrypting and transmitting it, and then at a remote location, decrypting, converting audio to text tags, creating image recognition tags, associating them with the digital image, and storing them. All these elements are explicitly or inherently covered by the broad title of US9832017, including "multi-media capture," "location services," "transmission," "speech to text conversion," "photo/video image/object recognition," and "creation of searchable metatag(s)/contextual tag(s), storage and search retrieval." [cite: Patent text]

  • Claim 6 and Claim 13 (Local Processing and Storage): These claims describe a capture device with internal storage where audio-to-text conversion and image recognition tagging are performed locally on the device, and the tags are stored in internal storage. While Claim 1 focuses on remote processing, the specification of US10237067 itself describes this local tagging as a "variation" where "placing the ability to enter tags on the data capture device itself" can be done via "voice recognition software." [cite: Patent text] A PHOSITA would recognize that performing such processing locally on the capture device, rather than solely at a remote server, is an obvious design choice to reduce latency, enable offline functionality, and provide immediate user feedback. This "variation" as described in US10237067 is thus either disclosed within the scope of US9832017 or represents an obvious implementation choice known to a PHOSITA.

  • Motivation to combine/modify US9832017: Given the extensive overlap, US10237067's claims appear to be largely obvious over US9832017 alone. A PHOSITA would be motivated to implement the functionalities described in Claims 1, 6, and 13 as straightforward applications or minor architectural variations (e.g., local vs. remote processing) of the system broadly taught by US9832017 to achieve predictable improvements in media management and searchability.

Combination 2: US Pat. No. 6,996,251 and US Pat. No. 7,778,438

  • Primary Reference: US Pat. No. 6,996,251 ("Forensic communication apparatus and method," filed Sep. 29, 2003, issued Feb. 7, 2006). [cite: Patent text]

    • Discloses: A system for capturing information, enhancing it with embedded data (metadata, including time, date, location), storing it, and ensuring authenticity and security. It involves transmission and remote storage. [cite: Patent text] This reference provides the foundation for secure multimedia capture, metadata association, transmission, and storage.
  • Secondary Reference: US Pat. No. 7,778,438 ("Method for Multi-Media Recognition, Data Conversion, Creation of Metatags, Storage and Search Retrieval," filed Jan. 8, 2007, issued Aug. 17, 2010). [cite: Patent text]

    • Discloses: Methods for "multi-media recognition," "data conversion," and "creation of metatags." [cite: Patent text] Crucially, it specifically states that "For audio files, this may include a speech-to-text algorithm; for still or moving images, it may include image recognition and identification." [cite: Patent text] This reference explicitly teaches the generation of searchable tags via speech-to-text and image recognition.
  • Motivation to Combine: A PHOSITA, seeking to enhance the utility and search capabilities of the secure multimedia capture and storage system described in US6996251, would be motivated to incorporate automated methods for creating descriptive metadata. US7778438 directly provides such methods, including speech-to-text for audio and image recognition for visual media, to generate metatags. The background of US10237067 itself highlights the problem of manually indexing large numbers of media files [cite: Patent text]. Combining the robust capture and storage system of US6996251 with the automated metatag generation techniques of US7778438 would be an obvious solution to address this known problem, leading to a predictable system for capturing, securely storing, and efficiently retrieving multimedia based on automatically generated, searchable tags. This combination directly covers the elements of Claims 1, 6, and 13 related to multi-media capture, location/time tagging, speech-to-text conversion, image recognition, and the creation and storage of searchable context tags.

Combination 3: US Pat. No. 6,996,251 and US Pat. No. 8,983,119

  • Primary Reference: US Pat. No. 6,996,251 (as above, for core capture, storage, metadata, transmission, security) [cite: Patent text].

  • Secondary Reference: US Pat. No. 8,983,119 ("Method for Voice Command Activation, Multi-Media Capture, Transmission, Speech Conversion, Metatags Creation, Storage and Search Retrieval," filed Aug. 13, 2013, issued Mar. 17, 2015). [cite: Patent text]

    • Discloses: "Voice command activation" and "speech conversion" (i.e., speech-to-text) specifically for the purpose of "Metatags Creation." [cite: Patent text]
  • Motivation to Combine: Similar to the previous combination, a PHOSITA would desire to improve the searchability and automation of the system from US6996251. US8983119 explicitly teaches the use of speech conversion for generating metatags, directly addressing the audio-to-text tagging feature of US10237067. This combination would be obvious for enhancing the existing system's ability to create searchable indices from audio inputs (e.g., spoken comments about an image or video), thereby improving content discoverability. The motivation remains centered on automating metadata creation to overcome manual indexing challenges and providing more robust retrieval capabilities, leading to a predictable outcome.

Conclusion:
Considering US Pat. No. 9,832,017 as the direct parent reference with a nearly identical scope, the independent claims (1, 6, and 13) of US10237067 appear to be obvious. Even without relying solely on the direct parent, combining earlier patents in the same family, such as US6996251 with US7778438 or US8983119, would render the claims obvious. These combinations provide a clear motivation for a PHOSITA to combine known elements for predictable improvements in automating multimedia content indexing and retrieval, directly addressing problems articulated within the patent family's own background.

Generated 5/24/2026, 12:50:39 AM