Patent 8332844

Obviousness

Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.

Active provider: Google · gemini-2.5-flash

Obviousness

Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.

✓ Generated

Obviousness Analysis of US Patent 8332844 under 35 U.S.C. § 103

This analysis identifies combinations of prior art references that would render the independent claims of US Patent 8332844 obvious to a person having ordinary skill in the art (POSA) at the time of the invention (priority date of December 30, 2004). The analysis utilizes the "Patent Citations" and "Non-Patent Citations" sections provided within the patent text as the basis for prior art.

The core contributions of US Patent 8332844, as articulated in its summary, are centered on:

  1. A block-level root/leaf system for distributing application environments, allowing for on-the-fly boot image creation and streamlined updates. This foundational concept is explicitly described as prior art within the patent itself and is derived from a priority application.
  2. Caching blocks of the root image that have been accessed by at least one compute node to improve performance.
  3. Receiving indexing results pertaining to the root image from one compute node and providing these results to other compute nodes to reduce redundant operations.

Therefore, the obviousness arguments will focus on demonstrating that combining the known block-level root/leaf system with well-established caching techniques and methods for sharing indexing results would have been obvious to a POSA.

Combination for Claims 1 and 7 (System and Method for Data Provision with Caching)

Independent Claim 1: A system for providing data to a plurality of compute nodes, the system comprising: a first storage unit configured to store blocks of a root image of said compute nodes; a plurality of second storage units configured to store leaf images of respective compute nodes, said leaf images including only additional data blocks not previously contained in said root image and changes made by respective compute nodes to the blocks of said root image, wherein said leaf images of respective compute nodes do not include blocks of said root image that are unchanged by respective compute nodes; and a cache configured to cache blocks of said root image previously accessed by at least one of said compute nodes.

Independent Claim 7: A method for providing data to a plurality of compute nodes, comprising: storing blocks of a root image of said compute nodes on a first storage unit; storing leaf images for respective compute nodes on respective second storage units, said leaf images including only additional data blocks not previously contained in said root image and changes made by respective compute nodes to the blocks of the root image, wherein said leaf images of respective compute nodes do not include blocks of said root image that are unchanged by respective compute nodes; and caching blocks of said root image that have been accessed by at least one of said compute nodes in a cache memory.

Prior Art Combination:
A combination of Nguyen et al. (U.S. Application No. 11/026,622), US7870106B1 (Panta Systems, Inc.), and general knowledge of caching principles would render Claims 1 and 7 obvious.

  • Nguyen et al. (U.S. Application No. 11/026,622), titled "Branching Store File System" and filed on December 30, 2004, serves as foundational prior art, as US8332844 is a continuation-in-part of an application claiming priority to it. The '844 patent explicitly describes this branching store file system as prior art: "In a branching store file system, a read-only base image (or “root” image) of the application environment is created. The root image is accessible by all compute nodes in the cluster. Changes made by a compute node to the root image are stored in a “leaf” image unique to that compute node." This reference clearly teaches the structure of a root image on a first storage unit and leaf images on second storage units, where leaf images contain only changes and new data, effectively covering the initial two limitations of Claims 1 and 7. Moreover, the '844 patent states that its embodiments are "operating system-independent system and method for distributing an application environment to a compute node" by operating "at the block level."

  • US7870106B1 (Panta Systems, Inc.), titled "Client side caching in a global file system," shares the same priority date (December 30, 2004) and was originally assigned to the same entity (Panta Systems, Inc.). This patent teaches client-side caching of data from a global file system, demonstrating the well-known concept of improving access times for shared data in a distributed environment through caching.

  • General knowledge in the art dictates that caching is a standard technique to improve performance by storing frequently or recently accessed data closer to the processing unit. The '844 patent itself highlights the commonality of data access in a clustered environment: "Because several compute nodes in a cluster may often access the same data (e.g., same drivers, same library files, etc.) on the root image, tremendous speed improvements can be realized by caching such data in cache 260." It further notes the significant speed difference between storage disks and cache memory.

Motivation to Combine:
A POSA would have been motivated to combine the block-level root/leaf system of Nguyen et al. with caching techniques because of the inherent performance bottlenecks in distributed computing, especially concerning shared, read-only data. The root image in the Nguyen et al. system is explicitly described as "read-only" and "accessible by all compute nodes." A POSA would recognize that frequently accessed blocks from this common root image, particularly during cluster boot-up or application launch, would benefit significantly from caching. Caching would reduce the load on the first storage unit and the network, and drastically improve data access times for subsequent compute nodes, leading to faster "cluster bring-up time" and general access time, as acknowledged in the '844 patent. The explicit teaching of client-side caching in a global file system by US7870106B1 further reinforces this motivation, showing that applying caching to shared data in a distributed storage environment was a recognized and desirable solution for performance improvement.

Combination for Claims 14, 19, and 23 (System, Method, and Computer-Readable Medium for File System Indexing)

Independent Claim 14: A system for indexing file systems for a plurality of compute nodes, the system comprising: a first storage unit configured to store blocks of a root image of said compute nodes; a plurality of second storage units configured to store leaf images of respective compute nodes, said leaf images comprising only additional data blocks not previously contained in said root image and changes made by respective compute nodes to the blocks of said root image, wherein said leaf images of respective compute nodes do not include blocks of said root image that are unchanged by respective compute nodes; and a plurality of union block devices corresponding to said compute nodes, said union block devices configured to interface between said compute nodes and said first and second storage units to distribute said file systems to said compute nodes, wherein said union block devices are configured to create said file systems by merging the blocks of said root image stored on the first storage unit with the blocks of respective leaf images stored on respective second storage units, and wherein further at least one of said compute nodes is configured to index said root image and provide the indexing results to another of said compute nodes.

Independent Claim 19: A method for indexing file systems for a plurality of compute nodes, comprising: storing blocks of a root image of said compute nodes on a first storage unit; storing leaf images for respective compute nodes on respective second storage units, said leaf images comprising only additional data blocks not previously contained in said root image and changes made by respective compute nodes to the blocks of the root image, wherein said leaf images for respective compute nodes do not include blocks of said root image that are unchanged by respective compute nodes; merging the blocks of said root image with the blocks of respective leaf images stored on respective second storage units to create respective file systems for respective compute nodes; receiving indexing results pertaining to said root image from one of said compute nodes; and providing said indexing results to the others of said compute nodes.

Independent Claim 23: A computer-readable storage medium having instructions stored thereon that, in response to execution by at least one computing device, cause the at least one computing device to: receive data blocks of a file system, said data blocks comprising a root image portion and leaf image portion, said leaf image portion comprising only additional data blocks not previously contained in said root image portion and, changes made by said first compute node to the blocks of said root image, wherein said leaf image portion does not include blocks of said root image that are unchanged by said first compute node, wherein said file system is the result of merging said root image portion and said leaf image portion together at the block-level; index said root image portion; and provide the results of said indexing to a second compute node, wherein said logic encoded in the one or more tangible media comprise computer executable instructions executed by the first compute node.

Prior Art Combination:
A combination of Nguyen et al. (U.S. Application No. 11/026,622), US6421777B1 (International Business Machines Corporation), and general knowledge of optimizing distributed computing resources would render Claims 14, 19, and 23 obvious.

  • Nguyen et al. (U.S. Application No. 11/026,622), as discussed previously, provides the core block-level root/leaf system and the merging of these images by union block devices (UBDs) to present a cohesive file system to each compute node. This covers the first three limitations of Claim 14 and the first three of Claim 19, and the initial data reception and merging of Claim 23. The '844 patent explains that "UBDs 230a-n are effectively low-level drivers that operate as an interface between the first and second storage devices and the file system of each compute node 220a-n."

  • US6421777B1 (International Business Machines Corporation), titled "Method and apparatus for managing boot images in a distributed data processing system," teaches managing boot images in a distributed system, which implies the presence of common system components across multiple nodes. This reference addresses challenges in distributed environments where many machines share a common base.

  • General knowledge in the art for optimizing distributed computing resources includes avoiding redundant computations, especially for common and unchanging data. The '844 patent itself describes the problem solved: "In a traditional clustered computing situation, each compute node would independently index its file system. To the extent that each compute node in a cluster has similar operating environments to the others, indexing performed on common data is therefore redundant." The patent further notes that for read-only root images, re-indexing is unnecessary as contents do not change.

Motivation to Combine:
A POSA, faced with a block-level root/leaf system (Nguyen et al.) where a common, read-only root image is shared by multiple compute nodes, would immediately recognize the inefficiency and wasted resources (CPU, disk I/O, network bandwidth) if each node independently indexed the identical root image. The goal in distributed computing, particularly High Performance Computing (HPC) clusters, is to maximize resource utilization and minimize redundant effort. Therefore, a POSA would be motivated to optimize this process by having only one compute node (or a dedicated indexing host, as the patent suggests as an alternative) perform the indexing of the unchanging root image. The results could then be shared with other compute nodes, either by providing them directly or storing them in a shared location (e.g., a shared storage unit or even the cache 260, as suggested by the patent). This would save "valuable time and resources" for the other compute nodes by eliminating redundant operations, a clear and obvious improvement in a distributed computing context. US6421777B1 highlights the general problem of managing common components in distributed boot image systems, further motivating solutions that streamline operations on these common parts.

Generated 5/31/2026, 12:48:00 PM