Prior art — US Patent 8190610

The current date is April 26, 2026.

Here is an analysis of the most relevant prior art for US Patent 8,190,610, focusing on the patent citations listed in the Google Patents entry. The provided document lists 10 patent citations and 1 non-patent citation. For each, I will provide the full citation, publication/filing date, a brief description, and the claim(s) it potentially anticipates under 35 U.S.C. § 102.

Non-Patent Citation

1. Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters"

Full Citation: Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", USENIX Association OSDI '04: 6th Symposium on Operating Systems Design and Implementation, Dec. 6-8, 2004, pp. 137-149.
Publication/Filing Date: December 6-8, 2004
Brief Description: This paper describes Google's MapReduce programming model and its implementation for simplified data processing on large clusters. It details the "map" function for processing input key/value pairs into intermediate key/value pairs, and the "reduce" function for merging intermediate values associated with the same key into a single output. This foundational work is explicitly acknowledged as background in US8190610.
Potential Anticipated Claims (35 U.S.C. § 102): The patent itself states that conventional MapReduce implementations, such as those described by Dean and Ghemawat, do not have the facility to efficiently process data from heterogeneous sources, particularly making it impractical to perform joins over relational tables with different schemas. Claims 1, 17, 33, and 40, which introduce the concept of processing multiple data groups with different schemas but a common key in a MapReduce framework, specifically aim to enhance the utility of the MapReduce methodology for such heterogeneous data. Therefore, while Dean and Ghemawat describe the core MapReduce operations, they do not appear to anticipate the specific improvements related to heterogeneous grouped datasets with common keys as claimed in US8190610. However, aspects of the basic map, reduce, and partitioning operations, as well as the distributed system elements common to all MapReduce implementations, could be found in Dean and Ghemawat. Therefore, it could potentially anticipate broad aspects of claims 1, 17, 33, and 40 related to general MapReduce processing, but not the specific "grouped sets of key/value pairs" from heterogeneous datasets as the inventive step.

Patent Citations

1. US6158044A - Proposal based architecture system

Full Citation: US6158044A to Epropose, Inc., published December 5, 2000.
Publication/Filing Date: Publication: 2000-12-05; Priority: 1997-05-21
Brief Description: This patent generally describes a system for generating proposals, involving retrieving and assembling data from various sources. It focuses on object-oriented architecture and database integration for proposal generation. It does not appear to describe a MapReduce framework.
Potential Anticipated Claims (35 U.S.C. § 102): Given its focus on proposal generation and object-oriented systems rather than distributed data processing with MapReduce, it is unlikely to directly anticipate the core claims of US8190610. It might be cited for general concepts of data retrieval or system architecture.

2. US6341289B1 - Object identity and partitioning for user defined extents

Full Citation: US6341289B1 to International Business Machines Corporation, published January 22, 2002.
Publication/Filing Date: Publication: 2002-01-22; Priority: 1999-05-06
Brief Description: This patent relates to object identity and data partitioning within database systems, particularly for user-defined data "extents." It addresses how objects are identified and how data can be partitioned across different storage structures.
Potential Anticipated Claims (35 U.S.C. § 102): This patent might be relevant to the "partitioning" aspects of US8190610's claims (e.g., in claims 1, 6, 7, 17, 23, 24, 33, 35, 40, 42) which discuss partitioning data into data partitions. However, it does not describe the MapReduce methodology or the handling of heterogeneous schemas with a common key in that context.

3. US6678691B1 - Method and system for generating corporate information

Full Citation: US6678691B1 to Koninklijke Kpn N.V., published January 13, 2004.
Publication/Filing Date: Publication: 2004-01-13; Priority: 1997-11-06
Brief Description: This patent describes a method and system for generating corporate information, likely involving data aggregation and processing from various sources within an enterprise. It focuses on business information systems.
Potential Anticipated Claims (35 U.S.C. § 102): This patent appears to be of general relevance to data processing for information generation rather than the specific distributed MapReduce methodology with heterogeneous grouped datasets. It is unlikely to anticipate the specific inventive steps of US8190610.

4. US20040225638A1 - Method and system for data mining in high dimensional data spaces

Full Citation: US20040225638A1 to International Business Machines Corporation, published November 11, 2004.
Publication/Filing Date: Publication: 2004-11-11; Priority: 2003-05-08
Brief Description: This publication discusses data mining techniques in high-dimensional data spaces. It likely involves algorithms and methods for extracting patterns and insights from complex datasets.
Potential Anticipated Claims (35 U.S.C. § 102): While broadly related to data processing, this patent application does not specifically describe the MapReduce framework or the inventive concept of grouped heterogeneous datasets as claimed in US8190610.

5. US20040230567A1 - Integrating intellectual capital into an intellectual capital management system

Full Citation: US20040230567A1 to Wookey Michael J., published November 18, 2004.
Publication/Filing Date: Publication: 2004-11-18; Priority: 2003-05-12
Brief Description: This patent application describes integrating intellectual capital into a management system, focusing on managing knowledge and intellectual assets.
Potential Anticipated Claims (35 U.S.C. § 102): This reference is unrelated to distributed data processing using MapReduce and would not anticipate the claims of US8190610.

6. US20060117036A1 - Method and apparatus to support bitmap filtering in a parallel system

Full Citation: US20060117036A1 to Thierry Cruanes, published June 1, 2006.
Publication/Filing Date: Publication: 2006-06-01; Priority: 2004-11-30
Brief Description: This patent application describes a method and apparatus for supporting bitmap filtering in a parallel system. This is a technique used in database systems for efficient data retrieval.
Potential Anticipated Claims (35 U.S.C. § 102): This could potentially be relevant to elements of a "distributed system" or "parallel processing" mentioned in claims 1, 17, 33, and 40. However, it does not disclose the MapReduce framework, the use of map and reduce functions for heterogeneous grouped data, or the specific iterator-based merging of intermediate results.

7. US7065618B1 - Leasing scheme for data-modifying operations

Full Citation: US7065618B1 to Google Inc., published June 20, 2006.
Publication/Filing Date: Publication: 2006-06-20; Priority: 2003-02-14
Brief Description: This patent describes a leasing scheme for controlling data-modifying operations, likely in a distributed system, to ensure data consistency and integrity.
Potential Anticipated Claims (35 U.S.C. § 102): This patent is concerned with data consistency in distributed systems, which is a different technical problem than the core MapReduce innovation for heterogeneous data in US8190610. It is unlikely to anticipate the claims.

8. US20070038659A1 - Scalable user clustering based on set similarity

Full Citation: US20070038659A1 to Google, Inc., published February 15, 2007.
Publication/Filing Date: Publication: 2007-02-15; Priority: 2005-08-15
Brief Description: This patent application describes scalable user clustering based on set similarity, a data analysis technique for grouping users with similar characteristics.
Potential Anticipated Claims (35 U.S.C. § 102): This reference focuses on clustering algorithms and would not anticipate the core MapReduce claims of US8190610. Its filing date is after US8190610's priority date of 2006-10-05, so it would not qualify as prior art under 35 U.S.C. § 102(a)(1) or (2).

9. US20070255685A1 - Method and system for modelling data

Full Citation: US20070255685A1 to Boult Geoffrey M, published November 1, 2007.
Publication/Filing Date: Publication: 2007-11-01; Priority: 2006-05-01
Brief Description: This patent application describes methods and systems for modeling data, likely involving structuring and representing data for various applications.
Potential Anticipated Claims (35 U.S.C. § 102): This reference focuses on data modeling and would not anticipate the core MapReduce claims of US8190610. Its publication date is after US8190610's priority date of 2006-10-05, so it would not qualify as prior art under 35 U.S.C. § 102(a)(1) or (2).

10. US7620936B2 - Schema-oriented content management system

Full Citation: US7620936B2 to Coremedia AG, published November 17, 2009.
Publication/Filing Date: Publication: 2009-11-17; Priority: 2002-03-21
Brief Description: This patent describes a schema-oriented content management system, which deals with managing digital content based on defined schemas or structures.
Potential Anticipated Claims (35 U.S.C. § 102): While US8190610 discusses "schemas" in the context of different data groups, this patent's focus on content management is distinct from the distributed MapReduce processing of heterogeneous datasets. Its publication date is also well after US8190610's priority date.

Summary of Most Relevant Prior Art:

Based on the explicit mention in the patent and the nature of the invention, the most relevant prior art is the non-patent citation:

Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (December 2004): This publication is highly relevant as it describes the fundamental MapReduce programming methodology. The US8190610 patent builds upon this by extending MapReduce to handle grouped sets of key/value pairs from heterogeneous datasets, enabling operations like distributed relational database joins that were impractical with the conventional MapReduce. Therefore, while it lays the groundwork, it does not appear to fully anticipate the specific inventive step of handling heterogeneous data groups with differing schemas but a common key, as detailed in claims 1, 17, 33, and 40 of US8190610.

Among the patent citations, US6341289B1 (Object identity and partitioning) and US20060117036A1 (Bitmap filtering in a parallel system) touch upon concepts of data partitioning and parallel processing, which are general aspects of distributed systems that also form part of the MapReduce environment. However, none of the listed patent citations appear to anticipate the specific innovation of US8190610 regarding the enhanced MapReduce for heterogeneous grouped datasets with common keys.

It is important to note that the PTAB proceedings (IPR2024-00659 and IPR2024-00303) that were denied institution on the merits did challenge claims of US8190610 based on combinations of prior art, including Dean and Ghemawat and US7103590B2 (a patent not listed in the "Citations" section but in "Similar Documents"), indicating that these references, in combination, were considered by petitioners as potentially anticipating or rendering obvious the claims. The Board, however, found those specific arguments unpersuasive for institution. This further reinforces that while Dean and Ghemawat is foundational, the specific inventive aspects of US8190610 were deemed to go beyond it in those particular challenges.