Obviousness — US Patent 10713672

Obviousness Analysis of US Patent 10713672 Under 35 U.S.C. § 103

This analysis identifies combinations of prior art references that would render the independent claims of US patent 10713672 obvious to a person having ordinary skill in the art (POSA) prior to the patent's earliest priority date of August 30, 2012. The analysis relies on the prior art references explicitly mentioned and incorporated by reference within the patent itself, along with general knowledge in related technical fields.

Independent Claims Overview

US Patent 10713672 B1 includes three independent claims:

Claim 1: A computer-implemented method for identifying geographic clusters of venues based on venue check-in data, involving generating check-in intensity vectors for venues, creating a pairwise venue similarity matrix combining geographical and social distance (based on common visitors), and identifying clusters from this matrix.
Claim 10: A computer-implemented method similar to Claim 1, but applied to geographic sub-regions (e.g., census tracts) rather than individual venues.
Claim 17: A computer system configured to perform the method of Claim 1.

Prior Art References and Their Teachings

The patent US10713672 itself cites and incorporates several relevant prior art references, all published before the critical date of August 30, 2012:

Cheng et al. (2011), "Exploring millions of footprints in location sharing services," AAAI ICWSM, 2011. This reference demonstrates the availability and utility of large datasets of venue check-in data from location-based social networks like Foursquare for analyzing location patterns. The patent explicitly states that a dataset of approximately 16 million Foursquare check-ins was used, with eleven million of these extracted from data released by Cheng et al. (2011).
Blei and Frazier (2011), "Distance dependent Chinese restaurant processes," J. Mach. Learn. Res., November 2011. This paper introduces the Distance Dependent Chinese Restaurant Process (ddCRP), a probabilistic model for clustering non-exchangeable data that utilizes a similarity matrix (A) to specify prior assumptions about relationships between items.
Ghosh et al. (2011), "Spatial distance dependent Chinese restaurant processes for image segmentation," Neural Information Processing Systems, 2011. This work extends the ddCRP to hierarchical modeling and applies it in a spatial context, specifically for image segmentation.
Ghosh et al. (2012), "From deformations to parts: Motion-based segmentation of 3d objects," Advances in Neural Information Processing Systems 25, pp. 2006-2014, 2012. This reference is noted in US10713672 for its MATLAB implementation of a ddCRP Gibbs sampler, indicating the availability of practical tools for implementing ddCRP.

Obviousness Argument for Independent Claims 1 and 17 (Venue Clusters)

A POSA in data science, machine learning, or urban computing, prior to August 30, 2012, would have found it obvious to combine the teachings of Cheng et al. (2011), Blei & Frazier (2011), and Ghosh et al. (2011) to arrive at the method and system of Claims 1 and 17.

Motivation for Combination:

The motivation would be to effectively discover meaningful geographic clusters of venues by leveraging both the physical proximity of venues and the social interactions of users reflected in location-based check-in data. This addresses the problem of understanding urban structure, which the patent itself identifies as critical for various endeavors like urban planning, real estate, and marketing.

Detailed Breakdown:

Obtaining and Representing Venue Check-in Data (Preamble & Claim 1(i)): Cheng et al. (2011) explicitly teaches the availability and utility of large-scale venue check-in data from location-sharing services. A POSA would readily understand that such data inherently contains information about which users visited which venues and when. The idea of representing user activity at venues using "check-in intensity vectors" (Claim 1(i)), where each element corresponds to a user and its value reflects check-in frequency, is a standard data representation technique analogous to "bag-of-words" models in text analysis or user-item interaction matrices in recommender systems. This is a routine step for analyzing user engagement with discrete entities.
Generating a Pairwise Venue Similarity Matrix Combining Geographical and Social Distance (Claim 1(ii)):
- Geographical Distance: The patent notes that "Almost always, the geographical proximity of venues is a factor in grouping venues into a cluster." Calculating geographical distance between venues based on coordinates (e.g., GPS) is a fundamental and well-known operation in any location-based service, as implied by the "location sharing services" discussed in Cheng et al. (2011).
- Social Distance Based on Common Visitors: Given the context of "location sharing services" (Cheng et al. 2011), a POSA would recognize that shared user check-ins between venues indicate a social connection or common patronage. Defining "social distance" (or similarity) based on whether common venue visitors frequent a pair of venues, and computing this similarity using established metrics like cosine or Jaccard similarity on user-venue interaction vectors (as described in the patent), is a conventional application of social network analysis principles.
- Combining Geographical and Social Distance: Blei & Frazier (2011) introduce the ddCRP model, which employs a "similarity matrix A" that is a "flexible way to specify prior assumptions about the strength of relationships between pairs of venues." Ghosh et al. (2011) further demonstrates the application of ddCRP in spatial contexts. A POSA, faced with the goal of clustering venues based on both their physical locations and user-driven social patterns, would be motivated to combine these two natural similarity factors (geographical proximity and social interaction). Merging different factors into a single similarity metric, for example, through weighted sums or by spatially constraining social similarity (e.g., using m closest neighbors as shown in patent equation (1)), is a common and obvious design choice in multivariate data analysis for clustering.
Identifying Geographic Clusters (Claim 1(iii)): Once a comprehensive similarity matrix (A) is generated, identifying clusters based on this matrix is a direct application of known clustering algorithms. The patent explicitly mentions using "spectral clustering" and the "distance dependent Chinese restaurant process (ddCRP)." Both spectral clustering and ddCRP (from Blei & Frazier 2011 and Ghosh et al. 2011) are well-established clustering techniques that operate on similarity matrices. Applying these known algorithms to a similarity matrix that combines geographical and social aspects would be an obvious choice for a POSA seeking to achieve the desired clustering. The MATLAB implementation of a ddCRP Gibbs sampler referenced in Ghosh et al. (2012) further indicates that the tools for such implementations were available.

Therefore, the combination of Cheng et al. (2011) for providing the data context and problem, and Blei & Frazier (2011) and Ghosh et al. (2011) for providing a suitable clustering framework that explicitly accommodates a flexible similarity matrix, coupled with general knowledge of how to derive and combine geographical and social similarity from location-based social network data, would have rendered Claims 1 and 17 obvious.

Obviousness Argument for Independent Claim 10 (Sub-Region Clusters)

Claim 10 describes applying the same clustering methodology to geographic sub-regions (e.g., census tracts) instead of individual venues. The patent itself explicitly describes this as an alternative embodiment: "In other embodiments, rather than clustering venues as described above, the system could be used to cluster sub-regions in the geographic region, where the sub-regions themselves contain multiple venues."

A POSA interested in analyzing urban patterns at different spatial scales would find it obvious to extend the venue-level clustering method to a higher geographical aggregation level, such as sub-regions.

Aggregating Check-in Data to Sub-Regions (Claim 10(i)): Instead of individual venues, check-in data would be aggregated to the sub-regions (e.g., cumulative check-ins to all venues within a sub-region by a given user). This is a routine data aggregation step in geographical information systems (GIS) and urban analytics.
Generating Similarity Matrix for Sub-Regions (Claim 10(ii)): The principles for determining similarity between sub-regions would follow directly from the venue-level approach. Geographical similarity between sub-regions (e.g., distance between centroids) is well-known. Social similarity would be derived from common users checking into venues within those sub-regions. Combining these factors into a similarity matrix for sub-regions would be an obvious parallel extension of the method described for venues.
Identifying Clusters of Sub-Regions (Claim 10(iii)): Applying the same known clustering algorithms (like spectral clustering or ddCRP) to this sub-region similarity matrix to identify clusters of sub-regions is a straightforward application of the underlying methodology.

The motivation for a POSA to apply the clustering technique to sub-regions would be to analyze urban patterns at different granularities, or to align the discovered clusters with existing administrative or statistical boundaries for planning and analytical purposes. This represents a known design choice in spatial analysis rather than an inventive step over the venue-level clustering method.