Introduction

Line simplification is a perennial topic of research in cartography, as well as other fields employing linear representations of various forms of signal. While most scholars in cartographic generalization recognize the importance of scale to the determination of appropriate degrees of simplification, it is surprising that few line simplification algorithms in cartography are presented with means of objectively relating input tolerance parameter values to target scales, with notable exceptions from Perkal (1966), Dutton (1999), and Li and Openshaw (1992). This research presents a new algorithm termed hexagonal quantization. The development of this algorithm borrows from sampling theory to produce a method which is readily calibrated to target map scales and resolutions. The hexagonal quantization algorithm is a contribution in the spirit of Li and Openshaw's (1990) natural principle for map generalization: that visual resolution at target scale should determine the degree to which features are transformed.

Research context in cartography and elsewhere

Line simplification, like map generalization more generally, is often agreed to be a scale-driven process (Li 1996; Dutton 1999) that takes place in order to use detailed data from some larger cartographic scale on smaller scale maps. Among the most popular conceptualizations of line simplification has been that of Peucker (1976), who modeled polylines as potentially noisy vertex-based representations of true feature lines. In Peucker's model, the frequency of vertices along a polyline bears a relationship to how efficiently, and with how much lateral displacement, the polyline communicates the position of the true line. Peucker (1976, 512) describes his theory as data-reducing, with "the objective of the theory to keep the computation at any time on the highest level of abstraction that the particular problem allows". While he did not exclude the possibility that his ideas could apply to the simplification of lines as scale reduces, it is in this manner that the Douglas-Peucker algorithm (1973; McMaster and Shea 1992) has become popular. Related "bandwidth" approaches to line simplification have been developed in cartography, with notable contributions focused on "strip tree" segmentation (Ballard 1981; Buttenfield 1985; Cromley 1991). Further segmentation strategies, not necessarily focused on bandwidth concepts, have also been proposed, focusing on local line characteristics (Buttenfield 1989, 1991; Cromley and Campbell 1992; Garcia and Fdez-Valdivia 1994; Plazanet 1995), as well as geometrically defined partitions such as Delaunay triangulations (Van Der Poorten and Jones 2002), and other tessellations (Li and Openshaw 1992; Zhan and Buttenfield 1996; Dutton 1999). Recent work by Park and Yu (2011) has focused on segmentation of lines based on portions which yield the least positional displacement using one of several algorithms, and simplifying using various algorithms across various segments.

Some researchers (McMaster 1987; Jenks 1989; Veregin 1999) have regarded point reduction to a "characteristic" subset as necessary, if not sufficient, for an algorithm to be considered line simplification. This view has been challenged by several scholars (e.g., Dutton 1999; Li 2007). While there are grounds for thinking of certain vertices as more representative of a line's general shape (Marino 1979), Dutton (1999) raises the points that all vertices are representative geometric abstractions, that no particular subset of them should be considered sacrosanct when simplifying lines, and that the creation of new vertices may be appropriate for cartographic simplification.

Scale-specificity and related algorithms

Scale-specificity, a kind of map constraint, has been infrequently discussed in the literature. While it seems intuitive that the input tolerance value used for a given simplification algorithm is related to scale, objective and mathematical relation between scales and tolerance values remains elusive in practice (Li and Openshaw 1993). Topfer and Pillewizer (1966) authored one of the only pieces of literature expressly devoted to scale-specificity in generalization, their work having become known as the Radical Law. The work provides equations to calculate how many features should remain on a map generalized to a target scale, given the number of those features present at the initial scale. The Law has also been tailored to consider the characteristics of linear features in generalization (e.g., Buttenfield et al. 2011). Perkal's [epsilon]-band (1966; Chrisman 1983) is one of the few cartographic line simplification algorithms that is explicitly scale-calibrated, involving a circular roulette rolled along the edge of a polygonal or linear feature, whose diameter is mathematically derived from target map resolution and input and output scale values.

The Li--Openshaw raster-vector algorithm (Li and Openshaw 1992) is one of three variants based on the natural principle developed by the authors (1990, 1993), and forms a central part of Li's suggested paradigm for map generalization (Li and Su 1995; Li 1996). The algorithm calibrates to target scale using a ratio of input and target scales, as well as target map resolution, to define a "smallest visible size" (SVS). In the raster-vector variant, a raster is generated at SVS resolution and overlaid on the input line. Sequencing along the line, all vertices falling into a raster cell are collapsed to a single vertex using a user-defined quantization method. The vector variation of the algorithm uses the SVS to define a smallest visible object (SVO) which is translated along the line to define vertex quantizations; Li and Openshaw (1992) describe the vector mode algorithm with a circle whose diameter and peripheral intersection points with the input line define sampling loci.

Wang and Muller (1998) present a method, implemented in ArcGIS (ESRI Inc., Redlands, CA, USA) as "bend simplify," based on recognizing line shapes and filtering them against cartographic specifications. Veregin (2000) has suggested that popular algorithms such as the Douglas Peucker method can be used with tolerance parameters calibrated against a priori map constraints on aggregated samples of lines. Also, constraints such as "lines must not self-cross" have inspired post-processing routines (Saalfeld 1999).

Vertex clustering and mesh simplification

Outside of cartography, computer graphics specialists have also tackled the problem of form simplification. Mesh simplification is a family of methods for reducing rendering detail (Yang and Chuang 2003), often of three-dimensional manifolds composed of vertices and triangle faces. Frequently mesh simplification reduces detail to a degree commensurate with simulated viewing distance, as in a simulation or game (Hoppe 1996). Mesh simplification is generally done by eliminating vertices from a manifold, sometimes also creating new vertices to represent those that have been collapsed, in a process known as vertex clustering (Dalmau 2004). Rossignac (2004, 1224) writes:

"Vertex clustering, among the simplest simplification techniques, is based on a crude vertex quantization, obtained by imposing a uniform, axis-aligned grid and clustering all vertices that fall in the same grid cell."

Intuitively, the degree of simplification is a function of grid cell size. Within cartography, Burghardt and Cecconi (2007) have suggested mesh simplification be used for building generalization.

Hexagonal and square tessellations applied to pattern analysis and generalization

Applications of tessellated data models are widespread in pattern analysis and computer graphics, but have been rarely used in cartographic line simplification. Inherent to a tessellated representation is a process of quantization; a single measurement is recorded in each cell of a sampling mesh, thereby generalizing what is potentially a complex signal, such as a vertex set from a map line. The quantization is a function of the input data and the sampling geometry: size, shape, topology, translation, and orientation. Positional displacement (i.e. error) caused by quantization is bounded by the sampling resolution.

A large body of literature on pattern analysis examines the relative merits of the three possible regular tessellations of the plane, being the triangular, rectangular, and hexagonal. The triangular is rarely considered for sampling purposes, as it exhibits inherent orientation inconsistency. Overwhelmingly, the literature indicates that hexagonal sampling meshes perform more efficiently, with less error, and with more meaningful inter-element connectivity than square meshes (Yajima et al. 1981; Graham 1990; Carr et al. 1992; Iftekharuddin and Karim 1993; Puu 2005). Regardless of this virtual consensus, square pixel data models remain dominant in practices using regular grids, examples of which include common digital graphics formats, climatological and ecological models, and geographical information system raster data. Graham (1990) suggests that pioneering work in computer spatial modeling by Unger (1958) using square pixels may have set a decisive precedent. Also, the simple Cartesian coordinates with which square pixels may be easily indexed are a quality that makes them attractive (Birch et al. 2007).

[FIGURE 1 OMITTED]

[FIGURE 2 OMITTED]

Connectivity between cells is one of the most convincing reasons why hexagons are frequently regarded as better suited than squares to sampling planar signal. If the neighbors of a cell are those that contact it by either an edge or a corner, then the only shape among triangles, squares, and hexagons with a consistent distance to all its neighbors is the hexagon. Connectivity to neighboring hexagons is defined exclusively by edge contact, meaning that the spatial relationship between one hexagon and its neighbor always has a consistent spatial meaning (Yajima et al. 1981)...