THE ROLE OF SYMBOL CLOSURE IN VISUAL ENCODING: FROM PERCEPTION TO VISUAL ANALYSIS OF SYNTHETIC AND REAL-WORLD DATA
Symbols and shapes are commonly employed to represent data in visualizations such as scatterplots. Practitioners, scientists, and automated visualization tools are reliant on empirical analyses of visual encoding strategies, taking into account the influence of data characteristics and visual features, to produce effective charts and graphs. In pursuit of this goal, the following questions were considered: (1) Are shapes that share bounded or unbounded structures members of a feature category that influences how they are processed and perceived? (2) Do open/closed feature categories differ in how they are processed? (3) How do shape encodings interact with characteristics of the data and types of tasks in visualization contexts? In this work I investigated the implications of perceptual categories of commonly used charting symbols sharing bounded (closed) or unbounded (open) structures using a series of experiments from low-level attentional allocation to high-level task performance in ensemble displays. Flanker and same/different tasks were used to explore the perceived similarity among open and closed symbols; participants responded to closed symbols more quickly and accurately, and discriminations within a feature category took longer than between categories, supporting the categorical distinctiveness of symbols with and without boundaries. Three relative judgment tasks (mean position, numerosity, and linear correlation) were implemented using exemplars of these shape categories as encodings in multiclass scatterplots in order to test whether performance differences due to categorical features would subsume differences among symbols. Each task was reliably harder when marks were encoded with shapes sharing open or closed features, and conditions with closed targets received more influence from distractor features, i.e. both facilitation with different-featured distractors and inhibition with same-featured distractors. A follow-up study with a larger symbol palette and systematic variation of the level of overlap among marks in numerosity and linear correlation tasks found similar results; open target sets took significantly longer and induced significantly more errors than closed targets, regardless of overlap or distractor features. The final study incorporated more realistic displays, with data sampled from the Toxics Release Inventory, a dataset on industrial usage of toxic chemicals, and chart axes and labels. Participant performance on relative judgment tasks differed across pairs of symbols used as mark encodings, but pairs sharing open or closed bounding features always took longer than pairs differing in that feature, and displays with closed targets were always faster and less erroneous than displays with more numerous open targets, comporting with the findings from the previous studies. Overall, the categorical relationship between open and closed symbols and the perceptual preference for closed symbols was clear in all the experiments, and persisted across relative judgment tasks, when overlap among marks was systematically varied, and with palettes of symbols containing different exemplars from both feature categories. This sequence of results has implications for visualization designs in which shapes are used as categorical encodings, and also poses new questions for the vision science and visualization communities. Further studies can model the role of shape encodings with a wider variety of data types and distributions, in tandem with more extensive tasks, and supporting more comprehensive encoding strategies involving redundant visual channels. Future work will also be required to understand the mechanisms underpinning shape perception and to explain the apparent salience of bounded symbols.