Representing Uncertain Clustering. The case of Social Specialization on the French Riviera
Giovanni FUSCO, Cristina Minh Thu CAO - UMR 7300 ESPACE, CNRS / Université Nice Sophia Antipolis
Analyses and Data-Viz realized within the PEPS Géo-Incertitude project (CNRS grant 2014-15).
Social specialization of residential space within an urban area is the concentration of households according to some characteristics like social status, demography, ethnicity, etc. in different urban subspaces. The metropolization process is often associated to increased social specialization in large urban areas (Lacour and Gaschet 2008). Taken to the extreme, social specialization can produce residential segregation, creating dividing lines within the metropolitan space and undermining social and territorial cohesion (Massey 1985). Nevertheless, knowledge of social specialization of space in metropolitan areas is still incomplete and general assumptions must be confronted to empirical data in case studies. More specifically, which groups of households are most opposed in residential space ? Which sociodemographic factors contribute most to these oppositions in space ? The answers to these questions are affected by high levels of uncertainty, which poses the question of uncertainty-based methods for describing our uncertain knowledge. Finally, how could we represent and communicate most effectively this uncertain geographic knowledge ?
To tackle these questions, a new clustering approach is proposed integrating both sociodemographic descriptors of households and geographic distribution of their place of residence. The case study is the metropolitan area of the French Riviera, a coastal conurbation of more than 1 million inhabitants which is particularly concerned by social specialization of residential space (Fusco and Scarella 2011).
Analyses of social specialization normally start by identifying target populations, whose segregation indicators are later calculated (Apparicio 2000). We opted for a bottom-up uncertainty-based approach : clusters of households were identified through data mining of selected sociodemographic variables within the 2008 Household Mobility Survey. A sample of 7539 households is analyzed through 16 variables describing social status, household structure, household demography and place in the workforce. Once the residence sector is added as a further variable, different strategies of Bayesian clustering (Korb and Nicholson 2004) are explored. This approach is different from previously developed research (Pallez et al. 2015) : clustering is uncertain because household assignment to clusters is probabilistic and households can have several non-zero probabilities for different clusters. Passing from the sample to the household population introduces additional uncertainties. Cluster labels are vague, too, and their sociodemographic characteristics are described through Bayesian probabilities. Social specialization of metropolitan sectors with respect to these clusters is evaluated in terms of the classical dissimilarity index (Duncan and Duncan 1955), but different evaluations are proposed for different levels of uncertainty. More precisely, four analyses are developed with respect to uncertain knowledge :
Probabilistic description of household clusters.
Sociodemographic content of metropolitan sectors.
Socio-spatial contrasts between neighboring metropolitan sectors.
Proximity of metropolitan sectors in variable space.
Several authors have already proposed approaches for graphical representation of uncertain information (MacEachren 1992, MacEachren and Howard 1993, Ehlschlaeger et al. 1997, Cedelnik and Rheingans 2000, Ward 2002). Within our research, we propose an interactive online geo-data visualization solution in order to explore the results of uncertainty-based analyses. The Tableau® dataviz platform was used to create four specific dashboards for every analysis. Systems of dashboards for interactive visualization seem particularly useful in representing uncertain and complex phenomena like social specialization of space.
Figure 1 - Sociodemographic content of metropolitan sectors in the French Riviera.
Figure 2 - Socio-spatial contrasts between neighboring metropolitan sectors in the French Riviera.
- Apparicio P. (2000) Residential segregation indices : a tool integrated into a geographical information system. Cybergeo : European Journal of Geography, n. 134, http://cybergeo.revues.org/12063
- Cedilnik, A., Reinghans, P. (2000), Procedural annotation of uncertain information, In Proceedings of Visualization ’00, IEEE Computer and Society Press, 77-84
- Duncan, O. et B. Duncan (1955). A methodological analysis of segregation indexes. American Sociological Review, 20(2), 210-217
- Ehlschlaeger C., Shortridge A., Goodchild M. (1997) Visualizing Spatial Data Uncertainty Using Animation, Computers & Geosciences, vol 23, No. 4, 387-395
- Fusco, G. et F. Scarella (2011). Métropolisation et ségrégation sociospatiale. Les flux des migrations résidentielles en PACA. L’Espace Géographique, 40(4), 319-336
- Korb K., Nicholson A. (2004) Bayesian Artificial Intelligence, Chapman & Hall / CRC
- Lacour, C. et F. Gaschet (2008) Métropolisation et ségrégation, Bordeaux : PUB.
- MacEachren, A. (1992). Visualizing uncertain information. Cartographic Perspective, 13, 10-19
- MacEachren, A. M., D. Howard, et al. (1993) Visualizing the health of Chesapeake Bay : An uncertain endeavor, In GIS/LIS Proceedings, vol. 1, Am Soc Photogram and Rem Sens/Am Congr Surv Map, Bethesda MD, 449–458
- Massey D. (1985) Ethnic residential segregation : A theoretical synthesis and empirical review. Sociology and Social Research, 69(3), 315-350
- Pallez D., Serrurier M., Da Costa Pereira C., Fusco G., Cao C. (2015) Social Specialization of Space : Clustering Households on the French Riviera, GECCO Companion 15 - Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation, ACM, New York, 1447-1448
- Ward, M.O. (2002) A taxonomy of glyph placement strategies for multidimensional data Visualization, Information Visualization, 194-210