Seminários em Estatística e Ciência de Dados - Spatial CART Classification Trees
Abstract: CART (Classification And Regression Trees) is a statistical method designing tree predictors for both regression and classification. We restrict our attention to the classification case with two populations. Each observation is characterized by some input variables gathered in X and a binary response variable Y. The principle of CART is to recursively partition the input space using binary splits and then to determine an optimal partition for prediction. The representation of the model relating Y to X is a tree representing the process of construction of the model. If the explanatory variables are spatial coordinates, we get a spatial decision tree and this induces a tessellation of the input space. We propose a spatial variant of the CART method, SpatCART. While the usual CART tree considers the marginal distribution of the response variable at each node, we propose to take into account the spatial location of the observations. We introduce a dissimilarity index based on Ripley's intertype K-function quantifying the interaction between two populations. This index used for the growing step of the CART strategy leads to a heterogeneity function consistent with the original CART algorithm. The proposed procedure SpatCART is finally applied to a tropical forest example.
This a joint work with A. Bar-Hen and S. Gey recently published in Computational Statistics https://link.springer.com/