Scheduled for Measurement Symposium—New Development on Compositional Data Analysis: It Becomes Easy With Excel Freeware and SAS Routines, Tuesday, March 30, 2004, 11:45 AM - 1:15 PM, Convention Center: 208


CoDaPack- An Excel and Visual Basic Based Software for Compositional Data Analysis

Santiago Thió-Henestrosa, Universitat de Girona, Girona, Spain and Yuanlong Liu, Western Michigan University, Kalamazoo, MI

Aitchison (1986) developed a new methodological approach for the statistical analysis of compositional data. This new methodology was developed in Basic routines grouped under the name CODA and later NEWCODA in Matlab (Aitchison, 1997). After that, several other authors have published extensions to this methodology (Martín-Fernández et al., 2000; Barceló-Vidal et al., 2001; Pawlowsky-Glahn & Egozcue, 2002; Pawlowsky-Glahn et al., 2003). This methodology is not straightforward to use with standard statistical packages. For this reason the Girona Compositional Data Analysis Group has developed a new freeware, named CoDaPack, which implements the statistical methods suitable to analysing compositional data. It is developed in VisualBasic associated to Excel and it is oriented towards users with minimum knowledge on computers, with the aim to be simple and easy to use. The purpose of this project is to illustrate how to conduct a compositional data analysis using CoDaPack with the male and female physical activity data in kinesiology. Using menus, one can execute macros to obtain the numerical results on the same Excel sheet and graphical outputs that appear in independent windows. In the present version there exist four menus with a total of 21 macros. The first menu, Transformations, performs several transformations of data from real space to the simplex and vice versa, that is, (1) Unconstrain/Basis, (2) Raw-ALR (additive log-ratio transformation, alr, and its inverse transformation, the generalised additive logistic transformation, agl), (3) Raw-CLR (centred log-ratio transformation, clr, and its inverse) and (4) Raw-ILR (the isometric log-ratio transformation, ilr, and its inverse transformation). The second menu, Operations, performs the following operations inside the simplex (1) Perturbation, (2) Power transformation, (3) Centering, (4) Standardization, (5) Amalgamation, (6) Subcomposition/Closure and (7) Rounded Zero Replacement. The third menu, Graphs, performs two dimensional graphs like ternary diagrams, plots of alr or clr transformed data sets, biplots, principal components plot, additive logistic normal predictive regions and confidence regions, the three last ones in the ternary diagram. In all of these graphs the user can customize the appearance of the graph and, in some cases, the user can mark the observations in the graph according to a previous classification. And, finally, the forth menu, Descriptive Statistics, provides characteristic values for a data set, like (1) Center, (2) Variation matrix that returns a matrix with the variance of the logarithms of the quotients of all the parts, and (3) Total variance. This freeware package will be provided to audience.
Keyword(s): assessment, measurement/evaluation, research

Back to the 2004 AAHPERD National Convention and Exposition