Méthodes Quantitatives
Avancées en Environnement Marin.
Multivariate Statistical
Methods in Marine Environment.
This
course is an introduction to multivariate statistical methods
for quantitative data. We remind some basics in matrix algebra before
introducing the notion of inertia in a dataset and relations with
usual statistical descriptors (mean, variance, etc...).
Rprojection.r
This Rfile illustrates the notion of projections and dimension reduction on simple examples.
Introduction (pdf file)
Introduction et motivation of the course through numerous examples in
oceanography.
Introduction to functional data analysis
This is a pdf beamer file on data that arrive as curves in oceanography. Fitted methods and basis functional pca is presented.
Corrected
exercises
Link toward interesting corrected exercises of J. F. Durand manuscript
(link below)..
Correction of TD1.
Find some
simple exercises assisted with R code in order to understand inertia
and variance-covariance notions.
- Link
towards R program
Correction of TD1B.
Find some
simple exercices assisted with basic R codes in order to understand
variance-covariance notions and the link with correlation.
- Link
towards statement + R program
Correction of TD2.
Some basic examples of PCA construction..
- Link towards statement + corrections.
Correction of TD4.
Some basic examples of PCA construction again.
- Link
towards statement + corrections.
Corrected
R program :
1) Functional data
analysis of a set of profiles of temperature (T, °C), salinity (S, SI)
and dissolved oxygen (DO, %).
An
observation of the following dataset is constituted with 30 values : 10
measures of T, 10 measures of S and 10 measures of DO sampled every 1m
from 0m to 9m in the Berre lagoon. An observation can then be seen as a
multivariate sampled profile of 3 variables.
We
dispose of a
collection of such observations from 1994 to 2010 with about 24
observations a year. The objective of this work is to study the
variability of this dataset using PCA. However, some constraints must
be added to the PCA analysis because the data are functional and
multivariate. We propose a PCA version which takes into account the
functional structure of tha dataset as well as the covariance structure
between profiles of T, S and DO. Before solving the eigenvalue problem
associated with PCA, data are weighted by dividing each block of
variables
(T, S and DO respectively) by the square root of the trace of the
variance-covariance matrix of each block. This allows to compare
profiles composed with variables which do not have the same unity.
- Link
towards the R program
- Link toward the datasets :
bers.txt
: contains salinity profiles
bert.txt
: contains temperature profiles
bero.txt
: contains dissolved oxygen profiles
2) PCA of metal contaminants.
We dispose of several stations on the Berre lagoon where dosages
of heavy metal
contamination and organic carbon have
been carried out. We propose a PCA of the data in order to construct a
pollution index and to construct a spatial map of the contamination.
- access to contamination
data
- access to main
R file
- access to Lat.
location and Long.
location
- access to bathymetry
file
- access to coast line
3) PCA R script.
A complete source code in R for PCA and interpretation tools
acpxqd.r
Useful
links
- To Ph.
Besse web page, in Toulouse where excellent courses of data
analysis can be found as well as R courses and exercises.
- To J.
F. Durand manuscript, a nice course on PCA
basics and a reminder of matrix algebra.