Fourth Summer school on Advanced Statistics and Data Mining (Madrid, July 6th-17th, 2009)

Rubén Armañanzas ruben at si.ehu.es
Di Mai 26 14:30:04 CEST 2009


Dear colleagues,

the Technical University of Madrid (UPM) organizes a summer school on
"Advanced Statistics and Data Mining" in Madrid between July 6th and
July 17th. The summer school comprises 18 courses divided in 2 weeks.
Attendees may register in each course independently. Registration will
be considered upon strict arrival order.

For more information, please, visit
http://www.dia.fi.upm.es/index.php?page=presentation&hl=es_ES

Best regards.

On behalf of the organizers, R. Armañanzas.

*List of courses and brief description*

Week 1 (July 6th - July 10th, 2009)

Course 1: Bayesian networks (15 h). Practical sessions: Hugin, Elvira,
Weka, LibB
    Bayesian networks basics. Inference in Bayesian networks.
    Learning Bayesian networks from data.

Course 2: Multivariate data analysis (15 h). Practical sessions: MATLAB
    Introduction. Data Examination. Principal component analysis
    (PCA). Factor Analysis. Multidimensional Scaling (MDS).
    Correspondence analysis. Multivariate Analysis of Variance
    (MANOVA). Canonical correlation.

Course 3: Dimensionality reduction (15 h). Practical sessions: MATLAB
    Introduction. Matrix factorization methods. Clustering methods.
    Projection methods. Applications.

Course 4: Supervised pattern recognition (Classification) (15 h).
Practical sessions: Weka
    Introduction. Assessing the Performance of Supervised
    Classification Algorithms. Classification techniques. Combining
    Classifiers. Comparing Supervised Classification Algorithms.

Course 5: Introduction to MATLAB (15 h)
    Overview of the Matlab suite. Data structures and files.
    Programming in Matlab. Visualization tools. Some applications in
    pattern recognition.

Course 6: Datamining: A practical perspective (15h). Practical sessions:
MATLAB, R, Weka
    Introduction to Data Mining and Knowledge Discovery. Prediction
    in data mining. Classification. Association studies. Data mining
    in free-form texts: text mining.

Course 7: Time series analysis (15 h). Practical sessions: MATLAB
    Introduction. Probability models to time series. Regression and
    Fourier analysis. Forecasting and Data mining.

Course 8: Neural networks (15 h). Practical sessions: MATLAB
    Introduction to the biological models. Nomenclature. Perceptron
    networks. The Hebb rule. Foundations of multivariate
    optimization. Numerical optimization.
    Rule of Widrow-Hoff. Backpropagation algorithm.
    Practical data modelling with neural networks.

Course 9: Introduction to SPSS (15 h)
    Introduction. Describing data. Statistical inference. Time
    series. Sampling. Classification and regression.

Week 2 (July 13th - July 17th, 2009)

Course 10: Regression (15 h). Practical sessions: SPSS
    Introduction. Simple Linear Regression Model. Measures of model
    adequacy. Multiple Linear Regression. Regression Diagnostics and
    model violations. Polynomial regression. Variable selection.
    Indicator variables as regressors. Logistic regression.
    Nonlinear Regression.

Course 11: Practical Statistical Questions (15 h). Practical sessions:
study of cases (without computer)
    I would like to know the intuitive definition and use of …: The
    basics. How do I collect the data? Experimental design.
    Now I have data, how do I extract information? Parameter
    estimation. Can I see any interesting association between two
    variables, two populations, …?
    How can I know if what I see is “true”? Hypothesis testing
    How many samples do I need for my test?: Sample size
    Can I deduce a model for my data? Other questions?

Course 12: Missing data and outliers (15 h). Practical sessions: R
    Missing Data: Typology of missing data. Simple missing-data
    methods. Imputation Methods. Diagnostics and Overimputing.
    Outliers and robust statistics: Typology of outliers. Influence
    measures. Robust methods.

Course 13: Hidden Markov Models (15 h). Practical sessions: HTK
    Introduction. Discrete Hidden Markov Models. Basic algorithms
    for Hidden Markov Models. Semicontinuous Hidden Markov Models.
    Continuous Hidden Markov Models. Unit selection and clustering.
    Speaker and Environment Adaptation for HMMs.
    Other applications of HMMs.

Course 14: Statistical inference (15 h). Practical sessions: SPSS
    Introduction. Some basic statistical test. Multiple testing.
    Introduction to bootstrapping.

Course 15: Features Subset Selection (15 h). Practical sessions: MATLAB,
R, Weka
    Filter approaches. Wrapper methods. Embedded methods.

Course 16: Introduction to R (15 h)
    An introductory R session. Data in R. Importing/Exporting data.
    Programming in R. R Graphics. Statistical Functions in R.

Course 17: Unsupervised pattern recognition (clustering) (15 h).
Practical sessions: MATLAB
    Introduction. Prototype-based clustering. Density-based
    clustering. Graph-based clustering. Cluster evaluation.
    Miscellanea.

Course 18: Evolutionary computation (15 h). Practical sessions: MATLAB
    Genetic algorithms. Genetic programming. Robust and
    self-adapting intelligent systems. Introduction to Estimation of
    Distribution Algorithms. Improvements, extensions and
    applications of EDAs. Current research in EDAs.




Mehr Informationen über die Mailingliste IFI-CI-Event