An interdisciplinary framework for learning methodologies—covering statistics, neural networks, and fuzzy logic, this book provides a unified treatment of the principles and methods for learning dependencies from data. It establishes a general conceptual framework in which various learning methods from statistics, neural networks, and fuzzy logic can be applied—showing that a few fundamental principles underlie most new methods being proposed today in statistics, engineering, and computer science. Complete with over one hundred illustrations, case studies, and examples making this an invaluable text.

Table Of Contents

Preface.

Notation.

1. Introduction.

1.1 Learning and Statistical Estimation.

1.2 Statistical Dependency and Causality.

1.3 Characterization of Variables.

1.4 Characterization of Uncertainty.

References.

2. Problem Statement, Classical Approaches, and Adaptive Learning.

2.1 Formulation of the Learning Problem.

2.1.1 Objective of Learning.

2.1.2 Common Learning Tasks.

2.1.3 Scope of the Learning Problem Formulation.

2.2 Classical Approaches.

2.2.1 Density Estimation.

2.2.2 Classification.

2.2.3 Regression.

2.2.4 Stochastic Approximation.

2.2.5 Solving problems with Finite Data.

2.2.6 Nonparametric Methods.

2.3 Adaptive Learning: Concepts and Inductive Principles.

2.3.1 Philosophy, Major Concepts, and Issues .

2.3.2 A priori Knowledge and Model Complexity.

2.3.3 Inductive Principles .

2.3.4 Alternative Learning Formulations.

2.4 Summary.

References.

3. Regularization Framework.

3.1 Curse and Complexity of Dimensionality.

3.2 Function Approximation and Characterization of Complexity.

3.3 Penalization.

3.3.1 Parametric Penalties.

3.3.2 Nonparametric Penalties.

3.4 Model Selection (Complexity Control).

3.4.1 Analytical Model Selection Criteria.

3.4.2 Model Selection via Resampling.

3.4.3 Bias-Variance Trade-off.

3.4.4 Example of Model Selection.

3.4.5 Function Approximation vs Predictive Learning.

3.5 Summary.

References.

4. Statistical Learning Theory.

4.1 Conditions for Consistency and Convergence of ERM.

4.2 Growth Function and VC-Dimension.

4.2.1 VC-Dimension for Classification and Regression Problems.

4.2.2 Examples of Calculating VC-Dimension.

4.3 Bounds on the Generalization.

4.3.1 Classification.

4.3.2 Regression.

4.3.3 Generalization Bounds and Sampling Theorem.

4.4 Structural Risk Minimization.

4.5 Comparisons of Model Selection for Regression.

4.5.1 Model selection for linear estimators.

4.5.2 Model Selection for k-Nearest Neighbors Regression.

4.5.3 Model Selection for Linear Subset Regression.

4.5.4 Discussion.

4.6 Measuring the VC-dimension.

4.7 Summary and Discussion.

References.

5. Nonlinear Optimization Strategies.

5.1 Stochastic Approximation Methods.

5.1.1 Linear Parameter Estimation.

5.1.2 Backpropagation Training of MLP Networks.

5.2 Iterative Methods.

5.2.1 Expectation-Maximization Methods for Density Estimation.

5.2.2 Generalized Inverse Training of MLP Networks.

5.3 Greedy Optimization.

5.3.1 Neural Network Construction Algorithms.

5.3.2 Classification and Regression Trees (CART).

5.4 Feature Selection, Optimization, and Statistical Learning Theory .

5.5 Summary.

References.

6. Methods for Data Reduction and Dimensionality Reduction.

6.1 Vector Quantization.

6.1.1 Optimal Source Coding in Vector Quantization.