Record:   Prev Next
作者 Basu, S
書名 Advances in Learning Theory : Methods, Models and Applications
出版項 Amsterdam : IOS Press, 2003
©2003
國際標準書號 9781601294012 (electronic bk.)
9781586033415
book jacket
說明 1 online resource (438 pages)
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
附註 Cover -- Title page -- Preface -- Organizing committee -- List of chapter contributors -- Contents -- 1 An Overview of Statistical Learning Theory -- 1.1 Setting of the Learning Problem -- 1.1.1 Function estimation model -- 1.1.2 Problem of risk minimization -- 1.1.3 Three main learning problems -- 1.1.4 Empirical risk minimization induction principle -- 1.1.5 Empirical risk minimization principle and the classical methods -- 1.1.6 Four parts of learning theory -- 1.2 The Theory of Consistency of Learning Processes -- 1.2.1 The key theorem of the learning theory -- 1.2.2 The necessary and sufficient conditions for uniform convergence -- 1.2.3 Three milestones in learning theory -- 1.3 Bounds on the Rate of Convergence of the Learning Processes -- 1.3.1 The structure of the growth function -- 1.3.2 Equivalent definition of the VC dimension -- 1.3.3 Two important examples -- 1.3.4 Distribution independent bounds for the rate of convergence of learning processes -- 1.3.5 Problem of constructing rigorous (distribution dependent) bounds -- 1.4 Theory for Controlling the Generalization of Learning Machines -- 1.4.1 Structural risk minimization induction principle -- 1.5 Theory of Constructing Learning Algorithms -- 1.5.1 Methods of separating hyperplanes and their generalization -- 1.5.2 Sigmoid approximation of indicator functions and neural nets -- 1.5.3 The optimal separating hyperplanes -- 1.5.4 The support vector network -- 1.5.5 Why can neural networks and support vectors networks generalize? -- 1.6 Conclusion -- 2 Best Choices for Regularization Parameters in Learning Theory: On the Bias-Variance Problem -- 2.1 Introduction -- 2.2 RKHS and Regularization Parameters -- 2.3 Estimating the Confidence -- 2.4 Estimating the Sample Error -- 2.5 Choosing the optimal γ -- 2.6 Final Remarks -- 3 Cucker Smale Learning Theory in Besov Spaces
3.1 Introduction -- 3.2 Cucker Smale Functional and the Peetre K-Functional -- 3.3 Estimates for the CS-Functional in Anisotropic Besov Spaces -- 4 High-dimensional Approximation by Neural Networks -- 4.1 Introduction -- 4.2 Variable-basis Approximation and Optimization -- 4.3 Maurey-Jones-Barron's Theorem -- 4.4 Variation with respect to a Set of Functions -- 4.5 Rates of Approximate Optimization over Variable Basis Functions -- 4.6 Comparison with Linear Approximation -- 4.7 Upper Bounds on Variation -- 4.8 Lower Bounds on Variation -- 4.9 Rates of Approximation of Real-valued Boolean Functions -- 5 Functional Learning through Kernels -- 5.1 Some Questions Regarding Machine Learning -- 5.2 r.k.h.s Perspective -- 5.2.1 Positive kernels -- 5.2.2 r.k.h.s and learning in the literature -- 5.3 Three Principles on the Nature of the Hypothesis Set -- 5.3.1 The learning problem -- 5.3.2 The evaluation functional -- 5.3.3 Continuity of the evaluation functional -- 5.3.4 Important consequence -- 5.3.5 R[sup(χ)] the set of the pointwise denned functions on χ -- 5.4 Reproducing Kernel Hilbert Space (r.k.h.s) -- 5.5 Kernel and Kernel Operator -- 5.5.1 How to build r.k.h.s? -- 5.5.2 Carleman operator and the regularization operator -- 5.5.3 Generalization -- 5.6 Reproducing Kernel Spaces (r.k.k.s) -- 5.6.1 Evaluation spaces -- 5.6.2 Reproducing kernels -- 5.7 Representer Theorem -- 5.8 Examples -- 5.8.1 Examples in Hilbert space -- 5.8.2 Other examples -- 5.9 Conclusion -- 6 Leave-one-out Error and Stability of Learning Algorithms with Applications -- 6.1 Introduction -- 6.2 General Observations about the Leave-one-out Error -- 6.3 Theoretical Attempts to Justify the Use of the Leave-one-out Error -- 6.3.1 Early work in non-parametric statistics -- 6.3.2 Relation to VC-theory -- 6.3.3 Stability -- 6.3.4 Stability of averaging techniques -- 6.4 Kernel Machines
6.4.1 Background on kernel machines -- 6.4.2 Leave-one-out error for the square loss -- 6.4.3 Bounds on the leave-one-out error and stability -- 6.5 The Use of the Leave-one-out Error in Other Learning Problems -- 6.5.1 Transduction -- 6.5.2 Feature selection and rescaling -- 6.6 Discussion -- 6.6.1 Sensitivity analysis, stability, and learning -- 6.6.2 Open problems -- 7 Regularized Least-Squares Classification -- 7.1 Introduction -- 7.2 The RLSC Algorithm -- 7.3 Previous Work -- 7.4 RLSC vs. SVM -- 7.5 Empirical Performance of RLSC -- 7.6 Approximations to the RLSC Algorithm -- 7.6.1 Low-rank approximations for RLSC -- 7.6.2 Nonlinear RLSC application: image classification -- 7.7 Leave-one-out Bounds for RLSC -- 8 Support Vector Machines: Least Squares Approaches and Extensions -- 8.1 Introduction -- 8.2 Least Squares SVMs for Classification and Function Estimation -- 8.2.1 LS-SVM classifiers and link with kernel FDA -- 8.2.2 Function estimation case and equivalence to a regularization network solution -- 8.2.3 Issues of sparseness and robustness -- 8.2.4 Bayesian inference of LS-SVMs and Gaussian processes -- 8.3 Primal-dual Formulations to Kernel PGA and CCA -- 8.3.1 Kernel PCA as a one-class modelling problem and a primal-dual derivation -- 8.3.2 A support vector machine formulation to Kernel CCA -- 8.4 Large Scale Methods and On-line Learning -- 8.4.1 Nyström method -- 8.4.2 Basis construction in the feature space using fixed size LS-SVM -- 8.5 Recurrent Networks and Control -- 8.6 Conclusions -- 9 Extension of the ν-SVM Range for Classification -- 9.1 Introduction -- 9.2 ν Support Vector Classifiers -- 9.3 Limitation in the Range of ν -- 9.4 Negative Margin Minimization -- 9.5 Extended ν-SVM -- 9.5.1 Kernelization in the dual -- 9.5.2 Kernelization in the primal -- 9.6 Experiments -- 9.7 Conclusions and Further Work
10 Kernels Methods for Text Processing -- 10.1 Introduction -- 10.2 Overview of Kernel Methods -- 10.3 From Bag of Words to Semantic Space -- 10.4 Vector Space Representations -- 10.4.1 Basic vector space model -- 10.4.2 Generalised vector space model -- 10.4.3 Semantic smoothing for vector space models -- 10.4.4 Latent semantic kernels -- 10.4.5 Semantic diffusion kernels -- 10.5 Learning Semantics from Cross Language Correlations -- 10.6 Hypertext -- 10.7 String Matching Kernels -- 10.7.1 Efficient computation of SSK -- 10.7.2 n-grams- a language independent approach -- 10.8 Conclusions -- 11 An Optimization Perspective on Kernel Partial Least Squares Regression -- 11.1 Introduction -- 11.2 PLS Derivation -- 11.2.1 PGA regression review -- 11.2.2 PLS analysis -- 11.2.3 Linear PLS -- 11.2.4 Final regression components -- 11.3 Nonlinear PLS via Kernels -- 11.3.1 Feature space K-PLS -- 11.3.2 Direct kernel partial least squares -- 11.4 Computational Issues in K-PLS -- 11.5 Comparison of Kernel Regression Methods -- 11.5.1 Methods -- 11.5.2 Benchmark cases -- 11.5.3 Data preparation and parameter tuning -- 11.5.4 Results and discussion -- 11.6 Case Study for Classification with Uneven Classes -- 11.7 Feature Selection with K-PLS -- 11.8 Thoughts and Conclusions -- 12 Multiclass Learning with Output Codes -- 12.1 Introduction -- 12.2 Margin-based Learning Algorithms -- 12.3 Output Coding for Multiclass Problems -- 12.4 Training Error Bounds -- 12.5 Finding Good Output Codes -- 12.6 Conclusions -- 13 Bayesian Regression and Classification -- 13.1 Introduction -- 13.1.1 Least squares regression -- 13.1.2 Regularization -- 13.1.3 Probabilistic models -- 13.1.4 Bayesian regression -- 13.2 Support Vector Machines -- 13.3 The Relevance Vector Machine -- 13.3.1 Model specification -- 13.3.2 The effective prior -- 13.3.3 Inference -- 13.3.4 Making predictions
13.3.5 Properties of the marginal likelihood -- 13.3.6 Hyperparameter optimization -- 13.3.7 Relevance vector machines for classification -- 13.4 The Relevance Vector Machine in Action -- 13.4.1 Illustrative synthetic data: regression -- 13.4.2 Illustrative synthetic data: classification -- 13.4.3 Benchmark results -- 13.5 Discussion -- 14 Bayesian Field Theory: from Likelihood Fields to Hyperfields -- 14.1 Introduction -- 14.2 The Bayesian framework -- 14.2.1 The basic probabilistic model -- 14.2.2 Bayesian decision theory and predictive density -- 14.2.3 Bayes' theorem: from prior and likelihood to the posterior -- 14.3 Likelihood models -- 14.3.1 Log-probabilities, energies, and density estimation -- 14.3.2 Regression -- 14.3.3 Inverse quantum theory -- 14.4 Prior models -- 14.4.1 Gaussian prior factors and approximate symmetries -- 14.4.2 Hyperparameters and hyperfields -- 14.4.3 Hyperpriors for hyperfields -- 14.4.4 Auxiliary fields -- 14.5 Summary -- 15 Bayesian Smoothing and Information Geometry -- 15.1 Introduction -- 15.2 Problem Statement -- 15.3 Probability-Based Inference -- 15.4 Information-Based Inference -- 15.5 Single-Case Geometry -- 15.6 Average-Case Geometry -- 15.7 Similar-Case Modeling -- 15.8 Locally Weighted Geometry -- 15.9 Concluding Remarks -- 16 Nonparametric Prediction -- 16.1 Introduction -- 16.2 Prediction for Squared Error -- 16.3 Prediction for 0 - 1 Loss: Pattern Recognition -- 16.4 Prediction for Log Utility: Portfolio Selection -- 17 Recent Advances in Statistical Learning Theory -- 17.1 Introduction -- 17.2 Problem Formulations -- 17.2.1 Uniform convergence of empirical means -- 17.2.2 Probably approximately correct learning -- 17.3 Summary of "Classical" Results -- 17.3.1 Fixed distribution case -- 17.3.2 Distribution-free case -- 17.4 Recent Advances -- 17.4.1 Intermediate families of probability measures
17.4.2 Learning with prior information
This text details advances in learning theory that relate to problems studied in neural networks, machine learning, mathematics and statistics
Description based on publisher supplied metadata and other sources
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2020. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries
鏈接 Print version: Basu, S. Advances in Learning Theory : Methods, Models and Applications Amsterdam : IOS Press,c2003 9781586033415
主題 Computational learning theory -- Congresses.;Machine learning -- Mathematical models -- Congresses
Electronic books
Alt Author Horvath, G
Micchelli, C
Micchelli, Charles A
Vandewalle, Joos
Record:   Prev Next