Record:   Prev Next
Author Gentleman, Robert
Title Bioconductor Case Studies
Imprint New York, NY : Springer, 2008
©2008
book jacket
Edition 1st ed
Descript 1 online resource (287 pages)
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
Series Use R! Ser
Use R! Ser
Note Intro -- Preface -- Contents -- List of Contributors -- 1 The ALL Dataset -- 1.1 Introduction -- 1.2 The ALL data -- 1.3 Data subsetting -- 1.4 Nonspecific filtering -- 1.5 BCR/ABL ALL1/AF4 subset -- 2 R and BioconductorIntroduction -- 2.1 Finding help in R -- 2.2 Working with packages -- 2.3 Some basic R -- 2.3.1 Functions -- 2.3.2 The apply family of functions -- 2.3.3 Environments -- 2.4 Structures for genomic data -- 2.4.1 Building an ExpressionSet from .CEL and other files -- 2.4.2 Building an ExpressionSet from scratch -- 2.4.3 ExpressionSet basics -- 2.5 Graphics -- 3 Processing AffymetrixExpression Data -- 3.1 The input data: CEL files -- 3.1.1 The sample annotation -- 3.2 Quality assessment -- 3.3 Preprocessing -- 3.4 Ranking and filtering probe sets -- 3.4.1 Summary statistics and tests for ranking -- 3.4.2 Visualization of differential expression -- 3.4.3 Highlighting interesting genes -- 3.4.4 Selecting hit lists and the multiple testing problem -- 3.4.5 Annotation -- 3.5 Advanced preprocessing -- 3.5.1 PM and MM probes -- 3.5.2 Background-correction -- 3.5.3 Summarization -- 4 Two-Color Arrays -- 4.1 Introduction -- 4.2 Data import -- 4.3 Image plots -- 4.4 Normalization -- 4.5 Differential expression -- 5 Fold-Changes, Log-Ratios, Background Correction, Shrinkage Estimation, and Variance Stabilization -- 5.1 Fold-changes and (log-)ratios -- 5.2 Background-correction and generalized logarithm -- 5.3 Calling VSN -- 5.4 How does VSN work? -- 5.5 Robust fitting and the "most genes not differentially expressed" assumption -- 5.6 Single-color normalization -- 5.7 The interpretation of glog-ratios -- 5.8 Reference normalization -- 6 Easy Differential Expression -- 6.1 Example data -- 6.2 Nonspecific filtering -- 6.3 Differential expression -- 6.4 Multiple testing correction -- 7 Differential Expression -- 7.1 Motivation
7.1.1 The gene-by-gene approach -- 7.1.2 Nonspecific filtering -- 7.1.3 Fold-change versus t-test -- 7.2 Nonspecific filtering -- 7.3 Differential expression -- 7.4 Multiple testing -- 7.5 Moderated test statistics and the limma package -- 7.5.1 Small sample sizes -- 7.6 Gene selection by Receiver Operator Characteristic (ROC) -- 7.7 When power increases -- 8 Annotation and Metadata -- 8.1 Our data -- 8.2 Multiple probe sets per gene -- 8.3 Categories and overrepresentation -- 8.3.1 Chromosomal location -- 8.4 Working with GO -- 8.4.1 Functional analyses -- 8.5 Other annotations available -- 8.6 biomaRt -- 8.7 Database versions of annotation packages -- 8.7.1 Mapping Symbols -- 8.7.2 Other capabilities -- 9 Supervised Machine Learning -- 9.1 Introduction -- 9.1.1 Supervised machine learning check list -- 9.2 The example dataset -- 9.2.1 Nonspecific filtering of features -- 9.3 Feature selection and standardization -- 9.4 Selecting a distance -- 9.5 Machine learning -- 9.6 Cross-validation -- 9.7 Random forests -- 9.7.1 Feature selection -- 9.7.2 More exercises -- 9.8 Multigroup classification -- 10 Unsupervised Machine Learning -- 10.1 Preliminaries -- 10.1.1 Data -- 10.2 Distances -- 10.3 How many clusters? -- 10.4 Hierarchical clustering -- 10.5 Partitioning methods -- 10.5.1 PAM -- 10.6 Self-organizing maps -- 10.7 Hopach -- 10.8 Silhouette plots -- 10.9 Exploring transformations -- 10.10 Remarks -- 11 Using Graphs for Interactome Data -- 11.1 Introduction -- 11.2 Exploring the protein interaction graph -- 11.3 The co-expression graph -- 11.4 Testing the association between physical interaction and coexpression -- 11.5 Some harder problems -- 11.6 Reading PSI-25 XML files from IntAct with the Rintact package -- 11.6.1 Introduction -- 11.6.2 Loading R Packages -- 11.6.3 Obtaining the interaction information
11.6.4 Obtaining protein complex composition information -- 11.6.5 Creating graph objects with Rintact -- 12 Graph Layout -- 12.1 Introduction -- 12.2 Layout and rendering using Rgraphviz -- 12.2.1 Rendering parameters -- 12.2.2 Layout parameters -- 12.3 Directed graphs -- 12.3.1 Reciprocated edges -- 12.4 Subgraphs -- 12.5 Tooltips and hyperlinks on graphs -- 13 Gene Set Enrichment Analysis -- 13.1 Introduction -- 13.1.1 Simple GSEA -- 13.1.2 Visualization -- 13.1.3 Data representation -- 13.2 Data analysis -- 13.2.1 Preprocessing -- 13.2.2 Using KEGG -- 13.2.3 Permutation testing -- 13.2.4 Chromosome bands -- 13.3 Identifying and assessing the effects of overlapping gene sets -- 14 Hypergeometric Testing Used for Gene Set Enrichment Analysis -- 14.1 Introduction -- 14.2 The basic problem -- 14.3 Preprocessing and inputs -- 14.3.1 Nonspecific filtering -- 14.3.2 Gene selection via t-test -- 14.3.3 Inputs -- 14.4 Outputs and result summarization -- 14.4.1 Calling the hyperGTest function -- 14.4.2 Summarizing a GOHyperGResult object -- 14.4.3 Generating an HTML report of test results -- 14.4.4 Results in detail -- 14.5 The conditional hypergeometric test -- 14.6 Other collections of gene sets -- 14.6.1 Chromosome bands -- 14.6.2 KEGG -- 14.6.3 PFAM -- 15 Solutions to Exercises -- 2 R and Bioconductor Introduction -- 3 Processing Affymetrix Expression Data -- 4 Two-Color Arrays -- 5 Fold-Changes, Log-Ratios, Background Correction, Shrinkage Estimation, and Variance Stabilization -- 6 Easy Differential Expression -- 7 Differential Expression -- 8 Annotation and Metadata -- 9 Supervised Machine Learning -- 10 Unsupervised Machine Learning -- 11 Using Graphs for Interactome Data -- 12 Graph Layout -- 13 Gene Set Enrichment Analysis -- 14 Hypergeometric Testing Used for Gene Set Enrichment Analysis -- References -- Index
In this volume, the authors present a collection of cases to apply Bioconductor tools in the analysis of microarray gene expression data. Each chapter describes an analysis of real data using hands-on example driven approaches. Short exercises are included
Description based on publisher supplied metadata and other sources
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2020. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries
Link Print version: Gentleman, Robert Bioconductor Case Studies New York, NY : Springer,c2008 9780387772394
Subject Bioconductor (Computer file);Bioinformatics.;R (Computer program language)
Electronic books
Alt Author Falcon, Seth
Hahne, Florian
Huber, Wolfgang
Record:   Prev Next