NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
The concept of utilizing big data to enable scientific discovery has generated tremendous excitement and investment from both private and public sectors over the past decade, and expectations continue to grow. Using big data analytics to identify complex patterns hidden inside volumes of data that have never been combined could accelerate the rate of scientific discovery and lead to the development of beneficial technologies and products. However, producing actionable scientific knowledge from such large, complex data sets requires statistical models that produce reliable inferences (NRC, 2013). Without careful consideration of the suitability of both available data and the statistical models applied, analysis of big data may result in misleading correlations and false discoveries, which can potentially undermine confidence in scientific research if the results are not reproducible. In June 2016 the National Academies of Sciences, Engineering, and Medicine convened a workshop to examine critical challenges and opportunities in performing scientific inference reliably when working with big data. Participants explored new methodologic developments that hold significant promise and potential research program areas for the future. This publication summarizes the presentations and discussions from the workshop.
Contents
- The National Academies of SCIENCES · ENGINEERING · MEDICINE
- The National Academies of SCIENCES · ENGINEERING · MEDICINE
- PLANNING COMMITTEE ON REFINING THE CONCEPT OF SCIENTIFIC INFERENCE WHEN WORKING WITH BIG DATA
- COMMITTEE ON APPLIED AND THEORETICAL STATISTICS
- BOARD ON MATHEMATICAL SCIENCES AND THEIR APPLICATIONS
- Acknowledgment of Reviewers
- 1. Introduction
- 2. Framing the Workshop
- PERSPECTIVES FROM STAKEHOLDERSMichelle Dunn, Nandini Kannan, and Chaitan Baru.
- INTRODUCTION TO THE SCIENTIFIC CONTENT OF THE WORKSHOPMichael Daniels.
- PERSPECTIVES FROM STAKEHOLDERS
- 3. Inference About Discoveries Based on Integration of Diverse Data Sets
- DATA INTEGRATION WITH DIVERSE DATA SETSAlfred Hero, III.
- DATA INTEGRATION AND ITERATIVE TESTINGAndrew Nobel.
- PANEL DISCUSSION
- STATISTICAL DATA INTEGRATION FOR LARGE-SCALE MULTIMODAL MEDICAL STUDIESGenevera Allen.
- DISCUSSION OF STATISTICAL INTEGRATION FOR MEDICAL AND HEALTH STUDIESJeffrey S. Morris.
- PANEL DISCUSSION
- DATA INTEGRATION WITH DIVERSE DATA SETS
- 4. Inference About Causal Discoveries Driven by Large Observational Data
- USING ELECTRONIC HEALTH RECORDS DATA FOR CAUSAL INFERENCES ABOUT THE HUMAN IMMUNODEFICIENCY VIRUS CARE CASCADEJoseph Hogan.
- DISCUSSION OF CAUSAL INFERENCES ON THE HUMAN IMMUNODEFICIENCY VIRUS CARE CASCADE FROM ELECTRONIC HEALTH RECORDS DATAElizabeth Stuart.
- PANEL DISCUSSION
- A GENERAL FRAMEWORK FOR SELECTION BIAS DUE TO MISSING DATA IN ELECTRONIC HEALTH RECORDS-BASED RESEARCHSebastien Haneuse.
- DISCUSSION OF COMPARATIVE EFFECTIVENESS RESEARCH USING ELECTRONIC HEALTH RECORDSDylan Small.
- PANEL DISCUSSION
- USING ELECTRONIC HEALTH RECORDS DATA FOR CAUSAL INFERENCES ABOUT THE HUMAN IMMUNODEFICIENCY VIRUS CARE CASCADE
- 5. Inference When Regularization Is Used to Simplify Fitting of High-Dimensional Models
- LEARNING FROM TIMEDaniela Witten.
- DISCUSSION OF LEARNING FROM TIMEMichael Kosorok.
- PANEL DISCUSSION
- SELECTIVE INFERENCE IN LINEAR REGRESSIONJonathan Taylor.
- STATISTICS AND BIG DATA CHALLENGES IN NEUROSCIENCEEmery N. Brown.
- DISCUSSION OF STATISTICS AND BIG DATA CHALLENGES IN NEUROSCIENCEXihong Lin.
- PANEL DISCUSSION
- LEARNING FROM TIME
- 6. Panel Discussion
- RESEARCH PRIORITIES FOR IMPROVING INFERENCES FROM BIG DATA
- INFERENCE WITHIN COMPLEXITY AND COMPUTATIONAL CONSTRAINTS
- EDUCATION AND CROSS-DISCIPLINARY COLLABORATION
- IDENTIFICATION OF QUESTIONS AND APPROPRIATE USES FOR AVAILABLE DATA
- FACILITATION OF DATA SHARING AND LINKAGE
- THE BOUNDARY BETWEEN BIOSTATISTICS AND BIOINFORMATICS
- References
- Appendixes
Rapporteur: Ben A. Wender.
Suggested citation:
National Academies of Sciences, Engineering, and Medicine. 2017. Refining the Concept of Scientific Inference When Working with Big Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24654.
- NLM CatalogRelated NLM Catalog Entries
- Review Refining the Concept of Scientific Inference When Working with Big Data: Proceedings of a Workshop—in Brief[ 2016]Review Refining the Concept of Scientific Inference When Working with Big Data: Proceedings of a Workshop—in BriefCommittee on Applied and Theoretical Statistics, Division on Engineering and Physical Sciences, National Academies of Sciences, Engineering, and Medicine. 2016 Aug 29
- Review Big Data and Analytics for Infectious Disease Research, Operations, and Policy: Proceedings of a Workshop[ 2016]Review Big Data and Analytics for Infectious Disease Research, Operations, and Policy: Proceedings of a WorkshopNational Academies of Sciences, Engineering, and Medicine, Health and Medicine Division, Board on Global Health, Forum on Microbial Threats. 2016 Dec 8
- Review Data in Motion: New Approaches to Advancing Scientific, Engineering and Medical Progress: Proceedings of a Workshop—in Brief[ 2021]Review Data in Motion: New Approaches to Advancing Scientific, Engineering and Medical Progress: Proceedings of a Workshop—in BriefNational Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Board on Research Data and Information, Arrison T, Saunders J, Kameyama E. 2021 May 25
- Review The Private Sector as a Catalyst for Health Equity and a Vibrant Economy: Proceedings of a Workshop[ 2016]Review The Private Sector as a Catalyst for Health Equity and a Vibrant Economy: Proceedings of a WorkshopRoundtable on the Promotion of Health Equity and the Elimination of Health Disparities, Board on Population Health and Public Health Practice, Health and Medicine Division, National Academies of Sciences, Engineering, and Medicine. 2016 Aug 24
- Review Developing Multimodal Therapies for Brain Disorders: Proceedings of a Workshop[ 2016]Review Developing Multimodal Therapies for Brain Disorders: Proceedings of a WorkshopNational Academies of Sciences, Engineering, and Medicine, Health and Medicine Division, Board on Health Sciences Policy, Forum on Neuroscience and Nervous System Disorders. 2016 Nov 18
- Refining the Concept of Scientific Inference When Working with Big DataRefining the Concept of Scientific Inference When Working with Big Data
Your browsing activity is empty.
Activity recording is turned off.
See more...