I studied statistics and data science for years before anyone ever suggested to me that these topics might have an ethical dimension, or that my numerical tools were products of human beings with motivations specific to their time and place. I’ve since written about the history and philosophy of mathematical probability and statistics, and I’ve come to understand just how important that historical background is and how critically important it is that the next generation of data scientists understand where these ideas come from and their potential to do harm. I hope anyone who reads these books avoids getting blinkered by the ideas that data = objectivity and that science is morally neutral.
The thing you should know about science is that it’s a human enterprise. As a result, it’s dependent on human factors like social consensus and prejudice. In this series of case studies of famously expensive and difficult-to-replicate experiments probing the limits of scientific understanding from biology to theoretical physics, Collins and Pinch show how scientific knowledge gathering is rarely straightforward because there are always alternative explanations available for the data. Was the phenomenon real or was the experiment set up badly? We can never know for sure, but we decide collectively what we believe. Scientists are experts participating in human culture, they argue, not mysterious clergy issuing declarations of absolute truth.
Harry Collins and Trevor Pinch liken science to the Golem, a creature from Jewish mythology, powerful yet potentially dangerous, a gentle, helpful creature that may yet run amok at any moment. Through a series of intriguing case studies the authors debunk the traditional view that science is the straightforward result of competent theorisation, observation and experimentation. The very well-received first edition generated much debate, reflected in a substantial new Afterword in this second edition, which seeks to place the book in what have become known as 'the science wars'.
I’ve wanted to be a philosopher since I read Plato’s Phaedo when I was 17, a new immigrant in Canada. Since then, I’ve been fascinated with time, space, and quantum mechanics and involved in the great debates about their mysteries. I saw probability coming into play more and more in curious roles both in the sciences and in practical life. These five books led me on an exciting journey into the history of probability, the meaning of risk, and the use of probability to assess the possibility of harm. I was gripped, entertained, illuminated, and often amazed at what I was discovering.
I am laughing out loud, even now that I am rereading this book for the umpteenth time. Fraudsters are so clever, and so is advertising. And then there is sloppy journalism with its “wow” statistics.
I like his book enormously, not least because of its witty illustrations. It is subversive, comic, and provocative, and it makes me wise to seductive, misleading practices–and it does so with a light touch.
From distorted graphs and biased samples to misleading averages, there are countless statistical dodges that lend cover to anyone with an ax to grind or a product to sell. With abundant examples and illustrations, Darrell Huff's lively and engaging primer clarifies the basic principles of statistics and explains how they're used to present information in honest and not-so-honest ways. Now even more indispensable in our data-driven world than it was when first published, How to Lie with Statistics is the book that generations of readers have relied on to keep from being fooled.
I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.
An authoritative tome on R. This book is the ultimate reference guide, heavy on statistical methods from the simple to the advanced. Of the 29 chapters, only the first five chapters or so have R syntactical and programming skills as their main focus; the remaining content highlights the many and varied statistical techniques R is capable of. I think this is a fantastic book to have on the shelf for people who are likely to need R and its contributed packages for a variety of different statistical analyses, but might not know where to initially start for any given statistical method.
Hugely successful and popular text presenting an extensive and comprehensive guide for all R users The R language is recognized as one of the most powerful and flexible statistical software packages, enabling users to apply many statistical techniques that would be impossible without such software to help implement such large data sets. R has become an essential tool for understanding and carrying out research. This edition: * Features full colour text and extensive graphics throughout. * Introduces a clear structure with numbered section headings to help readers locate information more efficiently. * Looks at the evolution of R over the…
I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.
A gentle yet detailed book for beginner programmers. A great book for those who know they'll be getting up to some programming in R but who are very new to programming in general. The book's chapters are filled with content on the syntax, usage, and 'best practice' guidelines. The examples guide the reader in a step-by-step fashion to maximise understanding. An especially unique chapter providing examples on things you can do in R that you might've otherwise done in Excel is one of its stand-out features.
Mastering R has never been easier Picking up R can be tough, even for seasoned statisticians and data analysts. R For Dummies, 2nd Edition provides a quick and painless way to master all the R you'll ever need. Requiring no prior programming experience and packed with tons of practical examples, step-by-step exercises, and sample code, this friendly and accessible guide shows you how to know your way around lists, data frames, and other R data structures, while learning to interact with other programs, such as Microsoft Excel. You'll learn how to reshape and manipulate data, merge data sets, split and…
Hi, I’m Neil. We need to live our tiny, precious lives with intention. I write about failure, resilience, happiness, trust, and gratitude. I’m the New York Times bestselling author of 10 books and journals that have sold over 2,000,000 copies and spent over 200 weeks on bestseller lists, including The Happiness Equation, Two-Minute Mornings, and You Are Awesome. I host the award-winning, ad-free, sponsor-free podcast 3 Books, where I’m on a 22-year quest to uncover the 1000 most formative books in the world. Guests include Brené Brown, Quentin Tarantino, and David Sedaris. I give over 50 keynote speeches a year at places like Harvard, SXSW, and Microsoft.
If I were teaching a course on life, this would be a mandatory textbook. Talib defines black swan events as events that 1) are disproportionately huge, 2) cannot be predicted, and 3) are mistakenly explained in retrospect with hindsight and fallacies.
This book helped me leave my corporate job and strike out on my own. Why? To help unroll the canvas of myself and my life, so I was more exposed to black swan events, leading me to write more books and have more unlikely, amazing experiences.
The most influential book of the past seventy-five years: a groundbreaking exploration of everything we know about what we don’t know, now with a new section called “On Robustness and Fragility.”
A black swan is a highly improbable event with three principal characteristics: It is unpredictable; it carries a massive impact; and, after the fact, we concoct an explanation that makes it appear less random, and more predictable, than it was. The astonishing success of Google was a black swan; so was 9/11. For Nassim Nicholas Taleb, black swans underlie almost everything about our world, from the rise of religions…
I started my career as a research scientist building machine learning algorithms for weather forecasting. Twenty years later, I found myself at a precision agriculture startup creating models that provided guidance to farmers on when to plant, what to plant, etc. So, I am part of the movement from academia to industry. Now, at Google Cloud, my team builds cross-industry solutions and I see firsthand what our customers need in their data science teams. This set of books is what I suggest when a CTO asks how to upskill their workforce, or when a graduate student asks me how to break into the industry.
It is not enough for a data scientist to be able to analyze data and build ML models. You have to be able to communicate the insights to decision-makers concisely and accurately. This book shows you bad and good visualizations — you’ll be surprised by how often you would have defaulted to the bad way without the guidance provided by this book!
Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences. But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.
This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke…
As a professional statistician, I am naturally interested in AI and data science. However, in our current information age, everyone, in all segments of society, needs to understand the basics of AI and data science. These basics include such things as what these disciplines are, what they can contribute to society, and perhaps most importantly, what can go wrong. However, I have found that much of the literature on these topics is highly technical and beyond the reach of most readers. These books are specifically selected because they are readable by virtually everyone, and yet convey the key concepts needed to be data-literate in the 21st century. Enjoy!
This book, by Nate Silver of 538 fame, explains in a straightforward manner why so many predictions by “experts,” from weather forecasts to sports outcomes to election polling to economics, ultimately prove wrong.
It relates to understanding the “signal,” the underlying science that is often revealed through trends and patterns in data, relative to the “noise,” the random or unpredictable variations always present in data. Silver also explains the concept of conditional probability, probability when provided with some relevant information, in an unusually clear manner.
The book reads more like a casual conversation with the author, rather than a statistics textbook.
UPDATED FOR 2020 WITH A NEW PREFACE BY NATE SILVER
"One of the more momentous books of the decade." —The New York Times Book Review
Nate Silver built an innovative system for predicting baseball performance, predicted the 2008 election within a hair’s breadth, and became a national sensation as a blogger—all by the time he was thirty. He solidified his standing as the nation's foremost political forecaster with his near perfect prediction of the 2012 election. Silver is the founder and editor in chief of the website FiveThirtyEight.
Drawing on his own groundbreaking work, Silver examines the world of prediction,…
I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector. I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics.
This is a foundational book on analytics and data science as a business function and helped to shape the development of the practice. It provides a view of the discipline through a business lens and avoids deep technical examinations. Though much has changed in the 15 years since it was originally published, it is still essential reading for a leader in the field. No book since has captured as well the competitive differentiation that analytics provides.
You have more information at hand about your business environment than ever before. But are you using it to "out-think" your rivals? If not, you may be missing out on a potent competitive tool. In Competing on Analytics: The New Science of Winning, Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon? Analytics: sophisticated quantitative and statistical analysis and predictive modeling. Exemplars of analytics are using new…
I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector. I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics.
Not everybody needs to be a data scientist, but everybody does need to be data literate. Without an intentional focus on evangelism and building a strong data culture in your organization it will be an uphill battle to make meaningful change. This book helps individuals and leaders to understand what data literacy is, and how we can build it like any other skill.
In the fast moving world of the fourth industrial revolution not everyone needs to be a data scientist but everyone should be data literate, with the ability to read, analyze and communicate with data.
It is not enough for a business to have the best data if those using it don't understand the right questions to ask or how to use the information generated to make decisions. Be Data Literate is the essential guide to developing the curiosity, creativity and critical thinking necessary to make anyone data literate, without retraining as a data scientist or statistician.
With learnings to show…
I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector. I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics.
Data scientists and analytics specialists are great at building models and algorithms, but often wrap them in a presentation or dashboard that diminishes their value and reduces the likelihood of their work being adopted. This book encourages practitioners to always consider the last mile and to pay as much attention to presentation and aesthetics as we do to the model itself.
Master the art and science of data storytelling-with frameworks and techniques to help you craft compelling stories with data.
The ability to effectively communicate with data is no longer a luxury in today's economy; it is a necessity. Transforming data into visual communication is only one part of the picture. It is equally important to engage your audience with a narrative-to tell a story with the numbers. Effective Data Storytelling will teach you the essential skills necessary to communicate your insights through persuasive and memorable data stories.
Narratives are more powerful than raw statistics, more enduring than pretty charts. When…