Arjun Srivastava's Library
home

Arjun Srivastava's Library

Advanced Analytics With Spark: Patterns for Learning From Data at Scale
Sandy Ryza and Uri Laserson and Sean Owen and Josh Wills
In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together...
Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us About Our Offline Selves
Christian Rudder
*A New York Times* Bestseller An audacious, irreverent investigation of human behavior—and a first look at a revolution in the makingOur personal data has been used to spy on us, hire and fire us, and sell us stuff we don’t need. In Dataclysm , Chr...
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Martin Kleppmann
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, includin...
Domain Modeling Made Functional: Tackle Software Complexity With Domain-Driven Design and F#
Scott Wlaschin
You want increased customer satisfaction, faster development cycles, and less wasted work. Domain-driven design (DDD) combined with functional programming is the innovative combo that will get you there. In this pragmatic, down-to-earth guide, you'll...
Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures
Claus O. Wilke
Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences. But with the increasing power of visualization software today, scientists, engineers, and business...
R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics
Jd Long and Paul Teetor
Perform data analysis with R quickly and efficiently with more than 275 practical recipes in this expanded second edition. The R language provides everything you need to do statistical work, but its structure can be difficult to master. These task-o...
Spark: The Definitive Guide: Big Data Processing Made Simple
Bill Chambers and Matei Zaharia
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Mate...