Arjun Srivastava's Library
home

Arjun Srivastava's Library

Advanced Analytics With Spark: Patterns for Learning From Data at Scale
Sandy Ryza and Uri Laserson and Sean Owen and Josh Wills
In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together...
High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark
Holden Karau and Rachel Warren
Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warre...
How to Win an Indian Election: What Political Parties Don’t Want You to Know
Shivam Shankar Singh
What role do political consultants play in election campaigns? How are political parties using technological tools such as data analytics, surveys and alternative media to construct effective, micro-targeted campaigns? How does the use of money impa...
Machine Learning With R, the Tidyverse, and Mlr
Hefin I. Rhys
Summary Machine learning (ML) is a collection of programming techniques for discovering relationships in data. With ML algorithms, you can cluster and classify data for tasks like making recommendations or fraud detection and make predictions for sa...
Spark: The Definitive Guide: Big Data Processing Made Simple
Bill Chambers and Matei Zaharia
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Mate...
Tika in Action
Chris Mattmann and Jukka Zitting
SummaryTika in Action is a hands-on guide to content mining with Apache Tika. The book's many examples and case studies offer real-world experience from domains ranging from search engines to digital asset management and scientific data processing.A...