Wednesday, April 4, 2012

An Overview of High Performance Data Mining and Big Data Analytics

<Update May 2014>
I have written a book on this title due to come out in July 2014.
See the book site for "High-Performance Data Mining and Big Data Analytics: The Story of Insight from Big Data" ( )
<April 04, 2012>
With the exponential growth of data comes an ever-increasing need to process and analyze the so-called Big Data. High performance computing architectures have been devised to address the needs for handling Big Data processing not only from a transaction processing viewpoint but also from an analytics perspective. The main goal of this paper is to provide the reader with a historical and comprehensive view on the recent trend toward high performance computing architectures especially as it relates to Analytics and Data Mining.
There are a variety of readings separately on Big Data (and its characteristics), High Performance Computing for Analytics, Massively Parallel Processing (MPP) databases, algorithms for Big Data, In-memory Databases, implementation of machine learning algorithms for Big Data platforms, the Analytics environments of the future, etc. However none gives a historical and comprehensive view of all these separate topics in a single document. It is the author’s first attempt to bring about as many of these topics together as possible and to portray an ideal analytic environment that is better suited to the challenges of today’s analytics demands.

(Use Fullscreen button down the page for easier viewing of the Document)  
201201_High_Performance_Data_Mining_published_20120501.pdf by Khosrow Hassibi