GridKa School 2016 - Data Science on Modern Architectures

Name: GridKa School 2016 - Data Science on Modern Architectures
Start: 2016-08-29T08:00:00+02:00
End: 2016-09-02T12:00:00+02:00
Location: FTU

August 29, 2016 to September 2, 2016

FTU

Europe/Berlin timezone

We are currently updating to the new Indico 2 layout!

Apache Spark in Scientific Applications

Sep 1, 2016, 1:00 PM

Room 155 (FTU)

Room 155

FTU

Mirko Kämpf (Cloudera)

The workshop Spark in Scientific Applications covers fundamentale development and data analysis techniques using Apache Hadoop and Apache Spark. Beside an introduction into the theoretical background about Map-Reduce- and Bulk-Synchronous-Parallel processing, also the machine learning library MLlib and the graph processing framework GraphX are used. We work on sample data sets from Wikipedia, financial market data, and from a generic data generator. During the tutorial sessions we illustrate the Data Science Workflow and present the right tools for the right task. All practical exercises are well prepared in a pre-configured virtual machine. Participants get access to required data sets on a „one node pseudo-distributed“ cluster with all tools inside. This VM is also a starting point for further experiments after the workshop.

There are no materials yet.

GridKa School 2016 - Data Science on Modern Architectures

Apache Spark in Scientific Applications

Room 155

FTU

Speaker

Description

Presentation materials

Choose timezone

GridKa School 2016 - Data Science on Modern Architectures

Speaker

Description

Presentation materials