August 28, 2017 to September 1, 2017
KIT, Campus North, FTU
Europe/Berlin timezone
We are currently updating to the new Indico 2 layout!

Databases for Big Data Analytics and Machine Learning

Aug 31, 2017, 1:00 PM
5h
Aula (FTU)

Aula

FTU

Speaker

Dr Mario Lassnig (CERN)

Description

In this workshop, the students will (a) learn how to efficiently use relational and non-relational databases, and (b) how to create database workflows suitable for analytics and machine learning.

First, the focus of the workshop is to teach efficient, safe, and fault-tolerant principles when dealing with high-volume and high-throughput database scenarios. This includes, but is not limited to, systems such as PostgreSQL, Redis or ElasticSearch. Topics include query planning and performance analysis, transactional safety, SQL injection, and competitive locking.

Second, we focus on how to actually prepare data from these databases to be usable for analytics and machine learning frameworks such as Keras. Topics include recommended workflows for data selection, data cleaning, model training, model running, error checking, and output archival.

An intermediate understanding of Python, SQL, and Linux shell scripting is recommended to follow this course. An understanding of machine learning principles is not required.

Presentation materials

There are no materials yet.