GridKa School 2019 - The Art of Data

Name: GridKa School 2019 - The Art of Data
Start: 2019-08-26T12:00:00+02:00
End: 2019-08-30T14:00:00+02:00
Location: KIT, Campus North, FTU

Aug 26 – 30, 2019

KIT, Campus North, FTU

Europe/Berlin timezone

Contact

GridKa-School@scc.kit.edu

Scalable Scientific Analysis in Python using Pandas and Dask

Aug 29, 2019, 1:15 PM

4h 45m

Aula (FTU)

Aula

FTU

Hands-On Tutorial Tutorials

Mr Florian Jetter (Blue Yonder GmbH)Mr Sebastian Neubauer (Blue Yonder GmbH)

Pandas is a Python package that provides data structures to work with heterogenous, relational/tabular data. It provides fundamental building blocks for a powerful and flexible data analysis. Pandas provides functionality to load a wide set of data formats, manipulate the resulting data and also visualize it using various plotting frameworks. We will show in the workshop how to clean and reshape data in Pandas and use the concept of split-apply-combine to do exploratory analysis on it. Pandas provides powerful tooling to do data analysis on a single machine and is mostly mostly constrained to a single CPU. To parallelize and distribute these tasks, one can use Dask.

Dask is a flexible tool for parallelizing Python code on a single machine or across a cluster. We can think of dask at a high and a low level: Dask provides high-level Array, Bag, and DataFrame collections that mimic NumPy, lists, and Pandas but can operate in parallel on datasets that don't fit into main memory. Dask's high-level collections are alternatives to NumPy and Pandas for large datasets. In the low level, Dask provides dynamic task schedulers that execute task graphs in parallel. These execution engines power the high-level collections mentioned above but can also power custom, user-defined workloads. In the tutorial, we will cover the high-level use of dask.array and dask.dataframe.

There are no materials yet.

GridKa School 2019 - The Art of Data

Contact

Scalable Scientific Analysis in Python using Pandas and Dask

Aula

FTU

Speakers

Description

Presentation materials

Choose timezone

GridKa School 2019 - The Art of Data

Contact

Speakers

Description

Presentation materials