Abstract
The bash shell contains a wealth of useful programs for examining, filtering, transforming and also analyzing data. In conjunction with the underlying filter and pipe architecture, powerful data transformations can be performed interactively and iteratively within a very short time, which can for example support the knowledge discovery process with further dedicated tools like mathematica, R, etc. In the tutorial presented here, the most useful command-line tools from the GNU coreutils and their interaction will be introduced on the basis of a continuous scenario and clarified by means of two in-depth practical exercises in which the participants have to convict a murderer using a series of available police documents - exciting!
Keywords
Filter and Pipes, Unix-Shell, gnu coreutils.
Aims and Learning Objectives
After completing the tutorial, the participants will be able to successfully create their own filter pipes for various tasks. They have internalized the idea of composing complex programs from small well defined components allows rapid prototyping, incremental iterations and easy experimentation.
Target Audience
Data Analysts, Database Developers, anyone who is interested in finding, filtering and transforming information.
Prerequisite Knowledge of Audience
Intermediate - Participants should be familiar (or at least interested) using a shell like bash, csh, DOS-shell. A basic knowledge of Regular Expressions is helpful, but not necessarily required.
Detailed Outline
• Introduction
• The Basic Building blocks (text utilities from coreutils)
• Practical Exercise I
• Composition using the Filter and Pipe Architecture
• Practical Exercise II
• Summary and Outlook