An important part of any computational research is dealing with data, either data used as input for calculations or finally created by codes during simulations. This lesson present several strategies to process data more efficiently not only from the point of view of computers but also for humans trying to make sense that abundant amount of data that we deal often.
Prerequisites
This tutorial requires familiarity with command line interface, examples with binary formats will use interpreted languages such as Python or R, a basic knowledge of those language allows focusing on the data processing techniques rather than language itself.
Setup | Download files required for the lesson | |
09:00 | 1. Processing Text Files with grep and awk | How to use grep and awk to extra useful information from large text files? |
10:00 | 2. Using Regular Expressions with Python | How to use Regular Expressions in python? |
10:30 | 3. Structurated Text (XML and JSON) | How to extract data from XML and JSON files? |
11:00 | 4. Binary formats: NetCDF and HDF5 | How to store large amounts of numerical data? |
13:00 | 5. Creating simple Databases with SQLite | How to use SQLite? |
13:30 | 6. No-SQL databases with MongoDB | How to use a MongoDB database? |
14:00 | 7. Machine Learning: (scikit-learn, Keras, and TensorFlow) | What is Machine Learning and all the fuzz about it? |
15:00 | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.