This lesson is still being designed and assembled (Pre-Alpha version)

Interactive Scientific Computing

Introduction

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • What is Interactive Scientific Computing (ISC)?

  • Why using High Performance Computing (HPC) clusters for ISC?

Objectives
  • First learning objective. (FIXME)

The purpose of this lesson is to introduce you to Interactive Scientific Computing using the open on-demand access to Thorny Flat, the latest HPC cluster on West Virginia University.

Scientific Computing is the use of computers to perform science. The term is very wide in scope and includes doing calculations with computers, but also storing data, cleaning data and producing more data as result of computations. It includes the creation of plots and the use of computers writing scientific papers and research proposals more often as a collaborative effort of several authors. In the case of experimental sciences it also includes the use of computers to control and gather data from equipment. In modern times, we cannot think about doing science without computers.

However, in this lesson we will use the term Scientific Computing in a more restrictive way. We will use the term for the particular case of doing mostly numerical operations with computers. Like the kind of usage that you probably have when using a Spreadsheet like Excel or a specialize software like Matlab.

The term Interactive Scientific Computing is understood in the common case of using a computer in a similar way as you use a handheld calculator. You enter some input and you expect that input to be processed right away to produce results. The term non-interactive computing refers to the way people typically interact with Supercomputers. Supercomputers are large computing devices usually build as clusters of individual computers. Supercomputers in many cases are used by tens or hundreds of users at the same time. As such, the computations are programmed in advance, the user submit jobs expecting that those jobs start being executed sometime in the future and produce the result that will be analyzed later on.

The idea of using a Supercomputer like a HPC cluster could be intimidating for beginners. However, you could have strong motivations for start using this kind of machines. Your research have scaled to a point where your desktop computer or laptop is not longer capable of managing the task, you need specialized software packages and you do not want to spent time compiling or installing software and you would like to rely on software that is already present on the HPC cluster.

For taking advantage of WVU’s High Performance Computing cluster for interactive scientific computing all that you need is a web browser. On this lesson you will not have to learn Linux commands, you just need to execute one for the purpose of downloading all the materials for the tutorials but beyond that your interaction will take place on a friendly web interface. You do not have to manually submitting jobs or editing submission scripts, these are tasks very important for HPC but they will delegated for other lesson.

We will be using a tool, a web-based client portal, that hides all that complexity and allow you to start using powerful computers for your research from a web interface, with minimal effort and fast learning curve.

Several technologies are involved here and it is important to understand how those different pieces are interconnected.

Open OnDemand is a web-based client, based on the Ohio Supercomputer Center’s proven “OSC OnDemand” platform, that enables HPC centers to install and deploy advanced web and graphical interfaces for their users. HPC resources are accesible from a web browser without the user having to install any special software or plugin.

Open OnDemand Welcome Page

Open OnDemand manage the creation of interactive sessions of HPC clusters and the activation of Interactive Apps. At this point, we have enabled two interactive apps, Jupyter Notebooks and RStudio Server.

Interactive Apps

Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. Several languages are supported, but for the purpose of these lesson we will use Jupyter with two languages, Python and Julia

Jupyter Welcome Page

RStudio is an integrated development environment (IDE) for R, a programming language for statistical computing and graphics. RStudio Server is an interactive app that runs as server on the compute node and presents RStudio interface on a web browser.

RStudio Welcome Page

Those two apps will be used to introduce the fundamentals of three programming languages very popular nowadays for scientific computing.

Programming Languages

Python

Julia

R

Key Points

  • First key point. Brief Answer to questions. (FIXME)


Open On Demand: Easy web access to HPC resources

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • What is Open On Demand?

  • Creating the first interactive session

Objectives
  • First learning objective. (FIXME)

Open OnDemand is an open-source HPC portal. Open OnDemand provides an easy way of getting web access to HPC resources, in particular we will demonstrate the use of Open OnDemand to create a session using Jupyter or RStudio, two popular web applications for scientific computing.

Open OnDemand has been configure at WVU to access computational resources on Thorny Flat, the biggest cluster at WVU.

In order to access Open OnDemand you should be in-campus network. Try to access it on you browser:

https://ondemand-tf.hpc.wvu.edu

You should get a page asking for authentication, like below.

Central Authentication Service

If you are getting an error like:

Out of Campus Error

You are out of campus network. You need a VPN to get access to resources available in-campus. Instructions to request access to WVU’s VPN can be found here:

https://wvu.teamdynamix.com/TDClient/1976/Portal/KB/ArticleDet?ID=100867

Once you can establish secure connexion to in-campus resources access the Open OnDemand portal again. You will be greet with the authentication page. Enter your username and password. After that you are directed to the DUO Two-Factor Authentication, where you have to either sent a Duo push to your mobile device or enter the passcode

Duo Two-Factor Authentication

Once you have pass the Authentication step, you will be greet by the Open OnDemand Welcome Page.

Open OnDemand Welcome Page

Key Points

  • First key point. Brief Answer to questions. (FIXME)


Python: A powerful language with a rich collection of tools for scientific computing

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • Opening a Jupyter session and start learning Python

Objectives
  • First learning objective. (FIXME)

FIXME

Key Points

  • First key point. Brief Answer to questions. (FIXME)


Julia: The new language for Technical Computing

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • Opening a Jupyter session and start learning Julia

Objectives
  • First learning objective. (FIXME)

FIXME

Key Points

  • First key point. Brief Answer to questions. (FIXME)


R: Language for statistical computing and graphics.

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • Opening a RStudio session and start learning R

Objectives
  • First learning objective. (FIXME)

FIXME

Key Points

  • First key point. Brief Answer to questions. (FIXME)


Globus: Research Data Management Platform

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • Using Globus Web interface to transfer files in and out the cluster

Objectives
  • First learning objective. (FIXME)

FIXME

Key Points

  • First key point. Brief Answer to questions. (FIXME)