Introduction
Overview
Teaching: 30 min
Exercises: 30 minQuestions
What is Interactive Scientific Computing (ISC)?
Why using High Performance Computing (HPC) clusters for ISC?
Objectives
First learning objective. (FIXME)
The purpose of this lesson is to introduce you to Interactive Scientific Computing using the open on-demand access to Thorny Flat, the latest HPC cluster on West Virginia University.
Scientific Computing is the use of computers to perform science. The term is very wide in scope and includes doing calculations with computers, but also storing data, cleaning data and producing more data as result of computations. It includes the creation of plots and the use of computers writing scientific papers and research proposals more often as a collaborative effort of several authors. In the case of experimental sciences it also includes the use of computers to control and gather data from equipment. In modern times, we cannot think about doing science without computers.
However, in this lesson we will use the term Scientific Computing in a more restrictive way. We will use the term for the particular case of doing mostly numerical operations with computers. Like the kind of usage that you probably have when using a Spreadsheet like Excel or a specialize software like Matlab.
The term Interactive Scientific Computing is understood in the common case of using a computer in a similar way as you use a handheld calculator. You enter some input and you expect that input to be processed right away to produce results. The term non-interactive computing refers to the way people typically interact with Supercomputers. Supercomputers are large computing devices usually build as clusters of individual computers. Supercomputers in many cases are used by tens or hundreds of users at the same time. As such, the computations are programmed in advance, the user submit jobs expecting that those jobs start being executed sometime in the future and produce the result that will be analyzed later on.
The idea of using a Supercomputer like a HPC cluster could be intimidating for beginners. However, you could have strong motivations for start using this kind of machines. Your research have scaled to a point where your desktop computer or laptop is not longer capable of managing the task, you need specialized software packages and you do not want to spent time compiling or installing software and you would like to rely on software that is already present on the HPC cluster.
For taking advantage of WVU’s High Performance Computing cluster for interactive scientific computing all that you need is a web browser. On this lesson you will not have to learn Linux commands, you just need to execute one for the purpose of downloading all the materials for the tutorials but beyond that your interaction will take place on a friendly web interface. You do not have to manually submitting jobs or editing submission scripts, these are tasks very important for HPC but they will delegated for other lesson.
We will be using a tool, a web-based client portal, that hides all that complexity and allow you to start using powerful computers for your research from a web interface, with minimal effort and fast learning curve.
Several technologies are involved here and it is important to understand how those different pieces are interconnected.
Open OnDemand is a web-based client, based on the Ohio Supercomputer Center’s proven “OSC OnDemand” platform, that enables HPC centers to install and deploy advanced web and graphical interfaces for their users. HPC resources are accesible from a web browser without the user having to install any special software or plugin.
Open OnDemand manage the creation of interactive sessions of HPC clusters and the activation of Interactive Apps. At this point, we have enabled two interactive apps, Jupyter Notebooks and RStudio Server.
Interactive Apps
Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. Several languages are supported, but for the purpose of these lesson we will use Jupyter with two languages, Python and Julia
RStudio is an integrated development environment (IDE) for R, a programming language for statistical computing and graphics. RStudio Server is an interactive app that runs as server on the compute node and presents RStudio interface on a web browser.
Those two apps will be used to introduce the fundamentals of three programming languages very popular nowadays for scientific computing.
Programming Languages
Key Points
First key point. Brief Answer to questions. (FIXME)
Open On Demand: Easy web access to HPC resources
Overview
Teaching: 30 min
Exercises: 30 minQuestions
What is Open On Demand?
Creating the first interactive session
Objectives
First learning objective. (FIXME)
Open OnDemand is an open-source HPC portal. Open OnDemand provides an easy way of getting web access to HPC resources, in particular we will demonstrate the use of Open OnDemand to create a session using Jupyter or RStudio, two popular web applications for scientific computing.
Open OnDemand has been configure at WVU to access computational resources on Thorny Flat, the biggest cluster at WVU.
In order to access Open OnDemand you should be in-campus network. Try to access it on you browser:
https://ondemand-tf.hpc.wvu.edu
You should get a page asking for authentication, like below.
If you are getting an error like:
You are out of campus network. You need a VPN to get access to resources available in-campus. Instructions to request access to WVU’s VPN can be found here:
https://wvu.teamdynamix.com/TDClient/1976/Portal/KB/ArticleDet?ID=100867
Once you can establish secure connexion to in-campus resources access the Open OnDemand portal again. You will be greet with the authentication page. Enter your username and password. After that you are directed to the DUO Two-Factor Authentication, where you have to either sent a Duo push to your mobile device or enter the passcode
Once you have pass the Authentication step, you will be greet by the Open OnDemand Welcome Page.
Key Points
First key point. Brief Answer to questions. (FIXME)
Python: A powerful language with a rich collection of tools for scientific computing
Overview
Teaching: 30 min
Exercises: 30 minQuestions
Opening a Jupyter session and start learning Python
Objectives
First learning objective. (FIXME)
FIXME
Key Points
First key point. Brief Answer to questions. (FIXME)
Julia: The new language for Technical Computing
Overview
Teaching: 30 min
Exercises: 30 minQuestions
Opening a Jupyter session and start learning Julia
Objectives
First learning objective. (FIXME)
FIXME
Key Points
First key point. Brief Answer to questions. (FIXME)
R: Language for statistical computing and graphics.
Overview
Teaching: 30 min
Exercises: 30 minQuestions
Opening a RStudio session and start learning R
Objectives
First learning objective. (FIXME)
FIXME
Key Points
First key point. Brief Answer to questions. (FIXME)
Globus: Research Data Management Platform
Overview
Teaching: 30 min
Exercises: 30 minQuestions
Using Globus Web interface to transfer files in and out the cluster
Objectives
First learning objective. (FIXME)
FIXME
Key Points
First key point. Brief Answer to questions. (FIXME)