Software Setup
This class will use Jupyter Notebooks. This is needed to read the lecture notes and work on the assignments. This document provides instructions for setting up your environment for running and developing programs in Jupyter Notebooks.
We recommend one of four options:
- Local setup using Miniconda
- Local setup using Docker
- Remote setup using VSCode on CS Lab Computers
- Cloud setup using Google Colab
1) Local Setup using Miniconda
Some assignments will require access to a GPU. We recommend that you only take this route if you have a NVIDIA GPU within your local device or plan on using another setup method listed below (i.e., Google Colab or Remote Access to Lab Computers) for assignments requiring GPUs. Non-NVIDIA GPUs can be used, but they require different setup instructions.
Miniconda is a free, miniature installation of Anaconda Distribution that includes only conda, Python, the packages they both depend on, and a small number of other useful packages. We recommend using the Miniconda installation over the Anaconda installation for those more familiar with using the terminal to access their environments and files. GTA’s will also be able to better troubleshoot and diagnose issues due to familiarity.
Click the following link and download the Miniconda installer. Ensure you select the download compatible with the OS (Windows/Mac/Linux): https://www.anaconda.com/download/success
Once downloaded, run the installer and follow the setup wizard instructions. The install options given depend on personal preference. The default installation options are recommended for those who are unfamiliar with the software.
Restart your computer once the installation is complete.After installation, open the terminal and type:
conda --versionThis will verify if the installation is successful.
It is important to use environments for a number of reasons, such as avoiding version conflicts, system protection, and to ensure reproducibility of model performance.
To create a virtual environment, we will use the following command within the terminal:
conda create --name <preferred_name> python=3.10Replace ‘preferred_name’ with your preferred naming convention.
Once created, activate the environment using the command:
conda activate <preferred_name>This will activate the created environment. You will know you were successful if the environment name is displayed on the left of the active terminal line.
Additionally, the environment can be deactivated using the command:
conda deactivateIt is important that you activate this environment any time you work on an assignment.
The libraries we will be using consistently through this course are NumPy, MatPlotLib, and Pandas. These can be easily installed using the following command in the terminal:
conda install -y ipykernel numpy pandas scikit-learn matplotlibWe will also need to install Pytorch, which is an open-source ML framework that is frequently used for building and training Neural Networks.
Ensure you do not have CUDA or PyTorch installed through pip prior to this step. Having multiple installations on the same device can cause a plethora of issues that can be hard to diagnose. If you already have a CUDA and PyTorch installation, skip this step for now and finish the rest of the instructions. Then, check to see if PyTorch is working in a notebook. If not, uninstall any versions of CUDA and PyTorch you may have, then use the command below.
This package will be a larger download and may take some time depending on the connected network. To install, use the command:
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch –c nvidiaYou will also need the d2l module that was developed to encapsulate frequently used functions and classes found throughout these notebooks. This file is available at the following github link. You can download from there, or use one of the following linux commands to download it from the command line:
curl -L -O https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.pyor
wget https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.pyNext, students should decide whether they want to open their notes and assignments in Jupyter Notebooks or in Visual Studio Code. Feel free to set up both if you are unsure which platform you would like to use or if you like having options.
Jupyter Notebooks
Install Jupyter Notebooks using the command:
conda install notebookWith the conda environment active, navigate to the directory where you will be storing the notebook files, and use the command:
jupyter notebookRunning the command should start a local server and open your web browser automatically.
If this does not happen, you can copy and paste the local host URL displayed in the terminal to your preferred browser.
You should be able to see all the files within your directory listed within the webpage.
Open and run a notebook to ensure all packages and extensions are properly installed and working. If possible, run a notebook provided by the course that uses PyTorch.
Visual Studio Code
Download the VS Code installer if you do not already have it downloaded: https://code.visualstudio.com/download
Once downloaded, run the installer and ensure the options “Add to PATH (requires shell restart)” and “Register Code as an editor for supported file types” is checked.
When installed, restart your computer. Once restarted, open the terminal and activate your virtual environment. Navigate to the directory with your files, then type the command:
code .Alternatively, students can open VS Code by clicking the application and opening their preferred folder within the VS Code application.
Visual Studio Code will open with all of your files. On the left-hand side, select the extensions tab. Install the extensions “Python”, “Python Debugger”, and “Jupyter”. All three will have Microsoft listed as the author/publisher.
After the extensions are installed press
Ctrl + Shift + P(this quick key command may differ for MacOS) and a drop-down menu should appear at the top of the screen, below the search bar. Choose “Python: Select Interpreter” then choose the interpreter in the list that contains your environment name.Open a Jupyter Notebook from the explorer. The explorer can be found in the first tab located in the left-handed navigation bar.
In the top right hand corner of the notebook, click “Select Kernel>PythonEnvironments” and select the environment created in earlier steps.
Run the notebook to ensure all packages and extensions are properly installed and working.
2) Local Setup using Docker
Some assignments will require access to a GPU. We recommend that you only take this route if you have a NVIDIA GPU within your local device or plan on using another setup method listed below (i.e., Google Colab or Remote Access to Lab Computers) for assignments requiring GPUs. Non-NVIDIA GPUs can be used, but they require different setup instructions.
We can use Docker to set up the development environment. Please follow the instructions to set things up on your local machines.
- Download and install Docker Desktop
- Start Docker Desktop on your machine
- Create a directory where all the Jupyter notebooks for this course will live. For this case, we use
cs445. - Copy this Dockerfile in directory
cs445. Open a terminal window and go to the directory
cs445. Then run:docker build -t cs445 .This will create a Docker image named
cs445. This Docker image will have all the relevant software installed. You only have to do this the first time.Start a Docker container using this image by running the following command in the terminal and under the same directory:
docker run -d --rm --name cs445-student --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8012:8012 -v <full path to cs445 directory>:/home/cs445 cs445Open a browser and go to http://127.0.0.1:8012/. You should see the Jupyter interface with a list of all the files and directories in the
cs445directory. Open the notebook that you would like to read or edit. If you edit, make sure to save the notebook.To exit, close the browser. To stop the Docker container, first, run the following command in the terminal window:
docker container lsThis should list all running containers along with their CONTAINER IDs. To stop a container, use the command:
docker container stop <CONTAINER ID>You will need the d2l module that was developed to encapsulate frequently used functions and classes found throughout these notebooks. This file is available at the following github link. You can download from there, or use one of the following linux commands to download it from the command line:
curl -L -O https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.pyor
wget https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.py
3) Remote Setup using VS Code on CS Lab Computers
Visual Studio Code (or VS Code) is an IDE that allows you to remote into the CS Lab Computers using SSH. In this use-case Visual Studio Code is installed and running on your personal machine (laptop or desktop). You write and edit code on your laptop but connect to the CS machines to run the code. The connection is handled by a proprietary SSH extension which you will need to install on your laptop. Here are the instructions on how to remote into one of the machines. You would need to use CSU’s Global Protect VPN.
Install VSCode. For installation instructions, refer to this earlier section.
Remote into one of the CS Lab computers using VSCode (For this case, we will use
frankfort.cs.colostate.edu)A list of lab terminals can be found in the following link. We highly recommend choosing a terminal that has a GPU: https://www.cs.colostate.edu/machinestats/?column=hostname&order=asc
If this is your first time connecting remotely into these machines, VSCode may ask you to choose the platform/OS. Use Linux.
Create a folder in your Linux account where all the Jupyter notebooks for this course will live. For this case, we use
cs445.mkdir cs445Install the Jupyter extension from Microsoft on your VSCode
Open the interactive terminal in VSCode and run the command:
module load python/anaconda/py3.10-2023.03Open and run a Jupyter notebook on VSCode. When asked to choose a Python version, use:
/usr/local/anaconda3/2023.03/bin/pythonInstall any packages that might be missing using
pip install <package-name>To stop using the module, run
module purgeYou will also need the d2l module that was developed to encapsulate frequently used functions and classes found throughout these notebooks. This file is available at the following github link. You can download from there, or use one of the following linux commands to download it from the command line:
curl -L -O https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.pyor
wget https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.py
4) Cloud Setup using Google Colab
Google Colab is a free, cloud-based service provided by Google that allows users to write and execute Python code in their web browser and is based on the open-source Jupyter Notebook environment. This service is most often used for machine learning, data analysis, and education. It is an imperative resource for those who do not have GPU access. Students will need a Google account to access Google Colab.
Navigate to Google Colab and sign in with your Google account: https://colab.research.google.com/
Once signed in, click the “+ New” button in the upper left corner and then select “File Upload” in the dropdown menu to select the file you want to upload.
Once selected, the file should appear on the Google Colab home screen. Double-click the file to open it.
Navigate to the Runtime > Change runtime type and ensure the runtime type is Python 3 and that T4 GPU is selected as the Hardware accelerator, then click save on the screen.
To run uploaded files in full, click the “Run all” button at the top of the notebook. If you wish to run one cell at a time, a circle play button can be found in the top right corner of each cell.
You will need the d2l module that was developed to encapsulate frequently used functions and classes found throughout these notebooks. This file is available at the following github link. You can download from there, or use one of the following linux commands to download it from the command line:
curl -L -O https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.pyor
wget https://github.com/asabenhur/CS545/raw/refs/heads/master/d2l.pyBe sure to upload this file to your Google Colab account.
To install specific packages, terminal commands can be run within a notebook cell. For example, the command to install numpy is:
!pip install numpy
If none of the options work for you, come see a TA during office hours.