Introduction to Jupyter Notebooks#

This first notebook will introduce you using Google Colab/Jupyter Notebooks, a handy coding environment for learning as well as sharing code with others.

At the end of this notebook, you’ll be able to:#

  • Recognize the main features of Jupyter Notebooks

  • Use notebooks to run Python Code

  • Identify and edit simple Markdown code

  • Create a list and a NumPy array

  • Use matplotlib to create line and scatter plots

Saving your work#

To save a copy of this notebook for yourself in Google Colab, go to File > Save a copy in Drive. These files will show up in a folder called “Colab Notebooks” in your Drive. Otherwise, your changes here will not be saved.


Part I. About Jupyter Notebooks#

Jupyter notebooks are a way to combine executable code, code outputs, and text into one connected file. They run in a web browser, but don’t require the internet (unless you’re running it on Colab or a Jupyter Hub).

The ’kernel’ is the thing that executes your code. It is what connects the notebook (as you see it) with the part of your computer, or the DataHub computers, that runs code.

Types of Cells#

Jupyter Notebooks have two types of cells, a Markdown (like this one) and Code. Most of the time you won’t need to run the Markdown cells, just read through them. However, when we get to a code cell, you need to tell Jupyter to run the lines of code that it contains.

Code cells will be read by the Python interpreter. In other words, the Python kernel will run whatever it recognizes as code within the cell.

Task: Run the cell below! You can run a cell (this box) by pressing shift-enter or by pressing the run arrow.

# In Python, anything with a "#" in front of it is code annotation,
# and is not read by the computer.
# 
# Click in this cell and then press shift and enter simultaneously.
# The print function below generates a message.
print('Nice work!')
Nice work!

Part II. Using Markdown#

Markdown is useful because it can be formatted using simple symbols. Here’s a full cheatsheet of markdown for more tips, but the main syntax is below:

  • You can create bulleted lists using asterisks.

  • Similarly, you can create numbered lists using numbers.

  • You can bold with two asterisks or underscores on either side (**bold**) or italicize with one asterisk or underscore (*italicize*)

  • Pound signs (#) create headers. More pound signs means a smaller header.

Task: Edit the markdown cell below with a quick biography of yourself. You should have your name as a big header, a short quippy subtitle for yourself as a smaller header, and a three bullet points that use both bold and italic.

Edit this markdown cell!

Part III: Basic Python Syntax#

Like any language, Python follows a set of rules, known as the language syntax. Below, we’ll see how to write expressions in Python, create variables, and then manipulate those variables.

Python Expressions#

We can perform various arithmetic operations in Python:

Symbol

Operation

Usage

+

Addition

10+2

-

Subtraction

10-2

*

Multiplication

10*2

/

Division

10/2

**

Exponent

10**2

%

Modulo

10%2

Notes:

  • The default order of operations is the same as in mathematics! (PEMDAS)

  • If you want a whole number from your division, use // instead

Task: Try each of the operators above. Before using the modulo operator, predict what it will output, and make sure it produces what you expect. You can use print() statements to see the outcome of multiple lines of code.

# Let's play with numbers weee

Variables#

Variables enable us to store a value and come back to it later. They are defined with name = value. Assignment is not the same thing as equality, as in mathematics. Instead, think of this as storing a value in a jar (the variable).

Assigning variables#

Task: Create two variables: a & b. Then, use an expression that combines a and b, and assign this to c. In the end, c should be equal to 6.

# Complete the task above here
a = ...

Note: Most code cells will not give you an output unless you ask for it. You can use print( ) to output a variable or string. However, cells that only contain one variable will print its value.

Usefully, you can run your cells out of order. This is useful for testing and debugging.

Task: Change your equation for c in the cell above and then use print(c) to check its value. For example, if you originally multipled two times three, change the equation to two plus three.

Part IV: Generate some plots#

Step 1. Import packages#

We can take advantage of pre-packaged code for many common functions in Python. But first, we need to tell Python to import it. This is a really common step for most Python code.

Below, we’ll import a package called “numpy” and nickname it “np” and “matplotlib.pyplot” and nickname it “plt.”

Task: Add as plt after import matplotlib.pyplot to instruct Python to import the Matplotlib pyplot package but nickname it plt. When you see plt in our script, it’s actually calling scripts from this package.

After you import it, we’ll use %whos to show all of the variables in the namespace, including our packages. Having printed messages like these is a really nice way to check that your cell actually ran, and that your packages are available.

# Add your code here
import matplotlib.pyplot

%whos
Variable     Type        Data/Info
----------------------------------
a            ellipsis    Ellipsis
matplotlib   module      <module 'matplotlib' from<...>/matplotlib/__init__.py'>

Step 2: Enter some data to plot#

Lists are a type of data structure in Python. They are defined using brackets [ ] with individual items separated by commas. For example, if you wanted to create a list, you could write:

my_list = [1,3,2,5]

Task: Below, generate a random list of 10 numbers for Python to plot. Assign it to random_list.

# Replace the ... with your list of numbers. Don't forget brackets!
random_list = ...

Task: Let’s make sure Python did what we wanted it to do – store our random list as a variable called random_list. Check by typing the variable name into the next cell, and running it. If that worked, you should see a list of values.

Step 3: Plot your data#

Let’s pretend this is data from an awesome experiment we ran. We need to plot the data. Remember that we imported matplotlib.pyplot as plt. We can now use the plt.plot() function to plot our random list. plot is matploblib pyplot’s basic plotting function.

Task: In the cell below, run plt.plot(), giving it just one argument: random_list. In the following line, use plt.show() (without any arguments) to cleanly show the plot.

Tasks:

  1. Add axes to your plots by adding plt.xlabel(‘yourxlabelhere’) and plt.ylabel(‘yourylabel’). Add those above before plt.show().

  2. Add some markers to your line! Do this by adding an additional argument to plt.plot(random_list), so that it says plt.plot(random_list, marker =”o”). You can add various markers of your choosing!.

Step 4: Add some more data#

Let’s create another random list and create a scatterplot of the data. To do so, we’ll take a couple of steps.

  1. Import another package, called numpy (numerical python). The convention is to import this as np.

  2. Convert our list into a numpy array. This will allow us to perform numerical operations on it.

  3. Task: Create a second array named random_array_2 where each value is equal to all of the values in random_array times two. You can do this with the same operators as above. For example, if we wanted to divide all of the values in random_array, we would use random_array/2.

# 1 - Import numpy
import numpy as np

# 2 - Convert random_list to an array
random_array = np.array(random_list)

# 3 - Create random_array_2 below
random_array_2 = ... 

Task: In the box below, make a scatterplot of your data using the matplotlib’s scatter function. You should plot random_array on the x axis, and random_array_2 on the y axis. Be sure to label your axes as well. You can find documentation on the scatter function here or here (for more fun).

Step 5: Celebrate#

That’s the Jupyter Notebook tutorial! You’re ready to tackle more complex notebooks.

from IPython.display import HTML
HTML('<img src="https://media.tenor.com/images/99cff34bdcb675975b2b0cc661f2e4ce/tenor.gif">')

Resources#

For additional Jupyter Notebook information and practice, see this tutorial from DataQuest.

About this Notebook#

This notebook was created by Ashley Juavinett for classes at UC San Diego.