Introduction to Jupyter Notebooks#
This first notebook will introduce you using Google Colab/Jupyter Notebooks, a handy coding environment for learning as well as sharing code with others.
At the end of this notebook, you’ll be able to:#
Recognize the main features of Jupyter Notebooks
Use notebooks to run Python Code
Identify and edit simple Markdown code
Create a list and a NumPy array
Use matplotlib to create line and scatter plots
Saving your work#
To save a copy of this notebook for yourself in Google Colab, go to File > Save a copy
in Drive. These files will show up in a folder called “Colab Notebooks” in your Drive. Otherwise, your changes here will not be saved.
Part I. About Jupyter Notebooks#
Jupyter notebooks are a way to combine executable code, code outputs, and text into one connected file. They run in a web browser, but don’t require the internet (unless you’re running it on Colab or a Jupyter Hub).
The ’kernel’ is the thing that executes your code. It is what connects the notebook (as you see it) with the part of your computer, or the DataHub computers, that runs code.
Types of Cells#
Jupyter Notebooks have two types of cells, a Markdown (like this one) and Code. Most of the time you won’t need to run the Markdown cells, just read through them. However, when we get to a code cell, you need to tell Jupyter to run the lines of code that it contains.
Code cells will be read by the Python interpreter. In other words, the Python kernel will run whatever it recognizes as code within the cell.
Task: Run the cell below! You can run a cell (this box) by pressing shift-enter or by pressing the run arrow.
# In Python, anything with a "#" in front of it is code annotation,
# and is not read by the computer.
#
# Click in this cell and then press shift and enter simultaneously.
# The print function below generates a message.
print('Nice work!')
Nice work!
Part II. Using Markdown#
Markdown is useful because it can be formatted using simple symbols. Here’s a full cheatsheet of markdown for more tips, but the main syntax is below:
You can create bulleted lists using asterisks.
Similarly, you can create numbered lists using numbers.
You can bold with two asterisks or underscores on either side (
**bold**
) or italicize with one asterisk or underscore (*italicize*
)Pound signs (#) create headers. More pound signs means a smaller header.
Task: Edit the markdown cell below with a quick biography of yourself. You should have your name as a big header, a short quippy subtitle for yourself as a smaller header, and a three bullet points that use both bold and italic.
Edit this markdown cell!
Part III: Basic Python Syntax#
Like any language, Python follows a set of rules, known as the language syntax. Below, we’ll see how to write expressions in Python, create variables, and then manipulate those variables.
Python Expressions#
We can perform various arithmetic operations in Python:
Symbol |
Operation |
Usage |
---|---|---|
+ |
Addition |
10+2 |
- |
Subtraction |
10-2 |
* |
Multiplication |
10*2 |
/ |
Division |
10/2 |
** |
Exponent |
10**2 |
% |
Modulo |
10%2 |
Notes:
The default order of operations is the same as in mathematics! (PEMDAS)
If you want a whole number from your division, use // instead
Task: Try each of the operators above. Before using the modulo operator, predict what it will output, and make sure it produces what you expect. You can use
print()
statements to see the outcome of multiple lines of code.
# Let's play with numbers weee
Variables#
Variables enable us to store a value and come back to it later. They are defined with name = value
. Assignment is not the same thing as equality, as in mathematics. Instead, think of this as storing a value in a jar (the variable).
Assigning variables#
Task: Create two variables:
a
&b
. Then, use an expression that combinesa
andb
, and assign this toc
. In the end,c
should be equal to 6.
# Complete the task above here
a = ...
Note: Most code cells will not give you an output unless you ask for it. You can use print( )
to output a variable or string. However, cells that only contain one variable will print its value.
Usefully, you can run your cells out of order. This is useful for testing and debugging.
Task: Change your equation for
c
in the cell above and then useprint(c)
to check its value. For example, if you originally multipled two times three, change the equation to two plus three.
Part IV: Generate some plots#
Step 1. Import packages#
We can take advantage of pre-packaged code for many common functions in Python. But first, we need to tell Python to import it. This is a really common step for most Python code.
Below, we’ll import a package called “numpy” and nickname it “np” and “matplotlib.pyplot” and nickname it “plt.”
Task: Add
as plt
afterimport matplotlib.pyplot
to instruct Python to import the Matplotlib pyplot package but nickname it plt. When you seeplt
in our script, it’s actually calling scripts from this package.
After you import it, we’ll use %whos
to show all of the variables in the namespace, including our packages. Having printed messages like these is a really nice way to check that your cell actually ran, and that your packages are available.
# Add your code here
import matplotlib.pyplot
%whos
Variable Type Data/Info
----------------------------------
a ellipsis Ellipsis
matplotlib module <module 'matplotlib' from<...>/matplotlib/__init__.py'>
Step 2: Enter some data to plot#
Lists are a type of data structure in Python. They are defined using brackets [ ]
with individual items separated by commas. For example, if you wanted to create a list, you could write:
my_list = [1,3,2,5]
Task: Below, generate a random list of 10 numbers for Python to plot. Assign it to
random_list
.
# Replace the ... with your list of numbers. Don't forget brackets!
random_list = ...
Task: Let’s make sure Python did what we wanted it to do – store our random list as a variable called
random_list
. Check by typing the variable name into the next cell, and running it. If that worked, you should see a list of values.
Step 3: Plot your data#
Let’s pretend this is data from an awesome experiment we ran. We need to plot the data. Remember that we imported matplotlib.pyplot
as plt
. We can now use the plt.plot()
function to plot our random list. plot
is matploblib pyplot’s basic plotting function.
Task: In the cell below, run
plt.plot()
, giving it just one argument:random_list
. In the following line, useplt.show()
(without any arguments) to cleanly show the plot.
Tasks:
Add axes to your plots by adding
plt.xlabel(‘yourxlabelhere’)
andplt.ylabel(‘yourylabel’)
. Add those above beforeplt.show()
.Add some markers to your line! Do this by adding an additional argument to
plt.plot(random_list)
, so that it saysplt.plot(random_list, marker =”o”)
. You can add various markers of your choosing!.
Step 4: Add some more data#
Let’s create another random list and create a scatterplot of the data. To do so, we’ll take a couple of steps.
Import another package, called
numpy
(numerical python). The convention is to import this as np.Convert our list into a numpy array. This will allow us to perform numerical operations on it.
Task: Create a second array named
random_array_2
where each value is equal to all of the values in random_array times two. You can do this with the same operators as above. For example, if we wanted to divide all of the values inrandom_array
, we would userandom_array/2
.
# 1 - Import numpy
import numpy as np
# 2 - Convert random_list to an array
random_array = np.array(random_list)
# 3 - Create random_array_2 below
random_array_2 = ...
Task: In the box below, make a scatterplot of your data using the matplotlib’s scatter function. You should plot
random_array
on the x axis, andrandom_array_2
on the y axis. Be sure to label your axes as well. You can find documentation on the scatter function here or here (for more fun).
Step 5: Celebrate#
That’s the Jupyter Notebook tutorial! You’re ready to tackle more complex notebooks.
from IPython.display import HTML
HTML('<img src="https://media.tenor.com/images/99cff34bdcb675975b2b0cc661f2e4ce/tenor.gif">')
data:image/s3,"s3://crabby-images/b14c7/b14c7315e1df9f838f54825e12638a3dc98ef18f" alt=""
Resources#
For additional Jupyter Notebook information and practice, see this tutorial from DataQuest.
About this Notebook#
This notebook was created by Ashley Juavinett for classes at UC San Diego.