Link to booklet is copied to clipboard!

Saved to booklet!

Removed from booklet!

Started capturing to booklet.

Stopped capturing to booklet.

Preferring latest booklet.

No longer prefers latest booklet.

Entered edit mode.

Exited edit mode.

The *NumPy* package was first introduced here. It provides most of the functionality
one needs to do numerical manipulations and handling datasets in Python. Recall that the package is
imported with the statement:

import numpy as np

after which one may start working with numerical values in Python in all sorts of ways. For instance, creating a list of eight numbers
from 0 to 1 is done with the

`linspace`

function:
x = np.linspace(0,1,8) x

that outputs:

array([0. , 0.14285714, 0.28571429, 0.42857143, 0.57142857, 0.71428571, 0.85714286, 1. ])

Notice how it may be a little annoying to get very many decimals on the printed values. To avoid that
you may change the default printing behavior of NumPy, with this function call:

np.set_printoptions(precision=3)

after which you will get only 3 decimals:

array([0. , 0.143, 0.286, 0.429, 0.571, 0.714, 0.857, 1. ])

The content of the variable *NumPy array*:

`x`

remains unchanged. It is only the printing of it that is affected.
The `array(...)`

part of the print reveals that the variable `x`

is a type(x)

numpy.ndarray

In many contexts, a *NumPy array* is precisely what you want, but if you want a *Python list* you may use the method

`tolist`

:
x.tolist()

[0.0, 0.14285714285714285, 0.2857142857142857, 0.42857142857142855, 0.5714285714285714, 0.7142857142857142, 0.8571428571428571, 1.0]

A one dimensional *NumPy array* is not just a *Python list*. In fact, a *NumPy array* has the properties
you would like in order to do numerical manipulations of a list of numbers. To illustrate this, assume you have the same list of numbers
both as a *Python list*, *NumPy array*,

`x_list`

, and as a `x_array`

:
x_list = [1, 4, 9] x_array = np.array(x_list) print('x_list: ',x_list) print('x_array: ',x_array) print('x_array.tolist():',x_array.tolist())

x_list: [1, 4, 9] x_array: [1 4 9] x_array.tolist(): [1, 4, 9]

Then multiplying each list with 2 gives two different results:

print('2 * x_list: ',2 * x_list) print('2 * x_array:',2 * x_array)

2 * x_list: [1, 4, 9, 1, 4, 9] 2 * x_array: [ 2 8 18]

The *Python list* is repeated twice, while the *NumPy array*
has all its elements doubled, i.e. the
operations on the *NumPy array* are done element wise.
Another example of a numerical manipulation possible for a NumPy array
is the adding a constant to it:

x_array + 100

array([101, 104, 109])

If you provide a *Python list* as the argument for a *NumPy* function,
the function first converts the input to a *NumPy array* and then does its action element wise on the array. Thus the
*NumPy* function *Python list* or a *NumPy array*:

`np.sqrt`

applied to either a np.sqrt(x_list) np.sqrt(x_array)

gives the same result:

array([1., 2., 3.])

Another reason to work with *NumPy arrays* rather than
with *Python lists* is that the former are much faster.
Consider that you have 10000 numbers in either type of
list and wish to evaluate the sum, then with a *NumPy array*:

an_array = np.arange(10000) print(len(an_array)) print(np.sum(an_array)) %timeit np.sum(an_array)

10000 49995000 11.3 µs ± 575 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

it takes: 11 micro seconds for such an evaluation, while with a *Python list*:

a_list = list(an_array) print(len(a_list)) print(sum(a_list)) %timeit sum(a_list)

10000 49995000 1.02 ms ± 5.48 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

it takes 1 millisecond, i.e. about a hundred times longer. *magic command* of the Jupyter Notebook. It repeats
the following *Python statement* a great number of
times and evaluates the average time it takes to execute it. The output
obviously depends on your hardware.

`%timeit`

is a so-called import numpy as np a = np.arange(6)*20 a

array([ 0, 20, 40, 60, 80, 100])

As with ordinary lists, you may pick out elements of the one-dimensional *NumPy array*:
One element is picked like this:

a[2]

40

Elements up till some index like this:

a[:2]

array([ 0, 20])

Elements from some index and up like this:

a[3:]

array([60, 80, 100])

The last element like this:

a[-1]

100

So why are they called *NumPy arrays* and not
just *NumPy lists*? Well, that is because you can add more
dimensions. If you have a one-dimensional *NumPy arrays*, it
may look like this:
See here

Two *NumPy arrays* can be combined into one:

a = np.arange(6)*20 a b = (np.arange(2)+1)*30 b np.concatenate( (a,b) )

array([ 0, 20, 40, 60, 80, 100]) array([30, 60]) array([ 0, 20, 40, 60, 80, 100, 30, 60])

If you need to delete entries in a *NumPy array*, you use the
*NumPy array* and indices of elements to delete.

`np.delete`

function that takes as arguments the existing
a np.delete(a,3) np.delete(a,(1,2)) np.delete(a,range(len(a)-2)) a

array([ 0, 20, 40, 60, 80, 100]) array([ 0, 20, 40, 80, 100]) array([ 0, 60, 80, 100]) array([ 80, 100]) array([ 0, 20, 40, 60, 80, 100])

Note that *new* *NumPy array*, so if you just
want to delete the element in place, you have to reassign the variable:

`np.delete`

returns a a = np.delete(a,range(len(a)-2)) a

array([ 80, 100])

See more on the topic here.

`random`

,
that provides random numbers.
You get a single random number by calling the
function `rand()`

from the module:
import numpy as np np.random.rand()

0.4375872112626925

Calling the function again gives a new random number:

np.random.rand()

0.8917730007820798

You may also get a whole list of random numbers by providing an integer argument while calling:

np.set_printoptions(precision=3) # only 3 decimals in print np.random.rand(5)

array([0.964, 0.383, 0.792, 0.529, 0.568])

Often, when developing some *Python* code, it is helpful to get the *same* random numbers everytime the code is run.
You can do that by providing a *seed* to the

`random`

module:
np.random.seed(0) np.random.rand(2)

array([0.549, 0.715])

You still get random numbers:

np.random.rand(2)

array([0.603, 0.545])

But once you provide the seed again, you start over and the same random numbers in the same sequence once more:

np.random.seed(0) np.random.rand(2)

array([0.549, 0.715])

The random numbers that are provided by
calling *uniformly distributed*
between 0 and 1. It means that the likelihood of getting
a random number between and is independent on as
long as . Making histograms of 100 or 10000 random numbers
illustrates how a uniform distribution looks like. Writing this:

`np.random.rand()`

are %matplotlib inline import matplotlib.pyplot as plt import numpy as np fig, ax = plt.subplots() r = np.random.rand(100) ax.hist(r)

in one cell and this in another cell:

r = np.random.rand(10000) ax.hist(r)

and we get plots like these:

With just 100 random numbers, there is a great variation in each of
the ten bins of the histogram from 0 to 1. With 10000 random numbers,
there is almost the same amount of random numbers in each bin.

The following commands build a non-trivial, one-dimensional *NumPy array*:

a1 = np.arange(10,14) a2 = np.arange(20,24) a12 = np.concatenate((a1,a2)) print(a1) print(a2) print(a12) a = np.concatenate((a12,20+a1)) print(a)

[10 11 12 13] [20 21 22 23] [10 11 12 13 20 21 22 23] [10 11 12 13 20 21 22 23 30 31 32 33]

which can be turned into a two-dimensional *NumPy array*, e.g with 3 lists of each 4 elements:

a = a.reshape(3,4) a

array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]])

The dimensions of a *NumPy array* can be queried with the

`shape`

attribute:
a.shape

(3, 4)

A two-dimensional *NumPy array* can be thought of as a list of lists. Printing the 1st and
second (inner) list of the (outer) list supports this notion:

print(a[0]) print(a[1])

[10 11 12 13] [20 21 22 23]

And one may loop through the (inner) lists in the (other list) like this:

for row in a: print(row)

[10 11 12 13] [20 21 22 23] [30 31 32 33]

Consider the two-dimensional *NumPy array*:

a = np.array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]])

Selecting the rows (the inner lists) of the array with an index, as in

`a[0]`

, `a[1]`

, and `a[2]`

,
one may inspect the individual members of the inner lists, by adding another index in square brackets, e.g. `[2]`

that will select for the 3rd element in every row:
print(a[0][2]) print(a[1][2]) print(a[2][2])

12 22 32

Together, the three

`print`

statements effectively select
the 3rd row of the array `a`

. However, getting three
separate numbers is rather impractical. A more rational way to get the 3rd
coloumn is the following:
print(a[:,2])

[12 22 32]

where the colon,

`:`

, means that all rows (i.e. elements of the outer list) should be considered, while `,2`

means that all 3rd
elements of these rows are selected. One may get inspired and use the `:`

for the second coordinate, selecting all elements
in a given row:
print(a[1,:]) print(a[1])

[20 21 22 23] [20 21 22 23]

That works fine, but as seen, the

`,:`

is not required to pick out a row, since `[1]`

already is sufficient.
A *NumPy array*,

a = np.array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]]) a

array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]])

may be transposed, meaning that rows and coloumns are interchanged:

np.transpose(a)

array([[10, 20, 30], [11, 21, 31], [12, 22, 32], [13, 23, 33]])

and meaning that the first dimension is now indexing what used to be the coloumns:

for col in np.transpose(a): print(col)

[10 20 30] [11 21 31] [12 22 32] [13 23 33]

These *"used to be coloumns"*-rows may of course be indexed. E.g. the 3rd coloumn of

`a`

can be gotten like this:
np.transpose(a)[2]

array([12, 22, 32])

Note that it does not matter whether the *NumPy package* is passed an
array as done above, or the *NumPy object* is invoked as done here:

`transpose`

function from the `transpose`

method on a a.transpose()[2]

array([12, 22, 32])

A two-dimensional *NumPy array*, may be converted to a one-dimensional
one with the

`flatten`

method:
a = np.array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]]) a.flatten()

array([10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33])

Note how none of the methods, *NumPy object*,
which maintains its original shape:

`transpose`

and `flatten`

, lead to an actual change of the a

array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]])

Sometime, you may encounter a two-dimensional *NumPy array* with just one coloumn. Here we construct it deliberately:

a = a.reshape(12,1) a

array([[10], [11], [12], [13], [20], [21], [22], [23], [30], [31], [32], [33]])

Whenever that happens, you might consider calling the

`squeeze`

method to reduce the dimensionality:
a.squeeze()

array([10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33])

However, you may also take direct action and invoke the

`reshape`

method:
a

array([[10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33]])

a.reshape(12)

array([10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33])

Once deleted this booklet is gone forever!

Choose which booklet to go to: