Introduction to Python ====================== Basic data types ~~~~~~~~~~~~~~~~ - **integers** (``int``) - whole numbers, such as ``42``, ``3``, ``56409`` - **floats** (``float``) - decimal numbers, such as ``57.346``, ``0.2``, ``5.0`` - **strings** (``str``) - sequences of characters, such as ``Hello, World.``, ``'I\'m 37'`` - **booleans** (``bool``) - can only be either ``True`` or ``False`` Variables ~~~~~~~~~ A **variable** is simply a name for something that we need to remember. We can **assign values** to **variables**, which are stored in the computer’s memory: .. code:: ipython3 age = 25 weight_in_kg = 61.4 name = 'Mary' is_male = False Python automatically determines the **type** of the value being assigned to a variable: .. code:: ipython3 type(weight_in_kg) .. parsed-literal:: float Variable names cannot have spaces in them; **\_underscores\_** are typically used between words Maths operators ~~~~~~~~~~~~~~~ Addition and subtraction ^^^^^^^^^^^^^^^^^^^^^^^^ .. code:: ipython3 4 + 4 - 2 .. parsed-literal:: 6 .. code:: ipython3 1 - 3 + 0.5 .. parsed-literal:: -1.5 ``int`` and ``float`` values can be used concurrently. Multiplication ^^^^^^^^^^^^^^ .. code:: ipython3 4 * 2 * 2.5 .. parsed-literal:: 20.0 .. code:: ipython3 3 ** 3 .. parsed-literal:: 27 ``**`` is used for **exponentiation** (*‘to the power of’*) Division ^^^^^^^^ .. code:: ipython3 4 / 4 .. parsed-literal:: 1.0 Division using ``/`` always results in a value of type ``float``. .. code:: ipython3 4 / (2 * 3) .. parsed-literal:: 0.6666666666666666 **``(``\ Parentheses\ ``)``** can be used as expected. Other types of division ^^^^^^^^^^^^^^^^^^^^^^^ .. code:: ipython3 8 // 6 .. parsed-literal:: 1 ``//`` is used for **floor division**: the remainder is discarded. .. code:: ipython3 365 // 7 .. parsed-literal:: 52 .. code:: ipython3 365 / 7 .. parsed-literal:: 52.142857142857146 .. code:: ipython3 8 % 6 .. parsed-literal:: 2 ``%`` is used for what is known as the **modulo** operation: the remainder is the result. *How many complete hours in a given number of minutes?* .. code:: ipython3 505 // 60 .. parsed-literal:: 8 *… and how many minutes are left over?* .. code:: ipython3 505 % 60 .. parsed-literal:: 25 Use with strings ^^^^^^^^^^^^^^^^ These operators can sometimes be used on strings: .. code:: ipython3 5 * 'ab' .. parsed-literal:: 'ababababab' … but sometimes not: .. code:: ipython3 'ab' / 2 :: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 'ab' / 2 TypeError: unsupported operand type(s) for /: 'str' and 'int' Using operators with variables ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Operators work with variables as they would if used directly with the assigned values: .. code:: ipython3 production = 250 defects = 6 defect_rate = (defects / production) * 100 defect_rate .. code:: ipython3 item_price = 5 total_cost = 1000 revenue = item_price * (production - defects) profit = revenue - total_cost profit Assignment operators ~~~~~~~~~~~~~~~~~~~~ When performing certain operations on a single variable, we can use the following more succinct syntax: .. code:: ipython3 attendance = 100 attendance += 3 #equivalent to attendance = attendance + 3 attendance .. code:: ipython3 refunds = 5 attendance -= refunds attendance Comments ~~~~~~~~ We can add **comments** to our code in the following ways: .. code:: ipython3 ''' Code for calculation of revenue. Assumes no failed payments or price changes. ''' ticket_price = 10 revenue = attendance * ticket_price #assuming refunds are paid in full revenue - **Multi-line comments** sit within ``'''``\ three speech marks\ ``'''`` - **Single line comments** (which can share a line with working code) sit to the right of a ``#`` character String indexing ~~~~~~~~~~~~~~~ We can access elements within certain data types (including the characters within **strings**) by their position, using **indexing**. We can access a character, section, or series of characters om a string by providing numerical values (**indexes**) within **``[``\ square brackets\ ``]``**. .. code:: ipython3 name = 'Alison' name[0] If a **single value** is provided, the character at that specific position is accessed. .. code:: ipython3 name[0:3] If a **pair of values separated by a ``:``** are provided, these are interpreted as ``start`` and ``stop`` values. - Note how the character at the ``stop`` position is **not included**. - The **difference** in values will be the **number of characters** returned .. code:: ipython3 name[:3] .. code:: ipython3 name[1:] .. code:: ipython3 name[:] Use of a **``:`` without values on one or both sides** is interpreted as there being **no limit in the given direction(s)**. .. code:: ipython3 name[::2] If **two colons** are used, any value to the right of the second colon is interpreted as the ``step`` value. .. code:: ipython3 volcano = 'Eyjafjallajökull' volcano[0:12:3] .. code:: ipython3 volcano[3::3] The values (or lack of) are interpreted as **``start:stop:step``** values (or lack of) respectively. .. code:: ipython3 name[-3] .. code:: ipython3 name[:-3] .. code:: ipython3 name[-3:] Negative values can be used to refer to position in relation to the **end** of the string. - The ``[position]``, ``[start:stop]``, and ``[start:stop:step]`` structures continue to work in the same way. Finding the index of a given character ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code:: ipython3 name.index('s') | We have used the ``.index()`` **method** to find the index (position) of the **first occurrence** of the given character. | - Methods and functions will be explained in more detail further on. Type conversion ~~~~~~~~~~~~~~~ Attempting operations with values of the wrong type can result in errors: .. code:: ipython3 meals = '4' meals += 1 … or mistakes: .. code:: ipython3 cost = meals * 8 cost There are several **built-in functions** we can use to convert the **type** of a value where possible, each corresponding to the **data types** we saw earlier: :: int() float() str() bool() .. code:: ipython3 meals = int(meals) meals += 1 meals .. code:: ipython3 float(meals) Note that conversion of a ``float`` to an ``int`` using built-in function ``int()`` simply **truncates** the value rather than rounding it: .. code:: ipython3 price = 5.99 pounds = int(price) pounds In this instance, we might instead want to use another built-in function ``round()``\ … .. code:: ipython3 int(round(price, 0)) # The zero here sets the number of decimal places String formatting ~~~~~~~~~~~~~~~~~ Python knows that something is a ``string`` when you put it within either " or ' marks. - The **same type of quotation mark** must be used at the start and finish - The type of quotation mark not used at the ends can be used *within* the string .. code:: ipython3 "The first man said, 'I'm 50 years old today!'" We can insert variables into strings as follows: .. code:: ipython3 name = 'Paul' age = 40 f"{name} said, 'I'm {age} years old today!'" - We have put an ``f`` before the string and the required variables within ``{``\ braces\ ``}`` You may also see the following syntax: .. code:: ipython3 "{} said, 'I'm {} years old today!'".format(name, age) .. parsed-literal:: "Mary said, 'I'm 25 years old today!'" The **f-strings** functionality we saw previously is more concise, but was only introduced in Python 3.6 (December 2016). The older ``.format()`` method continues to be supported. Now open the following workbook: ``intro-python-syntax-workbook.ipynb`` Data structures ~~~~~~~~~~~~~~~ The standard **built-in data structures** in python are: - **list**: an editable, **ordered collection** of values - **dictionary**: a collection of **``key:value`` pairs** - **tuple**: similar to lists, but not editable - **set**: a collection of **unique values** Data structures are used to **input**, **process**, **maintain** and **retrieve** data. - You are likely to encounter **lists** and **dictionaries** more frequently than **tuples** and **sets** - You will use other, more **specialised data structures** in future when using **packages** such as **pandas** Lists ~~~~~ - lists have a variable number of elements - elements can be **added**, **removed** or **modified** - elements are accessed using a **zero-based index** - elements do not have to be the same data type List indexing ^^^^^^^^^^^^^ The syntax we can use for **accessing list elements by position** are the same as we have seen previously for strings, with the number of **numerical values** and **colons** determining the interpretation: - ``[position]`` - ``[start:stop]`` - ``[start:stop:step]`` Again, indexing is **zero-based**, and **negative values** can be used to access by **position from the end of the list**. List indexing ^^^^^^^^^^^^^ .. code:: ipython3 rain = [6.5, 0, 0, 1.2, 2.6, 1.9, 5.4] .. code:: ipython3 print(rain[3]) print(rain[3:]) print(rain[:3]) .. parsed-literal:: 1.2 [1.2, 2.6, 1.9, 5.4] [6.5, 0, 0] .. code:: ipython3 print(rain[-2]) print(rain[-2:]) print(rain[:-2]) .. parsed-literal:: 1.9 [1.9, 5.4] [6.5, 0, 0, 1.2, 2.6] List indexing ^^^^^^^^^^^^^ .. code:: ipython3 people = ['Anna', 'Ben', 'Cynthia', 'Dennis', 'Evandro', 'Farhad'] .. code:: ipython3 print(people[::2]) print(people[::-2]) .. parsed-literal:: ['Anna', 'Cynthia', 'Evandro'] ['Farhad', 'Dennis', 'Ben'] We can use negative values for ``step`` to return values in **reverse order**, but when used with ``start`` and ``stop`` values, ``start`` must be greater than ``stop``: .. code:: ipython3 print(people[0:4:-2]) print(people[4::-2]) .. parsed-literal:: [] ['Evandro', 'Cynthia', 'Anna'] How else can we work with lists? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - ``.append(object)`` - ``.pop()`` - ``.insert(index, object)`` - ``.remove(object)`` - ``.extend(list)`` - ``.reverse()`` - ``.sort()`` - ``.copy()`` .. code:: ipython3 scores = [10, 25, 14] scores[0] = 50 scores.append('Unknown') scores .. parsed-literal:: [50, 25, 14, 'Unknown'] - ``.append()`` allows us to **add a single item** to a list .. code:: ipython3 scores.pop() .. parsed-literal:: 'Unknown' .. code:: ipython3 scores .. parsed-literal:: [50, 25, 14] - ``.pop()`` returns the **last item** in the list, and **removes it** from the list. .. code:: ipython3 more_scores = [77, 33, 12] scores.extend(more_scores) scores .. parsed-literal:: [50, 25, 14, 77, 33, 12] ``.extend()`` allows us to **extend** a list with values from another list .. code:: ipython3 scores.reverse() scores .. parsed-literal:: [12, 33, 77, 14, 25, 50] .. code:: ipython3 scores.sort() scores .. parsed-literal:: [12, 14, 25, 33, 50, 77] - ``.reverse()`` and ``.sort()`` work as expected - using ``.sort()`` on a list of **strings** will sort them **alphabetically** Dictionaries ^^^^^^^^^^^^ - Dictionaries **map values to keys** - values are **accessed using the key** (rather than index) - values can be modified and further **key:value pairs** added .. code:: ipython3 user = {'Name': 'David', 'Occupation': 'Magician', 'Phone': 5554267, 'Town': 'New York', } user['Town'] .. parsed-literal:: 'New York' We can **add** or **modify** values in the same way as we would assign or update variables: .. code:: ipython3 user['Town'] = 'San Francisco' user['Gender'] = 'Male' .. code:: ipython3 user .. parsed-literal:: {'Name': 'David', 'Occupation': 'Magician', 'Phone': 5554267, 'Town': 'San Francisco', 'Gender': 'Male'} How else can we work with dictionaries? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - ``.get(key, default)`` - ``.items()`` - ``.keys()`` - ``.values()`` - ``.update(key=value)`` - ``.copy()`` .. code:: ipython3 user.update({'Town': 'Washington', 'Country': 'USA'}) user .. parsed-literal:: {'Name': 'David', 'Occupation': 'Magician', 'Phone': 5554267, 'Town': 'Washington', 'Gender': 'Male', 'Country': 'USA'} - ``.update()`` allows us to **update** an existing dictionary using another dictionary of ``key:value`` pairs. .. code:: ipython3 user['Age'] :: --------------------------------------------------------------------------- KeyError Traceback (most recent call last) in ----> 1 user['Age'] KeyError: 'Age' .. code:: ipython3 user.get('Age', 'Not provided') - ``.get()`` allows us to provide a **fallback value** should the given key not be found in the dictionary .. code:: ipython3 user.keys() .. code:: ipython3 user.values() ``.keys()`` and ``.values()`` return a *view* of the **keys** and **values** respectively; we can use the **built-in ``list()`` function** to return the list itself from each: .. code:: ipython3 list(user.values()) Tuples ^^^^^^ - tuples contain an **ordered collection** of elements - the **number** of elements is **fixed** - individual elements **cannot be modified** .. code:: ipython3 address = ('Buckingham Palace', 'London', 'SW1A 1AA') name, area, postcode = address area Here we have **unpacked** the tuple, assigning the constituent values to several variables at the same time. Tuples vs lists ''''''''''''''' In practice you can usually achieve the same functionality by using a ``list``, but: - using a **tuple** conveys to others reading the code that the **structure** of the object is important, and that each element is likely to represent something of different nature - it may be important that there are a specific number of elements - each element has distinct features, e.g. ``name``, ``area``, ``postcode`` Tuples vs lists ''''''''''''''' - using a **list** conveys to others that **order** is important, elements are more likely to be **homogenous**, and the **number of elements may change** - each element of our ``scores`` earlier represents something of the same nature - we may not have known how many ``scores`` there would eventually be in the list Sets ^^^^ - All values in a set are **unique** - Elements are **not ordered or indexed** .. code:: ipython3 friends = {'Ellie', 'Sarah', 'Amar'} friends.update(['James', 'Ellie']) friends.discard('Sarah') friends Notice how the original **order is not retained**. Although here the items are displayed alphabetically, the order in which set items are stored in memory should be considered **arbitary**. .. code:: ipython3 required = ['F', 'A', 'C', 'B', 'A', 'C', 'A', 'D', 'H', 'B'] set(required) - The built-in ``set()`` function is useful for identifying the **unique values** in a list .. code:: ipython3 list(set(required)) .. code:: ipython3 list(sorted(set(required))) - We can use the built-in ``list()`` function to return a list - As mentioned previously, the item order of a set is arbitary … but can be resolved using the built-in ``sorted()`` function We can check for **membership** of a set as follows: .. code:: ipython3 'Sarah' in friends The same ``in`` keyword can also be used to check for the **presence of a value** in a list: .. code:: ipython3 0 in rain Combining data structures ^^^^^^^^^^^^^^^^^^^^^^^^^ The elements used in the examples so far have been single values; but we can also **combine data structures** in many ways: - **lists**, **tuples**, and **dictionary values** can be of **any data type** - structures can be **nested** within one another with **no practical limit** .. code:: ipython3 blog_post = {'user': ('cool_dave', 'dave@email.com'), 'subject': 'How to live better', 'tags': {'lifestyle', 'religion'}, 'likes':['Joe53ph', 'sara77', 'Cathy'], } .. code:: ipython3 blog_post['likes'].append('Jonathan') blog_post['likes'] - We can access the **object** in the dictionary using the **key**, and then perform operations on that object according to its **type** .. code:: ipython3 users = {('cool_dave', 'dave@email.com'): {'name': 'David Smith', 'interests': {'philosophy', 'music'}, 'awards':['Best Newcomer', 'Post of the Month - July 2015'] }, ('sara77', 'sara@email.com'): {'name': 'Sara Green', } } .. code:: ipython3 users[('sara77', 'sara@email.com')]['name'] - We can use **tuples** as **dictionary keys** … but not lists, sets or other dictionaries, as they are **mutable** (can be changed) - We can **nest** objects within others … and access elements within the nested objects by **chaining** code statements Now open the following workbook: ``data-structures-workbook.ipynb``