{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "# Intro to sequences: strings, lists, and tuples" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "In Python a *sequence* is an *ordered* collection of objects. The term 'ordered' here means that we can retrieve the first object in the sequence, and the second object, and so on.\n", "\n", "There are three main sequence types in Python: strings, lists, and tuples.\n", "\n", "- A *string* is an immutable sequence of characters\n", "- A *list* is a mutable sequence of objects (of any type)\n", "- A *tuple* is an immutable sequence of objects (of any type)\n", "\n", "The term *immutable* means that a value in the sequence **cannot** be changed directly; while *mutable* means that a value can be changed. We will see examples of this below.\n", "\n", "## Finding the length of a sequence\n", "The *len* function can be used to find the length of any sequence." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word = \"hello\"\n", "len(word)" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## Sequence indexing\n", "\n", "Each item in a sequence has a numbered index, which begins at 0. For example, the string \"hello\" has the following indices:\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "
Index01234
Characterhello
\n", "\n", "\n", "You can access the item at index $i$ by typing `sequence[i]`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word[0] # returns the element at index 0 (i.e., the first element)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word[1] # returns the element at index 1 (i.e., the second element)" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "A negative index, with the value $-i$, corresponds to the $i^{th}$ element from the end." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word[-1] # returns the element at index -1 (i.e., the last character)" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "Using an invalid index will result in an error." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word[10]" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## Referencing consecutive elements of a sequence using *slicing*\n", "\n", "**Note**: We will skip this for CSC 180, but it is very useful so I wanted to include it for completeness.\n", "\n", "*Slicing* can be used to get consecutive elements (a slice) of a sequence.\n", "\n", "Slices are specified through the code\n", "``` Python\n", "sequence[start:stop:step]\n", "```\n", "\n", "where \n", "- *start* is the index where the slice begins (defaults to 0, the first element of the sequence)\n", "- *stop* is used to denote the end of the slice, but the slice stops at index *stop - 1* (defaults to the len(sequence), which is the end of the sequence)\n", "- *step* determines the step size (or stride) between indices (defaults to one)\n", "\n", "In other words, `sequence[a:b]` will return all elements from index \n", "_a_ up to but not including index *b*.\n", "\n", "It may seem strange that elements up to but *not including* index _b_ are returned, but this is done because the length of the slice will always be *b - a*." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word[0:2] # get the first 2 characters (from index 0 up to but not including index 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since the default value of the starting index is 0, we can also specify the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "word[:2] # get the first 2 characters (from default index 0 up to but not including index 2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "scrolled": true }, "outputs": [], "source": [ "# we can use negative index values, for example to get the last 2 characters\n", "word[-2:]" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "### Exercise\n", "\n", "In the string below, use sequence indexing to display the 1st character and the 3rd character." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "sentence = 'Today is a good day'\n", "sentence" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## Lists ##\n", "A *list* in Python is a sequence of objects (technically, it is a sequence of references to each element -- more on this below). In Python you create a list using the following syntax:\n", "\n", "```python\n", "mylist = [item1, item2, ...]\n", "```\n", "Because lists are sequences, the same concepts regarding their length, indices, and slicing that apply for strings also applies to lists." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "numbers = [7,10,13,21]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# how many numbers are in the list?\n", "len(numbers)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# what is the first number in the list?\n", "numbers[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strings are immutable, while lists are not\n", "If a sequence is *immutable* then you cannot (directly) change any of its elements. Strings are immutable; trying to change an element will result in an error." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "s = 'hello'\n", "s[0] = 'H'" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "run_control": { "frozen": true } }, "source": [ "Lists are *not* immutable; so individual elements can be changed." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# create a list and then change the first element\n", "l = [1,2,3,4]\n", "l[0] = 7\n", "l" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## What is a list (technical answer) \n", "\n", "What happens when you have a list, and assign its value to another variable?\n", "\n", "```python\n", "list1 = [1,2,3,4]\n", "list2 = list1\n", "```\n", "Technically, each element of a list is a *reference* to a value, and not the value itself. As a result, assignment of the form `list2 = list1` will assigns the sequence of references in the first list to the second list variable. In other words, both lists will reference the same objects in memory! This can have unintended consequences, as seen in the code below. We will also visualize this code using the Python Tutor at http://www.pythontutor.com/." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "list1 = [1,2,3,4]\n", "list2 = list1\n", "\n", "print('list1 = ', list1)\n", "print('list2 = ', list2)\n", "print()\n", "print('changing the first element of list1 changes the first element of list2!')\n", "\n", "list1[0] = 99\n", "print('list1 = ', list1)\n", "print('list2 = ', list2)" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "This behavior can create issues when the programmer wants to copy a list. We will not worry about this. For more information on how to make copies of a list, see this link: https://www.geeksforgeeks.org/copy-python-deep-copy-shallow-copy/" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## Split and join methods\n", "\n", "If *s* is a string, then \n", "\n", "```python\n", "s.split(sep)\n", "```\n", "\n", "will split *s* into multiple strings based on the delimiter *sep*, and will return a _list_ of results.\n", "\n", "\n", "For example, 'how are you today' split by ' are ' will create a list that can be visualized as the following:\n", "\n", "```\n", " | |\n", "how| are |you today\n", " | |\n", "```\n", "\n", "However, the *separator* is removed from the list, so we will get 'how 'and 'you today'." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "sentence = 'how are you today'\n", "sentence.split(' are ') # returns strings before and after ' are '" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "If the separater is not specified, then the default delimiter is any whitespace character.\n", "Splitting 'how are you today' by the default separator can be visualized as:\n", "\n", "```\n", " | | | | | |\n", "how| |are| |you| |today\n", " | | | | | |\n", "```\n", "\n", "which results in a list containing the words 'how' , 'are', 'you', and 'today'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "words = sentence.split() # if the separater is not specified, then the default delimiter is any whitespace character\n", "words" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise:** Use python to output the first word of the sentence" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "sentence = 'how are you today'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## Tuples are like lists but are immutable ##\n", "A *tuple* is a sequence that is similar to a list but is immutable. A tuple is specified by including a comma separated list of elements in parentheses. The above notes regarding the length, indices and slicing, also apply. In general, *lists* are usually used to store similar values where either the number of values or individual values might change; *tuples* are used to store structured data where the order of values has meaning, but different values may represent different things." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# example of a tuple storing (x,y) values\n", "p = (1,2)\n", "p" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('x =', p[0])\n", "print('y =', p[1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# tuples are immutable, so we get an error if we try to change an element\n", "p[0] = 3" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false }, "source": [ "## Getting help in Python\n", "\n", "Python has built-in *help* that documents how to use functions or methods. The *help* function has the form `help(function)` or `help(object.method)`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "help(print)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# get help on string 'split' method. Note since 'split' must be called from a string object (which has type 'str'), \n", "# we use 'str.split' in the 'help' function call\n", "help(str.split)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false }, "outputs": [], "source": [ "# alternatively, if a string exists we can use that string rather than the generic 'str'\n", "s = 'how are you?'\n", "help(s.split)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(str.split)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 2 }