Data Types and Conversion

Data Types and Conversion#

Watch a walk-through of this lesson on YouTube
Download all the Jupyter notebooks and other files you need to complete the lessons in this chapter (Chapter 3)
Turn off the GitHub Copilot AI assistant so you can focus on learning Python using your HI (human intelligence). Click the Deactivate Copilot button in the bottom right of VS Code, if it is currently activated.

Questions:#

What kinds of data do programs store?
How can I convert one type to another?

Learning Objectives:#

Explain key differences between integers and floating point numbers.
Explain key differences between numbers and character strings.
Use built-in functions to convert between integers, floating point numbers, and strings.

Every value has a type#

Every value in a program has a specific type
Integer (int): represents positive or negative whole numbers like 3 or -512
Floating point number (float): represents real numbers like 3.14159 or -2.5
Character (char): single characters, for example "a", "j", "8", "("
- Characters are written in either single quotes or double quotes (as long as they match)
- Numerals placed in quotes will be treated as characters, not integers or floats
Character string (usually called “string”, str): text
- Written in either single quotes or double quotes (as long as they match)
- The quote marks aren’t printed when the string is displayed

Use the built-in function `type` to find the type of a value#

Use the built-in function type to find out what type a value has
Works on variables as well
- But remember: the value has the type; the variable is just a label

print(type(52))

<class 'int'>

fitness = 'average'
print(type(fitness))

<class 'str'>

Nested functions such as print(type()) are evaluated from the inside out, like in mathematics.

Types control what operations (or methods) can be performed on a given value#

A value’s type determines what the program can do to it. So we can perform subtraction on integers:

print(5 - 3)

But not on strings or characters:

print('hello' - 'h')

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 print('hello' - 'h')

TypeError: unsupported operand type(s) for -: 'str' and 'str'

You can use the `+` and `*` operators on strings#

“Adding” character strings concatenates them; i.e., creates one long string by combinging the inputs in the order you specify

full_name = 'Ahmed' + 'Walsh'
print(full_name)

AhmedWalsh

To add spaces between strings that you concateate, you need to explicitly include whitespaces in quotes:

full_name = 'Ahmed' + ' ' + 'Walsh'
print(full_name)

Ahmed Walsh

Multiplying a character string by an integer N creates a new string that consists of that character string repeated N times. (Since multiplication is repeated addition)

greeting = 'hello-' * 3
print(greeting)

hello-hello-hello-

Strings have a length (but numbers don’t)#

The built-in function len counts the number of characters in a string.

print(len(full_name))

But numbers don’t have a length (not even zero).

print(len(52))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 print(len(52))

TypeError: object of type 'int' has no len()

Use an index to get a single character from a string.#

The characters (individual letters, numbers, and so on) in a string are ordered. For example, the string 'AB' is not the same as 'BA'. Because of this ordering, we can treat the string as a list of characters.
Each position in the string (first, second, etc.) is given a number. This number is called an index.
Indices are numbered from 0.
Use the position’s index in square brackets to get the character at that position.

an illustration of indexing

atom_name = 'helium'
print(atom_name[0])

Use a slice to get a substring.#

A part of a string is called a substring. A substring can be as short as a single character.
An item in a list is called an element. Whenever we treat a string as if it were a list, the string’s elements are its individual characters.
A slice is a part of a string (or, more generally, any list-like thing).
We take a slice by using [start:stop], where start is replaced with the index of the first element we want and stop is replaced with the index of the element just after the last element we want.
Mathematically, you might say that a slice selects [start:stop).
The difference between stop and start is the slice’s length.
Taking a slice does not change the contents of the original string. Instead, the slice is a copy of part of the original string.

atom_name = 'sodium'
print(atom_name[0:3])

sod

Slicing numbers?#

If you assign a = 123, what happens if you try to get the second digit of a via a[1]?

a = 123
a[1]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[12], line 2
      1 a = 123
----> 2 a[1]

TypeError: 'int' object is not subscriptable

Click the button to reveal the answer

Numbers are not strings or sequences and Python will raise an error if you try to perform an index operation on a number. In the lesson on types and type conversion we will learn more about types and how to convert between different types. If you want the Nth digit of a number you can convert it into a string using the str built-in function and then perform an index operation on that string.

Slicing practice#

What does the following program print?

atom_name = 'carbon'
print('atom_name[1:3] is:', atom_name[1:3])

atom_name[1:3] is: ar

Slicing concepts#

cell_name = 'neuron'

What does cell_name[1:5] do?

cell_name[1:5]

'euro'

What does cell_name[0:5] do?

cell_name[0:5]

'neuro'

What does cell_name[0:6] do?

cell_name[0:6]

'neuron'

What does cell_name[0:] (without a value after the colon) do?

cell_name[0:]

'neuron'

What does cell_name[:5] (without a value before the colon) do?

cell_name[:5]

'neuro'

What does cell_name[:] (just a colon) do?

cell_name[:]

'neuron'

What does cell_name[1:-1] do?

cell_name[1:-1]

'euro'

What happens when you choose a high value (.e., the value after the colon) which is out of range? (i.e., try cell_name[1:99])

cell_name[1:99]

'euron'

You must convert numbers to strings or vice versa when operating on them#

Cannot add numbers and strings.

print(1 + '2')

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[23], line 1
----> 1 print(1 + '2')

TypeError: unsupported operand type(s) for +: 'int' and 'str'

This is not allowed because it’s ambiguous: should 1 + '2' be 3 or '12'?

Some types can be converted to other types by using the type name as a function:

print(1 + int('2'))

print(str(1) + '2')

You can mix integers and floats freely in operations#

Python 3 automatically converts integers to floats as needed.

print('half is', 1 / 2.0)
print('three squared is', 3.0 ** 2)

half is 0.5
three squared is 9.0

Exercises#

Fractions#

What type of value is 3.4? How can you find out?

Automatic Type Conversion#

What type of value is 3.25 + 4?

Choose a Type#

What type of value (integer, floating point number, or character string) would you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in (1), when would counting days with a floating point variable make more sense than using an integer?

Number of days since the start of the year.
Time elapsed from the start of the year until now in days.
Serial number of a piece of lab equipment.
A lab specimen’s age
Current population of a city.
Average population of a city over time.

Click the button to reveal!

Solution

The answers to the questions are:

Integer, since the number of days would lie between 1 and 365. Float would make sense if you were considering partial days (e.g., if it’s noon then today would count as 0.5)
Floating point, since fractional days are required
If serial number contains letters and numbers, then a character string. If the serial number consists only of numerals, then an integer could be used, although a character string could also be used.
This will vary! How do you define a specimen’s age? whole days since collection (integer)? date and time (string)?
Choose integer to represent population in units of individuals, or floating point to represent population as large aggregates (eg millions)
Floating point number, since an average is likely to have a fractional part.

Division Types#

In Python 3:

the // operator performs integer (whole-number) floor division
the / operator performs floating-point division
the ‘%’ (or modulo) operator calculates and returns the remainder from integer division:

print('5 // 3 = ', 5 // 3)

print('5 / 3 = ', 5 / 3)

print(1 + '2')

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 print(1 + '2')

TypeError: unsupported operand type(s) for +: 'int' and 'str'

print('5 % 3 =', 5 % 3)

print(1 + int('2'))

Division Challenge#

Imagine we are catering an event for 100 guests, and for dessert we want to serve each person one slice of pie. Each pie yields 8 pieces. How do we calculate the number of pies we need?

We can start by simply dividing the number of guests by the number of slices per pie:

pie_eaters = 100
slice_per_pie = 8
num_pies = pie_eaters / slice_per_pie
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')

100 guests requires 12.5 pies

However, this yields a floating point number. We can’t easily bake half a pie, so we need to round up to ensure we have enough pies. We can use floor division for this:

num_pies = pie_eaters // slice_per_pie
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')

100 guests requires 12 pies

Of course, we actually need one more pie than that, but Python doesn’t provide an operator for rounding up (“ceiling” division). So we can simply add 1 to our answer:

num_pies = pie_eaters // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')

100 guests requires 13 pies

Note that Python uses standard order of operations, so the division will be performed before the addition. That is, we will get:

(pie_eaters // slice_per_pie) + 1

not

pie_eaters // (slice_per_pie + 1)

When writing code, it’s good to test it and think about possible cases where it won’t work as intended. In this example, if the number of guests was evenly divisible by 8, then our calculation would erroneously tell us we need one more pie than we do:

pie_eaters = 64
num_pies = pie_eaters // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')

64 guests requires 9 pies

We can make our code more robust by subtracting 1 to pie_eaters within the formula:

num_pies = (pie_eaters - 1) // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')

64 guests requires 8 pies

Strings to Numbers#

Where reasonable, float() will convert a string to a floating point number, and int() will convert a floating point number to an integer:

print("string to float:", float("3.4"))
print("float to int:", int(3.4))

If the conversion doesn’t make sense, however, an error message will occur

print("string to float:", float("Hello world!"))

Given this information, what do you expect the following program to do?

print("fractional string to int:", int("3.4"))

What does it actually do?
Why do you think it does that?

Click the button to reveal!

Solution

What do you expect this program to do? It would not be so unreasonable to expect the Python 3 int command to convert the string “3.4” to 3.4 and an additional type conversion to 3. After all, Python 3 performs a lot of other magic - isn’t that part of its charm?

However, Python 3 throws an error. Why? To be consistent, possibly. If you ask Python to perform two consecutive typecasts, you must convert it explicitly in code.

int("3.4")
int(float("3.4"))

Arithmetic with Different Types#

Given these variable definitions:

a = 1.0
b = "1"
c = "1.1"

Which of the following will return the floating point number 2.0? Note: there may be more than one right answer.

a + float(b)
float(b) + float(c)
a + int(c)
a + int(float(c))
int(a) + int(float(c))
2.0 * b

Summary of Key Points:#

Every value has a type
Use the built-in function type to find the type of a value
Types control what operations can be done on values
Strings can be added and multiplied
Strings have a length (but numbers don’t)
Use the built-in function len to find the length of a string
Use an index to get a single character from a string
Use a slice to get a substring
Can mix integers and floats freely in operations
Must convert numbers to strings or vice versa when operating on them

This lesson is adapted from the Software Carpentry Plotting and Programming in Python workshop.

Data Types and Conversion

Contents

Data Types and Conversion#

Questions:#

Learning Objectives:#

Every value has a type#

Use the built-in function type to find the type of a value#

Types control what operations (or methods) can be performed on a given value#

You can use the + and * operators on strings#

Strings have a length (but numbers don’t)#

Use an index to get a single character from a string.#

Use a slice to get a substring.#

Slicing numbers?#

Slicing practice#

Slicing concepts#

You must convert numbers to strings or vice versa when operating on them#

You can mix integers and floats freely in operations#

Exercises#

Fractions#

Automatic Type Conversion#

Choose a Type#

Solution

Division Types#

Division Challenge#

Strings to Numbers#

Solution

Arithmetic with Different Types#

Summary of Key Points:#

Use the built-in function `type` to find the type of a value#

You can use the `+` and `*` operators on strings#