Data Types and Conversion#


Questions:#

  • What kinds of data do programs store?

  • How can I convert one type to another?

Learning Objectives:#

  • Explain key differences between integers and floating point numbers.

  • Explain key differences between numbers and character strings.

  • Use built-in functions to convert between integers, floating point numbers, and strings.


Every value has a type#

  • Every value in a program has a specific type

  • Integer (int): represents positive or negative whole numbers like 3 or -512

  • Floating point number (float): represents real numbers like 3.14159 or -2.5

  • Character (char): single characters, for example "a", "j", "8", "("

    • Characters are written in either single quotes or double quotes (as long as they match)

    • Numerals placed in quotes will be treated as characters, not integers or floats

  • Character string (usually called “string”, str): text

    • Written in either single quotes or double quotes (as long as they match)

    • The quote marks aren’t printed when the string is displayed

Use the built-in function type to find the type of a value#

  • Use the built-in function type to find out what type a value has

  • Works on variables as well

    • But remember: the value has the type; the variable is just a label

print(type(52))
<class 'int'>
fitness = 'average'
print(type(fitness))
<class 'str'>

Nested functions such as print(type()) are evaluated from the inside out, like in mathematics.

Types control what operations (or methods) can be performed on a given value#

A value’s type determines what the program can do to it. So we can perform subtraction on integers:

print(5 - 3)
2

But not on strings or characters:

print('hello' - 'h')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 print('hello' - 'h')

TypeError: unsupported operand type(s) for -: 'str' and 'str'

You can use the + and * operators on strings#

“Adding” character strings concatenates them; i.e., creates one long string by combinging the inputs in the order you specify

full_name = 'Ahmed' + 'Walsh'
print(full_name)
AhmedWalsh

To add spaces between strings that you concateate, you need to explicitly include whitespaces in quotes:

full_name = 'Ahmed' + ' ' + 'Walsh'
print(full_name)
Ahmed Walsh

Multiplying a character string by an integer N creates a new string that consists of that character string repeated N times. (Since multiplication is repeated addition)

greeting = 'hello-' * 3
print(greeting)
hello-hello-hello-

Strings have a length (but numbers don’t)#

The built-in function len counts the number of characters in a string.

print(len(full_name))
11

But numbers don’t have a length (not even zero).

print(len(52))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 print(len(52))

TypeError: object of type 'int' has no len()

Use an index to get a single character from a string.#

  • The characters (individual letters, numbers, and so on) in a string are ordered. For example, the string 'AB' is not the same as 'BA'. Because of this ordering, we can treat the string as a list of characters.

  • Each position in the string (first, second, etc.) is given a number. This number is called an index.

  • Indices are numbered from 0.

  • Use the position’s index in square brackets to get the character at that position.

an illustration of indexing

atom_name = 'helium'
print(atom_name[0])
h

Use a slice to get a substring.#

  • A part of a string is called a substring. A substring can be as short as a single character.

  • An item in a list is called an element. Whenever we treat a string as if it were a list, the string’s elements are its individual characters.

  • A slice is a part of a string (or, more generally, any list-like thing).

  • We take a slice by using [start:stop], where start is replaced with the index of the first element we want and stop is replaced with the index of the element just after the last element we want.

  • Mathematically, you might say that a slice selects [start:stop).

  • The difference between stop and start is the slice’s length.

  • Taking a slice does not change the contents of the original string. Instead, the slice is a copy of part of the original string.

atom_name = 'sodium'
print(atom_name[0:3])
sod

Slicing numbers?#

If you assign a = 123, what happens if you try to get the second digit of a via a[1]?

a = 123
a[1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[12], line 2
      1 a = 123
----> 2 a[1]

TypeError: 'int' object is not subscriptable

Slicing practice#

What does the following program print?

atom_name = 'carbon'
print('atom_name[1:3] is:', atom_name[1:3])
atom_name[1:3] is: ar

Slicing concepts#

cell_name = 'neuron'
  • What does cell_name[1:5] do?

cell_name[1:5]
'euro'
  • What does cell_name[0:5] do?

cell_name[0:5]
'neuro'
  • What does cell_name[0:6] do?

cell_name[0:6]
'neuron'
  • What does cell_name[0:] (without a value after the colon) do?

cell_name[0:]
'neuron'
  • What does cell_name[:5] (without a value before the colon) do?

cell_name[:5]
'neuro'
  • What does cell_name[:] (just a colon) do?

cell_name[:]
'neuron'
  • What does cell_name[1:-1] do?

cell_name[1:-1]
'euro'
  • What happens when you choose a high value (.e., the value after the colon) which is out of range? (i.e., try cell_name[1:99])

cell_name[1:99]
'euron'

You must convert numbers to strings or vice versa when operating on them#

Cannot add numbers and strings.

print(1 + '2')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[23], line 1
----> 1 print(1 + '2')

TypeError: unsupported operand type(s) for +: 'int' and 'str'

This is not allowed because it’s ambiguous: should 1 + '2' be 3 or '12'?

Some types can be converted to other types by using the type name as a function:

print(1 + int('2'))
3
print(str(1) + '2')
12

You can mix integers and floats freely in operations#

Python 3 automatically converts integers to floats as needed.

print('half is', 1 / 2.0)
print('three squared is', 3.0 ** 2)
half is 0.5
three squared is 9.0

Exercises#

Fractions#

What type of value is 3.4? How can you find out?

Automatic Type Conversion#

What type of value is 3.25 + 4?

Choose a Type#

What type of value (integer, floating point number, or character string) would you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in (1), when would counting days with a floating point variable make more sense than using an integer?

  1. Number of days since the start of the year.

  2. Time elapsed from the start of the year until now in days.

  3. Serial number of a piece of lab equipment.

  4. A lab specimen’s age

  5. Current population of a city.

  6. Average population of a city over time.

Division Types#

In Python 3:

  • the // operator performs integer (whole-number) floor division

  • the / operator performs floating-point division

  • the ‘%’ (or modulo) operator calculates and returns the remainder from integer division:

print('5 // 3 = ', 5 // 3)
print('5 / 3 = ', 5 / 3)
print(1 + '2')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 print(1 + '2')

TypeError: unsupported operand type(s) for +: 'int' and 'str'
print('5 % 3 =', 5 % 3)
print(1 + int('2'))
3

Division Challenge#

Imagine we are catering an event for 100 guests, and for dessert we want to serve each person one slice of pie. Each pie yields 8 pieces. How do we calculate the number of pies we need?

We can start by simply dividing the number of guests by the number of slices per pie:

pie_eaters = 100
slice_per_pie = 8
num_pies = pie_eaters / slice_per_pie
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
100 guests requires 12.5 pies

However, this yields a floating point number. We can’t easily bake half a pie, so we need to round up to ensure we have enough pies. We can use floor division for this:

num_pies = pie_eaters // slice_per_pie
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
100 guests requires 12 pies

Of course, we actually need one more pie than that, but Python doesn’t provide an operator for rounding up (“ceiling” division). So we can simply add 1 to our answer:

num_pies = pie_eaters // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
100 guests requires 13 pies

Note that Python uses standard order of operations, so the division will be performed before the addition. That is, we will get:

(pie_eaters // slice_per_pie) + 1

not

pie_eaters // (slice_per_pie + 1)

When writing code, it’s good to test it and think about possible cases where it won’t work as intended. In this example, if the number of guests was evenly divisible by 8, then our calculation would erroneously tell us we need one more pie than we do:

pie_eaters = 64
num_pies = pie_eaters // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
64 guests requires 9 pies

We can make our code more robust by subtracting 1 to pie_eaters within the formula:

num_pies = (pie_eaters - 1) // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
64 guests requires 8 pies

Strings to Numbers#

Where reasonable, float() will convert a string to a floating point number, and int() will convert a floating point number to an integer:

print("string to float:", float("3.4"))
print("float to int:", int(3.4))

If the conversion doesn’t make sense, however, an error message will occur

print("string to float:", float("Hello world!"))

Given this information, what do you expect the following program to do?

print("fractional string to int:", int("3.4"))
  • What does it actually do?

  • Why do you think it does that?

Arithmetic with Different Types#

Given these variable definitions:

a = 1.0
b = "1"
c = "1.1"

Which of the following will return the floating point number 2.0? Note: there may be more than one right answer.

  1. a + float(b)

  2. float(b) + float(c)

  3. a + int(c)

  4. a + int(float(c))

  5. int(a) + int(float(c))

  6. 2.0 * b


Summary of Key Points:#

  • Every value has a type

  • Use the built-in function type to find the type of a value

  • Types control what operations can be done on values

  • Strings can be added and multiplied

  • Strings have a length (but numbers don’t)

  • Use the built-in function len to find the length of a string

  • Use an index to get a single character from a string

  • Use a slice to get a substring

  • Can mix integers and floats freely in operations

  • Must convert numbers to strings or vice versa when operating on them


This lesson is adapted from the Software Carpentry Plotting and Programming in Python workshop.