Python Notes

Posted on Fri 17 May 2019 in Tech

Introduction

Python is a high-level programming language.

Miscellaneous

  • Suppose you've got a module "binky.py" which contains a "def foo()". The fully qualified name of that foo function is "binky.foo". In this way, various Python modules can name their functions and variables whatever they want, and the variable names won't conflict — module1.foo is different from module2.foo. In the Python vocabulary, we'd say that binky, module1, and module2 each have their own "namespaces," which as you can guess are variable name-to-object bindings.

  • input() function always returns a string, even if the user enters a number.

  • Precedence order of Boolean Operators - not,and,or

  • An integer can be equal to a floating point.

>>> 42 == '42'
False
>>> 42 == 42.0
True
>>> 42.0 == 0042.000
True
  • At most only one of the clauses (in flow-control statement) will be executed, and for elif statements, the order matters!

  • When the program execution reaches a continue statement, the program execution immediately jumps back to the start of the loop and reevaluates the loop’s condition.

Strings

Different methods available for str object -

  • s.lower(), s.upper() -- returns the lowercase or uppercase version of the string

  • s.strip()-- returns a string with whitespace removed from the start and end

  • s.isalpha()/s.isdigit()/s.isspace()... -- tests if all the string chars are in the various character classes

  • s.startswith('other'), s.endswith('other') -- tests if the string starts or ends with the given other string

  • s.find('other') -- searches for the given other string (not a regular expression) within s, and returns the first index where it begins or -1 if not found

  • s.replace('old', 'new') -- returns a string where all occurrences of 'old' have been replaced by 'new'

  • s.split('delim') -- returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it's just text. 'aaa,bbb,ccc'.split(',') -> ['aaa', 'bbb', 'ccc']. As a convenient special case s.split() (with no arguments) splits on all whitespace chars.

  • s.join(list) -- opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. '---'.join(['aaa', 'bbb', 'ccc']) -> aaa---bbb---ccc

  • Python uses negative numbers to give easy access to the chars at the end of the string: s[-1] is the last char 'o', s[-2] is 'l' the next-to-last char, and so on. Negative index numbers count back from the end of the string

  • Since python code does not have other syntax to remind you of types, your variable names are a key way for you to keep straight what is going on.

  # % operator
  text = "%d little pigs come out, or I'll %s, and I'll %s, and I'll blow your %s down." % (3, 'huff', 'puff', 'house')
  • Character encoding -
## (ustring from above contains a unicode string)
> s = ustring.encode('utf-8')
> s
'A unicode \xc6\x8e string \xc3\xb1'  ## bytes of utf-8 encoding
> t = unicode(s, 'utf-8')             ## Convert bytes back to a unicode string
> t == ustring                      ## It's the same as the original, yay!
True

List

  • The del statement will delete values at an index in a lsit

  • for, not and in constructors:

squares = [1,4,9,16]
sum = 0
for num in squares:
    sum += num
print sum  #30
>>> spam = ['hello', 'hi', 'howdy', 'heyas']
>>> 'cat' in spam
False
>>> 'howdy' not in spam
False
list = ['larry','curly','moe']
if 'curly' in list:
    print 'yay'   #yay
  • Multiple assignment trick -
>>> cat = ['fat', 'orange', 'loud']
>>> size, color, disposition = cat
>>> size
'fat'
>>> color
'orange'
  • Augmented assignment -
Augmented Assignment Statement Equivalent Assignment Statement
spam += 1 spam = spam + 1
spam -= 1 spam = spam - 1
spam *= 1 spam = spam * 1
spam /= 1 spam = spam / 1
spam %= 1 spam = spam % 1
  • Swapping values -
>>> a, b = 'Alice', 'Bob'
>>> a, b = b, a
>>> print(a)
'Bob'
>>> print(b)
'Alice'
* ```range(n)``` function yields the number 0,1,...n-1 and ```range(a,b)``` returns a, a+1, a+2, .....,b-1

* The combination of the for-loop and the range(n) function allow us to build a traditional numeric for loop:
```python
for i in range(100):
    print i
  • While Loop - Standard while loop like C++. break and continue statements works the same.

  • while loop gives total control over the index numbers.

a = range(100)
i = 0
while i < len(a):
    print a[i]
    i = i+3
  • List methods:
  list = ['larry', 'curly', 'moe']
  list.append('shemp')         ## append elem at end
  list.insert(0, 'xxx')        ## insert elem at index 0
  list.extend(['yyy', 'zzz'])  ## add list of elems at end
  print list  ## ['xxx', 'larry', 'curly', 'moe', 'shemp', 'yyy', 'zzz']
  print list.index('curly')    ## 2

  list.remove('curly')         ## search and remove that element
  list.pop(1)                  ## removes and returns 'larry'
  print list  ## ['xxx', 'moe', 'shemp', 'yyy', 'zzz']
  • Where there are duplicated of the value in the list, the index of its first appearance is returned.
>>> spam = ['Zophie', 'Pooka', 'Fat-tail', 'Pooka']
>>> spam.index('Pooka')
1
  • List Slices - works just like in strings
list = ['a', 'b', 'c', 'd']
print list[1:-1]   ## ['b', 'c']
list[0:2] = 'z'    ## replace ['a', 'b'] with ['z']
print list         ## ['z', 'c', 'd']
  • A string is also kind of a list of single text character
>>> name = 'Zophie'
>>> name[0]
'Z'
>>> name[0:4]
'Zoph'
  • Lists and strings are different in a way that list is a mutable data type. It can have values added, removed or changed. A string is immutable.

  • Unlike strings, when you assign a list to a variable, you are actually assigning a list reference to the variable. A reference is a value that points to some bit of data, and a list reference is a value that points to a list.

alt text

>>> spam = [0,1,2,3,4.5]
>>> cheese = spam
>>> cheese[1] = 'Hello!'
>>> spam
[0, 'Hello!', 2, 3, 4, 5]
>>> cheese
[0, 'Hello!', 2, 3, 4, 5]

Python uses references whenever variables must store values of mutable data types, such as lists or dictionaries. For values of immutable data types such as strings, integers, or tuples, Python variables will store the value itself.

copy() and deepcopy() functions under copy module -

  • If we don't want to change the original list or dictionary -
>>> import copy
>>> spam = ['A', 'B', 'C', 'D']
>>> cheese = copy.copy(spam)
>>> cheese[1] = 42
>>> spam
['A', 'B', 'C', 'D']
>>> cheese
['A', 42, 'C', 'D']
  • If the list you need to copy contains list, then use copy.deepcopy() function instead of copy.copy(). The deepcopy() function will copy these inner lists as well.

Tuples

  • Just like list but immutable. Declared by () brackets.
>>> eggs = ('hello',42,0.5)
>>> eggs[0]
'hello'
>>> eggs[1] = 99
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    eggs[1] = 99
TypeError: 'tuple' object does not support item assignment
  • If the tuple has only one value, it can be indicated by placing a trailing comma after the value inside the parenthesis.
>>> type(('hello',))
<class 'tuple'>
>>> type(('hello'))
<class 'str'>
  • Converting types with list() and tuple() functions -
>>> tuple(['cat', 'dog', 5])
('cat', 'dog', 5)
>>> list(('cat', 'dog', 5))
['cat', 'dog', 5]
>>> list('hello')
['h', 'e', 'l', 'l', 'o']
````
**Converting a tuple to a list is handy if you need a mutable version of a tuple value.**

## Sorting

* ```sort()``` method -
```python
>>> spam = [2, 5, 3.14, 1, -7]
>>> spam.sort()
>>> spam
[-7, 1, 2, 3.14, 5]
>>> spam = ['ants', 'cats', 'dogs', 'badgers', 'elephants']
>>> spam.sort()
>>> spam
['ants', 'badgers', 'cats', 'dogs', 'elephants']
>>> spam.sort(reverse=True)
>>> spam
['elephants', 'dogs', 'cats', 'badgers', 'ants']
  • To sort the values in regular alphabetical order, pass str.lower for the key keyword argument in the sort() method call.
>>> spam = ['a','z','A','Z']
>>> spam.sort(key=str.lower)
>>> spam
['a','A','z','Z']
  • sorted function sort the elements and returns a new list. The original list remains unchanged.
a = [5, 1, 4, 3]
print sorted(a)  ## [1, 3, 4, 5]
print a  ## [5, 1, 4, 3]

It is also customizable with optional arguments like key and reverse=True

strs = ['aa', 'BB', 'zz', 'CC']
print sorted(strs)  ## ['BB', 'CC', 'aa', 'zz'] (case sensitive)
print sorted(strs, reverse=True)   ## ['zz', 'aa', 'CC', 'BB']
  • More examples on key argument
strs = ['aa', 'BB', 'zz', 'CC']

## "key" argument specifying str.lower function to use for sorting
print sorted(strs, key=str.lower)  ## ['aa', 'BB', 'CC', 'zz']

strings = ['xc', 'zb', 'yd' ,'wa']

## Write a little function that takes a string, and returns its last letter.
## This will be the key function (takes in 1 value, returns 1 value).
def MyFn(s):
return s[-1]

## Now pass key=MyFn to sorted() to sort by the last letter:
print sorted(strs, key=MyFn)  ## ['wa', 'zb', 'xc', 'yd']
  • Working with tuple
tuple = (1, 2, 'hi')
print len(tuple)  ## 3
print tuple[2]    ## hi
tuple[2] = 'bye'  ## NO, tuples cannot be changed
tuple = (1, 2, 'bye')  ## this works

(x, y, z) = (42, 13, "hike")
print z  ## hike
(err_string, err_code) = Foo()  ## Foo() returns a length-2 tuple
  • List Comprehension
nums = [1, 2, 3, 4]
squares = [ n * n for n in nums ]   ## [1, 4, 9, 16]

Common syntax - [expr for var in list]

strs = ['hello','and','goodbye']
shouting = [s.upper()+'!!!' for words s in strs]
## ['HELLO!!!', 'AND!!!', 'GOODBYE!!!']
## Select values <= 2
nums = [2, 8, 1, 6]
small = [ n for n in nums if n <= 2 ]  ## [2, 1]

## Select fruits containing 'a', change to upper case
fruits = ['apple', 'cherry', 'banana', 'lemon']
afruits = [ s.upper() for s in fruits if 'a' in s ]
## ['APPLE', 'BANANA']

Dict Hash Table

Python's efficient key/value hash table structure is called a "dict". The contents of a dict can be written as a series of key:value pairs within braces { }, e.g. dict = {key1:value1, key2:value2, ... }.

## Can build up a dict by starting with the the empty dict {}
## and storing key/value pairs into the dict like this:
## dict[key] = value-for-that-key
dict = {}
dict['a'] = 'alpha'
dict['g'] = 'gamma'
dict['o'] = 'omega'

print dict  ## {'a': 'alpha', 'o': 'omega', 'g': 'gamma'}

print dict['a']     ## Simple lookup, returns 'alpha'
dict['a'] = 6       ## Put new key/value into dict
'a' in dict         ## True
## print dict['z']                  ## Throws KeyError
if 'z' in dict: print dict['z']     ## Avoid KeyError
print dict.get('z')  ## None (instead of KeyError)
  • While the order of items matters for determining whether two lists are the same, it does not matter in what order the key-value pairs are typed in a dictionary.
>>> spam = ['cats', 'dogs', 'moose']
>>> bacon = ['dogs', 'moose', 'cats']
>>> spam == bacon
False
>>> eggs = {'name': 'Zophie', 'species': 'cat', 'age': '8'}
>>> ham = {'species': 'cat', 'age': '8', 'name': 'Zophie'}
>>> eggs == ham
True
  • Traversing dict
## By default, iterating over a dict iterates over its keys.
## Note that the keys are in a random order.
for key in dict: print key
## prints a g o

## Exactly the same as above
for key in dict.keys(): print key

## Get the .keys() list:
print dict.keys()  ## ['a', 'o', 'g']

## Likewise, there's a .values() list of values
print dict.values()  ## ['alpha', 'omega', 'gamma']

## Common case -- loop over the keys in sorted order,
## accessing each key/value
for key in sorted(dict.keys()):
print key, dict[key]    ## My comment: didn't work. 'dict' object has no attribute 'key'

## .items() is the dict expressed as (key, value) tuples
print dict.items()  ##  [('a', 'alpha'), ('o', 'omega'), ('g', 'gamma')]

## This loop syntax accesses the whole dict by looping
## over the .items() tuple list, accessing one (key, value)
## pair on each iteration.
for k, v in dict.items(): print k, '>', v
## a > alpha    o > omega     g > gamma
  • Dict Formatting
hash = {}
hash['word'] = 'garfield'
hash['count'] = 42
s = 'I want %(count)d copies of %(word)s' % hash  # %d for int, %s for string
# 'I want 42 copies of garfield'
  • del operator
var = 6
del var  # var no more!

list = ['a', 'b', 'c', 'd']
del list[0]     ## Delete first element
del list[-2:]   ## Delete last two elements
print list      ## ['b']

dict = {'a':1, 'b':2, 'c':3}
del dict['b']   ## Delete 'b' entry
print dict      ## {'a':1, 'c':3}

Files

  • The open() function opens and returns a file handle that can be used to read or write a file in the usual way. The code f = open('name', 'r') opens the file into the variable f, ready for reading operations, and use f.close() when finished. Instead of 'r', use 'w' for writing, and 'a' for append.

  • The special mode 'rU' is the "Universal" option for text files where it's smart about converting different line-endings so they always come through as a simple '\n'. The standard for-loop works for text files, iterating through the lines of the file (this works only for text files, not binary files). The for-loop technique is a simple and efficient way to look at all the lines in a text file:

 # Echo the contents of a file
f = open('foo.txt', 'rU')
for line in f:   ## iterates over the lines of the file
print line,    ## trailing , so print does not add an end-of-line char
        ## since 'line' already includes the end-of line.
f.close()
  • For writing, f.write(string) method is the easiest way to write data to an open output file

  • f.readlines() method reads the whole file into memory and returns its contents as a list of its lines. The f.read() method reads the whole file into a single string

  • Files Unicode - The "codecs" module provides support for reading a unicode file.

import codecs

f = codecs.open('foo.txt','rU','utf-8')
for line in f:
    print line,

f.close()

Regular Expressions

Python "re" module provides regular expression support.

match = re.search(pat,str) #returns match object or None otherwise

Example:

str = 'an example word:cat!!'
match = re.search(r'word:\w\w\w',str)
# If-statement after search() tests if it succeeded
if match:
    print 'found',match.group() ## 'found word:cat'
else:
    print 'did not find'

match.group() is the matching text.

  • The 'r' at the start of the pattern string designates a python "raw" string which passes through backslashes without change which is very handy for regular expressions