Python introduction

I just couldn't find a python introduction as condensed and to-the-point as the excellent one for perl, by Kirrily "Skud" Robert, so I simply set out to convert that one to python. Hope you enjoy!

(Note that this will be a constant work-in-progress for some time, and I'll probably expand it with more topics as we go. Feel free to send suggestions for improvements to samuel [dot] lampa [the a-in-an-e char you know] gmail [dot] com)

Running python programs

On linux, just run:

python filename.py

Basic syntax overview

A Python script or program consists of one or more statements. These statements can be written in the script in a straightforward fashion. There is no need to have a main() function or anything of that kind (although it can simplify things. See below).

Python statements don't need semicolons:

print "Hello world!"

Comments start with a hash symbol and run to the end of the line

# This is a comment

Multi-line comments, can be written like so:

"""This is a multi-line 
   comment"""

... or:

'''This is a multi-line 
   comment'''

Double quotes or single quotes may be used around literal strings:

print "Hello, world"
print 'Hello, world'

Numbers don't need quotes around them:

print 42

Concatenating numbers and strings needs a type cast though:

print "Number: " + str(42)

Python basic data types

Python has strings, integers, floats, lists, dictionaries and tuples.

Strings

animal = "camel"

Integers

answer = 42

Floats

pi = 3.14

Some of these requires some "casting" when concatenated (but not otherwise):

print animal;
print "The animal is " + animal + "\n"
print "The square of " + str(answer) + " is " + str(answer * answer) + "\n"

... or if using the inline format syntax, you can just modify the "placeholder" datatype (%s for string, %d for digit and %f for float):

print animal;
print "The animal is %s \n" % animal
print "The square of %d is %d \n" % (answer, answer*answer)

... or, if we want to be even more self-documentary, we can use the format() function, that can be run off of all strings:

print animal;
print "The animal is {a} \n".format(a=animal)
print "The square of {a} is {b} \n".format(a=answer, b=answer*answer)

Lists:

animals = ["camel", "llama", "owl"]
numbers = [23, 42, 69]
mixed = ["camel", 42, 1.23]

Lists are zero-indexed. Here's how you get at elements in a list:

print animals[0] # prints "camel"
print animals[1] # prints "llama"

This is how to get the length of a list:

len(mixed)

... or print the last item:

mixed[-1]

To get multiple values from an array:

animals[0:2] # gives ["camel", "llama"];
animals[0:3]; # gives ("camel", "llama", "owl");
animals[1:] # Gives all except the first element
animals[0::2] # All "odd" elements (every second, starting with first)

This is called an "array slice".

You can do various useful things to lists:

animals.sort() # Done inline. No need to assign result to other variable.
backwards.reverse() # Done inline. No need to assign result to other variable.

Dictionaries (aka "Dicts"):

A dict represents a set of key/value pairs:

fruit_color = { 'apple': 'red', 'banana': 'yellow' }

You can use whitespace to lay them out more nicely:

fruit_color = {
        'apple' : 'red',
        'banana': 'yellow'
}

To get dict elements:

fruit_color["apple"] # gives "red"

You can get at lists of keys and values with keys() and values().

fruits = fruit_colors.keys()
colors = fruit_colors.values()

You can loop over both the keys and values with the dict's items() function:

for fruit, color in fruit_colors.items():
    print "%s has color: %s" % (fruit, color)

Conditional and looping constructs

Python has most of the usual conditional and looping constructs.

The conditions can be any python expression. See the list of operators in the next section for information on comparison and boolean logic operators, which are commonly used in conditional statements.

If:

if condition:
    # Do something
elif other condition:
    # Do something else
else:
    # Do something still else

While:

while condition:
    # Do something

For

for i in range(start, lower-than-value, increase):
    # Do something

"Foreach":

for item in my_list:
    print "This item is in my_list: " + item

To get a counter while looping, use enumerate:

for i, item in enumerate(my_list):
    print "Item nr %d is in my_list: %s" % (i, item)

There is a short-form for doing operation on list items, and returning another list:

old_list = [1,2,3]
new_list = [i * 10 for i in old_list]

Result (The ">>>" characters indicates that this is executed in a python shell):

>>> new_list
Out: [10, 20, 30]

Builtin operators and functions

Arithmetic:

+ # addition
- # subtraction
* # multiplication
/ # division

Numeric comparison:

is      # equality
is not # inequality
==      # equality (But 99% of time, use "is" intead)
!=      # inequality (But 99% of time, use "is not" intead)
<       # less than
>       # greater than
<=      # less than or equal
>=      # greater than or equal

Logical operators:

and
or
not

Miscellaneous:

= # assignment

+  # string/list/dict concatenation

Many operators can be combined with a = as follows:

a += 1 # same as a = a + 1
a -= 1 # same as a = a - 1
a_string += "\n"; # same as a_string = a_string + "\n"

Files and I/O

You can open a file for input or output using the open() function. In short:

myfile = open("filename.txt", "r") # Read-mode
myfile = open("filename.txt", "w") # Write-mode (overwrite existing content)
myfile = open("filename.txt", "a") # Append-mode (keep existing content)

You can read from an open filehandle either into a string, or a list containing the lines of the file:

myfile_content = myfile.read()
myfile_lines = myfile.readlines()

Reading in the whole file at one time is called slurping. It can be useful but it may be a memory hog. Most text file processing can be done a line at a time by using the file as an iterable:

for line in myfile:
    print "Just read in this line: " + line;

Writing is done with the "write()" or "writelines()" methods of the filehandle:

otherfile = open("newfile.txt","w")
otherfile.write("a string")
otherfile.writelines(["a line\n", "or two...\n"])

When you're done with your filehandles, you should close() them:

myfile.close()
myotherfile.close()

There is an idiom though, for opening and working with a file, that will even automatically close the file, and which makes working with files even more a breeze, the "with" construct. It is used like this (note that we don't need to close the file handle):

with open("infile.txt") as infile:
    for line in infile:
        print(line)

From python version 2.7 and onwards, it is possible to open multiple file handles in one with statement, so that we can e.g. open a file for reading, and one for writing:

with open("infile.txt") as infile, open("outfile.txt", "w") as outfile:
    for line in infile:
        # Just copy each line from infile to outfile:
        outfile.write(line)

Regular expressions

Python supports Perl's regular expression syntax, with the "re" module.

Import the Regular Expression module. Always start the script with this:

import re

Matching the whole string

if re.match(regex_pattern, mystr): # true if mystr matches regex_pattern.
    print "Found it!"
else:
    print "Did not find it!"

Finding substrings

Note though, that the pattern will not find substrings, if the don't start at the beginning of the string, e.g. fiding "bar" in the string "foo bar". For this purpose you can use re.search:

if re.search("bar", "foo bar"):
    print "Yes!"
else:
    print "No!"

... which will print out "Yes!".

Finding all occurences of a string:

matches = re.findall("[a-z]+", "a very long text")
for m in matches:
    print "Match: m"

... will output:
Match: a
Match: very
Match: long
Match: text

Simple substitution

# Replaces all instances of foo with bar in mystr
mystr = re.sub("foo", "bar", mystr)
# Replaces the first instance of foo with bar in mystr
mystr = re.subn("foo", "bar", mystr, 1) 

Parentheses for capturing

As well as grouping, parentheses serve a second purpose. They can be used to capture the results of parts of the regexp match for later use. The results end up in \1 , \1 and so on.

Getting hold of the groups

parts = re.match("([^@]+)@(.+)", "a@b.com")
print "Part 1: " + parts.group(1) + ", Part 2: " + parts.group(2)

Replacing groups with regex

parts_str = re.sub("([^@]+)@(.+)", "Part 1: \\1, Part 2: \\2", "a@b.com")
print parts_str # Prints out: "Part 1: a, Part 2: b.com"

Writing functions / subroutines

Writing a "function" is easy:

def logger(log_message):
    my_logfile = open("out.log", "a")
    my_logfile.write(log_message + "\n")

Now we can use the function just as any other built-in function:

logger("We have a logger subroutine!")

Functions can of course return values:

def square(num):
    result = num * num
    return result

Then use it like:

sq = square(8)

Overall structure of scripts

Short scripts can be written directly in the file in python, without any main function of whatsoever. One problem with this though, is that function have to be defined before the code that use them, which forces the main code to be placed in the bottom of the scripts if you use a lot of functions.

Therefore it is common to introduce a main function, and a little code snippet in the end, that executes the main function if it detects that the current file is the one that is executed (that is, not only imported into another larger script):

Typical structure of python script:

# some imports
import re
 
# The main function here
def main():
    # Do some stuff here
    pass # simple command to do nothing
 
# Other functions can be defined after "main()"
def other_function():
    pass
 
# Execute the main function if this file is the one that is called:
if __name__ == "__main__":
    main()