Posts Tagged ‘python’

Improved Python Traceback Module

Published January 27th, 2010, updated January 28th, 2010.

Like any modern language, Python comes along with a nice traceback module. This module gives you stack traces from the line of code where an exception is raised up to the next try-except clause. So, you can easily catch exceptions and write stack traces into a debug log. This debugging technique is pretty handy to drill down bugs and I use it a lot in prototyping.

Using the traceback module is straight forward for evident programming mistakes. However, real bugs are context-sensitive and they can hardly be reproduced without the actual data that was processed when an exception was raised. If you can reproduce a specific bug, you can add some logging code in front and inspect the variables the next time the bug is triggered. But if a bug occurs once in a blue moon, you’d be better in logging the data the first time an exception raises.

import traceback

def erroneous_function():
    ham = u"unicode string with umlauts äöü."
    eggs = "binary string with umlauts äöü."
    i = 23
    if i>5:
        raise Exception("it's true!")

try:
    erroneous_function()
except:
    print traceback.format_exc(with_vars=True)

Here’s my solution; an improved Python traceback module the logs variables from the local scope next to the affected code. You can find a working copy in our Mercirual repository (see the below).

Traceback (most recent call last):
 File "test.py", line 16, in <module>
   Local variables:
     erroneous_function = <function erroneous_function at 0x7ff6d82b...
     __builtins__ = <module '__builtin__' (built-in)>
     __file__ = "test.py"
     traceback = <module 'traceback' from '/srv/www/vhosts/dev.teamr...
     __name__ = "__main__"
     __doc__ = None
   erroneous_function()
 File "test.py", line 13, in erroneous_function
   Local variables:
     i = 23
     eggs = "binary string with umlauts \xc3\xa4\xc3\xb6\xc3\xbc."
     ham = u"unicode string with umlauts ???."
   raise Exception("it's true!")
Exception: it's true!

I am not sure if it is the “right” solution as sensitive information might be logged. This might have security implications for some real-world scenarios where webapps report stack traces to the end user (e.g. by using cgitb in production).

Credit: this code was inspired by format_exc_plus by Bryn Keller.

2010-01-28: there’s an active discussion on python-dev.

get latest source code
visit mercurial repository

Today 16oo: Sickos Hack Nacht

Published October 24th, 2009.

We are going to have a Hack Nacht tonight. Sickos and other nerds are invited to exchange ideas, write code and to get to know each other. I’ve some ideas what to do and I’m looking forward to seeing you hackers tonight. I’ll update this article when the event is over, stay tuned (our join us on SickosNet).

Python-MagickWand or How to Work With Icons

Published June 18th, 2009, updated December 8th, 2009.

I’m currently working on a pet project where I want to convert favicons (in Windows .ico format) to Web standard .png format. I started using the Python Image Library (PIL) for this, which supports plenty of image formats and which is Python’s standard way for doing image manipulations. A basic any_to_png function using PIL looks like this:

img = Image.open(StringIO(buf))
img = img.resize((16,16),Image.ANTIALIAS)
img.save(buf, format='PNG', transparcency=1 )
return buf.getvalue()

This is straight forward, but I had to find out that PIL’s .ico support is pretty outdated while Microsoft has updated the specs. Modern .ico files have switched from .bmp to .png format and added alpha masks. There is a patch available that brings you .png support, but alpha masks are still broken.

So, I’ve searched for another image library that has proper support for icon files and stumpled upon my old friend ImageMagick. I’ve found that there are Python bindings for the MagickWand interface (the C API). Yet, those bindings are incomplete, ugly and not actively maintained. I’ve found alternate Python bindings for the MagickWand interface and those are pretty nice:

img = Image(StringIO(buf), 'ico')
if not i.select((16,16)):
    i.alpha(True)
    i.scale((16,16))
return i.dump('png')

You see, Ian Stevens‘ CDLL bindings are a straight forward implementation of the MagickWand C API using the CDLL wrapper library. I’ve added some missing functions, documentation and clarified the licensing issues (now available under a BSD license) and I think this is a clean and elegant solution for a long standing problem. You can find a snapshot of Python MagickWand here and the latest source code there. Enjoy.

download source code
visit mercurial repository
visit original project page

2009-12-08: We’ve cloned python-magickwand and accept patches. You can find our latest work in our mercurial repository. Today, we’ve commited a magickwand6.5 patch, please look at the hg changelog for details.

Directory Snapshots

Published September 15th, 2008.

Creating backups of open files was a challenging endeavor in the past. The main problem here is inconsistency. This is because the original data could get changed while it is read by the backup software. If this happens, it can result in missing references or references pointing at incorrect data. A typical example would be any kind of database. Here we have a lot of indices that are stored aside of the raw data. When the backup software reads the index and an update occurs, the original data cannot be read anymore.

A simple strategy to deal with this type of problem is the creation of offline backups. They are fairly simple to implement and they ensure exclusive access by the backup software. Though, this approach is inelegant as it requires a downtime for every backup. To overcome this limitation, many applications spawned custom backup mechanisms to enable online backups. These mechanisms include simple dumps, logshipping, single- and multi-master replication and others. Although, they enable online backups, they are proprietary and require application specific know-how.

A more generic approach is the use of file system snapshots. They create a static copy of the original data within milliseconds. This copy can be backed up while the system is online and the original data gets changed. On Linux, this snapshot functionality is part of the Logical Volume Manager (LVM). It is included in the standard Kernel since version 2.4.x and most modern Linux distributions activate it in their default installation.

dsnapshot on top of lvm

To create a file system snapshot, one basically requires the file system to be on a logical volume (LV) and some free space on the underlying volume group (VG). Then, one can tell the logical volume manager to snapshot the particular volume. The LVM holds all IOs and creates a new copy-on-write (COW) volume within milliseconds. This new snapshot volume can be mounted and backed up safely, as it does not change when it is written to the original volume.

Of course, there are good reasons to use the backup methods that are recommended by some software vendor. But there are also many situations where a generic approach is preferable. Think of small databases, virtual machines, mail servers and other applications that store custom index files. I’ve seen a lot of these situations in the wild and for many times, I desired to have an easy snapshotting functionality. This lead to various snapshotting scripts, consolidation, adoptions and so on. After all, I’ve created dsnapshot which I want to introduce here.

The dsnapshot script provides a high-level interface to the Linux Logical Volume Manager. It uses its block-level snapshot support to create directory snapshots. In contrast to block-level snapshots, directory snapshots resemble the file system layer. Thus, you can snapshot any directory that is on a logical volume and you don’t have to worry about the actual logical volumes, mount points and paths.

This is the actual syntax for creating…

$ dsnapshot --create /srv/mysql/test/
/var/lib/dsnapshot/srv-fdf2e6dc/mysql/test/

… and removing a directory snapshot.

$ dsnapshot --remove /var/lib/dsnapshot/srv-fdf2e6dc/mysql/test/

I’ve found this script very handy when you need to backup single directories instead of whole volumes.

download source code

301

Published July 9th, 2008, updated September 30th, 2008.

301 is an uri redirector. It allows you to create short links for complex web addresses. Just submit a longish uri at 301.sickos.org, and you will get a short link that points to the original address. You can pass this on twitter, in irc or whereever you want to avoid complex web addresses.

301

This service was inspired by tinyurl.com and monkey.org/sl. In contrast to their services, 301 comes along with full python source code. This give you freedom to run your own 301 service and adopt it to your needs. Get the source code at benjamin-schweizer.de/files/301/.

hint: use pedit to manage the link database

use service
download source code

Htpasswd Editor

Published June 13th, 2008, updated September 18th, 2008.

User authentication on unix systems typically relies upon password files or directory services. Both contain logon names, user ids, passwords, the location of your home directory and other information. The choice of the right authentication backend typically relies upon the amount of users you have to manage and your system environment.
If you have decided to use simple password files, you can create different files for various services. This gives you the opportunity to separate system users from service users. Further, this enables you to delegate administrative rights to certain people.
However, user management still requires you to twiddle with command line tools. This is fine if you are a unix lover, but if you want somebody with little command line experience to manager your users, you probably prefer a user interface that guides the unexperienced and reduces the risk of crashing the system.

Htpasswd Editor

This is exactly what htpasswd_editor does. It provides a text user interface for htpasswd(1) files and can easily be integrated with popular software like the Apache Web Server, VSFTP Daemon and other PAM-enabled programs (using pam_pwdfile).

update 2008-09-17: there’s a new bugfix release available

download source code

Exploiting Python’s Class Dispatcher

Published May 16th, 2008.

In object oriented programming, the class dispatcher is a built-in function that looks up member functions and executes them in the context of a given class. In Python, those lookups are conducted dynamically, enabling one to modify the behaviour of a class without the need of subclassing. Here are some unusual but yet useful examples.

# class-based programming style
class Foo:
    pass

class Bar(Foo):
    def bar(self):
        print "bar"

bar = Bar()

bar.bar() # prints "bar"

So what? If you write all code on your own, you are fine. You can subclass Foo and invoke all methods from the new class Bar. But what, if the instantiation is done in code sections that you cannot modify? Imagine you are writing a plugin and you do not want to touch the code of others. They decided to instantiate Foo and you do not want to change this, nor you want to change Foo.

# prototype-based programming style
class Foo:
    pass

foo = Foo()

def bar(self):
    print "bar"

foo.__class__.bar = bar

foo.bar() # prints "bar"

This second example shows how to “inject” a method into an already instantiated object. In fact, this works because Python uses dynamic delegation. Objects and classes are inspected at runtime and so, the dispatcher finds attributes even if they are added after object instantiation.

Entropy Password Generator

Published March 27th, 2008.

Entropy is a password generator. It generates two kinds of passwords: i) low entropy passwords that humans can easily remember and ii) high entropy passwords as commonly used in stored sessions. The low entropy passwords are generated from the Basic English vocabulary by C.K. Ogdeni. The high entropy passwords are random alpha numeric passwords where similar looking characters are stripped.

Basic English Passwords (low entropy / e=649,527,500)
note564still             cover624powder           box300person
discovery371spring       over425such              arm781great
daughter658advertisement woman600cushion          help695money
not750sweet              where289brain            present557see
brain787polish           sticky446change          fly679fear
body411oven              system475house           frequent497size
dog303level              cushion435boy            great870language
porter288doubt           awake847pull             hat783burn              

Mixed Alpha Numeric Passwords (high entropy / e=10^18)
6rt84tZrvUkLrtE2 AG7HQEjxQDg4Znao v9DUzzJc8X97FQqj cXTQmY3gvvkvwhTx
VJBEC4RFRtTPNgFA Z4pcMrRPMuE8a4EM EcyJArGdH2D6jZBT wr75cJdmzuF9a9LX
wce4yXfhdnwjEnU9 hGKfFYuRwQMkAnqg BEmtkbjtLEyKM3YW wVgxoX82TfGmxbuT
ho3zNKvZCBQ3wgJ6 mvKTTyy6TN9zCCZ8 fKr8eWL34XDNQyKG wCQFtYHQcaxmoAep
Mp7dMC8gDBMa9qGh TGRKnW58cT8z66a4 dZAt2ghzCbDkdmJA P2XpNxFRDjcfQG83
gch7TqT2d6RYzpGb xeZWbqDegADXoRnu xmmeJXkFdTXzcWam t9JL3DpKoMPMYrac
URcVPrCRuQETzVVe aJnw4wghHcj3jCqr 9g9pVYtGtq5RhCaG oJ4y3k8rdjmnUE6w
aTWyu76uu5TPgkCv aLeffq6MVNfAnxp7 EnqeUkjHPkgwv3AG q5Zmmc3GzJyxneHn

This application is writte in Python and supports both, a CGI interface for your web server and a command line interface. From a security perspective, I strongly recommend the command line version after reading the source code.

use service
download source code

PySqlite2: Unicode Bug while Processing Non-unicode Text

Published March 5th, 2008, updated September 11th, 2008.

Pysqlite-2.3.2 accepts binary data on inserts but selects return unicode strings. This results in unicode conversion bugs when non-unicode bytes are stored in the database.

As sqlite3 accepts binary data in text fields, this seems to be a bug in pysqlite. To fix it, one could i) either restrict inserts to unicode strings or ii) change the result from unicode to binary.

However, the first would break compatibility with sqlite and that latter would break compatibility with existing code. Thus, this should be discussed with the authors.

import sqlite3

connection = sqlite3.connect(':memory:')
cursor = connection.cursor()
cursor.execute('''CREATE TABLE test (t TEXT)''')
cursor.execute('''INSERT INTO test (t) VALUES (?)''', (chr(128),))
cursor.execute('''SELECT t FROM test''')
# Traceback (most recent call last):
# File "pysqlite_utf8.py", line 10, in 
#     cursor.execute('''SELECT t FROM test''')
#     sqlite3.OperationalError: Could not decode to UTF-8 column 't' with text '?'

print cursor.fetchone()
connection.close()

pysqlite ticket
Debian ticket
download source code

Web Watchdog

Published November 28th, 2007, updated March 7th, 2008.

The Web Watchdog notifies you when a website gets updated. Instead of continuously returning to blogs, forums or discussion boards, you can ask the Web Watchdog to do this for you.

Web Watchdog, Bookmarklet

use service
download source code

Ipmap

Published October 17th, 2007, updated April 10th, 2008.

Ipmap is a GTK-based IP address grapher, inspired by an XKCD comic and glTail. It reads data from standard input and maps IP/size pairs on a grid (see the screenshot). Due to this simple interface, it is easy to create filters for a variety of data sources. The program comes along with some example filters, including tcpdump ouput, Apache/ProFTPd’s access logs and Squid logs.

download source code

pedit

Published June 6th, 2007, updated February 29th, 2008.

“The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure.” Pedit is an interactive editor for such data structures, making them handy on the command line. The code is work-in-progress but yet useful…

Here’s a sample:

# /tmp/foo.pickle - pedit

[
  "foobar",
  True,
  1,
  [
    2,
    3,
    4,
    {
      "key": "value",
    },
  ],
]

# eof.

download source code