Random (not so) useful stuff

Getting stdout of a subprocess in python

From my answer to a question on stackoverflow.


Use the subprocess module.

Tip: you can go to the documentation for a particular version of python via http://docs.python.org/release/$MAJOR.$MINOR/

From Python 2.7 and above:

output = subprocess.check_output(["command", "arg1"], shell=True)

In Python 2.4:

process = subprocess.Popen(["command", "arg1"], shell=True, stdout=subprocess.PIPE)
stdout,stderr = process.communicate()
# not shown: how to use Popen.poll() to wait for process death.
# while filling an output buffer
print stdout

Below Python 2.4:

output = os.popen('ls')
Posted

Ways of exposing methods in a python module

The Following is taken from my answer to a question on stackoverflow:


module foo.py:

def module_method():
    return "I am a module method"

class ModClass:
     @staticmethod
     def static_method():
         # the static method gets passed nothing
         return "I am a static method"
     @classmethod
     def class_method(cls):
         # the class method gets passed the class (in this case ModCLass)
         return "I am a class method"
     def instance_method(self):
         # An instance method gets passed the instance of ModClass
         return "I am an instance method"

now, importing:

>>> import foo
>>> foo.module_method()
'I am a module method'
>>> foo.ModClass.static_method()
'I am a static method'
>>> foo.ModClass.class_method()
'I am a class method'
>>> instance = ModClass()
>>> instance.instance_method()
'I am an instance method'

If you want to make class method more useful, import the class directly:

>>> from foo import ModClass
>>> ModClass.class_method()
'I am a class method'

You can also import ... as ... to make it more readable:

>>> from foo import ModClass as Foo
>>> Foo.class_method()
'I am a class method'

Which ones you should use is somewhat a matter of taste. My personal rule of thumb is:

  • Simple utility functions that generally act on things like collections, or perform some computation or fetch some resource should be module methods
  • Functions related to a class but that do not require either a class or an instance should be static methods
  • Functions that are related to a class, and will need the class for comparison, or to access class variable should be class methods.
  • Functions that will act on an instance should be instance method.
Posted

A Post on Design

This was originally posted as an answer on gamedev.stackexchange.com to someone looking for advice on how to design a physics engine. I thought it should be reflected here too.


How to start an Architecture/Design Task: With pen and paper.

Get yourself a large sheet of paper and start drawing out the components and items that will exist within your engine. What properties each entity will need so that you can model the physical interaction.

You’ll need to figure out what entities your code will have to deal with, which entity has the responsibility of running the physics simulation (and which bit of the simulation).

Then you’ll have to figure out how the entities are interacting. Use colours.

Now go to sleep. leave the design for a few days and do other stuff. If you have some friends you can talk to, this is a good time to ask them about what they think.

Return to you design. You’ll see some obvious problems and will feel the urge to make it more elegant. Do it.

Start coding from your second design. Rinse and repeat until you feel proud of your design and implementation. Then ask for criticism from people whose technical skills you respect, and realise that you’ve barely scratched the surface and that your engine is held together with duct tape. Rinse and repeat.

Some points:

  • There is no perfect formula for architectural systems like this. It’s a matter of experience and a taste for elegance. Books may help you by exposing you to interesting ideas, but if you’re comfortable with the physics, you’re no more or less qualified than anyone else to design an engine. [Edit]: Replace physics/engine with language/system

  • Don’t worry about a good engine worry about about getting an engine that runs, and move towards good from there. (even if it means you’re hitting polynomial complexity )

  • Worry more about being able to communicate the way your engine works to someone else. Until you can explain your code to your mum, you don’t understand what you’re doing.

That last point is most important. When you get stuck, try to explain what your engine should do to a friend, on IRC, or in an email to someone whose work you respect. The act of trying to explain it to someone else will often reveal what your problem was in the first place. (You don’t actually have to hit the “send” button, if you feel shy…)

If you want a good book on architecture, try The Architecture of Open Source Applications which you can read for free online. You could also watch Rich Hickey talk about how he designed Clojure: Hammock Driven Development for inspiration.

Posted

View SSH key fingerprint

I’m sure that you’ve all been faced with the following message:

The authenticity of host 'ipdl-fleet.com (31.222.166.122)' can't be established.
ECDSA key fingerprint is 83:d6:0f:55:6f:3b:af:dc:5a:05:79:ce:3a:6e:ab:ee.
Are you sure you want to continue connecting (yes/no)?

And most people will blindly type ‘y’ and get on with their life, merrily ignoring the fingerprint and putting themselves at risk of a man in the middle attack.

What if you don’t want to? what if you want to find out your server’s fingerprint before going travelling and finding yourself in Munbai or Volgograd, without the ability to tell if you’re being compromised. I thought you’d never ask!

Run the following on your server, and write down the fingerprint on an index card in your wallet. Make sure you run the command for all the keys, as there might also be DSA and RSA keys as well as ECDSA.

$ ssh-keygen -l -f /etc/ssh/ssh_host_ecdsa_key.pub
256 2a:6f:a3:e4:a3:3c:01:75:8f:bb:aa:57:c5:b9:c4:d8  root@hostname (ECDSA)

What if I have an old, invalid key in my known_hosts? I hear you say.

Well, in that case, remove it using ssh-keygen:

ssh-keygen -R "hostname"

Voila!

And we’re back to useful posts again, isn’t life grand?

Posted

Working Environment

Seeing an old post by Greg Wilson where he talked about his essential equipment made me what to set down some of the things that would be important in my work environment.

Having survived without a proper desk, let alone office, at home for the past 10 months, I have acquired the bad habit of fantasising about what it would be like, so this is a chance to indulge.

Since this is now the second post on this blog that isn’t useful to anyone else (Vanity posts?) I have also changed the title to Random (not so) useful stuff :–)

A good chair: Being of above average height and weight, a good chair is important to me, especially if I’m going to be spending several hours sat in it every day. What makes a good chair really depends on your morphology though. I vividly remember the backaches I endured when my new “ergonomic” office chair was brought in during my second week. I returned to the old one after a few days, as by the time work finished the pain made it hard to get up and walk back home! In the meantime, a colleague, who received the exact same chair as I did, swears by it. Go figure.

A bowl of fresh fruit and nuts: Healthy snacks while working are a great way to keep your concentration up. My workplace gets fruits delivered every morning, and it’s been a really valuable perk. Much healthier than the stereotypical cheetos, and tastier too. The fruits rarely make it to the afternoon.

A tall glass of cool water: Keeps you refreshed and hydrated. I will usually go through 4-5 pints of water every day while working. It also has the side benefit of helping you keep your breath fresh. This leads me neatly to:

A box of strong mints: Self explanatory. I like having fresh breath!

Plenty of leg room: See chair above. I keep pulling my monitor and network cable out of my tower at work, and it’s a pain. Having plenty of space for your workstation(s) and a good cable management system will save you time, and allow you to sit in a better position.

Two or more 22"+ monitors: Multihead makes you more productive. Fact. You get diminishing returns after the second monitor, but I’d argue, especially for programmers, that three is still money well spent. You’ll often need to have both your code and a debugging session/program output/logfile side by side, and even a single widescreen will feel cramped. It’s not necessary (you can do great work on almost anything with a tty) but it sure makes life more comfortable.

Good quality keyboard and mouse: These are the only physical interface to the computer for most of us. (fancy graphics tablets and 3D manipulators aside) It’s important to get them right. I pretty much swear by Logitech, and love my legacy MX1000 mouse. The MX Performance is pretty good too, and the loose scroll wheel really is more practical. If you have pains on the back of your mouse hand, get yourself a mouse that has good ergonomic fit. It made a real difference for me and the pain disappeared overnight after changing to the MX performance from a smaller, less ergonomic model.

Books: Having your favourite reference book besides you as you work is invaluable, and reading on screen or on paper are two very different experiences. Margin notes and diagrams are also a way to enhance your books over time.

Hand cream and lip balm: Because, while very manly and macho, bleeding over your keyboard from cracks in your hands does not improve your productivity.

Notebook and coloured pens: I don’t care how fast you type, I know that you can take handwritten notes faster. I also know that for anything conceptual, pen and paper will be much more practical. Pens are also great to relax your hands and fiddle nervously with while your code compiles (or doesn’t). Definite bonus earned for clicky pens twirled maniacally while muttering “I am invincible”, “Pah. Level 2 programmer.” or “Nobody beats Boris” in a thick Russian accent.

A comfortable couch: To take naps in. Taking a twenty minute nap when you feel tired can help you stay focused and will actually lead to more useful hours in your day. Shame this isn’t a valid proposition at work. If you’ve been banging your head against a hard problem before dozing off, you’ll also often find that the solution comes to you fully formed when waking. Invaluable.

Glass cleaner and tissues: For some reason, my glasses get dirty absurdly quickly. I also tend to cry if I remain illuminated by fluorescent light for any period of time. Hence, glass cleaner and tissues.

Comfortable headphones: I like to listen to music while working. A fast tempo track is great to enhance my state of flow when I’m typing away, and smoother tracks help to chill me out when getting stressed. I use a pair of Sennheiser HD201s at home. They fit my big head well and can be worn for long periods of time without any discomfort. They’re also darn cheap, at around £15. Plugging in some music also means that you don’t get disturbed by ambient noise.

Stacks of coloured postit notes: Use as bookmarks in your notebooks for different projects, stick on you window or wall as you implement Kanban or other agile systems.

A large whiteboard within reach: I found out really young that using a whiteboard to think helps to clarify your ideas. My mum installed a whiteboard in my room when I was seven/eight and took the time to explain fractions on it. I’ve been a convert since then. I kept a flipchart in my room while at uni, and would often map out my subjects to great benefit. Putting an architecture diagram up on a wall also seems to highlight the inherent flaws in an almost magical way. I suppose it’s a more visual form of rubber ducking.

Posted

dirstack.py - A replacement for pushd and popd

So, you use the shell.

A lot.

You’ve not even installed a file browser on your computer, as your velocity in browsing via the shell is an order of magnitude higher than with a gui.

It’s still not perfect though. switching between your different projects forces you to put together invocations like

cd ../../../some/dir/elsewhere

You’ve started to look into pushd, popd and dirs, but you feel that the user experience isn’t as pleasant as it should be. And there’s a few problems:

  • You can’t share the stack between multiple sessions
  • Your directory stack is not persistent after restarting your laptop.
  • The commands suck. Who wants to type cd $(dirs +2)?

What it should be like.

Well, OK, that sucks a bit. What should your experience look like then?

~> ds a .
~> ds 
    [1]:   /foo/bar
    [2]:   /bar/foo
    [3]: * /home/yourname
~> ds g 2
/bar/foo>

You know what, I agree. So I wrote a script that does exactly that:

Add it to your path and add the following to your .bashrc or equivalent:

function ds {
  dir=$(path/to/dirstack.py "$@")
  test $dir && echo "cd $dir" && cd $dir
}

Have a look at an example session:

$ ds
  [-]:   Empty Stack
$ ds a .
$ cd /my/secret/project/path/
$ ds a .

#
# Hack Hack Hack
#

$ ds 
  [1]:   /home/brice
  [2]: * /my/secret/project/path/
$ ds g 1
cd /home/brice
$ pwd
/home/brice
$ ds
  [1]: * /home/brice
  [2]:   /my/secret/project/path/

Cool, huh. And your stack will persist between reboots and be shared amongst all your sessions. The * in the stack listing even shows you if your current working directory is in the stack. Having a numbered stack listing means you won’t even have to think when moving between dirs.

Voila, dirstack.py, a simple replacement for pushd and popd that works the way it should. Hope it helps!

Posted

Proposed Markdown Extension: Include External file

Problem Statement

Ever since I came across markdown, I always wanted to be able to include an external file inside a markdown document. This would allow easy structuring and organisation of large markdown documents. A good example would be to put together some code level documentation, or a multi-part report.

To make it even more useful, I’d also like to be able to include the output from a script directly in my markdown file, which would be really useful for examples and dynamic reports. A good use case for this would be to include the latest results fetched from the web for a report, or to statically generate a webpage using the latest server stats.

Syntax Extension

In order to do all this, I propose two extensions to the markdown syntax. This first, “Include file verbatim” would look like this:

...
(> filename <)
...

This directive would take one full line and replace itself with the contents of the file. Additionally, since this is markdown, I want to be able to specify by how much the included file will be indented. To do this, the directive simply needs to be indented:

...
some text
    (> filename <)
some more text
...

This additional syntax can now be extended for executable scripts too:

...
(!> script --args A B C <)
...

In this case, The script should be executed and its output included in the markdown source. Indentation has the same effect as above.

Implementation

The implementation, it turns out, is pretty straight forward. Python’s re, shlex and subprocess module make this a breeze:

Use

Using this module is as straight forward as it gets. Simply pipe in some output and redirect:

$ ./pinc.py <inputfile.md >outputfile.md

A neat trick can easily turn pinc.py into a quine:

$ echo "(> ./pinc.py <)" | ./pinc.py

Easy peasy! Interestingly, since this implementation is just a shell filter, it can be used outside of markdown too. I use it to build web pages and dynamic reports. My scripts output vaild markdown, and are included in a report template. You could use pinc.py for LateX, html, or any other text based format.

Further Work

There’s several limitations to this implementation, and a few possible improvements:

  • Allow recursive includes. At the moment recursive includes are not allowed, which could be useful, especially for literate programming.
  • Create a mechanism for capturing stderr as well as stdout.
  • Create a mechanism for prefixing every included line with an arbitrary prefix instead of just whitespace.
  • Package the functionality into a python package for inclusion into python-markdown.

Conclusion

There’s a reason why I love Python. In a few minutes I put together a really useful filter that pulls together a lot of functionality: We’ve got subprocess being spawned, regular expression parsing, and we even use the functional features to modify every line with an anonymous function. Awesome!

Hope this motivates you to go and scratch your own itch too!

Posted

Clojure: I'm in Love

I have a confession to make. I have been unfaithful. Python, I’m sorry.

For the last two years or so I’ve been eyeing Clojure. but it’s only in the past few days that I’ve started playing with it.

Having gone through a few of the 4Clojure exercise I returned to the latest Programming Praxis problem. You can see it here with my first shot solution:

Write a program that takes a list of integers and a target number and determines if any two integers in the list sum to the target number. If so, return the two numbers. If not, return an indication that no such integers exist.

(defn sum-two [n xs] 
  (first 
    (for [x xs y xs :when (= n (+ x y))] 
      (list x y))))

Yes, I know that this is O(n2 ), but look how pretty that is. Seriously. It’s the first thing that came to my head and it works exactly as expected. Everything I’ve done with Clojure so far has been exactly the same: effortless. And this is a considerably uglier example than most of the solutions to the 4Clojure problems.

Not long ago, I too thought all those parentheses wielding weirdos had stayed under the sun for too long.

I am now a blissful convert to the cult of the lambda.

Come, don your robes, start the incantations and sacrifice the rubber chicken. This is gonna be fun!

Posted

Timing Sections of C Code

There comes a time in every developer’s life when you have to time a small section of code. It could be because you’re optimising a method and need to know if you’re making progress, or maybe you’re debugging that ten year old application that seems to take forever for the simplest task, and you have clients breathing down your neck to fix it yesterday. (hint: Synchronous network callbacks.)

Well, that time for we happened this week, and I’m going to share how it went. First we’ll put together a small and simple solution using the Linux and Posix APIs, then we’ll look at generalising our platform specific code using macros, and eventually, we’ll break out the wonderful Glib to provide a solution that is cross-platform.

Version 1

Our first naive attempt might look something like the following:

Which works well enough. Once you read up on the APIs mentioned above, this should hold no secret for you. The accuracy can be significantly improved by using the Linux only clock_gettime() function, but for what we need, this will be good enough. Now, the downside of this is twofold:

  1. This is not generalised, so you have to do all the declarations by hand every time, which will quickly get tedious if you have more than one timer in use at a time.
  2. This is Linux only. It would be easy enough to make this work across all Posix by implementing your own timersub() function, but it won’t work on vanilla windows.

Version 2

Let’s take care of the first problem. To do this, we can call upon macros to do the work for us:

And use them like this:

This is simpler, and more general than timing everything by hand. There’s a features worth mentioning in the timing.h header file for preprocessor newbies (And I totally include myself in this). And that is token concatenation. This is the ## you can see on the header. It allows us to create named timers. There’s a few gotchas involved, as passing in something that isn’t a valid identifier will break the code quite badly. For example, doing something like this:

TIMEIT_INIT(bob and alice);

Would be silly, as the expansion would come to:

struct timeval timeit_t1_bob and alice, timeit_t2_bob and alice, timeit_diff_bob and alice;
double timeit_interval_bob and alice ;

And will rightly give us an error about expecting “;”.

So how does this fare? Well, it could be worse, but we’re still stuck on Posix only. (In fact, as written, the timer_sub() method is Linux only, but that isn’t a big obstacle.)

Version 3

What if your application has to run on windows? Surely someone somewhere must have done the work of digging through the platform APIs to create a reliable, cross-platform solution for timing in C. Well, of course!

Taking a look through the Glib API documentation will reveal the Timer module, which is used like this:

Which has all the properties that we wanted:

  1. Cross platform
  2. Simple and generic.

Conclusion

Reinventing the wheel is sometimes educational (I learned about token concatenation while looking up those macros), but really shouldn’t be needed. Next time you need to do something that you feel you shouldn’t have to, take a look at Glib. Coming from Java, I was surprised at how much stuff was included. It’s not batteries, but it should definitely be in any C programmer’s toolbox.

Posted

Timing multiple commands at the same time.

Sometimes, you want to time multiple commands at the same time, and get a combined summary at the end. Doing this is pretty simple, but I’ve found people doing it in all sorts of way, so, for the record:

$> time { cmd1 && cmd2 && ... ; }

And you’re done. Don’t forget to add a space after the {, and to finish with a semicolon and a space before the closing curly brace.

Learn more about subshells at The Linux Documentation Project.

Posted