An Academic Inconvenience of Python «

Comments Off on An Academic Inconvenience of Python

Sometimes Python’s roots in academia bug me. Lots of functions have a computer science feel instead of a software development feel. Here’s an example I just ran into: I wanted to fit as many sentences as possible from a long text into 255 characters. So I wrote:

s = s[:255][:max(s.rindex('.'), s.rindex('!'), s.rindex('?')) + 1]

This snippet chops it down to the 255 max, finds the ., !, or ? marking the end of the last sentence, and chops there. Great, right? Except it doesn’t work.

Instead of returning None when it can’t match the substring, rindex throws ValueError. So unless the first 255 characters of the string contain a ., !, and ? it’ll throw an exception. OK, let’s try:

rightmost = -1
rightmost = s.rindex('.')
except ValueError:
rightmost = max(rightmost, s.rindex('!'))
except ValueError:
rightmost = max(rightmost, s.rindex('?'))
except ValueError:
s = s[:255][:rightmost + 1]

Eww. OK, let’s encapsulate that redundancy:

def no_exception_rindex(s, substring):
return s.rindex(substring)
except ValueError:
return None

s = s[:255][:max(no_exception_rindex(s, '.'), no_exception_rindex(s, '!'), no_exception_rindex(s, '?')) + 1]

That’s… well, it’s at least a little better. Lucky that max doesn’t mind seeing None, I could imagine it throwing its own ValueError. But I wouldn’t call this code good, we’ve been forced to switch out of object-oriented code because we can’t add our no_exception_rindex to the string objects.

Here’s another approach:

def rightmost_punctuation(s):
index = len(s) - 1
while index > 0 and s[index] not in ['.', '!', '?']:
index -= 1
return index

s = s[:255][:rightmost_punctuation(s) + 1]

I’d actually call this one worse, as it’s not immediately obvious what it’s doing. And anytime I create a variable and then tinker with it inside a loop I feel like I want to rewrite that code to use map and/or reduce.

Tomorrow I’ll redo this example in Ruby to talk about open classes, but for today does anyone have a better approach in Python?