PDF to text in python

PDFMiner is a suite of programs in python that help extracting and analyzing text data of PDF documents. Unlike other PDF-related tools, it allows to obtain the exact location of texts in a page, as well as other extra information such as font information or ruled lines. It includes a PDF converter that can transform PDF files into other text formats (such as HTML). It has an extensible PDF parser that can be used for other purposes instead of text analysis.

I downloaded a PDF from work which has my call stats in tabular form. It worked great.

[25] 21:56:47–> pdf2txt.py -t html test.pdf > test.htm
/Library/Python/2.6/site-packages/pdfminer-20091004-py2.6.egg/pdfminer/pdfparser.py:8: DeprecationWarning: the md5 module is deprecated; use hashlib instead import md5, struct

My project is to:
1. Email my weekly stats from work.
2. Construct a python program to read the data into a temporary textfile
3. Extract the data into a SQLite file
4. Produce statistics graphs on my average call times.

Leave a comment

The Bike has arrived

A happy me after taking ownership of my new bike ... but nervous as hell because it has been 7 years since I last used a motorbike

Ready to go. You can find more pictures at flickr

10.10.2009 13-56-45 Suzki Vstrom First Day.jpg10.10.2009 13-56-30 Suzki Vstrom First Day.jpg

The only accessories I have purchased so far are:

1. Boots

2. New Jacket

3. Crash Guard for bike which you can see in the photos

4. New Gloves

The owner of the shop was saying it was a A0 model, i.e not A9 (2009) but A0 (2010) … but I have no way to confirm this. My model was manufactured in August 2009. I went to the shop recently and all the 2010 models are in now, they are all September 2009.

Leave a comment

V-Strom 650 ABS Info

Trying out the bike in the shop, but wrong colour and no ABS

I am getting the white version of this bike

From Wikipedia

The Suzuki V-Strom DL650 is a mid-weight dual-sport motorcycle manufactured in Japan by Suzuki and sold worldwide. It was launched in 2004. The name V-Strom combines V referring to the bikes engine configuration with the German word Strom, meaning stream or current.[1]

Unlike specialized motorcycles, the V-Strom 650 trades strength in a single area for adaptability to a variety of riding conditions: commuting, cruising, adventure touring and to a lesser degree off-road riding. In this respect, the DL650 resembles a UJM with a broad character taking the place of specific strength.

Contents

[hide]

[edit]Mechanicals

A 6-speed transmission mates to the fuel-injected and slightly retuned 645 cc liquid cooled, 4-stroke V-twin engine from Suzuki’s own SV650 sport bike. An upright,standard riding posture and 427 lb (194 kg) dry weight contribute to the bike’s handling characteristics.

Engine

The engine is a 90-degree, liquid cooled, 4 stroke V-twin, with 81 mm (3.2 in) bore and a 62.6 mm (2.46 in) stroke, four valves per cylinder, and intake and exhaust valving each with their own camshaft. More relaxed cam profiles over the SV engine boost the power between 4000 and 6500 rpm, along with slight changes to the airbox and exhaust. Relative to the SV, the crank inertia (flywheel effect) is also increased by 4% via a redesigned starter clutch.[2] As well, the DL650 engine uses a plastic outer clutch cover and engine sprocket cover for reduced weight and noise.[2]

In a significant departure from the SV engine, which uses cast iron cylinder sleeves, the DL650 uses Suzuki’s proprietary SCEM (Suzuki Composite Electro-chemical Material) plated cylinders, a race-proven nickel-phosphorus-silicon-carbide coating for reduced weight and improved heat transfer, allowing for tighter and more efficient piston-to-cylinder clearance[2], similar to a Nikasil coating.

Engine electronics

The DL650 employs sophisticated engine electronics for starting and throttle control and uses Suzuki’s AFIS (Auto Fast Idle System), eliminating a fast-idle control. The engine control module (ECM) reads engine information (ie, coolant temperature) via a 16-bit central processing unit (CPU) — controlling the fuel system’s dual throttle bodies and contributing to strong acceleration up to a rev-limited 10500 rpm.

Emissions

The DL650 employs Suzuki Dual Throttle Valve (SDTV) fuel-injection and exhausts via a two-into-one exhaust system with a catalytic converter in the muffler. European models meet Euro 3 emissions specifications. In the US, a “PAIR” air injection system reduces CO and HC emissions.

Chassis

A twin-spar aluminum frame and swingarm accommodates rear Showa mono-shock with a hydraulic preload adjuster. Front Showa shocks are pre-load adjustable. The DL650 uses a 19 inch front wheel, 17 inch rear wheel.

Instruments and bodywork

The bike’s instrument cluster includes a compact analog step-motor speedometer and tachometer (both with LED illumination) and a digital LCD unit with odometer, tripmeter, coolant temperature gauge, fuel gauge, LED neutral, digital clock, turn signal and high beam lights and an oil pressure warning light.

An adjustable windshield allows movement of 50 mm. A small underseat compartment, suitable for small tools, gloves, or an owner’s manual, can be accessed by removal of the seat, via a keyed lock located at the rear of the bike, just below the built-in rack.

[edit]Global sales and manufacture

Sold in Europe, Oceania and the Americas, the DL650 competes with the Aprilia Caponord and Pegaso, BMW F650 Series, and most directly, the Kawasaki Versys. The Suzuki DL650 is manufactured at the Suzuki’s ISO14001 certified plant in Toyokawa, Japan.

European model 2004 DL650, note the lack of small round side reflectors, shown with aftermarket crashguards, belly-pan, centre-stand and windscreen.

US Model 2005 DL650, with aftermarket windshield and bracket, hand guards, Givi crash guards, Suzuki tall seat, and top case.

[edit]Awards

The V-Strom 650 was named one of the “ten best” bikes under $10,000 by Motorcyclist (USA) magazine, October, 2007—beating out, among many others, the V-Strom 1000. In a September 2006 article, Cycle World magazine wrote “the DL650 may just be the most shockingly competent machine in the world today.”[3] A 2004 article from MotorcycleUSA.com said “it was hard to imagine another machine with a competitive versatility-per-dollar ratio.”[4] Twice consecutively, the DL650 has earned the title “Alpenkoenig”, winning Motorrad magazine’s (Germany) grueling trans-alp multi-bike test in 2005 and 2006.[5]

[edit]

References

  1. ^ 2002 Suzuki DL1000 V-Strom“. Motorcycle.com.
  2. ^ a b c Suzuki V-Strom 650, Sean Alexander, Mar. 21, 2004“. www.Motorcycle.org.
  3. ^ September 2006 article from Cycle World
  4. ^ 2004 article from MotorcycleUSA.com
  5. ^ Alpenkoenig from MOTORRADonline.com
Leave a comment

Learning Python

After scanning through the archive Python articles at Clark’s Tech Blog I noticed a link to MIT’s MIT 6.00 Intro to Computer Science & Programming, Fall 2008

This subject is aimed at students with little or no programming experience. It aims to provide students with an understanding of the role computation can play in solving problems. It also aims to help students, regardless of their major, to feel justifiably confident of their ability to write small programs that allow them to accomplish useful goals. The class will use the Python™ programming language. Instructors: Prof. Eric Grimson, Prof. John Guttag View the complete course at: http://ocw.mit.edu/6-00F08 License: Creative Commons BY-NC-SA More information at http://ocw.mit.edu/terms More courses at http://ocw.mit.edu

I have downloaded the episodes from iTunes Uni and the first lecture was quite good.
It brings back old memories of sitting in lecture theatres listening to the same concepst. The only difference is we used Pascal, which shows my age.

Leave a comment

Ripping CD’s

I have just finished ripping these

Box of CDs

Ripping CD's.jpg

my CD collection, into Apple lossless audio files. Now I will never need to rip my music again because unlike MP3 or AAC files there is no compression, so converting from Apple lossless to another future lossless format means no audio quality is lost.

Converting to Apple lossless has blown out my music folder from around 120 gb to just under 160 gb

Screendump of my itunes folder

As you can see I am a huge fan of Podcasts and have a few audiobooks as well.
Itunes University is pretty good as there is a lot of good free stuff to watch and listen too.

Leave a comment

OpenCL

After installing Snow Leopard on my Macbook I was excited to see if I could use OpenCL.
OpenCL is a technology to use your graphics card to do tasks that are traditionally done by the CPU

The most common example given is encoding videos. A video card, a GPU is substantially faster for some tasks.
For example in the benchmark below it gives an example of a Core 2 Duo (C2D) running at 3gz takes 12 seconds to perform the benchmark vs 0.93 seconds using the GPU!

My Mac being slightly older is slower still.

OpenCL introduces hardware decoding of H264 streams. That is instead of maxing out your CPU when playing a BlueRay disc it now offloads the work to the GPU, making your computer more responsive.

But, you need a Nividia 9600GT, my 8600M GT does not cut it :-(

……………… OpenCL Bench V 0.25 by mitch ………..
…… C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ……

Number of OpenCL devices found: 2

OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing – please be patient….
time used: 2.971 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz
Device 1 is an: CPU with max. 2500 MHz and 2 units/cores
Now computing – please be patient….
time used: 15.817 seconds
Now checking if results are valid – please be patient….
:) Validate test passed – GPU results=CPU results :)

Leave a comment

Python on Snow Leopard

I have just installed Snow Leopard and came upon some excellent sites which have inspired me to learn the programming language Python

The first site is Clark’s Tech Blog

The articles listed below are excellent examples of using Python to manage your itunes library, well worth the read.

and a nice little article about the new version of python and some useful modules to install - Upgrading to Snow Leopard Part 1: Python

Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin

The second blog which has a plethora of scripts for automation is And now it’s all this

2 Comments

Installing Snow Leopard

Snow Leopard install disk, originally uploaded by Stephen Hucker.

I am in the process of installing Snow Leopard.

But I forgot to back up my #$%$@ itunes library file
- no fracking favorites or ratings

Hopefully I have backedup my vmware windows xp image or more work :-(

UPDATE: My vmware images are all fine ..  that saved my a lot of work
Quicktime is a mixed bag, you can now save videos and make screencasts without paying for the professional version, but it is a typical Apple strategy. Make the basic stuff easy which fits mosts consumers, but take out some of the useful power user features that were in Quicktime 7

  • In my case that would be the ability to extract H264 video and AC3 audio from an MKV container and store it in a MP4 file that is playable on a Playstation 3.

On the plus side:

  • The other thing I wanted, which is very basic, but you had to buy the professional version before was to be able to go to the Apple movie trailers site and after watching a trailer, save it!  You can’t get more basic then that, finally Apple let you do that for free
    e.g http://stephenhucker.com/wp-content/uploads/2009-08-30%20GI%20Joe.png

    One last thing, with Snow Leopard there is no way to do an archive and install. I liked this option b/c you can do a clean install of the operating system and then when you remembered you forgot to backup an important  setting …no stress, it’s in the archive.

    Leave a comment

    Harry Potter

    Finished listening to all 7 Harry Potter audio books,

    Harry Potter 1: The Sorcerer’s Stone 08h 17m

    Harry Potter 2: Harry Potter and The Chamber of Secrets 09h 02m

    Harry Potter 3: The Prisoner of Azkaban 11h 47m

    Harry Potter 4: The Goblet of Fire 20h 35m

    Harry Potter 5: The Order of the Phoenix 26h 23m

    Harry Potter 6: The Half-Blood Prince 18h 22m

    Harry Potter 7: Harry Potter And The Deathly Hallows 21h 37m

    The narration by Jim Dale was excellent. thoroughly recommended

    Listening to Harry Potter as an iPod audio book with the pictures from each chapter showing up on the iPod

    Leave a comment

    It’s magic!

    Yesterday I used up my download allowance, all 200gb for a month. I still had a whole seven hours with slow internet! So what to do?

    - get a life? … nah, too much trouble

    I have an iPhone and a two year contract with Vodaphone Australia (AUD$69 / Month). So I turned on internet tethering, something American’s aren’t allowed to do with AT&T.
    Speed was OK, but nothing special. The iPhone 3GS donwloaded anywhere between 30kb/s to 160 kb/s.

    Internet Syncing on iPhone 3GS

    Leave a comment