Using the Right Tools Part III - Text Files
In the beginning, there was ASCII text, and although largely forgotten or unknown by people today it underlies almost every computer system in existence. Specifically, ASCII is a character encoding for text. Although there have been various extensions developed over the years to support languages other than English, any file you see with an extension “.txt” is going to be an ASCII compatible file. More important than the specific character encoding is the notion that text is basic and universal, it will work with just about any computer system. A general philosophy when working with files is to choose the most basic format that meets your needs, which has two benefits:
- Your files will be compatible with the widest variety of computing environments
- The more basic the format, the greater the number of tools available to manipulate files in that format
Consider also that all programming languages have built-in support for processing text files, so if you are downloading scripts or even writing your own, text format is guaranteed to work. For example, try writing a Perl script to process lines from a Word document and you’ll quickly wish you had it in text format. The logic here is similar to the reasons I stated in an earlier post encouraging use of CSV format instead of Excel for lists. Unless you really need to format a document for printing forget about Word. If you are storing data or structured information use text format.
Microsoft Word and WordPad can open and manipulate text files, but these applications are so focused around formating documents for printing that it can be error prone to use them. For example, Word makes it easy to accidentally save to the richer doc format that renders files unusable as raw text. In contrast, Microsoft Notepad is an extremely simple application included with Windows that provides basic text editing capability, but it is so feature poor that you’ll almost certainly need more powerful application. Don’t be fooled by the Notepad Masochists, who claim that all websites should be “Created using Notepad” - this is a joke.
Download.com lists over 200 applications in the Editors category, and Wikipedia provides a comparison of editors by feature. Also, Matt Stibbe writes about an interesting use case I hadn’t thought of: distraction free writing, and suggests a few tools.
The text editing application I use personally is TextPad - it’s clutter free and entirely focussed on editing text files. There are some really useful features such as the ability to manipulate letter case and select blocks of text as opposed to lines - see the screen capture below.
(Click for larger image)
Perhaps the most important and powerful feature is Macro support, which allows “recording” and “replay” of a sequence of actions against a file. Macros are a form of semi-automation, and many people use them for sequences of edits they have to perform repeatedly against one or more files. Perhaps because I know Perl, I find myself using macros more often for one-off tasks. For example, say you have a text file with 1000 lines and you want to add a comma at the end of every line. Simply record the action of pressing the End key, typing a comma, and then the pressing Down Arrow as a macro. After binding this macro to Ctrl + , you can quickly scan through the entire file in seconds by holding down Ctrl + ,.
Here’s a general rule: when creating files, ask yourself “Is this for publication or processing?”. If it’s data, consider saving it as text, or as CSV (a standard for layout of text files). You can always load the file into Word or Excel later if need be.
January 6th, 2008 at 4:23 pm
One nice thing about applications moving to the web is that people expect data portability. Yet, it is surprising how many companies are willing to frustrate their customers by not providing these choices. The expectation is now that you should be able to move your data wherever you like. I was using Yahoo! Site Explorer the other day and I was impressed that they allow you to export certain search data to TSV.
On a related note, the Getting Things Done crowd promotes the use of text files as a way to get ride of unnecessary distraction and complication, which is probably a good argument. These simple formats are probably our best bet when it comes to archiving data for the long-now as well.
January 6th, 2008 at 4:34 pm
It’s actually somewhat timely that you write on this subject, although you’re coming at it from a different angle…
In today’s NY Times:
http://www.nytimes.com/2008/01/06/magazine/06wwln-medium-t.html?_r=1&ref=technology&oref=slogin
In terms of software complexity, I think many basic users are overwhelmed by formats, functions, interfaces, etc. As we head down this road, usability will need to be the focus, which is probably why companies like Apple are gaining market share.