Site: US UK AU |
Nexcess Blog

Viewing A File With All The Special & Control Characters

April 18, 2013 0 Comments RSS Feed

Sometimes I want view a file and see all the tabs replaced with a tab character and see line endings clearly marked. This might be because I’m trying to parse the file somehow, writing a regex to match something, or I might be debugging some problem with special characters.

For example, I’ve had to deal with the output of a program looking like:

<br />
id	domain	name<br />
1	John Doe<br />
2	John Q. Public<br />

I was trying to process it with awk but, when looking at the output in a terminal, it wasn’t clear if spaces or tabs were seperating the different fields. So, I fonud a few ways to print something out and show all the special characters.

cat has flags that will control showing non-printable and other special characters. If you want to see it all, just use ‘-A’.

<br />
$ cat -A foo.txt<br />
id^Idomain^Iname$<br />
1^^IJohn Doe$<br />
2^^IJohn Q. Public$<br />

I really don’t like how ‘^I’ is used for the tab character. I’m so used to writing and seeing ‘\t’ for tab characters, ‘^I’ really throws me off (not to mention it has the metacharacter ‘^’ in it means something else). This is why I prefer using sed.

sed has an ‘l’ operator that will print a file out in a ‘visually unambiguous’ form. See the ouput below:

<br />
$ sed -n l foo.txt<br />
id\tdomain\tname$<br />
1\\tJohn Doe$<br />
2\tlongest-domain-you-can-possibly-imagine-even-if-it-violates-rfc-10\<br />\tJohn Q. Public$<br />

sed took it upon itself to wrap the column for you which I really hate. Luckily GNU sed lets you specify the width you want to wrap it at. If you tell it 0, it won’t wrap it at all:

<br />
$ sed -n l0 foo.txt<br />
id\tdomain\tname$<br />
1\\tJohn Doe$<br />
2\\tJohn Q. Public$<br />

Posted in: Nexcess