Indiana University
University Information Technology Services
  
What are archived documents?

In Unix, how do I convert a PostScript file to text?

A PostScript file is a text file that describes how to place text and graphics on paper. While there are many programs available for generating PostScript files from text, converting a PostScript file back to normal text is more difficult. Extracting the raw text and graphics that originally composed the document from the special instructions in the PostScript file is not a straightforward task.

Ideally, you should try to obtain a copy of the original text from the creator of the PostScript file. If you're unable to do so, you can try using the script below, which invokes the Ghostscript text processing program. Save everything between the "clip here" lines into a file, making sure that long lines in the script are stored in the file as single lines. Then, add execute permission to the file. For example, if you save the script as ps2ascii, at the Unix prompt, enter:

chmod u+x ps2ascii

To run the script on a PostScript file named myfile.ps, at the Unix prompt, enter:

ps2ascii myfile.ps > myfile.txt

The output is then saved in a file named myfile.txt.

Note: The results will likely not be perfect because of the difficulty involved in extracting the text, and you may need to do some editing on the output file to make it presentable.

----clip here------------------------------------------------------------ #!/bin/sh # Extract ASCII text from a PostScript file. Usage: # ps2ascii [infile.ps [outfile.txt]] # If outfile is omitted, output goes to stdout. # If both infile and outfile are omitted, ps2ascii acts as a filter, # reading from stdin and writing on stdout. if ( test $# -eq 0 ) then gs -q -dNODISPLAY -dNOBIND -dWRITESYSTEMDICT -dSIMPLE ps2ascii.ps - -c quit elif ( test $# -eq 1 ) then gs -q -dNODISPLAY -dNOBIND -dWRITESYSTEMDICT -dSIMPLE ps2ascii.ps $1 -c quit else gs -q -dNODISPLAY -dNOBIND -dWRITESYSTEMDICT -dSIMPLE ps2ascii.ps $1 -c quit >$2 fi ----clip here------------------------------------------------------------

At Indiana University, Ghostscript is available on Quarry.

At Indiana University, to get support for personal or departmental Linux or Unix systems, see At IU, how do I get support for Linux or Unix?

Also see:

This is document abcd in domain all.
Last modified on August 22, 2008.
Please tell us, did you find the answer to your question?