File: /~heha/hs/transcode.zip/readme.txt

transcode:

a command-line utility program that converts from one code page to another.
There is a large number of code pages known to Windows!
Search www.msdn.com by "sjis" keyword for a complete list.


Following features:
* Can prepend or remove Windows Notepad (BOM, Byte Order Mark) prefixes
* Can auto-detect utf-8 and utf-16 files on Notepad prefixes
* Can auto-detect utf-16 files on zero bytes (both endians)
* Can auto-detect valid utf-8 files without BOM
* Some often-used code page numbers are available as mnemonics
* Input and output files as redirected files (with "<" and ">") or as command-line arguments
* Spaces in quoted file names allowed
* ERRORLEVEL =1 for warnings, =2 for errors
* Supports in-place editing by giving the same file name twice
* Drops incomplete output file on disk-full-alike error message
* Command-line help (when omitting any parameter)
* Very small executable size with no run-time library and no extra DLL
* ANSI (Win98/Me) and Unicode (WinNT4, 2k, XP, Vista) version


Usage:
transcode [sourcecodepage:][?]destcodepage [inputfile [outputfile]]

codepages are numbers or abbrevations like "utf-7", "utf-8" etc.,
see commend-line help for a complete list of abbrevations.

? (question mark) in front of <destcodepage> suppresses
output of Windows Notepad prefixes (applies for UTF-8 and UTF-16),
useful for feeding ôgccö compilers


Example:
1. Screen printing of a text file (either ANSI or any Unicode format):
transcode oem testfile.txt
or (for a program that outputs ANSI characters):
misbehaving_program.exe | transcode ansi:oem

This will convert umlauts etc. to OEM (DOS) code page before output.

2. Remove Notepad prefixes before passing UTF-8 source code files to [avr|msp]gcc:
transcode ?utf-8 program.cpp program.1.cpp


The current implementation (April 2009) has following limitations:
* Input file cannot be a true stream (but output can be a stream)
* file sizes much less than 2 GB (should be sufficient for text files)
* Messages in German language
* Requires Internet Explorer 5 or higher for rich code page support
* Requires Internet Explorer 4 or higher for basic code page support

For error messages with "Fehlerkode", type "net helpmsg <code>" to get
an (often useless) description by Windows.


Public-domain with source code, Henrik Haftmann, 090406
Detected encoding: OEM (CP437)1
Wrong umlauts? - Assume file is ANSI (CP1252) encoded