Personal tools
You are here: Home Projects Mac to win-valid utf-8

mac2winvalidutf8

Pythonscript to rename files to be windows compatible. Useful for people using Netatalk 2.0 together with samba 3.0 on the same shares.

Features:

  • It converts filenames from any supported encoding to UTF-8
  • If character decoding fails then the character is quoted in URLencoded-style (special-chars become %xx codes)
  • It converts CAP-sequences from any supported encoding to UTF-8
  • It removes characters illegal for windows: */\?
  • :|", whitespace from beginning and end of filenames, dots from beginning and end of filenames (unless disabled by cmd-line parameter)
  • Filenames that would be truncated to a zero length string will be renamed to a quote from the Movie "Finding Nemo" e.g. filename " ... " will be renamed to "Shark bait, oo ha ha!"
  • renames are also done in netatalks cnid database (only v2 format supported)
  • source-encodings can be freely selected from whatever python supports
  • many new cmd-line options

Known Problems:

  • pythons (==libiconv) character map does not know what to do with char 0xF0 (the "apple logo" in mac_roman encoding). That character is a "corporate zone" unicode character and libiconv does not support this zone. Samba also uses libiconv and therefore does not support that character. This problem actually does not affect the target group of this script (people using netatalk and samba) because samba wont be able to handle this character anyway. If my script comes across that character it will rename it to %F0.

Usage guide:

  • DO NOT USE IT WHILE NETATALK IS RUNNING
  • it needs the bsddb3 python module
  • Use ./mac2winvalidutf8.py --help for information about commandline parameters
  • it only works with atalk v2 databases
  • make backups of your database and files before running it
  • watch the output of the test mode carefully for warnings and non-transcoded characters (search for % to find them)
  • oh and don't use it while netatalk is active, i mean it, the script has no single line of db-locking or other synchronous access mode code in it

Most recent version of mac2winvalidutf8.py: download

Document Actions
« March 2010 »
March
MoTuWeThFrSaSu
1234567
891011121314
15161718192021
22232425262728
293031