Personal tools
You are here: Home Projects Mac to win-valid utf-8
Document Actions

mac2winvalidutf8

by Howard The Duck last modified 2004-11-18 06:13 PM

Pythonscript to rename files to be windows compatible. Useful for people using Netatalk 2.0 together with samba 3.0 on the same shares.

Features:

  • It converts filenames from any supported encoding to UTF-8
  • If character decoding fails then the character is quoted in URLencoded-style (special-chars become %xx codes)
  • It converts CAP-sequences from any supported encoding to UTF-8
  • It removes characters illegal for windows: */\?
  • :|", whitespace from beginning and end of filenames, dots from beginning and end of filenames (unless disabled by cmd-line parameter)
  • Filenames that would be truncated to a zero length string will be renamed to a quote from the Movie "Finding Nemo" e.g. filename " ... " will be renamed to "Shark bait, oo ha ha!"
  • renames are also done in netatalks cnid database (only v2 format supported)
  • source-encodings can be freely selected from whatever python supports
  • many new cmd-line options

Known Problems:

  • pythons (==libiconv) character map does not know what to do with char 0xF0 (the "apple logo" in mac_roman encoding). That character is a "corporate zone" unicode character and libiconv does not support this zone. Samba also uses libiconv and therefore does not support that character. This problem actually does not affect the target group of this script (people using netatalk and samba) because samba wont be able to handle this character anyway. If my script comes across that character it will rename it to %F0.

Usage guide:

  • DO NOT USE IT WHILE NETATALK IS RUNNING
  • it needs the bsddb3 python module
  • Use ./mac2winvalidutf8.py --help for information about commandline parameters
  • it only works with atalk v2 databases
  • make backups of your database and files before running it
  • watch the output of the test mode carefully for warnings and non-transcoded characters (search for % to find them)
  • oh and don't use it while netatalk is active, i mean it, the script has no single line of db-locking or other synchronous access mode code in it

Most recent version of mac2winvalidutf8.py: download

« May 2008 »
Su Mo Tu We Th Fr Sa
123
45678910
11121314151617
18192021222324
25262728293031
 

Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: