KOHA
Encoding and Character Sets in Koha: Revision 7

Koha wiki

Encoding and Character Sets in Koha
http://wiki.koha.org/doku.php?id=encodingscratchpad


MySQL

Edit /etc/mysql/my.cnf under the mysqld section:

default-character-set = utf8
character-set-server = utf8
skip-character-set-client-handshake # because the above settings don't actually work properly

http://wiki.jon.geek.nz/index.php/Koha/Install#Zebra_Z39.50_Support_.28Optional.29


Bilješke u vezi konverzije:

Doing a Latin-1 to UTF-8 conversion on the mysqldump directly will likely make any MARC records that are touched unparseable. I suggest as part of your process that you export the MARC bib and authority records separately, fix them using MARC::Record and the techniques you've already identified, then import them back into your 2.2.9 test
database. Then you can fix a mysqldump of the non-MARC tables.

Very briefly, Koha 3's C4::Charset module's MarcToUTF8Record routine should give you some ideas. You can use that as the core of a routine to convert a file that contains mixed Latin-1 and UTF-8 records to UTF-8. However, it will not correctly handle a MARC record that has both Latin-1 and UTF-8, but could be modified to test each field and subfield to see if it contains UTF-8 or Latin-1.
http://git.koha.org/cgi-bin/gitweb.cgi?p=Koha;a=blob;f=C4/Charset.pm


See also: Sorting Croatian characters in MySQL