Character Set
The type of a data file must be compatible with the database character set. For example, if the data file is type ASCII or UTF-8, you can be load it into a UTF-8 database. However, if the data file is type ISO 8859-1 (Latin1), which is not compatible with UTF-8, the column values containing multi-byte characters cannot be displayed in the result set.
$ file Date_Dimension.tbl
Date_Dimension.tbl: ASCII text
The file command may indicate ASCII TEXT even though the file contains multi-byte characters.
$ wc Date_Dimension.tbl
1828 5484 221822 Date_Dimension.tbl
If the wc command returns an error such as Invalid or incomplete multibyte or wide character
, the data file is using an incompatible character set.
Using Quoted Characters as Literals
You can use the backslash character (\) to quote data characters that would otherwise be taken as special characters. In particular, the following characters must be preceded by a backslash if they appear as part of a column value:
COPY ... DELIMITER
character (default is the tab character)COPY ... NULL
string (default is \N)Examples
In these examples, the DELIMITER is comma for visibility.
,1,2,3, ,1,2,3 1,2,3, |
Leading and trailing delimiters are ignored. Thus, the rows all have three columns. |
|
123,\n,\\n,456 |
Using the default null string (\n), the row would be interpreted as: 123 NULL \n 456 |
Using a non-default null string, the row would be interpreted as: 123 newline \n 456 |
123,this\, that\, or the other,something else,456 |
||
|
The row would be interpreted as: 123 this, that, or the other something else 456 |