LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name' [REPLACE | IGNORE] INTO TABLEtbl_name[CHARACTER SETcharset_name] [{FIELDS | COLUMNS} [TERMINATED BY 'string'] [[OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char'] ] [LINES [STARTING BY 'string'] [TERMINATED BY 'string'] ] [IGNOREnumberLINES] [(col_name_or_user_var,...)] [SETcol_name=expr,...]
      The LOAD DATA
      INFILE statement reads rows from a text file into a
      table at a very high speed. The file name must be given as a
      literal string.
    
      LOAD DATA
      INFILE is the complement of
      SELECT ... INTO
      OUTFILE. (See Section 12.2.9, “SELECT Syntax”.) To write data
      from a table to a file, use
      SELECT ... INTO
      OUTFILE. To read the file back into a table, use
      LOAD DATA
      INFILE. The syntax of the FIELDS and
      LINES clauses is the same for both statements.
      Both clauses are optional, but FIELDS must
      precede LINES if both are specified.
    
      For more information about the efficiency of
      INSERT versus
      LOAD DATA
      INFILE and speeding up
      LOAD DATA
      INFILE, see Section 7.2.22, “Speed of INSERT Statements”.
    
      The character set indicated by the
      character_set_database system
      variable is used to interpret the information in the file.
      SET NAMES and the setting of
      character_set_client do not
      affect interpretation of input. If the contents of the input file
      use a character set that differs from the default, it is usually
      preferable to specify the character set of the file by using the
      CHARACTER SET clause. A character set of
      binary specifies “no conversion.”
    
      LOAD DATA
      INFILE interprets all fields in the file as having the
      same character set, regardless of the data types of the columns
      into which field values are loaded. For proper interpretation of
      file contents, you must ensure that it was written with the
      correct character set. For example, if you write a data file with
      mysqldump -T or by issuing a
      SELECT ... INTO
      OUTFILE statement in mysql, be sure
      to use a --default-character-set option with
      mysqldump or mysql so that
      output is written in the character set to be used when the file is
      loaded with LOAD DATA
      INFILE.
    
      Note that it is currently not possible to load data files that use
      the ucs2, utf16, or
      utf32 character set.
    
      The character_set_filesystem
      system variable controls the interpretation of the file name.
    
      You can also load data files by using the
      mysqlimport utility; it operates by sending a
      LOAD DATA
      INFILE statement to the server. The
      --local option causes
      mysqlimport to read data files from the client
      host. You can specify the
      --compress option to get
      better performance over slow networks if the client and server
      support the compressed protocol. See
      Section 4.5.5, “mysqlimport — A Data Import Program”.
    
      If you use LOW_PRIORITY, execution of the
      LOAD DATA statement is delayed
      until no other clients are reading from the table. This affects
      only storage engines that use only table-level locking (such as
      MyISAM, MEMORY, and
      MERGE).
    
      If you specify CONCURRENT with a
      MyISAM table that satisfies the condition for
      concurrent inserts (that is, it contains no free blocks in the
      middle), other threads can retrieve data from the table while
      LOAD DATA is executing. Using this
      option affects the performance of LOAD
      DATA a bit, even if no other thread is using the table
      at the same time.
    
      Prior to MySQL 5.5.1, CONCURRENT was not
      replicated when using statement-based replication (see Bug#34628).
      However, it is replicated when using row-based replication,
      regardless of the version. See
      Section 16.4.1.13, “Replication and LOAD DATA
        INFILE”, for more
      information.
    
      The LOCAL keyword, if specified, is interpreted
      with respect to the client end of the connection:
    
          If LOCAL is specified, the file is read by
          the client program on the client host and sent to the server.
          The file can be given as a full path name to specify its exact
          location. If given as a relative path name, the name is
          interpreted relative to the directory in which the client
          program was started.
        
          If LOCAL is not specified, the file must be
          located on the server host and is read directly by the server.
          The server uses the following rules to locate the file:
        
If the file name is an absolute path name, the server uses it as given.
If the file name is a relative path name with one or more leading components, the server searches for the file relative to the server's data directory.
If a file name with no leading components is given, the server looks for the file in the database directory of the default database.
      Note that, in the non-LOCAL case, these rules
      mean that a file named as ./myfile.txt is
      read from the server's data directory, whereas the file named as
      myfile.txt is read from the database
      directory of the default database. For example, if
      db1 is the default database, the following
      LOAD DATA statement reads the file
      data.txt from the database directory for
      db1, even though the statement explicitly loads
      the file into a table in the db2 database:
    
LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;
Windows path names are specified using forward slashes rather than backslashes. If you do use backslashes, you must double them.
      For security reasons, when reading text files located on the
      server, the files must either reside in the database directory or
      be readable by all. Also, to use
      LOAD DATA
      INFILE on server files, you must have the
      FILE privilege. See
      Section 5.4.1, “Privileges Provided by MySQL”. For
      non-LOCAL load operations, if the
      secure_file_priv system variable
      is set to a nonempty directory name, the file to be loaded must be
      located in that directory.
    
      Using LOCAL is a bit slower than letting the
      server access the files directly, because the contents of the file
      must be sent over the connection by the client to the server. On
      the other hand, you do not need the
      FILE privilege to load local files.
    
      With LOCAL, the default duplicate-key handling
      behavior is the same as if IGNORE is specified;
      this is because the server has no way to stop transmission of the
      file in the middle of the operation. IGNORE is
      explained further later in this section.
    
      LOCAL works only if your server and your client
      both have been enabled to allow it. For example, if
      mysqld was started with
      --local-infile=0,
      LOCAL does not work. See
      Section 5.3.5, “Security Issues with LOAD
      DATA LOCAL”.
    
      On Unix, if you need LOAD DATA to
      read from a pipe, you can use the following technique (the example
      loads a listing of the / directory into the
      table db1.t1):
    
mkfifo /mysql/data/db1/ls.dat chmod 666 /mysql/data/db1/ls.dat find / -ls > /mysql/data/db1/ls.dat & mysql -e "LOAD DATA INFILE 'ls.dat' INTO TABLE t1" db1
Note that you must run the command that generates the data to be loaded and the mysql commands either on separate terminals, or run the data generation process in the background (as shown in the preceding example). If you do not do this, the pipe will block until data is read by the mysql process.
      The REPLACE and
      IGNORE keywords control handling of input rows
      that duplicate existing rows on unique key values:
    
          If you specify REPLACE, input
          rows replace existing rows. In other words, rows that have the
          same value for a primary key or unique index as an existing
          row. See Section 12.2.8, “REPLACE Syntax”.
        
          If you specify IGNORE, input rows that
          duplicate an existing row on a unique key value are skipped.
          If you do not specify either option, the behavior depends on
          whether the LOCAL keyword is specified.
          Without LOCAL, an error occurs when a
          duplicate key value is found, and the rest of the text file is
          ignored. With LOCAL, the default behavior
          is the same as if IGNORE is specified; this
          is because the server has no way to stop transmission of the
          file in the middle of the operation.
        
      If you want to ignore foreign key constraints during the load
      operation, you can issue a SET foreign_key_checks =
      0 statement before executing LOAD
      DATA.
    
      If you use LOAD DATA
      INFILE on an empty MyISAM table, all
      nonunique indexes are created in a separate batch (as for
      REPAIR TABLE). Normally, this makes
      LOAD DATA
      INFILE much faster when you have many indexes. In some
      extreme cases, you can create the indexes even faster by turning
      them off with ALTER TABLE ... DISABLE KEYS
      before loading the file into the table and using ALTER
      TABLE ... ENABLE KEYS to re-create the indexes after
      loading the file. See Section 7.2.22, “Speed of INSERT Statements”.
    
      For both the LOAD DATA
      INFILE and
      SELECT ... INTO
      OUTFILE statements, the syntax of the
      FIELDS and LINES clauses is
      the same. Both clauses are optional, but FIELDS
      must precede LINES if both are specified.
    
      If you specify a FIELDS clause, each of its
      subclauses (TERMINATED BY,
      [OPTIONALLY] ENCLOSED BY, and ESCAPED
      BY) is also optional, except that you must specify at
      least one of them.
    
      If you specify no FIELDS or
      LINES clause, the defaults are the same as if
      you had written this:
    
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\' LINES TERMINATED BY '\n' STARTING BY ''
      (Backslash is the MySQL escape character within strings in SQL
      statements, so to specify a literal backslash, you must specify
      two backslashes for the value to be interpreted as a single
      backslash. The escape sequences '\t' and
      '\n' specify tab and newline characters,
      respectively.)
    
      In other words, the defaults cause
      LOAD DATA
      INFILE to act as follows when reading input:
    
Look for line boundaries at newlines.
Do not skip over any line prefix.
Break lines into fields at tabs.
Do not expect fields to be enclosed within any quoting characters.
          Interpret characters preceded by the escape character
          “\” as escape sequences. For
          example, “\t”,
          “\n”, and
          “\\” signify tab, newline, and
          backslash, respectively. See the discussion of FIELDS
          ESCAPED BY later for the full list of escape
          sequences.
        
      Conversely, the defaults cause
      SELECT ... INTO
      OUTFILE to act as follows when writing output:
    
Write tabs between fields.
Do not enclose fields within any quoting characters.
          Use “\” to escape instances of
          tab, newline, or “\” that
          occur within field values.
        
Write newlines at the ends of lines.
        If you have generated the text file on a Windows system, you
        might have to use LINES TERMINATED BY '\r\n'
        to read the file properly, because Windows programs typically
        use two characters as a line terminator. Some programs, such as
        WordPad, might use \r as a
        line terminator when writing files. To read such files, use
        LINES TERMINATED BY '\r'.
      
      If all the lines you want to read in have a common prefix that you
      want to ignore, you can use LINES STARTING BY
      ' to skip over
      the prefix, and anything before it. If a line
      does not include the prefix, the entire line is skipped. Suppose
      that you issue the following statement:
    prefix_string'
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test FIELDS TERMINATED BY ',' LINES STARTING BY 'xxx';
If the data file looks like this:
xxx"abc",1 something xxx"def",2 "ghi",3
      The resulting rows will be ("abc",1) and
      ("def",2). The third row in the file is skipped
      because it does not contain the prefix.
    
      The IGNORE  option can be used to ignore lines at the start of
      the file. For example, you can use number
      LINESIGNORE 1
      LINES to skip over an initial header line containing
      column names:
    
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test IGNORE 1 LINES;
      When you use SELECT ...
      INTO OUTFILE in tandem with
      LOAD DATA
      INFILE to write data from a database into a file and
      then read the file back into the database later, the field- and
      line-handling options for both statements must match. Otherwise,
      LOAD DATA
      INFILE will not interpret the contents of the file
      properly. Suppose that you use
      SELECT ... INTO
      OUTFILE to write a file with fields delimited by commas:
    
SELECT * INTO OUTFILE 'data.txt' FIELDS TERMINATED BY ',' FROM table2;
To read the comma-delimited file back in, the correct statement would be:
LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY ',';
      If instead you tried to read in the file with the statement shown
      following, it wouldn't work because it instructs
      LOAD DATA
      INFILE to look for tabs between fields:
    
LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY '\t';
The likely result is that each input line would be interpreted as a single field.
      LOAD DATA
      INFILE can be used to read files obtained from external
      sources. For example, many programs can export data in
      comma-separated values (CSV) format, such that lines have fields
      separated by commas and enclosed within double quotes, with an
      initial line of column names. If the lines in such a file are
      terminated by carriage return/newline pairs, the statement shown
      here illustrates the field- and line-handling options you would
      use to load the file:
    
LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY ',' ENCLOSED BY '"'
  LINES TERMINATED BY '\r\n'
  IGNORE 1 LINES;
      If the input values are not necessarily enclosed within quotes,
      use OPTIONALLY before the ENCLOSED
      BY keywords.
    
      Any of the field- or line-handling options can specify an empty
      string (''). If not empty, the FIELDS
      [OPTIONALLY] ENCLOSED BY and FIELDS ESCAPED
      BY values must be a single character. The
      FIELDS TERMINATED BY, LINES STARTING
      BY, and LINES TERMINATED BY values
      can be more than one character. For example, to write lines that
      are terminated by carriage return/linefeed pairs, or to read a
      file containing such lines, specify a LINES TERMINATED BY
      '\r\n' clause.
    
      To read a file containing jokes that are separated by lines
      consisting of %%, you can do this
    
CREATE TABLE jokes (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, joke TEXT NOT NULL); LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes FIELDS TERMINATED BY '' LINES TERMINATED BY '\n%%\n' (joke);
      FIELDS [OPTIONALLY] ENCLOSED BY controls
      quoting of fields. For output
      (SELECT ... INTO
      OUTFILE), if you omit the word
      OPTIONALLY, all fields are enclosed by the
      ENCLOSED BY character. An example of such
      output (using a comma as the field delimiter) is shown here:
    
"1","a string","100.20" "2","a string containing a , comma","102.20" "3","a string containing a \" quote","102.20" "4","a string containing a \", quote and comma","102.20"
      If you specify OPTIONALLY, the
      ENCLOSED BY character is used only to enclose
      values from columns that have a string data type (such as
      CHAR,
      BINARY,
      TEXT, or
      ENUM):
    
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a \" quote",102.20 4,"a string containing a \", quote and comma",102.20
      Note that occurrences of the ENCLOSED BY
      character within a field value are escaped by prefixing them with
      the ESCAPED BY character. Also note that if you
      specify an empty ESCAPED BY value, it is
      possible to inadvertently generate output that cannot be read
      properly by LOAD DATA
      INFILE. For example, the preceding output just shown
      would appear as follows if the escape character is empty. Observe
      that the second field in the fourth line contains a comma
      following the quote, which (erroneously) appears to terminate the
      field:
    
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a " quote",102.20 4,"a string containing a ", quote and comma",102.20
      For input, the ENCLOSED BY character, if
      present, is stripped from the ends of field values. (This is true
      regardless of whether OPTIONALLY is specified;
      OPTIONALLY has no effect on input
      interpretation.) Occurrences of the ENCLOSED BY
      character preceded by the ESCAPED BY character
      are interpreted as part of the current field value.
    
      If the field begins with the ENCLOSED BY
      character, instances of that character are recognized as
      terminating a field value only if followed by the field or line
      TERMINATED BY sequence. To avoid ambiguity,
      occurrences of the ENCLOSED BY character within
      a field value can be doubled and are interpreted as a single
      instance of the character. For example, if ENCLOSED BY
      '"' is specified, quotes are handled as shown here:
    
"The ""BIG"" boss" -> The "BIG" boss The "BIG" boss -> The "BIG" boss The ""BIG"" boss -> The ""BIG"" boss
      FIELDS ESCAPED BY controls how to read or write
      special characters:
    
          For input, if the FIELDS ESCAPED BY
          character is not empty, occurrences of that character are
          stripped and the following character is taken literally as
          part of a field value. Some two-character sequences that are
          exceptions, where the first character is the escape character.
          These sequences are shown in the following table (using
          “\” for the escape character).
          The rules for NULL handling are described
          later in this section.
        
\0
                  
                  
 | 
An ASCII NUL (0x00) character | 
\b
                  
                  
 | 
A backspace character | 
\n
                  
                  
                  
                  
 | 
A newline (linefeed) character | 
\r
                  
                  
                  
 | 
A carriage return character | 
\t
                  
                  
 | 
A tab character. | 
\Z
                  
                  
 | 
ASCII 26 (Control-Z) | 
\N
                  
 | 
NULL | 
          For more information about
          “\”-escape syntax, see
          Section 8.1.1, “Strings”.
        
          If the FIELDS ESCAPED BY character is
          empty, escape-sequence interpretation does not occur.
        
          For output, if the FIELDS ESCAPED BY
          character is not empty, it is used to prefix the following
          characters on output:
        
              The FIELDS ESCAPED BY character
            
              The FIELDS [OPTIONALLY] ENCLOSED BY
              character
            
              The first character of the FIELDS TERMINATED
              BY and LINES TERMINATED BY
              values
            
              ASCII 0 (what is actually written
              following the escape character is ASCII
              “0”, not a zero-valued
              byte)
            
          If the FIELDS ESCAPED BY character is
          empty, no characters are escaped and NULL
          is output as NULL, not
          \N. It is probably not a good idea to
          specify an empty escape character, particularly if field
          values in your data contain any of the characters in the list
          just given.
        
In certain cases, field- and line-handling options interact:
          If LINES TERMINATED BY is an empty string
          and FIELDS TERMINATED BY is nonempty, lines
          are also terminated with FIELDS TERMINATED
          BY.
        
          If the FIELDS TERMINATED BY and
          FIELDS ENCLOSED BY values are both empty
          (''), a fixed-row (nondelimited) format is
          used. With fixed-row format, no delimiters are used between
          fields (but you can still have a line terminator). Instead,
          column values are read and written using a field width wide
          enough to hold all values in the field. For
          TINYINT,
          SMALLINT,
          MEDIUMINT,
          INT, and
          BIGINT, the field widths are 4,
          6, 8, 11, and 20, respectively, no matter what the declared
          display width is.
        
          LINES TERMINATED BY is still used to
          separate lines. If a line does not contain all fields, the
          rest of the columns are set to their default values. If you do
          not have a line terminator, you should set this to
          ''. In this case, the text file must
          contain all fields for each row.
        
          Fixed-row format also affects handling of
          NULL values, as described later. Note that
          fixed-size format does not work if you are using a multi-byte
          character set.
        
      Handling of NULL values varies according to the
      FIELDS and LINES options in
      use:
    
          For the default FIELDS and
          LINES values, NULL is
          written as a field value of \N for output,
          and a field value of \N is read as
          NULL for input (assuming that the
          ESCAPED BY character is
          “\”).
        
          If FIELDS ENCLOSED BY is not empty, a field
          containing the literal word NULL as its
          value is read as a NULL value. This differs
          from the word NULL enclosed within
          FIELDS ENCLOSED BY characters, which is
          read as the string 'NULL'.
        
          If FIELDS ESCAPED BY is empty,
          NULL is written as the word
          NULL.
        
          With fixed-row format (which is used when FIELDS
          TERMINATED BY and FIELDS ENCLOSED
          BY are both empty), NULL is
          written as an empty string. Note that this causes both
          NULL values and empty strings in the table
          to be indistinguishable when written to the file because both
          are written as empty strings. If you need to be able to tell
          the two apart when reading the file back in, you should not
          use fixed-row format.
        
      An attempt to load NULL into a NOT
      NULL column causes assignment of the implicit default
      value for the column's data type and a warning, or an error in
      strict SQL mode. Implicit default values are discussed in
      Section 10.1.4, “Data Type Default Values”.
    
      Some cases are not supported by
      LOAD DATA
      INFILE:
    
          Fixed-size rows (FIELDS TERMINATED BY and
          FIELDS ENCLOSED BY both empty) and
          BLOB or
          TEXT columns.
        
          If you specify one separator that is the same as or a prefix
          of another, LOAD
          DATA INFILE cannot interpret the input properly. For
          example, the following FIELDS clause would
          cause problems:
        
FIELDS TERMINATED BY '"' ENCLOSED BY '"'
          If FIELDS ESCAPED BY is empty, a field
          value that contains an occurrence of FIELDS ENCLOSED
          BY or LINES TERMINATED BY
          followed by the FIELDS TERMINATED BY value
          causes LOAD DATA
          INFILE to stop reading a field or line too early.
          This happens because
          LOAD DATA
          INFILE cannot properly determine where the field or
          line value ends.
        
      The following example loads all columns of the
      persondata table:
    
LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;
      By default, when no column list is provided at the end of the
      LOAD DATA
      INFILE statement, input lines are expected to contain a
      field for each table column. If you want to load only some of a
      table's columns, specify a column list:
    
LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata (col1,col2,...);
You must also specify a column list if the order of the fields in the input file differs from the order of the columns in the table. Otherwise, MySQL cannot tell how to match input fields with table columns.
      The column list can contain either column names or user variables.
      With user variables, the SET clause enables you
      to perform transformations on their values before assigning the
      result to columns.
    
      User variables in the SET clause can be used in
      several ways. The following example uses the first input column
      directly for the value of t1.column1, and
      assigns the second input column to a user variable that is
      subjected to a division operation before being used for the value
      of t1.column2:
    
LOAD DATA INFILE 'file.txt' INTO TABLE t1 (column1, @var1) SET column2 = @var1/100;
      The SET clause can be used to supply values not
      derived from the input file. The following statement sets
      column3 to the current date and time:
    
LOAD DATA INFILE 'file.txt' INTO TABLE t1 (column1, column2) SET column3 = CURRENT_TIMESTAMP;
You can also discard an input value by assigning it to a user variable and not assigning the variable to a table column:
LOAD DATA INFILE 'file.txt' INTO TABLE t1 (column1, @dummy, column2, @dummy, column3);
      Use of the column/variable list and SET clause
      is subject to the following restrictions:
    
          Assignments in the SET clause should have
          only column names on the left hand side of assignment
          operators.
        
          You can use subqueries in the right hand side of
          SET assignments. A subquery that returns a
          value to be assigned to a column may be a scalar subquery
          only. Also, you cannot use a subquery to select from the table
          that is being loaded.
        
          Lines ignored by an IGNORE clause are not
          processed for the column/variable list or
          SET clause.
        
User variables cannot be used when loading data with fixed-row format because user variables do not have a display width.
      When processing an input line, LOAD
      DATA splits it into fields and uses the values according
      to the column/variable list and the SET clause,
      if they are present. Then the resulting row is inserted into the
      table. If there are BEFORE INSERT or
      AFTER INSERT triggers for the table, they are
      activated before or after inserting the row, respectively.
    
If an input line has too many fields, the extra fields are ignored and the number of warnings is incremented.
If an input line has too few fields, the table columns for which input fields are missing are set to their default values. Default value assignment is described in Section 10.1.4, “Data Type Default Values”.
An empty field value is interpreted differently than if the field value is missing:
For string types, the column is set to the empty string.
          For numeric types, the column is set to 0.
        
For date and time types, the column is set to the appropriate “zero” value for the type. See Section 10.3, “Date and Time Types”.
      These are the same values that result if you assign an empty
      string explicitly to a string, numeric, or date or time type
      explicitly in an INSERT or
      UPDATE statement.
    
      TIMESTAMP columns are set to the
      current date and time only if there is a NULL
      value for the column (that is, \N) and the
      column is not declared to allow NULL values, or
      if the TIMESTAMP column's default
      value is the current timestamp and it is omitted from the field
      list when a field list is specified.
    
      LOAD DATA
      INFILE regards all input as strings, so you cannot use
      numeric values for ENUM or
      SET columns the way you can with
      INSERT statements. All
      ENUM and
      SET values must be specified as
      strings.
    
      BIT values cannot be loaded using
      binary notation (for example, b'011010'). To
      work around this, specify the values as regular integers and use
      the SET clause to convert them so that MySQL
      performs a numeric type conversion and loads them into the
      BIT column properly:
    
shell>cat /tmp/bit_test.txt2 127 shell>mysql testmysql>LOAD DATA INFILE '/tmp/bit_test.txt'->INTO TABLE bit_test (@var1) SET b= CAST(@var1 AS UNSIGNED);Query OK, 2 rows affected (0.00 sec) Records: 2 Deleted: 0 Skipped: 0 Warnings: 0 mysql>SELECT BIN(b+0) FROM bit_test;+----------+ | bin(b+0) | +----------+ | 10 | | 1111111 | +----------+ 2 rows in set (0.00 sec)
      When the LOAD DATA
      INFILE statement finishes, it returns an information
      string in the following format:
    
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
      If you are using the C API, you can get information about the
      statement by calling the
      mysql_info() function. See
      Section 21.9.3.35, “mysql_info()”.
    
      Warnings occur under the same circumstances as when values are
      inserted via the INSERT statement
      (see Section 12.2.5, “INSERT Syntax”), except that
      LOAD DATA
      INFILE also generates warnings when there are too few or
      too many fields in the input row. The warnings are not stored
      anywhere; the number of warnings can be used only as an indication
      of whether everything went well.
    
      You can use SHOW WARNINGS to get a
      list of the first max_error_count
      warnings as information about what went wrong. See
      Section 12.4.5.41, “SHOW WARNINGS Syntax”.
    

User Comments
I just recently had to figure out how to get a database with rows that look like this, into MySQL:
0001GB000020000300050006700070008000100020040050060007008
Believe it or not, there's like 9 columns in that row. The data from our vendor comes like this.
I am hoping someone finds this post useful. HOW I GOT IT INTO MYSQL:
1. When you setup your MySQL table, make sure it's a compatible format. If your LOAD DATA INFILE is spitting back 'too long for column', then there's something wrong. First check to make sure your Fixed-Width is correct, then check your Table Format. I had to use latin1_general_ci to get the data to load. It would not accept my LOAD DATA INFILE with utf8_general_ci. So that should save some hairs on your head when you pull them out from the frustration.
Here is the code to load the data:
LOAD DATA INFILE C:/bobsfile.txt' INTO TABLE mydatabase.mytable
FIELDS TERMINATED BY ''
FIELDS ENCLOSED BY '';
I hope this helps someone. It could have helped me, the documentation was poor on mentioning WILL NOT WORK IF YOU USE SPECIFIC TABLE FORMATS. a nice list of acceptable or prohibited formats would have been nice.
! :)
Nathan
Converting strings into dates while loading data using LOAD DATA INFILE:
In the following example, we are trying to convert the data in the file for date columns col3, col4 in formats 'mm/dd/yyyy', 'dd/mm/yyyy' into MySQL standard YYYY-mm-dd respectively.
load data infile '/tmp/xxx.dat'
into table xxx
fields terminated by '|'
lines terminated by '\n'
(col1,
col2,
@col3,
@col4,
col5)
set
col3 = str_to_date(@col3, '%m/%d/%Y'),
col4 = str_to_date(@col4, '%d/%m/%Y')
;
You could convert into any format you want by using the date_format function around the str_to_date().
Example:
...
set col2 = date_format(str_to_date(@col2, 'format'), 'your format')
I just ran into the same problem that the first two posters (Nathan Nuebner and Robert Lee) had with fixed-width imports. I suspect that the reason for this behaviour derives from the following statement from the above documentation:
'...column values are read and written using a field width wide enough to hold all values in the field.'
If you have a VARCHAR(20) column in a multi-byte character set (eg, UTF8), then the "field width wide enough to hold all values" in this field, measured in bytes, will actually be somewhat greater than 20. The two workarounds above worked because they both specified character sets which allocate one byte per character (latin1 and binary).
Specifying the character set in the LOAD DATA INFILE statement does not seem to work around the problem - that seems only to affect the incoming conversion from bytes to characters, it doesn't affect the number of bytes read.
The Latin1/binary examples above worked because they weren't trying to load multi-byte characters, however for someone who was trying to import multi-byte characters (or more specifically, to import character sets like UTF8 that use variable-width encoding for the characters) it would not work. There doesn't appear to be an easy workaround that I can see except to write an import utility in another programming language like Perl, Java or C.
Step by step guide on how to import Excel data using LOAD DATA: http://www.tjitjing.com/blog/2008/02/import-excel-data-into-mysql-in-5-easy.html
To load a text file with fixed width columns, I used the form:
LOAD DATA LOCAL INFILE '<file name>' INTO TABLE <table>
(@var1)
SET Date=str_to_date(SUBSTR(@var1,3,10),'%m/%d/%Y'),
Time=SUBSTR(@var1,14,8),
WindVelocity=SUBSTR(@var1,26,5),
WindDirection=SUBSTR(@var1,33,3),
WindCompass=SUBSTR(@var1,38,3),
WindNorth=SUBSTR(@var1,43,6),
WindEast=SUBSTR(@var1,51,6),
WindSamples=SUBSTR(@var1,61,4);
Importing floating point numbers that use comma as decimal separator requires the same trick that dates:
LOAD DATA LOCAL INFILE 'C:/path/to/mytable.txt' IGNORE
INTO TABLE mytable
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\r\n'
(int_col, @float_col)
SET float_col = replace(@float_col, ',', '.');
MySQL casts the value into the column type when it reads it from the file, *before* applying the transformations described in the SET clause. If you instruct it to read it into a variable the value is handled as string from the beginning.
From what I can tell you can change the character set with the appropriate option:
mysql --default-character-set=utf8
It seems to work for me.
If you get "Skipped records" using "LOAD DATA LOCAL INFILE" copy the data file to the actual database server and do the load without the "LOCAL" keyword.
This will then stop when an error occurs, 9 times out of 10 it will be index issues and you will know why there are skipped records.
e.g. LOAD DATA LOCAL INFILE 'myinfile.txt';
Query OK, 288168 rows affected (1 min 44.49 sec)
Records: 494522 Deleted: 0 Skipped: 206354 Warnings: 0
LOAD DATA INFILE '/data/input/myinfile.txt';
Query OK, 252243 rows affected (0.02 sec)
ERROR 1062 (23000): Duplicate entry '5935009001-2008-08-03 04:19:18' for key 1
I've looked for a way to conditionally ignore rows in an INFILE for a long time. There may be an obvious way to do this but I have never seen one.
Today I discovered a very neat way to achieve this.
Assume that you are have an INFILE Containing names and genders of people. In my case, the INFILE has fixed fields. The first column is either 1 (=Male) or 2 (=Female) and column 2-18 contains the name of the person.
Now, if you would want to load the males (row[0] == 1) only, create the following table.
CREATE TABLE Names
(
name varchar(255),
gender tinyint
)
PARTITION BY LIST (gender)
(
PARTITION Male VALUES IN (1)
#,PARTITION Female VALUES IN (2)
);
Note that the Female partition is commented out.
Now load the data normally, but be sure to specify the IGNORE keyword.
LOAD DATA INFILE '/tmp/names.dmp' IGNORE INTO TABLE Names (@var)
SET
Name=Trim(SUBSTR(@var,2,17)),
Gender=SUBSTR(@var,1,1)
;
The IGNORE prevents mysql to abort the import mid-file due to the missing partition.
Hope this is helpful to someone.
Here's a great way to use LOAD DATA for UPDATEs, reaping the performance benefits of the super fast LOAD DATA when doing massive data updates:
http://www.softwareprojects.com/resources/programming/t-how-to-use-mysql-fast-load-data-for-updates-1753.html
When the situation warrants, I will use load data infile because of its speed. In other situations I'll resort to using another method to load the data. I wish load data had the ability to output a .bad file. Without this, I must resort to writing code (PHP) to output bad records to a file.
I have an article about different methods to get data in MySQL, I prefer PHP's MySQLi prepared statements because of the lack of .bad files.
http://chrisjohnson.blogsite.org/php-and-mysql-data-import-performance/
Add your own comment.