您好,登錄后才能下訂單哦!
對于LOAD DATA INFILE和SELECT … INTO OUTFILE語句中,FIELDS和LINES子句的語法完全相同。兩個子句在LOAD DATA INFILE和SELECT … INTO OUTFILE語句中都是可選的,但如果兩個子句都被指定,則FIELDS必須在LINES之前,否則報語法錯誤
當mysql server導出文本數據到文件時,FIELDS和LINES默認值時SELECT … INTO OUTFILE在輸出文本數據時行為如下:
- admin@localhost : xiaoboluo 03:08:34> select * from test3 into outfile "/tmp/test3.txt" FIELDS TERMINATED BY ',';
- Query OK, 4 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 03:08:37> system cat /tmp/test3.txt
- 2,a string,100.20
- 4,a string containing a \, comma,102.20
- 6,a string containing a " quote,102.20
- 8,a string containing a "\, quote and comma,102.20
- # 指定字段引用符為",不使用optionally關鍵字
- admin@localhost : xiaoboluo 03:33:33> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 03:37:21> select * from test3 into outfile "/tmp/test3.txt" FIELDS ENCLOSED BY '"';
- Query OK, 5 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 03:37:33> system cat /tmp/test3.txt
- "2" "a string" "100.20"
- "4" "a string containing a , comma" "102.20"
- "6" "a string containing a \" quote" "102.20"
- "8" "a string containing a \", quote and comma" "102.20"
- "10" "\\t" "102.20"
- # 指定字段引用符為",使用optionally關鍵字,可以看到id列的字段引用符去掉了
- admin@localhost : xiaoboluo 03:37:41> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 03:40:53> select * from test3 into outfile "/tmp/test3.txt" FIELDS optionally ENCLOSED BY '"';
- Query OK, 5 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 03:41:03> system cat /tmp/test3.txt
- 2 "a string" "100.20"
- 4 "a string containing a , comma" "102.20"
- 6 "a string containing a \" quote" "102.20"
- 8 "a string containing a \", quote and comma" "102.20"
- 10 "\\t" "102.20
1.2.6.2. LINES 關鍵字及其子句詳解
- admin@localhost : xiaoboluo 03:42:41> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 03:44:18> select * from test3 into outfile "/tmp/test3.txt" fields escaped by '.';
- Query OK, 5 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 03:44:25> system cat /tmp/test3.txt # 可以看到數據中指定的轉義符.號被轉義了,而數據\t沒有被轉義
- 2 a string 100..20
- 4 a string containing a , comma 102..20
- 6 a string containing a " quote 102..20
- 8 a string containing a ", quote and comma 102..20
- 10 \t 102..20
- admin@localhost : xiaoboluo 03:44:28> truncate test3; #清空表
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 03:45:19> load data infile "/tmp/test3.txt" into table test3 fields escaped by '.'; #導入數據時指定轉義符為.號
- Query OK, 5 rows affected (0.00 sec)
- Records: 5 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 03:45:40> select * from test3; #校驗數據,可以看到導入數據正常
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- | 10 | \t | 102.20 |
- +----+------------------------------------------+--------+
- 5 rows in set (0.00 sec)
如果您想要讀取的純文本文件中所有行都有一個您想要忽略的公用前綴,則可以使用LINES STARTING BY'prefix_string'來跳過這個前綴,以及前綴字符前面的任何內容。如果某行數據不包含前綴字符,則跳過整行內容,例
1.2.6.3. FIELDS和LINES注意事項
- # load data語句如下
- admin@localhost : xiaoboluo 03:48:04> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 03:54:54> select * from test3 into outfile "/tmp/test3.txt" LINES STARTING BY 'xxx';
- Query OK, 5 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 03:55:03> system cat /tmp/test3.txt #可以看到每行數據前面多了個行前綴字符串xxx
- xxx2 a string 100.20
- xxx4 a string containing a , comma 102.20
- xxx6 a string containing a " quote 102.20
- xxx8 a string containing a ", quote and comma 102.20
- xxx10 \\t 102.20
- # 現在,到shell命令行去修改一下,增加兩行
- admin@localhost : xiaoboluo 03:55:50> system cat /tmp/test3.txt # 最后要加載的純文本數據內容如下
- xxx2 a string 100.20
- xxx4 a string containing a , comma 102.20
- xxx6 a string containing a " quote 102.20
- xxx8 a string containing a ", quote and comma 102.20
- xxx10 \\t 102.20
- 12 \\t 102.20
- dfadsfasxxx14 \\t 102.20
- admin@localhost : xiaoboluo 03:59:03> truncate test3; #清空表
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 03:59:38> load data infile "/tmp/test3.txt" into table test3 LINES STARTING BY 'xxx'; #導入數據,指定行前綴字符為xxx
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 03:59:44> select * from test3; #校驗表數據,可以看到沒有xxx行前綴的行被忽略了,而包含xxx的最后一行,從xxx開始截斷,xxx字符本身及其之前的內容被忽略,\
- xxx之后的內容被解析為行數據導入了
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- | 10 | \t | 102.20 |
- | 14 | \t | 102.20 |
- +----+------------------------------------------+--------+
- 6 rows in set (0.00 sec)
- 行結束符(換行符),linux下默認為\n,使用子句lines terminated by 'string' 指定,其中string代表指定的換行符
- # 指定換行符為\r\n導出數據
- admin@localhost : xiaoboluo 03:59:49> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 04:02:22> select * from test3 into outfile "/tmp/test3.txt" lines terminated by '\r\n';
- Query OK, 6 rows affected (0.00 sec)
- # 由于linux的一些命令本身會解析掉這些特殊字符,所以使用python來查看這個文本文件中的換行符,從下面的結果中可以看到,列表的每一個元素代表一行數據,每一個元素的\
- 末尾的\r\n就是這行數據的換行符
- >>> f = open('/tmp/test3.txt','r')
- >>> data = f.readlines()
- >>> data
- ['2\ta string\t100.20\r\n', '4\ta string containing a , comma\t102.20\r\n', '6\ta string containing a " quote\t102.20\r\n', '8\ta string containing a ", quote and comma\t102.20\r\n', '10\t\\\\t\t102.20\r\n', \
- '14\t\\\\t\t102.20\r\n']
- >>>
- # 現在,把數據重新導入表,從下面的結果中可以看到,導入表中的數據正確
- admin@localhost : xiaoboluo 04:02:39> truncate test3;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 04:04:55> load data infile "/tmp/test3.txt" into table test3 lines terminated by '\r\n';
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 04:05:11> select * from test3;
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- | 10 | \t | 102.20 |
- | 14 | \t | 102.20 |
- +----+------------------------------------------+--------+
- 6 rows in set (0.00 sec)
眾所周知,MySQL中反斜杠是SQL語句中特殊字符的轉義字符,因此在sql語句中碰到特殊字符時,您必須指定一個或者兩個反斜杠來為特殊字符轉義(如在mysql中或者一些其他程序中,\n代表換行符,\t代表制表符,\代表轉義符,那么需要使用\t來轉義制表符,\n來轉義換行符,\來轉義轉義符本身,這樣才能正確寫入數據庫或者生成導出的數據文本,使用FIELDS ESCAPED BY子句指定轉義符
特殊字符列表如
- \0 ASCII NUL (X'00') 字符
- \b 退格字符
- \n 換行符
- \r 回車符
- \t 制表符
- \Z ASCII 26 (Control+Z)
- \N NULL值,如果轉義符值為空,則會直接導出null字符串作為數據,這在導入時將把null作為數據導入,而不是null符號
- # 字段引用符為",數據中包含",轉義符和換行符保持默認,導入數據時不會有任何問題
- admin@localhost : xiaoboluo 09:46:14> select * from test3;
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- +----+------------------------------------------+--------+
- 4 rows in set (0.00 sec)
- admin@localhost : xiaoboluo 09:46:17> select * from test3 into outfile "/tmp/test3.txt" FIELDS OPTIONALLY enclosed BY '"';
- Query OK, 4 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 09:46:23> system cat /tmp/test3.txt;
- 2 "a string" "100.20"
- 4 "a string containing a , comma" "102.20"
- 6 "a string containing a \" quote" "102.20"
- 8 "a string containing a \", quote and comma" "102.20" # 可以看到與字段引用符相同的符號數據被轉義了
- admin@localhost : xiaoboluo 09:54:41> truncate test3;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 09:58:01> load data infile '/tmp/test3.txt' into table test3 FIELDS OPTIONALLY enclosed BY '"' TERMINATED by ',';
- Query OK, 4 rows affected (0.00 sec)
- Records: 4 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 09:58:45> select * from test3;
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- +----+------------------------------------------+--------+
- 4 rows in set (0.00 sec)
- # 如果字段引用符為",字段分隔符為,且數據中包含字段引用符"和字段分隔符,,轉義符和換行符保持默認,這在導入數據時不會有任何問題
- admin@localhost : xiaoboluo 09:53:45> select * from test3 into outfile "/tmp/test3.txt" FIELDS OPTIONALLY enclosed BY '"' TERMINATED by ',';
- Query OK, 4 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 09:54:29> system cat /tmp/test3.txt;
- 2,"a string","100.20"
- 4,"a string containing a , comma","102.20"
- 6,"a string containing a \" quote","102.20"
- 8,"a string containing a \", quote and comma","102.20"
- admin@localhost : xiaoboluo 09:54:41> truncate test3;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 09:58:01> load data infile '/tmp/test3.txt' into table test3 FIELDS OPTIONALLY enclosed BY '"' TERMINATED by ',';
- Query OK, 4 rows affected (0.00 sec)
- Records: 4 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 09:58:45> select * from test3;
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- +----+------------------------------------------+--------+
- 4 rows in set (0.00 sec)
- # 但是,如果在字段引用符為",數據中包含",字段分隔符使用逗號,換行符保持默認的情況下,轉義符使用了空串,這會導致在導入數據時,第四行無法正確解析,報錯
- admin@localhost : xiaoboluo 09:58:01> load data infile '/tmp/test3.txt' into table test3 FIELDS OPTIONALLY enclosed BY '"' TERMINATED by ',';
- Query OK, 4 rows affected (0.00 sec)
- Records: 4 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 09:58:45> select * from test3;
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- +----+------------------------------------------+--------+
- 4 rows in set (0.00 sec)
- admin@localhost : xiaoboluo 09:58:49> select * from test3 into outfile "/tmp/test3_test.txt" FIELDS OPTIONALLY enclosed BY '"' TERMINATED by ',' escaped by '';
- Query OK, 4 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 10:00:42> system cat /tmp/test3_test.txt;
- 2,"a string","100.20"
- 4,"a string containing a , comma","102.20"
- 6,"a string containing a " quote","102.20" #關于這一行數據,需要說明一下ENCLOSED BY子句,該子句指定的引用符號從一個FIELDS TERMINATED BY子句指定的分隔符開始,直到碰到下一個\
- 分隔符之間且這個分隔符前面一個字符必須是字段引用符號(如果這個分隔符前面一個字符不是字段引用符,則繼續往后匹配,如第二行數據),在這之間的內容都會被當作整個列字符串處理,\
- 所以這一行數據在導入時不會發生解析錯誤
- 8,"a string containing a ", quote and comma","102.20" #這一行因為無法正確識別的字段結束位置,所以無法導入,報錯終止,前面正確的行也被回滾掉(binlog_format=row)
- admin@localhost : xiaoboluo 10:00:49> truncate test3;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 10:01:03> load data infile '/tmp/test3_test.txt' into table test3 FIELDS OPTIONALLY enclosed BY '"' TERMINATED by ',' escaped by '';
- ERROR 1262 (01000): Row 4 was truncated; it contained more data than there were input columns
- admin@localhost : xiaoboluo 10:01:33> select * from test3;
- Empty set (0.00 sec)
- # 數據中包含了默認的轉義符和指定的字段分隔符,字段引用符和行分隔符使用默認值,則在數據中的轉義符和字段分隔符會被轉義(只要不為空,則不管字段分隔符和轉義字符定義為什么值,\
- 都會被轉義)
- admin@localhost : xiaoboluo 03:08:45> insert into test3(test,test2) values('\\t','102.20');
- Query OK, 1 row affected (0.00 sec)
- admin@localhost : xiaoboluo 03:17:29> select * from test3;
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- | 10 | \t | 102.20 |
- +----+------------------------------------------+--------+
- 5 rows in set (0.00 sec)
- admin@localhost : xiaoboluo 03:17:32> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 03:17:39> select * from test3 into outfile "/tmp/test3.txt" FIELDS TERMINATED BY ',';
- Query OK, 5 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 03:17:42> system cat /tmp/test3.txt
- 2,a string,100.20
- 4,a string containing a \, comma,102.20
- 6,a string containing a " quote,102.20
- 8,a string containing a "\, quote and comma,102.20
- 10,\\t,102.20
- # 假設您執行SELECT ... INTO OUTFILE語句時使用了逗號作為列分隔符:
- SELECT * INTO OUTFILE 'data.txt'
- FIELDS TERMINATED BY ','
- FROM table2;
- # 如果您嘗試使用\t作為列分隔符,則它將無法正常工作,因為它會指示LOAD DATA INFILE在字段之間查找制表符,可能導致每個數據行整行解析時被當作單個字段:
- LOAD DATA INFILE 'data.txt' INTO TABLE table2
- FIELDS TERMINATED BY '\t';
- # 要正確讀取逗號分隔各列的文件,正確的語句是
- LOAD DATA INFILE 'data.txt' INTO TABLE table2
- FIELDS TERMINATED BY ','
- # 如果LINES TERMINATED BY換行符指定了一個空字符,并且FIELDS TERMINATED BY字段分隔符指定的是非空的一個字符(或者使用默認值\t),則行也會以字段分隔符作為行的結束符\
- (表現行為就是文本中最后一個字符就是字段分隔符),即整個文本看上去就是一整行數據了
- admin@localhost : xiaoboluo 04:48:35> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 04:53:59> select * from test3 into outfile "/tmp/test3.txt" FIELDS TERMINATED BY ',' lines terminated by '';
- Query OK, 6 rows affected (0.00 sec)
- # 使用python查看文本內容,從下面的結果中可以看到,整個表的數據由于換行符為空,所以導致都拼接為一行了,最后行結束符使用了字段分隔符逗號
- >>> f = open('/tmp/test3.txt','r')
- >>> data = f.readlines()
- >>> data
- ['2,a string,100.20,4,a string containing a \\, comma,102.20,6,a string containing a " quote,102.20,8,a string containing a "\\, quote and comma,102.20,10,\\\\t,102.20,14,\\\\t,102.20,']
- >>>
- # 導入數據到表,這里新建一張表來進行導入測試,預防清理掉了表數據之后,文本內容又無法正確導入的情況發生
- admin@localhost : xiaoboluo 04:57:52> create table test4 like test3;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 04:57:59> load data infile "/tmp/test3.txt" into table test4 FIELDS TERMINATED BY ',' lines terminated by '';
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 04:58:26> select * from test4; #從查詢結果上看,數據正確導入表test4中了
- +----+------------------------------------------+--------+
- | id | test | test2 |
- +----+------------------------------------------+--------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- | 10 | \t | 102.20 |
- | 14 | \t | 102.20 |
- +----+------------------------------------------+--------+
- 6 rows in set (0.00 sec)
- # 如果FIELDS TERMINATED BY和FIELDS ENCLOSED BY值都為空(''),則使用固定行(非限制)格式。使用固定行格式時,字段之間使用足夠寬的空格來分割各字段。對于數據類型\
- 是TINYINT,SMALLINT,MEDIUMINT,INT和BIGINT,字段寬度分別為4,6,8,11和20個空格(無論數據類型聲明的顯示寬度如何),對于varchar類型使用大約298個空格(這個空格數量是自己\
- 數的。。。,猜想這個空格數量可能與字符集,varchar定義長度有關,因為我在嘗試把varchar定義為50個字符的時候,空格少了156個左右)
- admin@localhost : xiaoboluo 04:58:31> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 05:04:05> select * from test3 into outfile "/tmp/test3.txt" FIELDS TERMINATED BY '' lines terminated by '';
- Query OK, 6 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 05:04:17> system cat /tmp/test3.txt #下面展示內容中把打斷空格使用...代替
- 2 a string ... 100.20 ... 4 a string containing a , comma ... 102.20 ... 6 a string containing a " quote ...102.20 ... 8 a string containing a ", quote and comma ... 102.20 \
- ... 10 \\t ... 102.20 ... 14 \\t ... 102.20 ... admin@localhost : xiaoboluo 05:04:35>
- # 現在,清理掉test4表,并載入數據,從下面的結果中可以看到,導入表中之后,雖然數據是對的,但是多了非常多的空格,那么也就意味著你需要使用程序正確地處理一下這些多余的空格之后,\
- 再執行導入
- admin@localhost : xiaoboluo 05:06:19> truncate test4;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 05:06:47> load data infile "/tmp/test3.txt" into table test4 FIELDS TERMINATED BY '' lines terminated by ''; # 注意:這是在sql_mode=''時導入的,如果不修改\
- sql_mode請使用local關鍵字
- Query OK, 6 rows affected, 12 warnings (0.01 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 12
- Note (Code 1265): Data truncated for column 'test' at row 1
- Note (Code 1265): Data truncated for column 'test2' at row 1
- Note (Code 1265): Data truncated for column 'test' at row 2
- Note (Code 1265): Data truncated for column 'test2' at row 2
- Note (Code 1265): Data truncated for column 'test' at row 3
- Note (Code 1265): Data truncated for column 'test2' at row 3
- Note (Code 1265): Data truncated for column 'test' at row 4
- Note (Code 1265): Data truncated for column 'test2' at row 4
- Note (Code 1265): Data truncated for column 'test' at row 5
- Note (Code 1265): Data truncated for column 'test2' at row 5
- Note (Code 1265): Data truncated for column 'test' at row 6
- Note (Code 1265): Data truncated for column 'test2' at row 6
- admin@localhost : xiaoboluo 05:07:09> select * from test4;
- +----+------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
- | id | test | test2 |
- +----+------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
- | 2 | a string | 100.20 |
- | 4 | a string containing a , comma | 102.20 |
- | 6 | a string containing a " quote | 102.20 |
- | 8 | a string containing a ", quote and comma | 102.20 |
- | 10 | \t | 102.20 |
- | 14 | \t | 102.20 |
- +----+------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
- 6 rows in set (0.00 sec)
- # 對于默認的FIELDS和LINES值,NULL值被轉義為\N輸出,字段值\N讀取時使用NULL替換并輸入(假設ESCAPED BY字符為\)
- admin@localhost : xiaoboluo 05:17:07> alter table test3 add column test3 varchar(10); #添加一個字段test3,默認值會被填充為null
- Query OK, 0 rows affected (0.04 sec)
- Records: 0 Duplicates: 0 Warnings: 0
- admin@localhost : xiaoboluo 05:17:33> select * from test3; #查看表中的test3列數據
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | NULL |
- | 4 | a string containing a , comma | 102.20 | NULL |
- | 6 | a string containing a " quote | 102.20 | NULL |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL |
- | 10 | \t | 102.20 | NULL |
- | 14 | \t | 102.20 | NULL |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
- admin@localhost : xiaoboluo 05:17:37> select * from test3 into outfile "/tmp/test3.txt"; #執行導出
- Query OK, 6 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 05:18:02> system cat /tmp/test3.txt #查看導出的文本文件,可以發現null被轉義為\N了,這是為了避免數據字符串本身包含null值時無法正確區分數據類型的null值
- 2 a string 100.20 \N
- 4 a string containing a , comma 102.20 \N
- 6 a string containing a " quote 102.20 \N
- 8 a string containing a ", quote and comma 102.20 \N
- 10 \\t 102.20 \N
- 14 \\t 102.20 \N
- # 導入數據,從結果中可以看到\N被正確解析為了數據類型的null值
- admin@localhost : xiaoboluo 05:18:06> truncate test3;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 05:20:36> load data infile '/tmp/test3.txt' into table test3;
- Query OK, 6 rows affected (0.01 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 05:20:52> select * from test3;
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | NULL |
- | 4 | a string containing a , comma | 102.20 | NULL |
- | 6 | a string containing a " quote | 102.20 | NULL |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL |
- | 10 | \t | 102.20 | NULL |
- | 14 | \t | 102.20 | NULL |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
- # 如果FIELDS ENCLOSED BY不為空,FIELDS escaped BY為空時,則將NULL值的字面字符串作為輸出字符值。這與FIELDS ENCLOSED BY引用的字符串中包含的null值不同,\
- 后者讀取為字符串'null',而前者讀取到數據庫中時被當作數據類型的null值,而不是數據的字符串null
- admin@localhost : xiaoboluo 05:20:57> update test3 set test3='null' where id=2; #更新id=2的test3列值為數據字符串的null
- Query OK, 1 row affected (0.00 sec)
- Rows matched: 1 Changed: 1 Warnings: 0
- admin@localhost : xiaoboluo 05:23:14> select * from test3;
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | null |
- | 4 | a string containing a , comma | 102.20 | NULL |
- | 6 | a string containing a " quote | 102.20 | NULL |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL |
- | 10 | \t | 102.20 | NULL |
- | 14 | \t | 102.20 | NULL |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
- admin@localhost : xiaoboluo 05:23:16> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 05:24:05> select * from test3 into outfile "/tmp/test3.txt" FIELDS ENCLOSED BY '"' escaped BY ''; #指定行引用符號為雙引號",轉義符為空導出數據
- Query OK, 6 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 05:24:51> system cat /tmp/test3.txt #查看導出的文本文件,可以看到數據字符串的null被加了雙引號,而數據類型的null沒有加雙引號
- "2" "a string" "100.20" "null"
- "4" "a string containing a , comma" "102.20" NULL
- "6" "a string containing a " quote" "102.20" NULL
- "8" "a string containing a ", quote and comma" "102.20" NULL
- "10" "\t" "102.20" NULL
- "14" "\t" "102.20" NULL
- admin@localhost : xiaoboluo 05:24:57> alter table test4 add column test3 varchar(10);
- Query OK, 0 rows affected (0.04 sec)
- Records: 0 Duplicates: 0 Warnings: 0
- admin@localhost : xiaoboluo 05:26:40> truncate test4; #這里使用test4表做測試,避免無法導入的情況發生
- Query OK, 0 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 05:26:44> load data infile '/tmp/test3.txt' into table test4 FIELDS ENCLOSED BY '"' escaped BY ''; #指定字段引用符為雙引號",轉義符為空導入數據
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 05:27:02> select * from test4; #查看表test4中的數據,從結果中可以看到,數據導入正確
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | null |
- | 4 | a string containing a , comma | 102.20 | NULL |
- | 6 | a string containing a " quote | 102.20 | NULL |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL |
- | 10 | \t | 102.20 | NULL |
- | 14 | \t | 102.20 | NULL |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
- # 使用固定行格式(當FIELDS TERMINATED BY和FIELDS ENCLOSED BY都為空時使用),將NULL寫為空字符串。這將導致表中的NULL值和空字符串在寫入文件時無法區分,\
- 因為它們都以空字符串形式寫入文本文件。如果您需要能夠在讀取文件時將其分開,則不應使用固定行格式(即不應該使用FIELDS TERMINATED BY和FIELDS ENCLOSED BY都為空)
- admin@localhost : xiaoboluo 05:29:11> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 05:29:22> select * from test3 into outfile "/tmp/test3.txt" FIELDS ENCLOSED BY '' TERMINATED BY '';
- Query OK, 6 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 05:29:43> system cat /tmp/test3.txt #從結果中看,是不是有點似曾相識呢?沒錯,前面演示過FIELDS TERMINATED BY和FIELDS ENCLOSED BY都為空的情況,\
- 使用了固定格式來導出文本,但是這里多了數據類型的null值處理,從下面的結果中已經看不到數據類型的null了,被轉換為了空值(下面展示時把大段空格使用...代替)
- 2 a string ... 100.20 ... null
- 4 a string containing a , comma ... 102.20 ...
- 6 a string containing a " quote ... 102.20 ...
- 8 a string containing a ", quote and comma ... 102.20 ...
- 10 \\t ... 102.20 ...
- 14 \\t ... 102.20 ...
- admin@localhost : xiaoboluo 05:29:46> truncate test4; #清空test4
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 05:34:15> load data infile "/tmp/test3.txt" into table test4 FIELDS ENCLOSED BY '' TERMINATED BY ''; #執行導入文本到test4表,注意:這是在sql_mode=''時導入的,\
- 如果不修改sql_mode請使用local關鍵字
- Query OK, 6 rows affected, 24 warnings (0.01 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 24
- Note (Code 1265): Data truncated for column 'test' at row 1
- Note (Code 1265): Data truncated for column 'test2' at row 1
- Note (Code 1265): Data truncated for column 'test3' at row 1
- Warning (Code 1262): Row 1 was truncated; it contained more data than there were input columns
- Note (Code 1265): Data truncated for column 'test' at row 2
- Note (Code 1265): Data truncated for column 'test2' at row 2
- Note (Code 1265): Data truncated for column 'test3' at row 2
- Warning (Code 1262): Row 2 was truncated; it contained more data than there were input columns
- Note (Code 1265): Data truncated for column 'test' at row 3
- Note (Code 1265): Data truncated for column 'test2' at row 3
- Note (Code 1265): Data truncated for column 'test3' at row 3
- Warning (Code 1262): Row 3 was truncated; it contained more data than there were input columns
- Note (Code 1265): Data truncated for column 'test' at row 4
- Note (Code 1265): Data truncated for column 'test2' at row 4
- Note (Code 1265): Data truncated for column 'test3' at row 4
- Warning (Code 1262): Row 4 was truncated; it contained more data than there were input columns
- Note (Code 1265): Data truncated for column 'test' at row 5
- Note (Code 1265): Data truncated for column 'test2' at row 5
- Note (Code 1265): Data truncated for column 'test3' at row 5
- Warning (Code 1262): Row 5 was truncated; it contained more data than there were input columns
- Note (Code 1265): Data truncated for column 'test' at row 6
- Note (Code 1265): Data truncated for column 'test2' at row 6
- Note (Code 1265): Data truncated for column 'test3
load data執行時如果表中有外鍵、輔助索引、唯一索引,那么會導致加載數據的時間變慢,因為索引也需要一同更新,可以使用對應參數關閉外鍵檢查、唯一索引檢查甚至關閉索引
如果在sql_mode設置為嚴格模式下,且不使用local和ignore關鍵字時,碰到缺少字段值會直接報錯終止,但在sql_mode設置為嚴格模式下,使用了local和ignore關鍵字時,則行為與不使用嚴格模式類似
LOAD DATA INFILE將所有輸入視為字符串,因此您不能認為load data語句會像INSERT語句那樣插入ENUM或SET列的數值。所有ENUM和SET值必須指定為字符串
LOAD DATA INFILE不支持的場景
PS:在Unix上,如果需要LOAD DATA從管道讀取數據,可以使用以下方法(該示例將/目錄的列表加載到表db1.t1中,find命令掛后臺持續查找內容并生成ls.dat文件,mysql 客戶端使用-e選項來執行load data這個文件到表):
1.2.7. IGNORE number {LINES | ROWS}子句
- mkfifo /mysql/data/db1/ls.dat
- chmod 666 /mysql/data/db1/ls.dat
- find / -ls> /mysql/data/db1/ls.dat&
- mysql -e "LOAD DATA INFILE 'ls.dat' INTO TABLE t1" db1
- admin@localhost : xiaoboluo 05:34:41> system cat /tmp/test3.txt
- id test test2 test3
- 2 a string 100.20 null
- 4 a string containing a , comma 102.20 NULL
- 6 a string containing a " quote 102.20 NULL
- 8 a string containing a ", quote and comma 102.20 NULL
- 10 \\t 102.20 NULL
- 14 \\t 102.20 NULL
- admin@localhost : xiaoboluo 05:41:35> truncate test4;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 05:41:41> load data infile "/tmp/test3.txt" into table test4 ignore 1 lines; #載入文本時指定ignore 1 lines子句忽略文本中的前1行數據
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 05:42:22> select * from test4; #查詢表test4中的數據,從下面的結果中可以看到數據正確
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | null |
- | 4 | a string containing a , comma | 102.20 | NULL |
- | 6 | a string containing a " quote | 102.20 | NULL |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL |
- | 10 | \t | 102.20 | NULL |
- | 14 | \t | 102.20 | NULL |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
1.2.8. (col_name_or_user_var,…)指定字段名稱的子句
- LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
- FIELDS TERMINATED BY ',' ENCLOSED BY '"'
- LINES TERMINATED BY '\r\n'
- IGNORE 1 LINES;
- # 如果輸入值不一定包含在引號內,請在ENCLOSED BY關鍵字之前使用OPTIONALLY,如:OPTIONALLY ENCLOSED BY '"',加上OPTIONALLY 可能會忽略數值類型的字段的引用符號,\
- 另外,如果你的csv文件第一行是數據而不是列名,那就不能使用IGNORE 1 LINES子句
如果只想加載某些列,請指定列列表,如:LOAD DATA INFILE'persondata.txt'INTO TABLE persondata(col1,col2,…); ,要注意:如果輸入文件中的字段值順序與表中列的順序不同,你需要對load data語句中的tb_name后跟的字段順序做一下調整以對應文本文件中的字段順序。否則,MySQL不能判斷如何與表中的順序對齊,列出列名時可以在tb_name后指定具體的列名,也可以使用表達式生成值指定給某個列名(使用set語句指定一個表達式,復制給一個變量,詳見1.2.9小節),如果沒有set語句,建議列名寫在tb_name表名后邊,方便理解,有set語句時就跟set語句寫在一起
如果發現文件中的列順序和表中的列順序不符,或者只想加載部分列,在命令中加上列的順序時指定的字段名也不一定非要放在緊跟著表名,可以放在語句最后面也可以,如:load data infile "/tmp/filename.txt" into table emp fields terminated by ',' enclosed by '"' ignore 2 lines (id,content,name);如果只需要導入一個字段id,則把 (id,content,name)換做(id)即可
使用示例參考1.1小節的“如果文本文件中的數據字段與表結構中的字段定義順序不同,則使用如下語句指定載入表中的字段順序”演示部分
- # 如果系統將id列的文本數據加上10以后再加載到表的test3列中,可以如下操作:
- admin@localhost : xiaoboluo 06:05:42> system rm -f /tmp/test3.txt;
- admin@localhost : xiaoboluo 06:06:00> select * from test3 into outfile "/tmp/test3.txt";
- Query OK, 6 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 06:06:04> system cat /tmp/test3.txt
- 2 a string 100.20 null
- 4 a string containing a , comma 102.20 \N
- 6 a string containing a " quote 102.20 \N
- 8 a string containing a ", quote and comma 102.20 \N
- 10 \\t 102.20 \N
- 14 \\t 102.20 \N
- admin@localhost : xiaoboluo 06:07:49> truncate test4;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 06:07:53> load data infile "/tmp/test3.txt" into table test4 (id,test,test2) set test3=id+10 ;
- ERROR 1262 (01000): Row 1 was truncated; it contained more data than there were input columns
- admin@localhost : xiaoboluo 06:08:02> select * from test4; #嚴格模式下因為文本中多了一個字段被截斷了,所以拒絕導入
- Empty set (0.00 sec)
- admin@localhost : xiaoboluo 06:08:08> load data local infile "/tmp/test3.txt" into table test4 (id,test,test2) set test3=id+10 ; #可以使用local關鍵字強制進行截斷最后一個字段的null值列進行導入,\
- 注意,如果不使用local關鍵字,那就需要修改sql_mode才能導入
- Query OK, 6 rows affected, 6 warnings (0.01 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 6
- Warning (Code 1262): Row 1 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 2 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 3 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 4 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 5 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 6 was truncated; it contained more data than there were input columns
- admin@localhost : xiaoboluo 06:10:45> select * from test4;
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | 12 |
- | 4 | a string containing a , comma | 102.20 | 14 |
- | 6 | a string containing a " quote | 102.20 | 16 |
- | 8 | a string containing a ", quote and comma | 102.20 | 18 |
- | 10 | \t | 102.20 | 20 |
- | 14 | \t | 102.20 | 24 |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
- # 或者使用txt文件中的某些列進行計算后生成新的列插入,這里演示兩個字段進行相加后導入另外一個字段中:
- admin@localhost : xiaoboluo 06:18:37> load data local infile "/tmp/test3.txt" into table test4 (id,test,test2) set test3=id+test2 ; # 注意,如果不使用local關鍵字,那就需要修改sql_mode才能導入
- Query OK, 6 rows affected, 6 warnings (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 6
- Warning (Code 1262): Row 1 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 2 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 3 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 4 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 5 was truncated; it contained more data than there were input columns
- Warning (Code 1262): Row 6 was truncated; it contained more data than there were input columns
- admin@localhost : xiaoboluo 06:19:07> select * from test4;
- +----+------------------------------------------+--------+-------+
- | id | test | test2 | test3 |
- +----+------------------------------------------+--------+-------+
- | 2 | a string | 100.20 | 102.2 |
- | 4 | a string containing a , comma | 102.20 | 106.2 |
- | 6 | a string containing a " quote | 102.20 | 108.2 |
- | 8 | a string containing a ", quote and comma | 102.20 | 110.2 |
- | 10 | \t | 102.20 | 112.2 |
- | 14 | \t | 102.20 | 116.2 |
- +----+------------------------------------------+--------+-------+
- 6 rows in set (0.00 sec)
- # 可以直接使用一個用戶變量并進行計算(計算表達式可以使用函數、運算符、子查詢等都允許),然后賦值給test4列直接導入,而不需要從文件中讀取test4列數據,該列數據也允許在文件中不存在
- admin@localhost : xiaoboluo 06:27:06> alter table test4 add column test4 varchar(20); #新建一個字段test4,用于導入set子句計算的值
- Query OK, 0 rows affected (0.01 sec)
- Records: 0 Duplicates: 0 Warnings: 0
- admin@localhost : xiaoboluo 06:27:56> truncate test4;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 06:28:02> set @test=200; #設置一個用戶變量
- Query OK, 0 rows affected (0.00 sec)
- admin@localhost : xiaoboluo 06:30:32> load data infile "/tmp/test3.txt" into table test4 (id,test,test2,test3) set test4=round(@test/100,0) ; #執行導入,使用set子句導入test4列通過表達式\
- round(@test/100,0)計算之后的值
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 06:30:52> select * from test4; #查看test4表中導入的數據,從以下結果中來看,導入數據正確
- +----+------------------------------------------+--------+-------+-------+
- | id | test | test2 | test3 | test4 |
- +----+------------------------------------------+--------+-------+-------+
- | 2 | a string | 100.20 | null | 2 |
- | 4 | a string containing a , comma | 102.20 | NULL | 2 |
- | 6 | a string containing a " quote | 102.20 | NULL | 2 |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL | 2 |
- | 10 | \t | 102.20 | NULL | 2 |
- | 14 | \t | 102.20 | NULL | 2 |
- +----+------------------------------------------+--------+-------+-------+
- 6 rows in set (0.00 sec)
- # SET子句可以將一個內部函數返回的值直接導入到一個指定列
- admin@localhost : xiaoboluo 06:31:22> truncate test4;
- Query OK, 0 rows affected (0.01 sec)
- admin@localhost : xiaoboluo 06:40:58> load data infile "/tmp/test3.txt" into table test4 (id,test,test2,test3) set test4=now() ;
- Query OK, 6 rows affected (0.00 sec)
- Records: 6 Deleted: 0 Skipped: 0 Warnings: 0
- admin@localhost : xiaoboluo 06:41:02> select * from test4;
- +----+------------------------------------------+--------+-------+---------------------+
- | id | test | test2 | test3 | test4 |
- +----+------------------------------------------+--------+-------+---------------------+
- | 2 | a string | 100.20 | null | 2017-05-03 18:41:02 |
- | 4 | a string containing a , comma | 102.20 | NULL | 2017-05-03 18:41:02 |
- | 6 | a string containing a " quote | 102.20 | NULL | 2017-05-03 18:41:02 |
- | 8 | a string containing a ", quote and comma | 102.20 | NULL | 2017-05-03 18:41:02 |
- | 10 | \t | 102.20 | NULL | 2017-05-03 18:41:02 |
- | 14 | \t | 102.20 | NULL | 2017-05-03 18:41:02 |
- +----+------------------------------------------+--------+-------+---------------------+
- 6 rows in set (0.00 sec)
使用指定列名或者變量列表時SET子句的使用受以下限制:
如果輸入行的字段太多(多過表中的字段數量),則會忽略額外的字段,并增加警告數。如果輸入行的字段太少,那么輸入字段缺少的表列被設置為其默認值,在解析文本文件時,空串字段數據與缺少字段值不同(空串會直接作為數據插入,而缺少字段時,會根據字段定義的默認值進行填充),如下:
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。