15 Linux Split and Join Command Examples to Manage Large Fil
by??on? http://www.thegeekstuff.com/2012/10/15-linux-split-and-join-command-examples-to-manage-large-files/ Linux split and join commands are very helpful when you are manipulating large files. This article explains how to use Linux split and join command with descriptive examples. Join and split command syntax:
Linux Split Command Examples1. Basic Split ExampleHere is a basic example of split command. $ split split.zip So we see that the file split.zip was split into smaller files with x** as file names. Where ** is the two character suffix that is added by default. Also,by default each x** file would contain 1000 lines. $ wc -l * 40947 split.zip 1000 xaa 1000 xab 1000 xac 1000 xad 1000 xae 1000 xaf 1000 xag 1000 xah 1000 xai ... ... ... So the output above confirms that by default each x** file contains 1000 lines. 2.Change the Suffix Length using -a optionAs discussed in example 1 above,the default suffix length is 2. But this can be changed by using -a option. As you see in the following example,it is using suffix of length 5 on the split files. $ split -a5 split.zip $ ls split.zip xaaaac xaaaaf xaaaai xaaaal xaaaao xaaaar xaaaau xaaaax xaaaba xaaabd xaaabg xaaabj xaaabm xaaaaa xaaaad xaaaag xaaaaj xaaaam xaaaap xaaaas xaaaav xaaaay xaaabb xaaabe xaaabh xaaabk xaaabn xaaaab xaaaae xaaaah xaaaak xaaaan xaaaaq xaaaat xaaaaw xaaaaz xaaabc xaaabf xaaabi xaaabl xaaabo Note: Earlier we also discussed about other file manipulation utilities –?. 3.Customize Split File Size using -b optionSize of each output split file can be controlled using -b option. In this example,the split files were created with a size of 200000 bytes. $ split -b200000 split.zip 4. Create Split Files with Numeric Suffix using -d optionAs seen in examples above,the output has the format of x** where ** are alphabets. You can change this to number using -d option. Here is an example. This has numeric suffix on the split files. $ split -d split.zip $ ls split.zip x01 x03 x05 x07 x09 x11 x13 x15 x17 x19 x21 x23 x25 x27 x29 x31 x33 x35 x37 x39 x00 x02 x04 x06 x08 x10 x12 x14 x16 x18 x20 x22 x24 x26 x28 x30 x32 x34 x36 x38 x40 5. Customize the Number of Split Chunks using -C optionTo get control over the number of chunks,use the -C option. This example will create 50 chunks of split files. $ split -n50 split.zip $ ls split.zip xac xaf xai xal xao xar xau xax xba xbd xbg xbj xbm xbp xbs xbv xaa xad xag xaj xam xap xas xav xay xbb xbe xbh xbk xbn xbq xbt xbw xab xae xah xak xan xaq xat xaw xaz xbc xbf xbi xbl xbo xbr xbu xbx 6. Avoid Zero Sized Chunks using -e optionWhile splitting a relatively small file in large number of chunks,its good to avoid zero sized chunks as they do not add any value. This can be done using -e option. Here is an example: $ split -n50 testfile So we see that lots of zero size chunks were produced in the above output. Now,lets use -e option and see the results: $ split -n50 -e testfile $ ls split.zip testfile xaa xab xac xad xae xaf So we see that no zero sized chunk was produced in the above output. 7. Customize Number of Lines using -l optionNumber of lines per output split file can be customized using the -l option. As seen in the example below,split files are created with 20000 lines. $ split -l20000 split.zip Get Detailed Information using –verbose optionTo get a diagnostic message each time a new split file is opened,use –verbose option as shown below. $ split -l20000 --verbose split.zip creating file `xaa' creating file `xab' creating file `xac' Linux Join Command Examples8. Basic Join ExampleJoin command works on first field of the two files (supplied as input) by matching the first fields. Here is an example : $ cat testfile1 1 India 2 US 3 Ireland 4 UK 5 Canada So we see that a file containing countries was joined with another file containing capitals on the basis of first field. 9. Join works on Sorted ListIf any of the two files supplied to join command is not sorted then it shows up a warning in output and that particular entry is not joined. In this example,since the input file is not sorted,it will display a warning/error message. $ cat testfile1 1 India 2 US 3 Ireland 5 Canada 4 UK 10. Ignore Case using -i optionWhen comparing fields,the difference in case can be ignored using -i option as shown below. $ cat testfile1 a India b US c Ireland d UK e Canada 11. Verify that Input is Sorted using –check-order optionHere is an example. Since testfile1 was unsorted towards the end so an error was produced in the output. $ cat testfile1 a India b US c Ireland d UK f Australia e Canada 12. Do not Check the Sortness using –nocheck-order optionThis is the opposite of the previous example. No check for sortness is done in this example,and it will not display any error message. $ join --nocheck-order testfile1 testfile2 a India NewDelhi b US Washington c Ireland Dublin d UK London 13. Print Unpairable Lines using -a optionIf both the input files cannot be mapped one to one then through -a[FILENUM] option we can have those lines that cannot be paired while comparing. FILENUM is the file number (1 or 2). In the following example,we see that using -a1 produced the last line in testfile1 (marked as bold below) which had no pair in testfile2. $ cat testfile1 a India b US c Ireland d UK e Canada f Australia 14. Print Only Unpaired Lines using -v optionIn the above example both paired and unpaired lines were produced in the output. But,if only unpaired output is desired then use -v option as shown below. $ join -v1 testfile1 testfile2 f Australia 15. Join Based on Different Columns from Both Files using -1 and -2 optionBy default the first columns in both the files is used for comparing before joining. You can change this behavior using -1 and -2 option. In the following example,the first column of testfile1 was compared with the second column of testfile2 to produce the join command output. $ cat testfile1 a India b US c Ireland d UK e Canada (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |