2003/10/06 join

© October 2003 Tony Lawrence

Unix "join" is a database operator for text files. Suppose you have two files that have related text in them: for example, /etc/passwd and /etc/group. We sort them on their group field ( sort -n -t : +2 /etc/passwd > j1; sort -n -t : +2 /etc/group > j2) and end up with files that look something like this:

nobody:*:-2:-2:Unprivileged User:/dev/null:/dev/null
root:*:0:0:System Administrator:/var/root:/bin/tcsh
daemon:*:1:1:System Services:/var/root:/dev/null
smmsp:*:25:25:Sendmail User:/private/etc/mail:/dev/null
www:*:70:70:World Wide Web Server:/Library/WebServer:/dev/null
xxw:*:70:70:Inside Web Server:/Library/IWebServer:/dev/null
mysql:*:74:74:MySQL Server:/dev/null:/dev/null
sshd:*:75:75:sshd Privilege separation:/var/empty:/dev/null
unknown:*:99:99:Unknown User:/dev/null:/dev/null
apl:*:501:20:Anthony Lawrence:/Users/apl:/bin/bash

(Files shortened for clarity). Now we use join:

bash-2.05a$ join -t : -j 3 j1 j2
-2:nobody:*:-2:Unprivileged User:/dev/null:/dev/null:nobody:*: 
0:root:*:0:System Administrator:/var/root:/bin/tcsh:wheel:*:root  
1:daemon:*:1:System Services:/var/root:/dev/null:daemon:*:root
25:smmsp:*:25:Sendmail User:/private/etc/mail:/dev/null:smmsp:*:
70:www:*:70:World Wide Web Server:/Library/WebServer:/dev/null:www:*:
70:xxw:*:70:70:Inside Web Server:/Library/IWebServer:/dev/null:www:*:
74:mysql:*:74:MySQL Server:/dev/null:/dev/null:mysql:*:
75:sshd:*:75:sshd Privilege separation:/var/empty:/dev/null:sshd:*: 
99:unknown:*:99:Unknown User:/dev/null:/dev/null:unknown:*:  

Which gives us the matching lines from each file joined together in one line. Rather purposeless here, of course.

MSDOS once had a "join" command. Its purpose was to operate much like Unix mount and join two drives rather than having them be separate drive letters. That was of course exactly what they should have done originally; the curse of using drive letters still makes Windows more clumsy than it needs to be.

