I have run into this problem several times recently, and decided to finally write down the solution for myself rather than keep searching the internet for it.
This is the problem: if you want to sort a file that is tab-delimited (and some of the filelds contain spaces), then you must explicitly tell sort to use TABS as the field separator, otherwise it will use any whitespace character. For functions such as cut and paste, this can be done like so:
where -f specifies the field number and -d specifies the field seperator.
The sort command uses the -t flag instead. So one would think that this would work:
sort -k 2 -t '\t' file
where -k specifies the field number and -t specifies the field separator
Unfortunately this does not work, because sort won’t accept ‘\t’, since it treats it as a multi-byte character. The solution is to place a $ before it, like so:
sort -k 2 -t $'\t' file
The dollar sign tells bash to use ANSI-C quoting
3.3.5. ANSI-C quoting
Words in the form “$’STRING'” are treated in a special way. The word expands to a string, with backslash-escaped characters replaced as specified by the ANSI-C standard. Backslash escape sequences can be found in the Bash documentation.
So now I have the answer for myself the next time the problem arises. I hope someone else benefits as well.