We like to keep things simple here at Rapleaf. One small tweak we made right after we installed hadoop was to alias 'hadoop dfs' to 'hdfs'. It rolls off the fingers nicely. We are also constantly typing 'hdfs -ls this' or 'hdfs -du that'. If we are not sure what this/that is, we type 'hdfs -ls /this/what', then 'hdfs -ls /this/what/ever', followed by a copy and paste or two. Thanks to our recent HackLeaf day and Nathan’s great idea, we no longer have to go through all of that. Just type 'hdfs -ls [tab]' and it works just like bash command-line completion.
This was easy to implement once I found the programmable completion tool by Ian Macdonald. I just added the following section to bash_completion:
#
have hadoop &&
_hdfs()
{
local cur prev
COMPREPLY=()
cur=${COMP_WORDS[COMP_CWORD]}
prev=${COMP_WORDS[COMP_CWORD-1]}
if [[ "$prev" == hdfs ]]; then
COMPREPLY=( $( compgen -W '-ls -lsr -du -dus -count -mv -cp -rm \
-rmr -expunge -put -copyFromLocal -moveToLocal -mkdir -setrep \
-touchz -test -stat -tail -chmod -chown -chgrp -help' -- $cur ) )
fi
if [[ "$prev" == -ls ]] || [[ "$prev" == -lsr ]] || \
[[ "$prev" == -du ]] || [[ "$prev" == -dus ]] || \
[[ "$prev" == -cat ]] || [[ "$prev" == -mkdir ]] || \
[[ "$prev" == -put ]] || [[ "$prev" == -rm ]] || \
[[ "$prev" == -rmr ]] || [[ "$prev" == -tail ]] || \
[[ "$prev" == -cp ]]; then
if [[ -z "$cur" ]]; then
COMPREPLY=( $( compgen -W "$( hdfs -ls / 2>-|grep -v ^Found|awk '{print $8}' )" -- "$cur" ) )
elif [[ `echo $cur | grep \/$` ]]; then
COMPREPLY=( $( compgen -W "$( hdfs -ls $cur 2>-|grep -v ^Found|awk '{print $8}' )" -- "$cur" ) )
else
COMPREPLY=( $( compgen -W "$( hdfs -ls $cur* 2>-|grep -v ^Found|awk '{print $8}' )" -- "$cur" ) )
fi
fi
} &&
complete -F _hdfs hdfs
I’m sure there are some ways to make the code more elegant, but it is called HackLeaf, after all. This bit of code builds on top of other functions in the script, but the basic idea is pretty simple. cur contains the current word you are typing, so this would be a partial command or partial path. prev contains the previous word. If the previous word is hdfs, then we present the user with valid arguments to hdfs. If the previous word is -ls (or any other command where you want a path/file), then present the user with the possibilities for that path or partial path. HDFS defaults to the user’s home directory if no path is provided, so we override that by presenting the user with the possibilities under “/”. Finally, COMPREPLY returns the possibilities to the user on the command-line.
Be sure to check out some of the other features of bash_completion, particularly ssh and chkconfig.

4 Comments
I never considered how much time I waste typing complete hadoop commands until seeing this. Now, I can only image how much time I’ll save!
Grabbing this immediately! Thanks for sharing!
Thanks for the great tip! It is something I’ve always wished for, but never managed to put together.
One thing I noticed: it appears your blog software might have eaten an ampersand because the lines like the following:
Seem to need 2>&- (2>ampersand-) to properly close stderr. Without the ampersand the shell seems gets confused (at least in my case on ubuntu 9.04 running bash 3.2.48(1)
I was shortcutting things as:
hals
hacat
harm (actually I was too chicken to make this one)
But your approach is way cooler.
@Drew: I don’t think the blog ate my ‘&’, but you’re right. The proper way to close STDERR is 2>&-. You could also do 2> /dev/null. We’re running CentOS and 2>- works, but I’m not sure why. Good catch!