We like to keep things simple here at Rapleaf. One small tweak we made right after we installed hadoop was to alias 'hadoop dfs' to 'hdfs'. It rolls off the fingers nicely. We are also constantly typing 'hdfs -ls this' or 'hdfs -du that'. If we are not sure what this/that is, we type 'hdfs -ls /this/what', then 'hdfs -ls /this/what/ever', followed by a copy and paste or two. Thanks to our recent HackLeaf day and Nathan’s great idea, we no longer have to go through all of that. Just type 'hdfs -ls [tab]' and it works just like bash command-line completion.
This was easy to implement once I found the programmable completion tool by Ian Macdonald. I just added the following section to bash_completion:
[cc lang="bash" tab_size="2"]
# hdfs(1) completion
#
have hadoop &&
_hdfs()
{
local cur prev
COMPREPLY=()
cur=${COMP_WORDS[COMP_CWORD]}
prev=${COMP_WORDS[COMP_CWORD-1]}
if [[ "$prev" == hdfs ]]; then
COMPREPLY=( $( compgen -W ‘-ls -lsr -du -dus -count -mv -cp -rm
-rmr -expunge -put -copyFromLocal -moveToLocal -mkdir -setrep
-touchz -test -stat -tail -chmod -chown -chgrp -help’ — $cur ) )
fi
if [[ "$prev" == -ls ]] || [[ "$prev" == -lsr ]] ||
[[ "$prev" == -du ]] || [[ "$prev" == -dus ]] ||
[[ "$prev" == -cat ]] || [[ "$prev" == -mkdir ]] ||
[[ "$prev" == -put ]] || [[ "$prev" == -rm ]] ||
[[ "$prev" == -rmr ]] || [[ "$prev" == -tail ]] ||
[[ "$prev" == -cp ]]; then
if [[ -z "$cur" ]]; then
COMPREPLY=( $( compgen -W “$( hdfs -ls / 2>-|grep -v ^Found|awk ‘{print $8}’ )” — “$cur” ) )
elif [[ `echo $cur | grep /$` ]]; then
COMPREPLY=( $( compgen -W “$( hdfs -ls $cur 2>-|grep -v ^Found|awk ‘{print $8}’ )” — “$cur” ) )
else
COMPREPLY=( $( compgen -W “$( hdfs -ls $cur* 2>-|grep -v ^Found|awk ‘{print $8}’ )” — “$cur” ) )
fi
fi
} &&
complete -F _hdfs hdfs
[/cc]
I’m sure there are some ways to make the code more elegant, but it is called HackLeaf, after all. This bit of code builds on top of other functions in the script, but the basic idea is pretty simple. cur contains the current word you are typing, so this would be a partial command or partial path. prev contains the previous word. If the previous word is hdfs, then we present the user with valid arguments to hdfs. If the previous word is -ls (or any other command where you want a path/file), then present the user with the possibilities for that path or partial path. HDFS defaults to the user’s home directory if no path is provided, so we override that by presenting the user with the possibilities under “/”. Finally, COMPREPLY returns the possibilities to the user on the command-line.
Be sure to check out some of the other features of bash_completion, particularly ssh and chkconfig.








15 Comments
I never considered how much time I waste typing complete hadoop commands until seeing this. Now, I can only image how much time I’ll save!
Grabbing this immediately! Thanks for sharing!
Thanks for the great tip! It is something I’ve always wished for, but never managed to put together.
One thing I noticed: it appears your blog software might have eaten an ampersand because the lines like the following:
COMPREPLY=( $( compgen -W "$( hdfs -ls / 2>-|grep -v ^Found|awk '{print $8}' )" -- "$cur" ) )
Seem to need 2>&- (2>ampersand-) to properly close stderr. Without the ampersand the shell seems gets confused (at least in my case on ubuntu 9.04 running bash 3.2.48(1)
I was shortcutting things as:
hals
hacat
harm (actually I was too chicken to make this one)
But your approach is way cooler.
@Drew: I don’t think the blog ate my ‘&’, but you’re right. The proper way to close STDERR is 2>&-. You could also do 2> /dev/null. We’re running CentOS and 2>- works, but I’m not sure why. Good catch!
FOr some reason, its not working for me. When I try to tab after the hdfs -ls command, it says permission denied.
How do I fix this ? I chmodded the file to 777 and changed the ownership to the file directory name.
Thanks..
Is the permission denied coming from bash or hadoop? Can you type ‘hdfs -ls’ without a problem?
Yes I can type hdfs -ls, it gives me the list of hadoop commands, however, when I try to enter a tab after hdfs -ls it shows ” – bash : permission denied”
I also forgot to mention that I am running it in Ubuntu hadoop 0.20
‘hdfs -ls’ should list out the contents of your home directory on hdfs, not give you a list of hadoop commands. That may actually be the problem. hdfs is also an alias, so we should eliminate that variable. Try typing ‘hadoop dfs -ls’. This should give you a listing of your home directory. If it doesn’t, I would guess that something is wrong with the hadoop configuration.
Sorry, I understand that hdfs is an alias for hadoop fs.
I think I unwittingly wrote that it spits out the list of commands. It does not. when I type hdfs -l and then tab, it shows -ls and -lsr.
Note that its after I type hdfs -ls and then press a tab, it should be accessing the contents inside the hadoop directory.
The script wrote above does not actually work at all, I found a script taken from here on this website:
http://code.google.com/p/hadoopenv/source/browse/hdenv.sh
## Bash Autocompletion for HDFS
# hdfs(1) completion
# taken from: http://blog.rapleaf.com/dev/2009/11/17/command-line-auto-completion-for-hadoop-dfs-commands/
have()
{
unset -v have
PATH=$PATH:/sbin:/usr/sbin:/usr/local/sbin type $1 &>/dev/null &&
have=”yes”
}
have hadoop &&
_hdfs()
{
local cur prev
COMPREPLY=()
cur=${COMP_WORDS[COMP_CWORD]}
prev=${COMP_WORDS[COMP_CWORD-1]}
if [[ "$prev" == hdfs ]]; then
COMPREPLY=( $( compgen -W ‘-ls -lsr -du -dus -count -mv -cp -rm \
-rmr -expunge -put -copyFromLocal -moveToLocal -mkdir -setrep \
-touchz -test -stat -tail -chmod -chown -chgrp -help’ — $cur ) )
fi
if [[ "$prev" == -ls ]] || [[ "$prev" == -lsr ]] || \
[[ "$prev" == -du ]] || [[ "$prev" == -dus ]] || \
[[ "$prev" == -cat ]] || [[ "$prev" == -mkdir ]] || \
[[ "$prev" == -put ]] || [[ "$prev" == -rm ]] || \
[[ "$prev" == -rmr ]] || [[ "$prev" == -tail ]] || \
[[ "$prev" == -cp ]]; then
if [[ -z "$cur" ]]; then
COMPREPLY=( $( compgen -W “$( hdfs -ls / 2>-|grep -v ^Found|awk ‘{print $8}’ )” — “$cur” ) )
elif [[ `echo $cur | grep \/$` ]]; then
COMPREPLY=( $( compgen -W “$( hdfs -ls $cur 2>-|grep -v ^Found|awk ‘{print $8}’ )” — “$cur” ) )
else
COMPREPLY=( $( compgen -W “$( hdfs -ls $cur* 2>-|grep -v ^Found|awk ‘{print $8}’ )” — “$cur” ) )
fi
fi
} &&
complete -F _hdfs hdfs
unset have
Glad you found a script that works for you. Not sure what’s going on, but it may be that the code above got messed up by a WordPress upgrade. For example, it looks like it’s missing the original backslashes.
Thanks for the posting the code URL from google. People should definitely use that link going forward. I found it ironic that the code that worked for you actually did originally come from this blog post (note the “taken from” line in the code you posted).
Happy auto-completing @Amol.
Excellent article… I got inspired and did more changes on it…
checkout here – http://cloudblog.8kmiles.com/2012/01/09/hadoop-autocomplete-for-hadoop-dfs-commands/
Wow, looks great @dhamu. Definitely an upgrade in functionality.
Thanks ckline. I have published another bash completion script for hadoop commands at http://cloudblog.8kmiles.com/2012/01/14/hadoop-autocomplete-for-hadoop-commands/
Finally managed to create the deb package for those scripts and uploaded those to my launchpad ppa.
– https://launchpad.net/~kernel164/+archive/ppa
Here is how to setup bash completion for hadoop commands using my ppa – http://cloudblog.8kmiles.com/2012/01/15/hadoop-setup-bash-completion-for-hadoop-commands-using-ppa/
One Trackback
[...] What I needed was an bash auto completion for those commands. A simple google search led me to rapleaf, but I wanted more… so here it is.. I made hadoop dfs autocompletion scripts.Plans: 1) To [...]