<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Engineering Rapleaf &#187; Hadoop</title>
	<atom:link href="http://blog.rapleaf.com/dev/tag/hadoop/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.rapleaf.com/dev</link>
	<description>For engineers, by engineers.</description>
	<lastBuildDate>Mon, 12 Dec 2011 08:57:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Command-line auto completion for Hadoop DFS commands</title>
		<link>http://blog.rapleaf.com/dev/2009/11/17/command-line-auto-completion-for-hadoop-dfs-commands/</link>
		<comments>http://blog.rapleaf.com/dev/2009/11/17/command-line-auto-completion-for-hadoop-dfs-commands/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 01:31:13 +0000</pubDate>
		<dc:creator>ckline</dc:creator>
				<category><![CDATA[bash]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HDFS]]></category>
		<category><![CDATA[auto-complete]]></category>
		<category><![CDATA[auto-completion]]></category>
		<category><![CDATA[bash_completion]]></category>
		<category><![CDATA[command-line]]></category>

		<guid isPermaLink="false">http://blog.rapleaf.com/dev/?p=304</guid>
		<description><![CDATA[We like to keep things simple here at Rapleaf. One small tweak we made right after we installed hadoop was to alias 'hadoop dfs' to 'hdfs'. It rolls off the fingers nicely. We are also constantly typing 'hdfs -ls this' or 'hdfs -du that'. If we are not sure what this/that is, we type 'hdfs [...]]]></description>
			<content:encoded><![CDATA[<p>We like to keep things simple here at Rapleaf.  One small tweak we made right after we installed hadoop was to alias <tt>'hadoop dfs'</tt> to <tt>'hdfs'</tt>.  It rolls off the fingers nicely.  We are also constantly typing <tt>'hdfs -ls this'</tt> or <tt>'hdfs -du that'</tt>.  If we are not sure what <tt>this/that</tt> is, we type <tt>'hdfs -ls /this/what'</tt>, then <tt>'hdfs -ls /this/what/ever'</tt>, followed by a copy and paste or two.  Thanks to our recent HackLeaf day and Nathan&#8217;s great idea, we no longer have to go through all of that.  Just type <tt>'hdfs -ls [tab]'</tt> and it works just like bash command-line completion.</p>
<p>This was easy to implement once I found the <a href="http://www.caliban.org/bash/#completion">programmable completion</a> tool by Ian Macdonald.  I just added the following section to bash_completion:</p>
<p>[cc lang="bash" tab_size="2"]<br />
# hdfs(1) completion<br />
#<br />
have hadoop &amp;&amp;<br />
_hdfs()<br />
{<br />
  local cur prev</p>
<p>  COMPREPLY=()<br />
  cur=${COMP_WORDS[COMP_CWORD]}<br />
  prev=${COMP_WORDS[COMP_CWORD-1]}</p>
<p>  if [[ "$prev" == hdfs ]]; then<br />
    COMPREPLY=( $( compgen -W &#8216;-ls -lsr -du -dus -count -mv -cp -rm<br />
      -rmr -expunge -put -copyFromLocal -moveToLocal -mkdir -setrep<br />
      -touchz -test -stat -tail -chmod -chown -chgrp -help&#8217; &#8212; $cur ) )<br />
  fi</p>
<p>  if [[ "$prev" == -ls ]] || [[ "$prev" == -lsr ]] ||<br />
     [[ "$prev" == -du ]] || [[ "$prev" == -dus ]] ||<br />
     [[ "$prev" == -cat ]] || [[ "$prev" == -mkdir ]] ||<br />
     [[ "$prev" == -put ]] || [[ "$prev" == -rm ]] ||<br />
     [[ "$prev" == -rmr ]] || [[ "$prev" == -tail ]] ||<br />
     [[ "$prev" == -cp ]]; then<br />
    if [[ -z "$cur" ]]; then<br />
      COMPREPLY=( $( compgen -W &#8220;$( hdfs -ls / 2&gt;-|grep -v ^Found|awk &#8216;{print $8}&#8217; )&#8221; &#8212; &#8220;$cur&#8221; ) )<br />
    elif [[ `echo $cur | grep /$` ]]; then<br />
      COMPREPLY=( $( compgen -W &#8220;$( hdfs -ls $cur 2&gt;-|grep -v ^Found|awk &#8216;{print $8}&#8217; )&#8221; &#8212; &#8220;$cur&#8221; ) )<br />
    else<br />
      COMPREPLY=( $( compgen -W &#8220;$( hdfs -ls $cur* 2&gt;-|grep -v ^Found|awk &#8216;{print $8}&#8217; )&#8221; &#8212; &#8220;$cur&#8221; ) )<br />
    fi<br />
  fi<br />
} &amp;&amp;<br />
complete -F _hdfs hdfs<br />
[/cc]</p>
<p>I&#8217;m sure there are some ways to make the code more elegant, but it is called HackLeaf, after all.  This bit of code builds on top of other functions in the script, but the basic idea is pretty simple.  <tt>cur</tt> contains the current word you are typing, so this would be a partial command or partial path.  <tt>prev</tt> contains the previous word.  If the previous word is <tt>hdfs</tt>, then we present the user with valid arguments to <tt>hdfs</tt>.  If the previous word is <tt>-ls</tt> (or any other command where you want a path/file), then present the user with the possibilities for that path or partial path.  HDFS defaults to the user&#8217;s home directory if no path is provided, so we override that by presenting the user with the possibilities under &#8220;/&#8221;.  Finally, <tt>COMPREPLY</tt> returns the possibilities to the user on the command-line.</p>
<p>Be sure to check out some of the other features of bash_completion, particularly <tt>ssh</tt> and <tt>chkconfig</tt>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rapleaf.com/dev/2009/11/17/command-line-auto-completion-for-hadoop-dfs-commands/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Dead Simple MapReduce Workflow Configuration</title>
		<link>http://blog.rapleaf.com/dev/2009/11/09/dead-simple-mapreduce-workflow-configuration/</link>
		<comments>http://blog.rapleaf.com/dev/2009/11/09/dead-simple-mapreduce-workflow-configuration/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 01:03:06 +0000</pubDate>
		<dc:creator>nathan marz</dc:creator>
				<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://blog.rapleaf.com/dev/?p=277</guid>
		<description><![CDATA[If you use MapReduce for any real-world application, chances are your workflow consists of more than one MapReduce job. Rapleaf has workflows consisting of over one hundred jobs. A lot of times, you need to make configurations to the workflow that should apply to every job. For example, you may want each job to run [...]]]></description>
			<content:encoded><![CDATA[<p>If you use MapReduce for any real-world application, chances are your workflow consists of more than one MapReduce job. Rapleaf has workflows consisting of over one hundred jobs. A lot of times, you need to make configurations to the workflow that should apply to every job. For example, you may want each job to run in the same <a href="http://hadoop.apache.org/common/docs/current/fair_scheduler.html">fair scheduler</a> pool or use a certain number of reducers.</p>
<p>One way to do this would be to configure each job at the code level. Unfortunately, this can be tedious and error-prone, since if you add a new job to the workflow you need to remember to add the proper configurations. Fortunately, there&#8217;s a better way.</p>
<p>Hadoop has a static method called &#8220;Configuration.addDefaultResource&#8221; that allows you to specify a file to be loaded into the configuration by default. Like <a href="http://www.w3.org/Style/CSS/">Cascading Style Sheets</a>, Hadoop will load the configurations one file at a time, with configurations from later files overriding those from earlier ones. A static initializer in the JobConf class causes Hadoop to load in &#8220;mapred-default.xml&#8221; and &#8220;mapred-site.xml&#8221;.</p>
<p>To create an application level configuration, you will want to perform the following steps:</p>
<p>1. Create an &#8220;application-site.xml&#8221; file and put it in the same directory from which you run your job jar.<br />
2. In the main method of your code, add the lines:</p>
<p>[cc lang="java"]<br />
new JobConf(); //ensure mapred-default and mapred-site get loaded in first<br />
Configuration.addDefaultResource(&#8220;application-site.xml&#8221;);<br />
[/cc]</p>
<p>3. Ensure that &#8220;.&#8221; is in your classpath when you run the job jar so that Hadoop can find the &#8220;application-site.xml&#8221; resource.</p>
<p>A side benefit of this approach is that if you need to tweak some settings in the middle of a workflow, you just need to edit the application-site.xml file and each subsequent job will pick up the new settings.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rapleaf.com/dev/2009/11/09/dead-simple-mapreduce-workflow-configuration/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting the serial terminal to work over IPMI on a Dell R410</title>
		<link>http://blog.rapleaf.com/dev/2009/10/19/getting-the-serial-terminal-to-work-over-ipmi-on-a-dell-r410/</link>
		<comments>http://blog.rapleaf.com/dev/2009/10/19/getting-the-serial-terminal-to-work-over-ipmi-on-a-dell-r410/#comments</comments>
		<pubDate>Mon, 19 Oct 2009 16:00:59 +0000</pubDate>
		<dc:creator>cesar</dc:creator>
				<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[IPMI]]></category>

		<guid isPermaLink="false">http://blog.rapleaf.com/dev/?p=257</guid>
		<description><![CDATA[As avid readers of the blog know, we use Hadoop a lot and talk about it quite a bit. We are in the process of expanding our Hadoop cluster and decided to go with the new Dell R410 1U machines.  From talks with other Hadoop users the sweet-spot is one spindle (drive) for every 2 [...]]]></description>
			<content:encoded><![CDATA[<p>As avid readers of the blog know, we use <a href="http://hadoop.apache.org/">Hadoop</a> a lot and <a href="http://blog.rapleaf.com/dev/?cat=13">talk about it quite a bit</a>.  We are in the process of expanding our Hadoop cluster and decided to go with the new <a href="http://www.dell.com/us/en/highered/Servers/server-poweredge-r410/pd.aspx?refid=server-poweredge-r410&amp;cs=RC956904&amp;s=hied">Dell R410 1U</a> machines.  From talks with other Hadoop users the sweet-spot is one spindle (drive) for every 2 cores on a machine and this machine is the first 1U server from Dell that has this spindle/core ratio.</p>
<p>The one problem we encountered when I received the first test machine was getting the Serial Console to work though the machine&#8217;s IPMI interface.  Rapleaf has some other Dell servers and never had any problems, but this machine is new and shiny and just didn&#8217;t work with our default configuration.</p>
<p>The first change we needed to make was change the way the BIOS redirects the serial console.  To enter the BIOS press &#8220;F2&#8243; after the machine has finished checking its memory.  Once the computer gets to the BIOS go down to &#8220;Serial Communication,&#8221; you should be seeing the following screen.</p>
<p><img class="alignnone size-full wp-image-268" src="http://blog.rapleaf.com/dev/wp-content/uploads/2009/10/r410_serial_comm_config1.jpg" alt="r410_serial_comm_config" width="559" height="375" /></p>
<p>The options should be as follows:</p>
<ul>
<li>Serial Communication &#8230;&#8230;&#8230;. On with Console Redirection via COM2</li>
<li>Serial Port Address &#8230;&#8230;&#8230;&#8230;&#8230; Serial Device1=COM1,Serial Device2=COM2</li>
<li>External Serial Connector &#8230;.. Serial Device2</li>
<li>Failsafe Baud Rate &#8230;&#8230;&#8230;&#8230;&#8230; 115200</li>
<li>Remote Terminal Type &#8230;&#8230;&#8230;. VT100/VT220</li>
<li>Redirection After Boot &#8230;&#8230;&#8230;.. Enabled</li>
</ul>
<p>When this is done, exit the BIOS and enable IPMI on the machine by pressing CTRL-E when the prompt to modify the IPMI configuration appears and give it an IP address either static or DHCP.</p>
<p>Now there&#8217;s a couple of modifications that need to be done to Linux for this to work.</p>
<p>1) The first is modify your /boot/grub/<em>grub.conf</em> file and add &#8220;console=ttyS1,115200&#8243; at the end of your kernel parameters.  Ours looks like this:</p>
<blockquote><p>title Rapleaf Linux (2.6.22)<br />
root (hd0,0)<br />
kernel /vmlinuz-2.6.22 ro root=LABEL=/ console=ttyS1,115200<br />
initrd /initrd-2.6.22.img</p></blockquote>
<p>2) The last two lines in <em>/etc/inittab</em> should be:</p>
<blockquote><p># Run agetty COM2/ttyS1<br />
s1:2345:respawn:/sbin/agetty -L -f /etc/issueserial ttyS1 115200 vt100-nav</p></blockquote>
<p>3) The last line of <em>/etc/securetty</em> should be</p>
<blockquote><p>ttyS1</p></blockquote>
<p>Once all this is done the machine can be rebooted and you should be able to interact with the boot process through IPMI.  Good luck!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rapleaf.com/dev/2009/10/19/getting-the-serial-terminal-to-work-over-ipmi-on-a-dell-r410/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Glance at the Hadoop Failure Model</title>
		<link>http://blog.rapleaf.com/dev/2009/07/17/a-glance-at-the-hadoop-failure-model/</link>
		<comments>http://blog.rapleaf.com/dev/2009/07/17/a-glance-at-the-hadoop-failure-model/#comments</comments>
		<pubDate>Fri, 17 Jul 2009 20:43:04 +0000</pubDate>
		<dc:creator>nathan marz</dc:creator>
				<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://blog.rapleaf.com/dev/?p=114</guid>
		<description><![CDATA[Hadoop is designed to be a fault tolerant system. Jobs should be resilient to nodes going down and other random failures. Hadoop isn&#8217;t perfect however, as I still see jobs failing due to random causes every now and again. I decided to investigate the significance of the different factors that play into a job failing. [...]]]></description>
			<content:encoded><![CDATA[<p>Hadoop is designed to be a fault tolerant system. Jobs should be resilient to nodes going down and other random failures. Hadoop isn&#8217;t perfect however, as I still see jobs failing due to random causes every now and again. I decided to investigate the significance of the different factors that play into a job failing.</p>
<p>A Hadoop job fails if the same task fails some predetermined amount of times (by default, four). This is set through the properties &#8220;mapred.map.max.attempts&#8221; and &#8220;mapred.reduce.max.attempts&#8221;. For a job to fail <em>randomly</em>, an individual task will need to fail randomly this predetermined amount of times. A task can fail randomly for a variety of reasons &#8211; a few of the ones we&#8217;ve seen are disks getting full, a variety of bugs in Hadoop, and hardware failures. </p>
<p>The formula for the probability of a job failing randomly can be derived as follows:</p>
<p><span id="more-114"></span></p>
<p><code><br />
Pr[individual task failing maximum #times] = Pr[task failing] ^ (max task failures)<br />
Pr[task succeeding] = 1 - Pr[individual task failing maximum #times]<br />
Pr[job succeeding] = Pr[task succeeding] ^ (num tasks)<br />
Pr[job failing] = 1 - Pr[job succeeding]</p>
<p>Pr[job failing] = 1 - (1-Pr[task failing] ^ (max task failures))^(num tasks)<br />
</code></p>
<p>The maximum amount of task failures is set through the property &#8220;mapred.max.tracker.failures&#8221; and defaults to 4.</p>
<p>Let&#8217;s take a significant workload of 100,000 map tasks and see what the numbers look like:</p>
<p><a href="http://blog.rapleaf.com/dev/wp-content/uploads/2009/07/mapper-failures.png"><img class="aligncenter size-full wp-image-115" src="http://blog.rapleaf.com/dev/wp-content/uploads/2009/07/picture-13.png" alt="mapper-failures" width="526" height="205" /></a></p>
<p>As the probability of a task failing goes above 1%, the probability of the job failing rapidly increases. It is very important to keep the cluster stable and keep the failure rate relatively small, as these numbers show Hadoop&#8217;s failure model only goes so far. We can also see the importance of the &#8220;max task failures&#8221; parameter, as values under 4 cause the probability of job failures to rise to significant values even with a 0.5% probability of task failure.</p>
<p>Reducers run for a much longer period of time than mappers, which means a reducer has more time for a random event to cause it to fail. We can therefore say that the probability of a reducer failing is much higher than a mapper failing. This is balanced out by the fact that there are a much smaller amount of reducers. Let&#8217;s look at some numbers more representative of a job failing due to reducers failing:</p>
<p><a href="http://blog.rapleaf.com/dev/wp-content/uploads/2009/07/reducer-failures.png"><img class="aligncenter size-full wp-image-117" src="http://blog.rapleaf.com/dev/wp-content/uploads/2009/07/picture-141.png" alt="reducer-failures" width="525" height="188" /></a></p>
<p>The probabilities of a reducer failing need to go up to 10% to have a significant chance of failure.</p>
<h3>Bad Nodes</h3>
<p>One more variable to consider in the model is bad nodes. Oftentimes nodes go bad and every task run on them fails, whether because of a disk going bad, the disk filling up, or other causes. With a bad node, you typically see a handful of mappers and reducers fail before the node gets blacklisted and no more tasks are assigned to it. In order to simplify our analysis, let&#8217;s assume that each bad node causes a fixed number of tasks to fail. Additionally, let&#8217;s assume a task can only be affected by a bad node once, which is reasonable because nodes are blacklisted fairly quickly. Let&#8217;s call the tasks which fail once due to a bad node &#8220;b-tasks&#8221; and the other tasks &#8220;n-tasks&#8221;. A &#8220;b-task&#8221; starts with one failure, so it needs to fail randomly &#8220;max task failures &#8211; 1&#8243; times to cause the job to fail. On our cluster, we typically see a bad node cause three tasks to automatically fail, so using that number the modified formula ends up looking like:</p>
<p><code><br />
#b-tasks = #bad nodes * 3<br />
Pr[all b-tasks succeeding] = (1-Pr[task failing] ^ (max task failures - 1))^(#b-tasks)<br />
Pr[all n-tasks succeeding] =  (1-Pr[task failing] ^ (max task failures))^(num tasks - #b-tasks)<br />
Pr[job succeeding] = Pr[all b-tasks succeeding] * Pr[all n-tasks succeeding]<br />
Pr[job succeeding] = (1-Pr[task failing] ^ (max task failures - 1))^(#b-tasks) * (1-Pr[task failing] ^ (max task failures))^(num tasks - #b-tasks)<br />
Pr[job failing] = 1 - Pr[job succeeding]</p>
<p>Pr[job failing] = 1 - (1-Pr[task failing] ^ (max task failures - 1))^(#b-tasks) * (1-Pr[task failing] ^ (max task failures))^(num tasks - #b-tasks)<br />
</code></p>
<p>Since there are so many mappers, the results of the formula won&#8217;t change for a handful of bad nodes. Given that the number of reducers is relatively small though, the numbers do change somewhat:</p>
<p><a href="http://blog.rapleaf.com/dev/wp-content/uploads/2009/07/bad-nodes.png"><img class="aligncenter size-full wp-image-118" src="http://blog.rapleaf.com/dev/wp-content/uploads/2009/07/picture-15.png" alt="bad-nodes" width="642" height="170" /></a></p>
<p>Happily the numbers aren&#8217;t too drastic &#8211; five bad nodes causes the failure rate to increase by 1.5x to 2x.</p>
<p>In the end, Hadoop is fairly fault tolerant as long as the probability of a task failing is kept relatively low. Based on the numbers we&#8217;ve looked at, 4 is a good value to use for &#8220;max task failures&#8221;, and you should start worrying about cluster stability when the task failure rate approaches 1%. You could always increase the &#8220;max task failures&#8221; properties to increase robustness, but if you are having that many failures you will be suffering performance penalties and would be better off making your cluster more stable. </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rapleaf.com/dev/2009/07/17/a-glance-at-the-hadoop-failure-model/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

