<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Hadoop Digest, February 2010</title>
	<atom:link href="http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/</link>
	<description>Search, Text Analytics, Natural Language Processing</description>
	<lastBuildDate>Fri, 03 Feb 2012 05:00:37 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Alex Baranau</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-360</link>
		<dc:creator><![CDATA[Alex Baranau]]></dc:creator>
		<pubDate>Tue, 13 Jul 2010 05:08:41 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-360</guid>
		<description><![CDATA[I think I know what the question is about.

In case the split ends with only some part of the record, the framework (this is up to InputFormat actually) requests for additional bytes (which are in the beginning of the next split). The call is made over HTTP and it usually transmits very little piece of data so, given the size of each split  (usually at least 64Mb) it doesn&#039;t affect the performance at all even if the next split is on another physical box.

I hope I answered your question.]]></description>
		<content:encoded><![CDATA[<p>I think I know what the question is about.</p>
<p>In case the split ends with only some part of the record, the framework (this is up to InputFormat actually) requests for additional bytes (which are in the beginning of the next split). The call is made over HTTP and it usually transmits very little piece of data so, given the size of each split  (usually at least 64Mb) it doesn&#8217;t affect the performance at all even if the next split is on another physical box.</p>
<p>I hope I answered your question.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sematext</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-359</link>
		<dc:creator><![CDATA[sematext]]></dc:creator>
		<pubDate>Tue, 13 Jul 2010 03:37:07 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-359</guid>
		<description><![CDATA[I&#039;m not sure which part of the post you are referring to, but this sounds like a question for the mailing list.
Alternatively, have you looked for the answer on http://www.search-hadoop.com/ ?]]></description>
		<content:encoded><![CDATA[<p>I&#8217;m not sure which part of the post you are referring to, but this sounds like a question for the mailing list.<br />
Alternatively, have you looked for the answer on <a href="http://www.search-hadoop.com/" rel="nofollow">http://www.search-hadoop.com/</a> ?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: yuanke</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-358</link>
		<dc:creator><![CDATA[yuanke]]></dc:creator>
		<pubDate>Tue, 13 Jul 2010 02:59:49 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-358</guid>
		<description><![CDATA[how the framework makes sure complete records are presented in ?could you give a more detailed description? all you said I know.thanks]]></description>
		<content:encoded><![CDATA[<p>how the framework makes sure complete records are presented in ?could you give a more detailed description? all you said I know.thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: slim tebourbi</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-74</link>
		<dc:creator><![CDATA[slim tebourbi]]></dc:creator>
		<pubDate>Sat, 20 Feb 2010 09:03:42 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-74</guid>
		<description><![CDATA[great work. Thanks.]]></description>
		<content:encoded><![CDATA[<p>great work. Thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jakob Homan</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-69</link>
		<dc:creator><![CDATA[Jakob Homan]]></dc:creator>
		<pubDate>Fri, 19 Feb 2010 05:37:37 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-69</guid>
		<description><![CDATA[Otis-  
This is an excellent resource; thank you very much for compiling this information.  Very much appreciated.

Thanks,
Jakob Homan
Hadoop @ Yahoo!]]></description>
		<content:encoded><![CDATA[<p>Otis-<br />
This is an excellent resource; thank you very much for compiling this information.  Very much appreciated.</p>
<p>Thanks,<br />
Jakob Homan<br />
Hadoop @ Yahoo!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Otis Gospodnetic</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-66</link>
		<dc:creator><![CDATA[Otis Gospodnetic]]></dc:creator>
		<pubDate>Thu, 18 Feb 2010 13:06:04 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-66</guid>
		<description><![CDATA[Hi Todd.  I didn&#039;t personally put this one together.  But yes, digest posts will be monthly.  Thanks for the feedback!]]></description>
		<content:encoded><![CDATA[<p>Hi Todd.  I didn&#8217;t personally put this one together.  But yes, digest posts will be monthly.  Thanks for the feedback!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Todd Lipcon</title>
		<link>http://blog.sematext.com/2010/02/16/hadoop-digest-february-2010/#comment-63</link>
		<dc:creator><![CDATA[Todd Lipcon]]></dc:creator>
		<pubDate>Thu, 18 Feb 2010 08:14:14 +0000</pubDate>
		<guid isPermaLink="false">http://blog.sematext.com/?p=84#comment-63</guid>
		<description><![CDATA[Hey Otis

Thanks for writing this up! Will this be monthly? Should be a great resource for the community.

-Todd]]></description>
		<content:encoded><![CDATA[<p>Hey Otis</p>
<p>Thanks for writing this up! Will this be monthly? Should be a great resource for the community.</p>
<p>-Todd</p>
]]></content:encoded>
	</item>
</channel>
</rss>

