<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
  <channel>
    <title>rainsf</title>
    <description></description>
    <link>http://rainsf.javaeye.com</link>
    <language>UTF-8</language>
    <copyright>Copyright 2003-2008, JavaEye.com</copyright>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <generator>JavaEye - 做最棒的软件开发交流社区</generator>
      <item>
        <title>(转)Subversion的权限控制</title>
        <author>rainsf</author>
        <description>
          <![CDATA[
          <br/>
          作者: <a href="http://rainsf.javaeye.com">rainsf</a>&nbsp;
          链接：<a href="http://rainsf.javaeye.com/blog/144512" style="color:red;">http://rainsf.javaeye.com/blog/144512</a>&nbsp;
          发表时间: 2007年11月28日
          <br/><br/>
          声明：本文系JavaEye网站发布的原创博客文章，未经作者书面许可，严禁任何网站转载本文，否则必将追究法律责任！
          <br/><br/>
          <h3>1，认证（Authentication）和授权（Authorization）</h3>
<p>这两个术语经常一起出现。其中认证的意思就是鉴别用户的身份，最常见的方式就是使用用户名和密码，授权就是判断用户是否具备某种操作的权限，在Subversion里提供了&ldquo;authz-db&rdquo;文件，实现了以路径为基础的授权，也就是判断用户是否有操作对应路径的权限，在Subversion 1.3之后，svnserve和Apache一样都可以使用&ldquo;authz-db&rdquo;文件。</p>
<h3><br />
2. svnserve下的配置文件</h3>
<p>因为本文是以svnserve为例的，所以先介绍一下版本库目录的结构：</p>
<p>D:\SVNROOT\PROJECT1<br />
├─conf<br />
├─dav<br />
├─db<br />
│&nbsp; ├─revprops<br />
│&nbsp; ├─revs<br />
│&nbsp; └─transactions<br />
├─hooks<br />
└─locks</p>
<p>其中conf下面有三个文件：</p>
<p>&nbsp;&nbsp;&nbsp; authz<br />
&nbsp;&nbsp;&nbsp; passwd<br />
&nbsp;&nbsp;&nbsp; svnserve.conf</p>
<p>其中的&ldquo;svnserve.conf&rdquo;是这个版本库的配置文件，当使用svnserve时，这个配置文件决定了使用什么认证和授权文件：</p>
<p>&nbsp;&nbsp;&nbsp; password-db = passwd<br />
&nbsp;&nbsp;&nbsp; authz-db = authz</p>
<p>上面的配置说明使用&ldquo;svnserve.conf&rdquo;同目录的passwd和authz，其中的password-db指定了用户密码文件，authz-db是我们的授权文件，也就是我们本文主要介绍的文件。</p>
<p><font color="#ff0000">注意：使用Apache作为服务器时，根本就不会参考&ldquo;svnserve.conf&rdquo;文件的内容，而是会参考Apache的配置。</font></p>
<h3><br />
3，基于svnserve的版本库文件布局</h3>
<p>使用svnserve时，为了管理的方便，应该使用相同的认证和授权文件，所以应该让所有版本库的配置文件svnserve.conf指向同一个password-db和authz-db文件。下面是一个多版本库的目录：<br />
D:\SVNROOT<br />
├─project1<br />
│&nbsp; ├─conf<br />
│&nbsp; ├─dav<br />
│&nbsp; ├─db<br />
│&nbsp; │&nbsp; ├─revprops<br />
│&nbsp; │&nbsp; ├─revs<br />
│&nbsp; │&nbsp; └─transactions<br />
│&nbsp; ├─hooks<br />
│&nbsp; └─locks<br />
└─project2<br />
&nbsp;&nbsp;&nbsp; ├─conf<br />
&nbsp;&nbsp;&nbsp; ├─dav<br />
&nbsp;&nbsp;&nbsp; ├─db<br />
&nbsp;&nbsp;&nbsp; │&nbsp; ├─revprops<br />
&nbsp;&nbsp;&nbsp; │&nbsp; ├─revs<br />
&nbsp;&nbsp;&nbsp; │&nbsp; └─transactions<br />
&nbsp;&nbsp;&nbsp; ├─hooks<br />
&nbsp;&nbsp;&nbsp; └─locks<br />
&nbsp;&nbsp;&nbsp; <br />
D:\SVNROOT下有两个目录project1和project2，都已经创建了版本库，所以我们修改每个conf目录下的svnserve.conf，使之指向同一个password-db和authz-db文件。</p>
<pre>password-db = ..\..\passwd
authz-db = ..\..\authz</pre>
<p>这样，D:\SVNROOT\passwd和D:\SVNROOT\authz就控制了所有版本库的svnserve访问。另外在后面的操作中要关闭匿名访问，应该去掉&ldquo;anon-access = none&rdquo;前的&ldquo;#&rdquo;号，保证只有认证用户可以访问。</p>
<p><font color="#ff0000">注意：还有一点需要注意，那就是svnserve的&ldquo;realm&rdquo;的值，在上面的设置下，应该保证所有的版本库使用相同的realm值，这样，对版本库的密码缓存可以在多个版本库之间共享，更多细节见<span class="sect2"><a href="http://www.subversion.org.cn/svn-ch-6-sect-2.html#svn-ch-6-sect-2.2"><u><font color="#0000ff">客户端凭证缓存</font></u></a></span>。</font></p>
<h3>4，测试用户和组说明</h3>
<p>版本库禁止任何匿名用户的访问，只对认证用户有效。</p>
<p>root:配置管理管理员，对版本库有完全的管理权限。</p>
<p>p1_admin1:project1的管理员，对project1有完全权限。<br />
p1_d1:project1的开发者，对project1的trunk有完全的权限，但是对其中的/trunk/admin目录没有任何权限。<br />
p1_t1:project1的测试者，对project1的trunk有完全的读权限，但是对其中的/trunk/admin目录没有任何权限。</p>
<p>p2_admin1:project2的管理员，对project2有完全权限。<br />
p2_d1:project2的开发者，对project2的trunk有完全的权限，但是对其中的/trunk/admin目录没有任何权限。<br />
p2_t1:project2的测试者，对project2的trunk有完全的读权限，但是对其中的/trunk/admin目录没有任何权限。</p>
<p><br />
对应的组及组的用户：<br />
p1_group_a:p1_admin1<br />
p1_group_d:p1_d1<br />
p1_group_t:p1_t1<br />
p2_group_a:p2_admin1<br />
p2_group_d:p2_d1<br />
p2_group_t:p2_t1</p>
<p><br />
5，修改D:\SVNROOT\passwd文件</p>
<p>前面已经说过了，用户和密码文件应该是在D:\SVNROOT\passwd，所以我们为每一位用户设置权限，文件内容如下：</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<pre><span style="COLOR: #66cc66">[</span>users<span style="COLOR: #66cc66">]</span>
p1_admin1 = p1_admin1
p1_d1 = p1_d1
p1_t1 = p1_t1
         
p2_admin1 = p2_admin1
p2_d1 = p2_d1
p2_t1 = p2_t1</pre>
</blockquote>
<p>为了便于验证，所有密码和用户名一致，如果你使用的是其他认证方式，这一步可能不同，但是用户名应该都是一样的。</p>
<p>6，配置授权，修改D:\SVNROOT\authz</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p>[groups]<br />
# 定义组信息</p>
<p>p1_group_a = p1_admin1<br />
p1_group_d = p1_d1<br />
p1_group_t = p1_t1</p>
<p>p2_group_a = p2_admin1<br />
p2_group_d = p2_d1<br />
p2_group_t = p2_t1</p>
<p><br />
[/]<br />
# 指定所有的版本库默认只读，root可读写<br />
* = r<br />
root = rw</p>
<p><br />
[project1:/]<br />
# 指定对版本库project1根目录的权限<br />
@p1_group_a = rw<br />
@p1_group_d = rw<br />
@p1_group_t = r</p>
<p>[project1:/trunk/admin]<br />
# 指定对版本库project1的/trunk/admin根目录的权限，<br />
# p1_group_a读写，p1_group_d和p1_group_t没有任何权限。<br />
@p1_group_a = rw<br />
@p1_group_d = <br />
@p1_group_t = </p>
<p>&nbsp;</p>
<p>[project2:/]<br />
# 指定对版本库project2根目录的权限<br />
@p2_group_a = rw<br />
@p2_group_d = rw<br />
@p2_group_t = r</p>
<p>[project2:/trunk/admin]<br />
# 指定对版本库project1的/trunk/admin根目录的权限<br />
@p2_group_a = rw<br />
@p2_group_d = <br />
@p2_group_t = </p>
</blockquote>
<p><br />
经过以上设置以后，你会发现一些有趣的事情。当使用用户&ldquo;p1_d1&rdquo;，检出project1的trunk时，目录是空的，好像admin目录根本不存在一样，当使用p1_d1用户浏览版本库时，能够看到admin目录，但是其中的内容却无法看到。</p>
<p>关于中文目录，也是没有问题的，只是注意要把authz文件转化为UTF-8格式，在我的WINXP的UltraEdit里显示的文件格式为U8-DOS，具体的做法是用UltraEdit打开authz文件，然后选择&ldquo;文件-&gt;转换-&gt;ASCII转UTF-8&rdquo;，然后保存。</p>
<p>再复杂的情况也不过如此，在实际的工作中要首先规划好权限，只赋给用户最小的权限，保证以最小的配置实现最复杂的权限控制。</p>
          <br/>
          <span style="color:red;">
            <a href="http://rainsf.javaeye.com/blog/144512#comments" style="color:red;">本文的讨论也很精彩，浏览讨论>></a>
          </span>
          <br/><br/><br/>
          <span style="color:#E28822;">JavaEye推荐</span>
          <br/>
          <ul class='adverts'><li><a href='/adverts/92' target='_blank'><span style="color:red;font-weight:bold;">快来参加7月17日在成都举行的SOA中国技术论坛</span></a></li><li><a href='/adverts/41' target='_blank'><span style="color:red;font-weight:bold;">北京: 千橡集团暨校内网诚聘软件研发工程师</span></a></li><li><a href='/adverts/42' target='_blank'><span style="color:red;font-weight:bold;">搜狐网站诚聘Java、PHP和C++工程师</span></a></li><li><a href='/adverts/106' target='_blank'><span style="color:blue;font-weight:bold;">JavaEye问答大赛开始了！ 从6月23日 至 7月6日，奖品丰厚 ！</span></a></li><li><a href='/adverts/97' target='_blank'><span style="color:blue;font-weight:bold;">Oracle专区上线，有Oracle最新文章，重要下载及知识库等精彩内容，欢迎访问。</span></a></li></ul>
          <br/><br/><br/>
          ]]>
        </description>
        <pubDate>Wed, 28 Nov 2007 14:48:51 +0800</pubDate>
        <link>http://rainsf.javaeye.com/blog/144512</link>
        <guid>http://rainsf.javaeye.com/blog/144512</guid>
      </item>
      <item>
        <title>(转)Subversion快速入门教程</title>
        <author>rainsf</author>
        <description>
          <![CDATA[
          <br/>
          作者: <a href="http://rainsf.javaeye.com">rainsf</a>&nbsp;
          链接：<a href="http://rainsf.javaeye.com/blog/144492" style="color:red;">http://rainsf.javaeye.com/blog/144492</a>&nbsp;
          发表时间: 2007年11月28日
          <br/><br/>
          声明：本文系JavaEye网站发布的原创博客文章，未经作者书面许可，严禁任何网站转载本文，否则必将追究法律责任！
          <br/><br/>
          <p>如何快速建立Subversion服务器，并且在项目中使用起来，这是大家最关心的问题，与CVS相比，Subversion有更多的选择，也更加的容易，几个命令就可以建立一套服务器环境，可以使用起来，这里配套有<u><font color="#0000ff"><a href="http://www.subversion.org.cn/media/all.swf">动画教程</a></font></u>。 <br />
本文是使用Subversion最快速的教程，在最短的时间里帮助您建立起一套可用的服务器环境，只需略加调整就可以应用到实际项目当中。 <br />
本教程分为以下几个部门，不仅仅是快速入门，最后我们还有一些高级功能的说明，为了说明简单，教程是在windows下使用的方式，以方便资源有限的项目使用，对于UNIX环境下，区别并不大。</p>
<p>软件下载</p>
<p>服务器和客户端安装</p>
<p>建立版本库（Repository） </p>
<p>配置用户和权限</p>
<p>运行独立服务器 </p>
<p>初始化导入 </p>
<p>基本客户端操作</p>
<h1>1，软件下载 </h1>
<h3>下载Subversion服务器程序。 </h3>
<p>到<a href="http://subversion.tigris.org/">官方网站</a>的下载二进制安装文件，来到<a href="http://subversion.tigris.org/project_packages.html#binary-packages">二进制包下载部分</a>，找到 Windows NT, 2000, XP and 2003部分，然后选择&quot; <a href="http://subversion.tigris.org/servlets/ProjectDocumentList?folderID=91">this directory </a>&quot;，这样我们可以看到许多下载的内容，目前可以下载 <a href="http://subversion.tigris.org/files/documents/15/34093/svn-1.4.0-setup.exe">svn-1.4.0-setup.exe</a> 。 </p>
<h3>下载Subversion的Windows客户端TortoiseSVN。 </h3>
<p>TortoiseSVN是扩展Windows Shell的一套工具，可以看作Windows资源管理器的插件，安装之后Windows就可以识别Subversion的工作目录。 <br />
官方网站是<a href="http://tortoisesvn.net/">TortoiseSVN </a>，下载方式和前面的svn服务器类似，在<a href="http://tortoisesvn.net/downloads">Download</a>页面的我们可以选择下载的版本，目前的最高稳定版本的安装文件为<a href="http://prdownloads.sourceforge.net/tortoisesvn/TortoiseSVN-1.4.0.7501-win32-svn-1.4.0.msi?download"><u><font color="#0000ff">TortoiseSVN-1.4.0.7501-win32-svn-1.4.0.msi</font></u></a>。</p>
<h1>2，服务器和客户端安装 </h1>
<p>服务器安装，直接运行<a href="http://subversion.tigris.org/files/documents/15/34093/svn-1.4.0-setup.exe">svn-1.4.0-setup.exe</a> ，根据提示安装即可，这样我们就有了一套服务器可以运行的环境。 </p>
<p>安装TortoiseSVN，同样直接运行<a href="http://prdownloads.sourceforge.net/tortoisesvn/TortoiseSVN-1.4.0.7501-win32-svn-1.4.0.msi?download"><u><font color="#0000ff">TortoiseSVN-1.4.0.7501-win32-svn-1.4.0.msi</font></u></a>按照提示安装即可，不过最后完成后会提示是否重启，其实重启只是使svn工作拷贝在windows中的特殊样式生效，与所有的实际功能无关，这里为了立刻看到好的效果，还是重新启动机器。<br />
&nbsp; </p>
<h1>3，建立版本库（Repository）</h1>
<p>运行Subversion服务器需要首先要建立一个版本库（Repository），可以看作服务器上存放数据的数据库，在安装了Subversion服务器之后，可以直接运行，如： </p>
<pre>svnadmin create E:\svndemo\repository</pre>
<p>就会在目录E:\svndemo\repository下创建一个版本库。 </p>
<p>我们也可以使用TortoiseSVN图形化的完成这一步： <br />
在目录E:\svndemo\repository下&quot;右键-&gt;TortoiseSVN-&gt;Create Repository here...&ldquo;， 然后可以选择版本库模式， 这里使用默认即可， 然后就创建了一系列目录和文件。 </p>
<h1><br />
4，配置用户和权限 </h1>
<p>来到E:\svndemo\repository\conf目录，修改svnserve.conf： <br />
# [general] <br />
# password-db = passwd <br />
改为： <br />
[general] <br />
password-db = passwd 然后修改同目录的passwd文件，去掉下面三行的注释： <br />
# [users] <br />
# harry = harryssecret <br />
# sally = sallyssecret <br />
最后变成： <br />
[users] <br />
harry = harryssecret <br />
sally = sallyssecret </p>
<p>&nbsp;</p>
<h1>5，运行独立服务器 </h1>
<p>在任意目录下运行： <br />
svnserve -d -r E:\svndemo\repository 我们的服务器程序就已经启动了。<span class="ontab">注意不要关闭命令行窗口，关闭窗口也会把svnserve停止。 </span></p>
<h1><br />
6，初始化导入 </h1>
<p>来到我们想要导入的项目根目录，在这个例子里是E:\svndemo\initproject，目录下有一个readme.txt文件： </p>
<p><br />
右键-&gt;TortoiseSVN-&gt;Import... <br />
URL of repository输入&ldquo;svn://localhost/&rdquo; <br />
ok <br />
完成之后目录没有任何变化，如果没有报错，数据就已经全部导入到了我们刚才定义的版本库中。 </p>
<p>需要注意的是，这一步操作可以完全在另一台安装了TortoiseSVN的主机上进行。例如运行svnserve的主机的IP是133.96.121.22，则URL部分输入的内容就是&ldquo;svn://133.96.121.22/&rdquo;。</p>
<h1><br />
7，基本客户端操作 </h1>
<p>取出版本库到一个工作拷贝： <br />
来到任意空目录下，在本例中是E:\svndemo\wc1，运行右键-&gt;Checkout，在URL of repository中输入svn://localhost/，这样我们就得到了一份工作拷贝。 <br />
在工作拷贝中作出修改并提交： <br />
打开readme.txt，作出修改，然后右键-&gt;Commit...，这样我们就把修改提交到了版本库，我们可以运行。 </p>
<p>察看所作的修改： <br />
readme.txt上右键-&gt;TortoiseSVN-&gt;Show Log，这样我们就可以看到我们对这个文件所有的提交。在版本1上右键-&gt;Compare with working copy，我们可以比较工作拷贝的文件和版本1的区别。 </p>
<p>最后，所有的内容都已经录制为<a href="http://www.subversion.org.cn/media/all.swf">动画文件</a>，大家可以参考。 </p>
<p>&nbsp;</p>
<p>另附：</p>
<p><font face="Arial">SVN1.4的服务设置</font></p>
<p><font face="Arial">安装服务<br />
sc create subversion_service binpath= &quot;c:\subversion\bin\svnserve.exe --service -r c:\svn_test\repos&quot; displayname= &quot;Subversion Repository&quot; depend= Tcpip</font></p>
<p><font face="Arial"><br />
删除服务<br />
sc delete subversion_service</font></p>
          <br/>
          <span style="color:red;">
            <a href="http://rainsf.javaeye.com/blog/144492#comments" style="color:red;">本文的讨论也很精彩，浏览讨论>></a>
          </span>
          <br/><br/><br/>
          <span style="color:#E28822;">JavaEye推荐</span>
          <br/>
          <ul class='adverts'><li><a href='/adverts/41' target='_blank'><span style="color:red;font-weight:bold;">北京: 千橡集团暨校内网诚聘软件研发工程师</span></a></li><li><a href='/adverts/106' target='_blank'><span style="color:blue;font-weight:bold;">JavaEye问答大赛开始了！ 从6月23日 至 7月6日，奖品丰厚 ！</span></a></li><li><a href='/adverts/92' target='_blank'><span style="color:red;font-weight:bold;">快来参加7月17日在成都举行的SOA中国技术论坛</span></a></li><li><a href='/adverts/42' target='_blank'><span style="color:red;font-weight:bold;">搜狐网站诚聘Java、PHP和C++工程师</span></a></li><li><a href='/adverts/97' target='_blank'><span style="color:blue;font-weight:bold;">Oracle专区上线，有Oracle最新文章，重要下载及知识库等精彩内容，欢迎访问。</span></a></li></ul>
          <br/><br/><br/>
          ]]>
        </description>
        <pubDate>Wed, 28 Nov 2007 14:01:20 +0800</pubDate>
        <link>http://rainsf.javaeye.com/blog/144492</link>
        <guid>http://rainsf.javaeye.com/blog/144492</guid>
      </item>
      <item>
        <title>Nutch 0.9笔记</title>
        <author>rainsf</author>
        <description>
          <![CDATA[
          <br/>
          作者: <a href="http://rainsf.javaeye.com">rainsf</a>&nbsp;
          链接：<a href="http://rainsf.javaeye.com/blog/75725" style="color:red;">http://rainsf.javaeye.com/blog/75725</a>&nbsp;
          发表时间: 2007年04月27日
          <br/><br/>
          声明：本文系JavaEye网站发布的原创博客文章，未经作者书面许可，严禁任何网站转载本文，否则必将追究法律责任！
          <br/><br/>
          &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 一直留意Lucene,Nutch的进展，最近这两个项目都发展得非常快，Lucne已发展到 2.1,Nutch已发展到 0.9，改进了很多，令人欣喜。<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;今天小试了一下Nutch-0.9,笔记如下：<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />1、解压Nutch包，在Nutch根目录下建目录urls,里面建一些包含URL的文本如urlt.txt，一行一个URL,内容如：http://www.blogjava.net<br /><font color="#000000">http://www.javaeye.com/</font><br /><br /><br />2、修改conf目录下的<span style="COLOR: #ff00ff">crawl-urlfilter.txt,</span>片断如下：<br /># accept hosts in MY.DOMAIN.NAME<br /># +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/<br />+^http://www.blogjava.net/<br />+^http://www.javaeye.com/<br />+^http://lucene.apache.org/<br /><br />3、修改conf目录下的<span style="COLOR: #ff00ff">nutch-site.xml</span>，内容如下：<br />
<div style="BORDER-RIGHT: #cccccc 1px solid; PADDING-RIGHT: 5px; BORDER-TOP: #cccccc 1px solid; PADDING-LEFT: 4px; FONT-SIZE: 13px; PADDING-BOTTOM: 4px; BORDER-LEFT: #cccccc 1px solid; WIDTH: 98%; WORD-BREAK: break-all; PADDING-TOP: 4px; BORDER-BOTTOM: #cccccc 1px solid; BACKGROUND-COLOR: #eeeeee"><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="COLOR: #0000ff">&lt;?</span><span style="COLOR: #ff00ff">xml&nbsp;version="1.0"</span><span style="COLOR: #0000ff">?&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="COLOR: #0000ff">&lt;?</span><span style="COLOR: #ff00ff">xml-stylesheet&nbsp;type="text/xsl"&nbsp;href="configuration.xsl"</span><span style="COLOR: #0000ff">?&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="COLOR: #008000">&lt;!--</span><span style="COLOR: #008000">&nbsp;Put&nbsp;site-specific&nbsp;property&nbsp;overrides&nbsp;in&nbsp;this&nbsp;file.&nbsp;</span><span style="COLOR: #008000">--&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">configuration</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">http.agent.name</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">Nutch</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">HTTP&nbsp;'User-Agent'&nbsp;request&nbsp;header.&nbsp;MUST&nbsp;NOT&nbsp;be&nbsp;empty&nbsp;-&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;please&nbsp;set&nbsp;this&nbsp;to&nbsp;a&nbsp;single&nbsp;word&nbsp;uniquely&nbsp;related&nbsp;to&nbsp;your&nbsp;organization.<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NOTE:&nbsp;You&nbsp;should&nbsp;also&nbsp;check&nbsp;other&nbsp;related&nbsp;properties:<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;http.robots.agents<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;http.agent.description<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;http.agent.url<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;http.agent.email<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;http.agent.version<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;and&nbsp;set&nbsp;their&nbsp;values&nbsp;appropriately.<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">http.robots.agents</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">Nutch,*</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">The&nbsp;agent&nbsp;strings&nbsp;we'll&nbsp;look&nbsp;for&nbsp;in&nbsp;robots.txt&nbsp;files,<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;comma-separated,&nbsp;in&nbsp;decreasing&nbsp;order&nbsp;of&nbsp;precedence.&nbsp;You&nbsp;should<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;put&nbsp;the&nbsp;value&nbsp;of&nbsp;http.agent.name&nbsp;as&nbsp;the&nbsp;first&nbsp;agent&nbsp;name,&nbsp;and&nbsp;keep&nbsp;the<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;default&nbsp;*&nbsp;at&nbsp;the&nbsp;end&nbsp;of&nbsp;the&nbsp;list.&nbsp;E.g.:&nbsp;BlurflDev,Blurfl,*<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">http.agent.description</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">Nutch&nbsp;Search&nbsp;Engineer</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">Further&nbsp;description&nbsp;of&nbsp;our&nbsp;bot-&nbsp;this&nbsp;text&nbsp;is&nbsp;used&nbsp;in<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;the&nbsp;User-Agent&nbsp;header.&nbsp;&nbsp;It&nbsp;appears&nbsp;in&nbsp;parenthesis&nbsp;after&nbsp;the&nbsp;agent&nbsp;name.<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">http.agent.url</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">http://lucene.apache.org/nutch/bot.html</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">A&nbsp;URL&nbsp;to&nbsp;advertise&nbsp;in&nbsp;the&nbsp;User-Agent&nbsp;header.&nbsp;&nbsp;This&nbsp;will&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;appear&nbsp;in&nbsp;parenthesis&nbsp;after&nbsp;the&nbsp;agent&nbsp;name.&nbsp;Custom&nbsp;dictates&nbsp;that&nbsp;this<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;should&nbsp;be&nbsp;a&nbsp;URL&nbsp;of&nbsp;a&nbsp;page&nbsp;explaining&nbsp;the&nbsp;purpose&nbsp;and&nbsp;behavior&nbsp;of&nbsp;this<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;crawler.<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">http.agent.email</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">name</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">nutch-agent@lucene.apache.org</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">value</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">An&nbsp;email&nbsp;address&nbsp;to&nbsp;advertise&nbsp;in&nbsp;the&nbsp;HTTP&nbsp;'From'&nbsp;request<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;header&nbsp;and&nbsp;User-Agent&nbsp;header.&nbsp;A&nbsp;good&nbsp;practice&nbsp;is&nbsp;to&nbsp;mangle&nbsp;this<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;address&nbsp;(e.g.&nbsp;'info&nbsp;at&nbsp;example&nbsp;dot&nbsp;com')&nbsp;to&nbsp;avoid&nbsp;spamming.<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">description</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">property</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">configuration</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span></div>
<br /><span style="COLOR: red">注意</span>：在nutch-0.9.jar里面已包含nutch-site.xml，&nbsp; conf目录下的文件都复制过到classpath根下，如果是在WEB环境下运行classpath下的nutch-site.xml会优先加载，如果在在Application环境运行，应把如上nutch-site.xml打入到nutch-0.9.jar包里，否则，上面的一些属性为空不能运行。<br /><br /><br />4、在Windows下运行Nutch，很简单，只要你能执行Crawl这个类就行，写一个Ant脚本放在Nuthc的根目录下执行它就OK，内容如下：<br />
<div style="BORDER-RIGHT: #cccccc 1px solid; PADDING-RIGHT: 5px; BORDER-TOP: #cccccc 1px solid; PADDING-LEFT: 4px; FONT-SIZE: 13px; PADDING-BOTTOM: 4px; BORDER-LEFT: #cccccc 1px solid; WIDTH: 98%; WORD-BREAK: break-all; PADDING-TOP: 4px; BORDER-BOTTOM: #cccccc 1px solid; BACKGROUND-COLOR: #eeeeee"><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">project&nbsp;</span><span style="COLOR: #ff0000">name</span><span style="COLOR: #0000ff">="nutch-crawl"</span><span style="COLOR: #ff0000">&nbsp;default</span><span style="COLOR: #0000ff">="crawl"</span><span style="COLOR: #ff0000">&nbsp;basedir</span><span style="COLOR: #0000ff">="."</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property&nbsp;</span><span style="COLOR: #ff0000">name</span><span style="COLOR: #0000ff">="lib.dir"</span><span style="COLOR: #ff0000">&nbsp;&nbsp;location</span><span style="COLOR: #0000ff">="lib"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property&nbsp;</span><span style="COLOR: #ff0000">name</span><span style="COLOR: #0000ff">="conf.dir"</span><span style="COLOR: #ff0000">&nbsp;&nbsp;location</span><span style="COLOR: #0000ff">="conf"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">path&nbsp;</span><span style="COLOR: #ff0000">id</span><span style="COLOR: #0000ff">="project.classpath"</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">fileset&nbsp;</span><span style="COLOR: #ff0000">dir</span><span style="COLOR: #0000ff">="."</span><span style="COLOR: #ff0000">&nbsp;includes</span><span style="COLOR: #0000ff">="nutch-*.jar"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">fileset&nbsp;</span><span style="COLOR: #ff0000">dir</span><span style="COLOR: #0000ff">="lib"</span><span style="COLOR: #ff0000">&nbsp;</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">pathelement&nbsp;</span><span style="COLOR: #ff0000">path</span><span style="COLOR: #0000ff">="."</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">pathelement&nbsp;</span><span style="COLOR: #ff0000">path</span><span style="COLOR: #0000ff">="${conf.dir}"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">path</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">target&nbsp;</span><span style="COLOR: #ff0000">name</span><span style="COLOR: #0000ff">="crawl"</span><span style="COLOR: #ff0000">&nbsp;</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">echo</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">crwaling&nbsp;starting<img src="http://www.blogjava.net/Images/dot.gif" /></span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">echo</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">property&nbsp;</span><span style="COLOR: #ff0000">name</span><span style="COLOR: #0000ff">="JVM.extra.args"</span><span style="COLOR: #ff0000">&nbsp;value</span><span style="COLOR: #0000ff">="-Xmx512m"</span><span style="COLOR: #ff0000">&nbsp;</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">java&nbsp;</span><span style="COLOR: #ff0000">classname</span><span style="COLOR: #0000ff">="org.apache.nutch.crawl.Crawl"</span><span style="COLOR: #ff0000">&nbsp;classpathref</span><span style="COLOR: #0000ff">="project.classpath"</span><span style="COLOR: #ff0000">&nbsp;fork</span><span style="COLOR: #0000ff">="true"</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">jvmarg&nbsp;</span><span style="COLOR: #ff0000">line</span><span style="COLOR: #0000ff">="${JVM.extra.args}"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="C:/dev-tools/nutch-0.9/urls"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="-dir"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="C:/dev-tools/nutch-0.9/crawl"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="-depth"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="3"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="-threads"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">arg&nbsp;</span><span style="COLOR: #ff0000">value</span><span style="COLOR: #0000ff">="15"</span><span style="COLOR: #0000ff">/&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">java</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;</span><span style="COLOR: #800000">echo</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000">crwaling&nbsp;finished<img src="http://www.blogjava.net/Images/dot.gif" /></span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">echo</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">target</span><span style="COLOR: #0000ff">&gt;</span><span style="COLOR: #000000"><br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp;&nbsp;<br /><img src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="COLOR: #0000ff">&lt;/</span><span style="COLOR: #800000">project</span><span style="COLOR: #0000ff">&gt;</span></div>
<br />至此，如无意外，Nutch已经欢快地运行起来，最后在crawl目录下你会发现你想要的东西，Enjoy it! 
<img src="http://www.blogjava.net/rainsf/aggbug/114022.html" height="1" width="1" /><br /><br /><div align="right"><a href="http://www.blogjava.net/rainsf/" target="_blank" style="text-decoration:none;">小鱼</a> 2007-04-27 11:09 <a href="http://www.blogjava.net/rainsf/archive/2007/04/27/114022.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>
          <br/>
          <span style="color:red;">
            <a href="http://rainsf.javaeye.com/blog/75725#comments" style="color:red;">本文的讨论也很精彩，浏览讨论>></a>
          </span>
          <br/><br/><br/>
          <span style="color:#E28822;">JavaEye推荐</span>
          <br/>
          <ul class='adverts'><li><a href='/adverts/97' target='_blank'><span style="color:blue;font-weight:bold;">Oracle专区上线，有Oracle最新文章，重要下载及知识库等精彩内容，欢迎访问。</span></a></li><li><a href='/adverts/106' target='_blank'><span style="color:blue;font-weight:bold;">JavaEye问答大赛开始了！ 从6月23日 至 7月6日，奖品丰厚 ！</span></a></li><li><a href='/adverts/41' target='_blank'><span style="color:red;font-weight:bold;">北京: 千橡集团暨校内网诚聘软件研发工程师</span></a></li><li><a href='/adverts/92' target='_blank'><span style="color:red;font-weight:bold;">快来参加7月17日在成都举行的SOA中国技术论坛</span></a></li><li><a href='/adverts/42' target='_blank'><span style="color:red;font-weight:bold;">搜狐网站诚聘Java、PHP和C++工程师</span></a></li></ul>
          <br/><br/><br/>
          ]]>
        </description>
        <pubDate>Fri, 27 Apr 2007 03:09:00 +0800</pubDate>
        <link>http://rainsf.javaeye.com/blog/75725</link>
        <guid>http://rainsf.javaeye.com/blog/75725</guid>
      </item>
  </channel>
</rss>