ibm-information-center/dist/eclipse/plugins/i5OS.ic.rzaie_5.4.0.1/rzaieurlobjsrch.htm

101 lines
6.0 KiB
HTML
Raw Normal View History

2024-04-02 14:02:31 +00:00
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="topic" />
<meta name="DC.Title" content="Set up a URL object for the Webserver search engine on HTTP Server" />
<meta name="abstract" content="This topic provides information about how to set up a URL object file for use with the Webserver search engine with the IBM Web Administration for i5/OS interface." />
<meta name="description" content="This topic provides information about how to set up a URL object file for use with the Webserver search engine with the IBM Web Administration for i5/OS interface." />
<meta name="DC.Relation" scheme="URI" content="rzaieparsearch.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 2002,2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 2002,2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="rzaieurlobjsrch" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Set up a URL object for the Webserver search engine on HTTP Server</title>
</head>
<body id="rzaieurlobjsrch"><a name="rzaieurlobjsrch"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Set up a URL object for the Webserver search engine on HTTP Server</h1>
<div><p>This topic provides information about how to set up a URL object
file for use with the Webserver search engine with the <span>IBM<sup>®</sup> Web Administration for i5/OS™ interface</span>.</p>
<div class="important"><span class="importanttitle">Important:</span> Information
for this topic supports the latest PTF levels for HTTP Server for i5/OS .
It is recommended that you install the latest PTFs to upgrade to the latest
level of the HTTP Server for i5/OS. Some of the topics documented here are
not available prior to this update. See <a href="http://www-03.ibm.com/servers/eserver/iseries/software/http/services/service.html" target="_blank">http://www.ibm.com/servers/eserver/iseries/software/http/services/service.htm</a> <img src="www.gif" alt="Link outside Information Center" /> for more information. </div>
<p>A URL object contains a list of URLs plus additional web crawling attributes.
If you select to edit an existing URL object, the contents of the current
object are displayed. The URL object can be selected together with an options
object to use when you select to build document lists by crawling remote web
sites. See <a href="rzaiedoclstsrch.htm">Set up a document list for the Webserver search engine on HTTP Server</a> for more
information.</p>
<p>To create a URL object, do the following: </p>
<ol><li>Start the <span>IBM Web Administration for i5/OS interface</span>. </li>
<li>Click the <strong>Advanced</strong> tab.</li>
<li>Click the <span class="uicontrol">Search Setup</span> subtab.</li>
<li>Expand <strong>Search Engine Setup</strong>. </li>
<li>Click <strong>Build URL object</strong>. </li>
<li>Choose URL object options:<dl><dt class="dlterm">Create a new URL object</dt>
<dd>Select this option to create a new URL object. Enter the name of the new
URL object. </dd>
</dl>
<dl><dt class="dlterm">Edit this URL object</dt>
<dd>Select this option to edit an existing URL object. Select the URL object
from the list.</dd>
</dl>
</li>
<li>Click <strong>Apply</strong>. </li>
<li>Enter document storage and language options:<dl><dt class="dlterm">Directory to store documents</dt>
<dd>Enter the directory where documents found on web sites are stored. Possible
values include any valid directory path name. </dd>
</dl>
<dl><dt class="dlterm">Document language</dt>
<dd>Select the language of the documents that are downloaded. The list provides
all valid language entries.</dd>
</dl>
</li>
<li>Enter URL list options:<dl><dt class="dlterm">Action</dt>
<dd>Click <strong>Add</strong> to add a new row. </dd>
<dd class="ddexpand">Click <strong>Remove</strong> to remove an existing row.</dd>
</dl>
<dl><dt class="dlterm">URL</dt>
<dd>Enter a URL in the form, for example, http://www.ibm.com. If you enter
a URL that requires authentication, create a validation list using the Build
validation list form. See <a href="rzaievallstsrch.htm">Set up validation lists for the Webserver search engine on HTTP Server</a> for
more information.</dd>
</dl>
<dl><dt class="dlterm">URL domain filter</dt>
<dd>Enter a domain to limit crawling, for example, <em>ibm.com</em>. </dd>
</dl>
<dl><dt class="dlterm">Maximum crawling depth</dt>
<dd>Enter the depth of links from the starting URL to continue crawling. The
starting URL is at depth 0. The links on that page are at depth 1. </dd>
</dl>
<dl><dt class="dlterm">Support robot exclusion</dt>
<dd>Choose whether to support robot exclusion. If you select Yes, any site
or pages that contain robot exclusion META tags or files will not be downloaded.</dd>
</dl>
</li>
<li>Click <strong>Apply</strong>. </li>
</ol>
<p>Your new URL object can now be used when Web crawling remote sites.</p>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="rzaieparsearch.htm" title="This topic provides step-by-step tasks for the Webserver search engine.">Search tasks</a></div>
</div>
</div>
</body>
</html>