Document Settings (Content Crawler)

Note: The content crawler job does not perform the actions associated with the settings on this page (refreshing or deleting documents); the Document Refresh Agent does. Every time the Document Refresh job runs, it looks at the settings for each document to determine whether anything needs to be done. Therefore, documents are only refreshed or deleted as frequently as the Document Refresh job runs.

To set options that affect how the documents imported by this content crawler are refreshed or deleted:

  1. Under Document Expiration, specify whether these documents should be deleted after a specified period. Choose one of the following options:

  2. Under Link and Property Refresh, specify whether these documents and their associated properties should be periodically refreshed.

    1. Choose one of the following options:

    2. If you specified that documents should be refreshed, you can also specify whether the associated properties should be refreshed. By default, when a document is refreshed, the associated property values are also refreshed from the source document.

      To avoid updating the document properties, select Only confirm the validity of the links to these documents. The Document Refresh Agent checks to see if the source document still exists. If it does exist, nothing happens. If the document is missing, the settings you specify for how to handle broken links (described in step 3) are applied.

      If you run the Document Refresh Agent every day, this feature is useful for removing broken links quickly; otherwise, running the Document Refresh Agent on an enterprise-scale portal can take more than a day.

  3. If you specified that documents should be refreshed, under Broken Links, specify what to do if source documents are missing upon refresh. Choose one of the following options:

  4. If you change the settings on this page after this content crawler has run and you want to apply these new settings to previously imported documents, select Apply these settings to existing documents created by this content crawler. These settings will be applied when you click Finish.


  1. Click Administration.
  2. Open the Content Crawler Editor:
  3. On the left, under Edit Object Settings, click Document Settings.