Here are the common causes and
their resolution:
- Indexing Performace is set at reduced - common mistake on the configuration screen for the index service. See Central Administration > Operations > Services on Server > Office SharePoint Server Search Service Settings and set to Maximum.
- Number of Connections - by default the indexer will run a limited number of simultaneous threads (6 usually) per host. This can be increased manually by adding specific Crawler Impact Rules for each host. You can really improve speed by setting a large file server up to 64 connections. This number is just a suggestion btw to SharePoint, it also looks at other factors like the number of processors (8 * #procs). And also watch your network for bottlenecks and those pesky RPC errors you may get in your logs (dial it back of you see those)
- Crawled systems are slow or hosted on remote networks. - not a lot to be done here, except by moving those files closer.
- Overlapping Crawls - SharePoint gives priority to the first running crawl so that if you already are indexing one system it will hold up the indexing of a second and increase crawl times.
- Solution: Schedule your crawl times so there is no overlap. Full crawls will take the longest so run those exclusively.
- IFilter Issues - the Adobe PDF IFilter can only filter one file at a time and that will slow crawls down, and has a high reject rate for new PDFs
- Solution: Use a retail PDF filter from pdflib.com or Foxit
- Not enought Memory Allocated to Filter Process - an aspect of the crawling process is then the filtering deamons use up to much memory (mssdmn.exe) they get automatically terminated and restarted. There is of course a windup time when this happend and can slow down your crawling. The current default setting is pretty low (around 100M) so is easy to trip when filter large files. You can and should increase the memory allocation by adjusting the following registry keys
- HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager: set DedicatedFilterProcessMemoryQuota = 200000000 Decimal
- HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager: set FilterProcessMemoryQuota = 200000000 Decimal
- Bad File Retries - there is a setting in the registry that controls the number of times a file is retried on error. This will severly slow down incremental crawls as the default is 100. This retry count can be adjust by this key:
- HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager: set DeleteOnErrorInterval = 4 Decimal
- General Architecture Issues - Ensure that you have at least 2 Gig of free memory available before your crawl even starts and that you have at least 2 real processors available.
- Disk Health - the nature of the indexing process causes extensive fragmentation of the file system for both the index server and the database server. Schedule defrags routinely and after all full crawls. Ensure you have enough diskspace always.
- Run 64 bit OS - school is still out on this one, i personally haven't seen much difference as long as there is enough memory and the same processor types, but MS recommends this for large deployments.
No comments:
Post a Comment