This article is about the full-text search capability of SQL Server 2000 (2005). It is an easy to use, very fast and extensible solution to index and search in various types of documents' content. For example in Word, Excel, Adobe portable document format (PDF) and HTML files. [More]
Microsoft SQL Server 2000 (2005) is capable to provide full text indexing and search services for most document types by default. All of the Microsoft Office document formats belong to the supported/searchable document types, like Word and Excel. Some other text like filters are also supported like simple txt files and html files. This feature can also be extended by installing additional indexing filters which is used by Windows Indexing Service originally. These kind of filters are usually written by third parties, like Adobe who created one for PDF indexing. Windows SharePoint Service 2.0 relies on the SQL Server's full-text indexing capability, so that functionality can also be extended by installing new filters on the SQL Server (additional configuration is required on the WSS). The issue we found regarding the MHT (Web archive) files is quite interesting, because the SQL Server/WSS doesn't really index those files although those are simply text files so the indexing should work without installing additional filters. Solution Applying some registry modifications can solve our problem: After navigating to the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ContentIndexCommon\Filters\Extension key, we need to create a new key called ".mht" and assign a value to it: {5645C8C2-E277-11CF-8FDA-00AA00A14F93}. The modification requires to rebuild and repopulate the full-text indexes on the SQL Server.