Changelog: DocFetcher Pro

Note: How to upgrade to the latest release is explained on the DocFetcher Pro FAQ.

DocFetcher Pro 1.18 – 2023-08-06

  • While indexing certain very large files without file extension, the program would try to determine the file’s text encoding, but run out of memory and crash.
  • Fixed an indexing crash that occurred when trying to index files whose paths were longer than 32 KB. This could happen when the files resided in infinitely deep folder structures.
  • There were some PDF annotations that were indexed, but not shown in the preview pane.
  • PDF annotations on empty pages were not indexed.
  • RTF bodies of Outlook emails are now indexed. Previously, they were ignored.
  • When the “message delivery time” attribute of an Outlook email is not available, fall back to indexing other attributes such as “client submit time”.
  • Fixed a “VCRUNTIME140.dll was not found” error that occurred with some of the executables the application ships with.
  • In the macOS version of DocFetcher Pro, the Readme.txt file was not included in the disk image.

DocFetcher Pro 1.17 – 2022-12-22

  • Features:
    • Indexing settings: Added an optional file size limit. This helps prevent crashes caused by the application running out of memory during indexing.
    • Preview pane: The context menu now has a “Copy With Path” entry. Clicking this entry will copy the file path of the currently shown file and the currently selected text to the clipboard.
    • Advanced setting: Added a setting HtmlEncodingOverride in the program-conf.txt file to allow forcing the application to use a particular encoding for indexing HTML files.
  • Bugfixes:
    • Indexing:
      • The program crashed when the user selected “Create Index From > Archive” or “Create Index From > Outlook PST/OST” in the Search Scope context menu, but then chose a non-container file such as PDF. Similarly, pasting such files into the Search Scope pane also crashed the program.
      • Fixed a crash related to the indexing of bzip2 archives.
      • Fixed a crash related to the indexing of PST files.
      • On computers with Arabic locale, it was impossible to create indexes due to a crash involving Arabic numbers.
    • Search Scope pane:
      • Trying to rename an index via the Search Scope context menu while the index was being rebuilt could crash the application.
      • In the Search Scope pane, the logic to enable and disable context menu entries was broken as a result of performance optimizations a couple of versions ago, namely DocFetcher Pro 1.12.
    • Preview pane:
      • The page number field and the occurrence field had no tooltip.
      • If the occurrence field lost focus without the user pressing Enter beforehand, it no longer reacted to occurrence jumps.

DocFetcher Pro 1.16 – 2022-03-15

  • Fixed a crash on certain special placeholder RAR files used by DropBox.
  • On the result table, the date format in the last-modified column is now set to “yyyy-MM-dd, HH:mm”, independent of the system locale.
  • Minor change: When the program fails to load some indexes at startup, it reports them via an info dialog. However, these indexes were presented in a counter-intuitive order.

DocFetcher Pro 1.15 – 2021-11-09

  • tar.gz and related archive formats may contain a “.” directory at the top level, which could cause DocFetcher Pro to crash. Such archives can be created on Linux with a command like this: tar -czf test.tar.gz .
  • Fixed a NullPointerException crash that occurred during the indexing of some 7z archives.
  • Sometimes, when loading a file in the preview pane failed, a blank red bar was shown beneath the preview pane. Now a proper error message is shown there instead.

DocFetcher Pro 1.14 – 2021-07-29

  • Bugfixes:
    • Fixed an ArrayIndexOutOfBoundsException crash that occurred when indexing certain tar.bz2 archives.
    • Windows: If portable DocFetcher Pro was located on a shared drive, there was a loophole that allowed multiple instances of the program to be run simultaneously by different users. This could lead to abnormal program behavior.
    • MacOS: Fixed a NullPointerException crash related to the Search Scope pane.
  • Indexing-related changes:
    • Added a new file size limit setting on the indexing dialog to exclude files having no file extension and being bigger than a certain file size from indexing. This helps prevent crashes that occur when the program tries to index huge data dump files without file extension, such as “Trash-1”.
    • In case of a fatal indexing crash, the program now reports the path of the last file it worked on. This helps locate the file that likely caused the crash.
    • During indexing, the list of files being indexed now has a context menu with “Copy Selection” and “Copy All” entries. The “Copy Selection” entry can be invoked with Ctrl + C (Windows, Linux) or ⌘C (macOS).
    • On Windows, a warning message is shown if the user tries to index an entire drive (such as “C:\”) with index auto-updating enabled. Doing so can greatly slow down your computer.
    • During the parsing of text files, now C1 control characters are filtered out as well, not just C0 control characters as before.
    • Formerly, there was a limit on how small the indexing dialog could be made (i.e., there was a minimum window size).
    • Replaced the “jar with document” button in the top-right corner of the indexing dialog with a “cogwheels” button, in order to make clearer what the button is for: Saving and loading indexing settings.
  • Other changes:
    • Previously, indexes that could not be loaded due to a broken tree.xml file in them were silently ignored. Now such index loading failures will be reported at startup. This helps you with identifying and removing broken indexes that are uselessly occupying disk space.
    • Result table: For emails, the value in the size column now includes not only the email body size, but also the size of any attachments. Note that you must rebuild your indexes before this new size value is displayed.
    • Search Scope pane: If you right-click an index and select “Inspect Index Files”, you will now find an info.txt file in there which contains assorted system info associated with the index.

DocFetcher Pro 1.13 – 2021-05-31

  • On all platforms, the bundled Java runtime was downgraded from Java 16 to Java 11 due to some reports of stability issues.
  • After indexing millions of files, an ArrayIndexOutOfBoundsException crash could occur.
  • Excel files containing unsupported or invalid formulas could not be read. Now the program will fail on such formulas, but still read the rest of the file.
  • On Windows, the program could crash during indexing if its application data folder was replaced with an NTFS junction or NTFS symlink.
  • On Windows, the program will now by default follow NTFS junctions and NTFS symlinks during indexing, but nevertheless be smart enough to avoid getting stuck in circular folder structures. Previously, NTFS junctions and NTFS symlinks were ignored by default due to the risk of getting stuck.
  • On macOS, after the crash dialog was shown, the program’s taskbar icon changed permanently to a yellow warning triangle.
  • On macOS, an ArrayIndexOutOfBoundsException crash could occur during some index-related operations.
  • On macOS, a NullPointerException crash occurred when removing an index.

DocFetcher Pro 1.12 – 2021-05-25

  • Dramatic performance improvements with respect to the handling of large indexes. More specifically, indexes are now loaded much faster at startup, writing indexes to disk is faster, and checking and unchecking folders in the Search Scope pane is faster. For example, in one benchmark with an index comprising 2.8 million files, loading time was cut down from 63.048 s to 8.726 s, which is a reduction down to 14%. These performance improvements were achieved partly by numerous code optimizations, and partly by upgrading the bundled Java runtime from Java 8 to Java 16.
  • Index updates now finish faster due to less aggressive index optimization at the end of the indexing process (known as “merging index segments”).
  • 7z archive support: On Windows, trying to index 7z archives resulted in an AssertionError crash.
  • Fixed an indexing crash on uncompressed tar files (i.e., files ending with “.tar” instead of something like “.tar.gz”).
  • Some library upgrades that may improve PDF and JPEG indexing (POI 4.1.2 → 5.0.0, metadata-extractor 2.14.0 → 2.16.0).

DocFetcher Pro 1.11 – 2021-05-14

  • Added 7z v0.4 archive support. This means DocFetcher Pro can now read 7z archives produced by the latest versions of 7-Zip. Previous versions of DocFetcher Pro and the current version of the free DocFetcher can only read 7z archives up to v0.3.
  • Bugfix: 7z- or rar-compressed files could not be shown in the preview pane, unless the file was directly in the root of the archive.
  • Fixed a NullPointerException crash during indexing of Outlook email attachments.
  • Added a --disable-index-auto-update command-line argument. Launching the program with this argument will disable automatic index updating on all existing indexes. When this is helpful: If you index one or more extremely large folders and these folders are being frequently modified in the background, DocFetcher Pro will get stuck in a cycle of continuous index updating and thereby slow down to a crawl. In this state, it will be difficult if not impossible to turn off automatic index updating manually, so that’s why the aforementioned command-line argument was added.
  • Automatic index updating now ignores file modification events from the entire app data folder, not just from the indexes folder in the app data folder. This means if you index the folder in which DocFetcher Pro stores its settings, any changes there will no longer trigger index updates. Thus, DocFetcher Pro can no longer self-trigger index updates.
  • Improved crash handling: If showing the crash dialog fails for any reason, the program will now fall back to reporting the crash via an HTML file.
  • Minor bugfix: The tutorial hint overlay didn’t relocate correctly when the main window was maximized or unmaximized.

DocFetcher Pro 1.10 – 2021-04-12

  • Search-related bugfix: With type-ahead search enabled, phrase search and proximity search worked incorrectly when combined with wildcards. For instance, “dog cat”* was equivalent to a match-all query that caused all words in the preview pane to be highlighted.
  • After line numbers in the preview pane were introduced in the previous 1.9 release, some users experienced performance issues and a “No more handles” crash when trying to preview very large files. This release adds a performance optimization to fix these problems.

DocFetcher Pro 1.9 – 2021-03-29

  • The preview pane now shows line numbers when displaying plain text files, such as source code. This feature was ported from a code contribution to the upcoming DocFetcher 1.1.23 release.
  • Now after each index update, the internal index files are run through an optimization process called “merging index segments”. This improves search performance and reduces disk usage, but the optimization process itself requires extra processing time and temporary additional disk space.
  • Fixed an IOException crash with “File name too long” message due to overlong file names. This time it was with Chinese characters such as “明”, which take up 3x as much space as ASCII characters.
  • Minor fix: On macOS, the context menu of the preview pane showed the incorrect keyboard shortcut “Ctrl+C” instead of the correct “⌘C”.

DocFetcher Pro 1.8 – 2021-03-18

  • First macOS version of DocFetcher Pro.
  • Critical bugfix: Outlook email bodies were no longer indexed, and also no longer displayed in the preview pane. This bug was caused by an earlier bugfix.
  • Crash after clicking away the tutorial hint overlay and then moving or resizing the main window.
  • Added default exclusion rules on the indexing dialog for various common system files, such as “desktop.ini” and “Thumbs.db”.

DocFetcher Pro 1.7 – 2021-02-27

  • A proper user manual for DocFetcher Pro is now included, replacing the user manual copied from the free DocFetcher.
  • Fixed a NullPointerException crash that occurred during the indexing of certain tar archives.
  • Fixed an IOException crash during indexing. The crash came with a “File name too long” error message, and occurred when the program encountered files with very long file extensions (e.g., “filename.aaaaaa…”).
  • Fixed a NoSuchElementException crash that occurred during search when a file with a linebreak in its name was included in the search results.

DocFetcher Pro 1.6 – 2021-02-16

  • The DocFetcher Pro daemon is now fully functional.
  • During index updates, HTML files with associated folders (e.g., foo.html and foo_files) were always incorrectly identified as new, and were therefore always reindexed.
  • Fixed an IOException crash during indexing. The crash came with a “File name too long” error message, and occurred, as the error message indicates, when the program encountered files with very long filenames.

DocFetcher Pro 1.5 – 2021-02-08

  • After indexing files with invalid characters in the filename (e.g., docum�nt.txt) and then restarting, the program silently failed to load the new index. The invalid characters are typically a result of filename encoding issues.
  • If the program silently fails to load an index, it will now at least write some error information to an “error.txt” file amid the index files.
  • Fixed a PSTException crash during Outlook indexing.
  • Type-ahead search suffered from slowdown and excessive memory usage (up to 8-9 GB compared to at most 1-2 GB at present).
  • With type-ahead search enabled, results from an earlier search could displace the current results.

DocFetcher Pro 1.4 – 2021-02-02

  • The default value for the program’s memory limit was increased from 2 GB to 16 GB. If your computer has less than 16 GB of RAM, there’s no need to adjust the memory limit downwards, the programm will start anyway.
  • The program could crash when its index files somehow got corrupted. Now it will only display an error message instead.
  • The index success sound is no longer played in the following cases: (1) Index updates from the command line via the “–update-indexes” parameter. (2) Index updates triggered by file modification events.
  • In the preview pane, the highlighting of matches was sometimes missing. This happened only when selecting a Word Segmentation algorithm other than “Standard” in the preferences.
  • The program version is now displayed at the bottom of the preferences dialog. (Previously, the only way to determine the program version from within the program was to open the program manual.)

DocFetcher Pro 1.3 – 2021-01-25

  • Fixed IndexNotFoundException crash caused by broken index.
  • Fixed PSTException crash during indexing of Outlook files.

DocFetcher Pro 1.2 – 2021-01-18

  • Indexing of RTF files didn’t work.
  • Fixed a crash that occurred when the program tried to play a success sound after indexing, but no audio device was available on the machine.
  • Fixed a ClassCastException crash that occurred when some other crash occurred during indexing and the user then shut down the program.
  • Sorting the search results by last-modification time didn’t work.

DocFetcher Pro 1.1 – 2021-01-13

  • The default value for the program’s memory limit was increased from 1 GB to 2 GB.
  • On Windows, the highlighting of matches in the preview pane did not work correctly with multi-page documents (e.g., PDF files). Starting with the second page, highlighting ranges were shifted downstream. It was also possible to crash the program by viewing certain pages.
  • The indexing process could get stuck on certain mobi files. Now there’s a timeout of 10s for mobi indexing after which the program gives up and moves on to the next file. You can change the timeout in the Advanced Settings file, via the entry MobiExtractionTimeout.
  • Sometimes it was possible to click on the close button of the indexing dialog twice, causing the program to crash.
  • Fixed an IndexOutOfBoundsException crash occurring during Outlook indexing.

DocFetcher Pro 1.0 – 2021-01-09

First release.