‘How to increase mining speed ?‘ was one of the most commonly asked questions by our users. With previous versions, the main limitation was that when links had to be followed from the starting page to get each listing details, the miner took more time to scrape a page full of listings. This is because WebHarvy used to sequentially load links one after the other to scrape data.
Instead of processing links to be followed and extracted one after the other, the latest update of WebHarvy processes them in bulk, in parallel, using multiple mining threads. You can set the maximum number of parallel mining threads which WebHarvy uses in Advanced Miner Options window as shown below.
Providing a higher value for ‘Maximum number of parallel mining threads’ option in the above window will increase mining speed. But, to run more threads in parallel, WebHarvy will require more memory, processing power and internet-bandwidth. So we recommend that you increase this setting only based on your system’s CPU, installed physical memory (RAM) and internet speed.
Chrome Developer Tools
More Accurate Automatic Sub-Text Selection
To scrape only a portion of the text displayed in the Capture window, you can highlight the required portion with mouse. We have improved the accuracy of this method, especially when the text selected is in between delimiter characters like currency symbols, punctuation/special characters, new line/space etc.
Improvements And Bug Fixes
- Miner now scrolls the page before clicking on Load More links. This is done to make sure that the ‘load more’ link is visible and loaded before miner tries to click it.
- When text scaling in Windows is not set to 100% (which is the recommended setting on most systems), it was not possible to click and correctly select the required data items during configuration. This issue is fixed in this version. Configuration time data selection works irrespective of text scaling.
- Fixed issue related to downloading images behind SSL.
- Non-visibility of miner window in multi monitor systems when monitor configuration changes is fixed.
- Earlier, the Capture window would become unresponsive for a second or two after applying Regular Expression on HTML. This unresponsive state has been removed.
- Added browser zoom level and number of parallel mining threads info in status bar of configuration browser.
- Fixed issue with loading and displaying upgrade purchase page in cases where user’s license has expired.
- Disabled ‘Mine all pages/Number of pages to mine’ controls while mining is in progress.
- Updated internal browser to a more recent version of Chromium.