WebHarvy Version 3.0 Released !

We are happy to announce the release of WebHarvy 3.0. We have added a lot of new features in this major update. The feature/changes list for this update is the longest among all product updates which we have done till date. Here we go. .

  • Added the following options in the Capture Window (grouped under ‘More Options’)
    • Capture following text: Improved by using brute force search for all elements in the page
    • Capture HTML: Option to scrape HTML of selected element
    • Capture Text as File: Option to scrape text and save it as a local file (useful while scraping articles and blog posts)
    • Click: Ability to scrape hidden (partially displayed) fields in webpages which require a click from the user to be displayed in full. For example phone numbers or email addresses which are displayed completely only if you click them.
    • Apply Regular Expression: Option to apply Regular Expressions (RegEx) on captured text. RegEx can be applied even after applying ‘Capture following text’, ‘Capture HTML’ & ‘Capture More Content’ options.
    • Capture More Content: Option to capture more text than the selected text, captures parent element’s text. For example this would capture the entire article if you apply this option after having selected the first paragraph.
  • Option to individually select categories/links (one by one) for Category Scraping (Mine menu – Scrape a list of similar links)
  • Export captured data as JSON
  • Ability to mine data from tables (row-column / grid layout)
  • Ability to mine pages which has fewer (less than 10) data items
  • Option to test proxies before using them (Edit menu – Settings – Proxy Settings)
  • Non responsive proxies are skipped during mining. Mining would not stop because of a bad/non-responsive proxy in the list.
  • Option to manually add URLs to an existing configuration (Edit menu – Add URLs to configuration)
  • Option to remove duplicates while mining (Edit menu – Settings – Miner)
  • Added ‘Hourly’ frequency option in Scheduler (Mine menu – Scheduler)
  • Added option to export data directly to database for scheduled mining tasks & command line
  • Added ‘Clear’ option in Edit menu which will clear both the browser and data preview pane
  • Language encoding defaulted to ‘utf-8′ for file exports (XML, CSV etc)
  • CSV/Database export : handles delimiters (comma, quotes etc) in captured data
  • Keyword/Category scraping allowed for 2 entries in evaluation version
  • Rendering issues with in-built browser fixed – defaults to IE 9 rendering
  • New Installer built with InstallShield

Download the latest installation of WebHarvy Web Scraper from 
https://www.webharvy.com/download.html
.

USBTrace version 2.8 Released !

We’ve just released the latest update of USBTrace, the USB analyzer software for Windows. The changes in this update are :-

  • Added option to timestamp captured requests in ‘system’ time (HH:MM:SS:milliSeconds)
  • Added decoding of bConfigurationValue in Configuration Descriptor
  • Captured Data Export (HTML, CSV, XML) made faster
  • Added headers for HTML and XML export files
  • Updated USB device list for VID/PID decoding
  • Layout of Search/Filter/Trigger windows changed
  • More support for Windows 8 and USB 3.0 (SuperSpeed USB)
  • Minor bug fixes

You may download the 15 days free trial version of USBTrace USB Analyzer from 
http://www.sysnucleus.com/usbtrace_download.html
.

To all customers of UniBlue’s software who have incorrectly installed our drivers

This post is for all customers of UniBlue’s driver update software who have incorrectly installed our software’s drivers.

First off, kindly be informed that the drivers which are incorrectly installed in your systems (which has rendered most of your USB devices unusable) is a software component which is exclusively used by our software – USBDeviceShare (http://www.sysnucleus.com/usbshare/).

We own this driver and have NOT authorised any third party company (including UniBlue) to redistribute our drivers. Our drivers are meant to be used along with and only with our software. They are meant to be installed only along with the installation of our software.The problem which you are currently facing is due to UniBlue’s software. We have contacted them regarding this, and have also requested them to remove our drivers from their updates. We are yet to receive a reply from them.

Update (Feb 20, 2013) : Uniblue Technical Support has informed us that they will investigate this issue and take immediate action

Since our drivers are involved and since many of UniBlue’s customers are contacting us regarding a solution to this problem, we are listing below the steps to be followed to remove our drivers from your system. For more assistance we request you to contact UniBlue itself. We have our own customers to support and it is beyond us to attend to each of you, mainly because the problem was initiated by UniBlue.

The solution is to remove the USBDeviceShare’s driver completely from your system. For this you may need to plug in a non-USB mouse/keyboard to your system, if the current ones are not working. Please follow the steps below :

1. Go to c:\windows\inf directory
2. Search for all files (*.*) containing the text ‘udsstub.sys’
3. Delete all files displayed in the search result
4. Delete the file ‘udsstub.sys’ from c:\windows\system32\drivers folder

5. Open RegEdit (Run ‘regedit’) – in administrator mode
6. Delete the key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\udsstub (including subkeys)
7. Restart your system and see if the problem is solved.

You may also try to update the device driver directly :-

1. Open device manager
2. Locate the node corresponding to non working devices
3. Right click and update driver
4. Select ‘Browse for driver on my computer’ and later ‘Let me pick from a list of drivers on my computer’ options
5. Select the correct driver from the list displayed

We hope that the above instructions will help you remove our drivers, but if it does not please contact UniBlue support. If they do not remove our drivers there is a chance that the same issue can happen again in the future.

Web Scraping from Command Line

WebHarvy supports command line arguments so that you can run the software directly from the command line. This allows you to run WebHarvy from script or batch files, or to invoke it via code from your own applications.

To know more, read : Running WebHarvy Web Scraper from Command Line

Schedule scraping tasks

WebHarvy comes with an in-built scheduler using which you may schedule your scraping tasks. The scheduler window can be opened from the Mine menu.

WebHarvy Scheduler

WebHarvy Scheduler

The scheduler enables you to run scraping tasks periodically – daily, weekly or monthly.

Know More about WebHarvy Scheduler

Download  and Try  the free 15 days evaluation version of WebHarvy Web Data Extraction Software.

WebHarvy v2.0 Released !

The new features in the 2.0 update are :

  • Built-in scheduler for running scraping tasks – (know more)
  • Command Line Options – (know more)
  • MySQL Support for exporting scraped data – (know more)
  • Option to scrape sub text of selected text – (know more)
  • Updated proxy settings – (know more)
    • Supports proxies which require authentication
    • Supports importing proxies from CSV/Text files
  • Option to resume mining from where it stopped/aborted
  • Option to auto-save captured data on regular intervals – (know more)
  • Option to automatically inject pauses while mining (prevents IP blocking) – (know more)
  • Major improvements in mining
  • Minor changes
    • Number of pages & records mined are always displayed in Miner window’s status strip
    • Fixed bug related to capturing images where image text is empty
    • Updated capturing email addresses
    • Record numbers displayed inside captured data grid view in Miner window
    • Option to cancel preview generation for large index page data

You may download the latest version of WebHarvy Web Scraper from 
http://www.webharvy.com/download.html
.

 

Sniff HTTP from iPhone, iPad

HTTP traffic exchanged between browsers and web servers can be sniffed and analyzed using the HTTP Trace app for iOS.

This slideshow requires JavaScript.

The app captures and decodes the following HTTP data :

  1. HTTP Headers and Cookies
  2. URLs and Methods (GET, POST etc)
  3. Parameters sent in Query String and Post Data
  4. HTTP request and response body (content)

App website :- 
http://www.sysnucleus.com/httptrace/index.html

Follow

Get every new post delivered to your Inbox.