AutoWeber

AutoWeber is a proof of concept on automatically grabbing a structure of a webpage for web scraping purposes.

The program requires the user to provide a file/url and data that the program uses as a reference.  From there, AutoWeber determine the most common structure and then generates a JSON representation of the structure for further processing.  The program currently only prints the structure to a JSON file.  However, it can be adapted for other purposes, such as generating web scraping code.

Programming Languages

The program is written in Python.

Frameworks

The following frameworks are used for this program:

  • BeautifulSoup 4

References

Here’s the link to this project via github.

I also make mentions of this project in the following posts: