JavaScript Capable Webpage Download Tool?

Karl Semich 0xloem at gmail.com
Sun Sep 5 02:49:12 PDT 2021


Thinking on different tools.  Content scraping also missing, in addition to
spidering.

What kind of tool would be helpful here?  Assuming that we need to build it.

Maybe a local http server that provides access to a selenium session as if
it is non-js html?

Elements with mouse events could be converted to links.

My interest in this missing area relates to ecommerce scraping.  I'd like
for people to be able to work around the product search engines that use
marketing algorithms for all their result orders.  I'd like for it to be
easy for reseller to seed the inventory of decentralized marketplaces from
existing platforms.

The larger corps have been winning the scraping tech race recently, I
believe.  PhantomJS, which itself was another cultural polyfill for an
ongoing situation, had a number of libraries and tools, but Selenium
Webdriver has now replaced by it.

A web search for "phantomjs OR casperjs spider" turns up some hits.
"selenium spider" appears to as well.  For example:
https://scalpel.readthedocs.io/en/latest/selenium-spider/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 1530 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20210905/41dc3960/attachment.txt>


More information about the cypherpunks mailing list