ferret
is a web scraping system. It aims to simplify data extraction from the web for UI testing, machine learning, analytics and more.
ferret
allows users to focus on the data. It abstracts away the technical details and complexity of underlying technologies using its own declarative language.
It is extremely portable, extensible and fast.
Read the introductory blog post about Ferret here!
What is this container ?
This container makes it possible to deploy an instance including an API and chrome headless. You can query this API by sending an FQL instructions via a POST request. This request will be executed via Ferret & Chrome and the result will be returned to you.
Show me some code
curl -d "{\"text\": \"LET doc = DOCUMENT('https://weareopensource.me', true) LET btn = ELEMENT(doc, '.nav-hobbies') CLICK(btn) WAIT_NAVIGATION(doc) FOR el IN ELEMENTS(doc, '.post-card-title') RETURN TRIM(el.innerText)\"}" -H "Content-Type: application/json" -X POST http://localhost:8080/
Installation
Docker Hub
docker run --rm -p 8080:8080 pierrebrisorgueil/ferretapi
Build
git clone && cd ferretApi
docker build -t ferretapi .
docker run --rm -p 8080:8080 ferretapi
Dev
Pierre