Commit Graph

8 Commits

Author SHA1 Message Date
e34edf0b21
fix: keep media queries in ref styles
All checks were successful
continuous-integration/drone/push Build is passing
Previously, media queries weren't keep when downloading styles from ref
tags.

It have been fixed so that media attribute are kept when creating style
tags from ref tags.
2021-05-22 04:41:08 +02:00
40ebc1ddea
feat: allow to inject styles 2021-05-22 04:41:08 +02:00
6e091a32fc
chore: use a config struct for self_contained_html
Previously, self_html_function was a function taking all parameters as
arguments.
As new optionnal parameters are beeing added, the function had too much
arguments and each usage of the function would have to be modified each
time an argument will be added.

Therefore, it have been moved to a configuration structure with a `run`
function taking only one argument, the html string.
2021-05-22 04:41:08 +02:00
5d0872b4d9
feat : add retrieve from courrier international
Retrieval of articles from courrier international have been added
2021-05-22 04:41:08 +02:00
cee0af6c3c
fix: only select images that have non-data src
Previously, when the image url contained data, it tried to parse an url
and failed, instead of keeping data.

It have been fixed so that images where url is starting by 'data' are
not modified.
2021-05-22 04:41:08 +02:00
970f510cd1
feat: add retrieval from le monde diplomatique
Add retrieval from le monde diplomatique

Previously, 404 pages were injected in the document when downloading
styles
Now, the downloader returns None when documents are not found
2021-05-22 04:41:01 +02:00
756b1592b7
feat: allows to remove elements of html pages
A feature to remove elements of html pages based on css selectors have
been added.

The removal of link element that load external js have been added.
2021-04-24 03:45:13 +02:00
c4ab210c4d
feat: add retrieval application and one newspaper
A first example as well as some documentation have been added

The first example builds an article location and download the article as
an html String.

The documentation explains how it has been designed and what is the goal
of the application as well as it's intended architecture
2021-04-23 22:12:02 +02:00