crieur/documentation/design/scope.md
koalp 6e091a32fc
chore: use a config struct for self_contained_html
Previously, self_html_function was a function taking all parameters as
arguments.
As new optionnal parameters are beeing added, the function had too much
arguments and each usage of the function would have to be modified each
time an argument will be added.

Therefore, it have been moved to a configuration structure with a `run`
function taking only one argument, the html string.
2021-05-22 04:41:08 +02:00

2.1 KiB

title
Scope of the project

This project mainly aims at providing an unified interface for several newspapers. Side objectives are to provide web API and different clients like a webUI or chatbots.

Several big components are planned for this project (it is an initial draft and may change later) :

@startuml

frame "backend" {
    [Retrieval tools] as retrieval_tools
    [Article representation] as article_repr
    [Automatic retrieval] as auto_retrieve
    [Atom/RSS adapters] as rss
    [Cache DB] as cache

    [Newspaper\n(Mediapart, …)] as newspaper
    () "Newspaper" as np_i
    newspaper -up- np_i


    [Article location] as article_location

    [API] as api
    () "API" as api_i
    api -up- api_i

    article_location ..> np_i

    api -> article_location
    api -> rss

    newspaper -> retrieval_tools: uses to implement

    article_location --> article_repr: uses
    retrieval_tools -up-> article_repr: uses

    auto_retrieve --> rss: watches
    auto_retrieve --> article_location
    auto_retrieve --> cache: stores in

}

frame "Web ui" {
    [Web UI] as webui
    [HTML renderer] as html_rend
    [Pdf exporter] as pdf_rend
    [Articles] as articles
    webui --> html_rend
    webui --> pdf_rend
    webui -> articles
    articles ..> api_i
}

[Chatbot] as chatbot

chatbot ..> api_i

actor User
User ..> webui
User ..> chatbot

actor "Newspaper programmer" as newspaper_programmer
newspaper_programmer ..> newspaper: implements
@enduml

A task queue could be added later to space requests.

Implementation plan

Phase I

  • Newspaper interface : use to retrieve from newspaper websites
  • minimal chatbot (uses libraries directly)
  • ArticleLocation : library for using several Newspaper and retrieving from a given url.

Phase II

  • Article Representation : having a (beta) unified representation for downloaded articles
    • adding this representation to Newpsaper

Phase III

  • Cache
  • Atom/rss adapters
  • automatic retrieve

Phase IV

  • API
  • chatbot (uses api)

Phase V

  • web ui