Add docs into the repo #756

jtojnar · 2022-10-23T13:34:57Z

I downloaded the HTML from live site and created a script (markdownify.py) that will convert them to Markdown. This PR attempts to make everything work. We will remove/update the outdated content in a follow-up PR.

To regenerate the markdown files, run python3 markdownify.py in the SimplePie branch after installing python3-beautifulsoup4, python3-pypandoc and prettier. Or if you have Nix, you can just run nix-shell -I 'nixpkgs=channel:nixos-unstable' -p 'python3.withPackages (ps: with ps; [ beautifulsoup4 pypandoc ])' nodePackages.prettier --run 'python3 markdownify.py'.

Currently I am using Zola as the generator, as I am most familiar with it. A different SSG can be used if preferred.

To preview run zola serve in the docs subdirectory.

TODO

Fixes: #543

Art4 · 2022-10-24T08:39:31Z

Great work so far. 👍 What do you think about having the website on an orphan gh-pages branch to maintain the website and deployment tools independent from the master branch?

We should also put the name change of Sam on the todo list (#543 (comment))

jtojnar · 2022-10-24T09:53:58Z

Using a different repo would be even cleaner. But then it is harder to do coordinated changes (e.g. updating tutorial to new API), as Git & GitHub does not really support this. Plus “out of sight out of mind” applies. Since the goal of this effort is to make updating the docs simpler, I think using the same branch is probably the best choice here. Zola is essentially zero-config (only really requires setting the site URL config.toml and few templates) so it should not bloat the repo much.

Opened #757 for the todos.

jtojnar · 2022-10-31T09:04:18Z

@mblaney The last remaining question content-wise is what to do with the demo:

Upload the static website to the FTP server (instead of GitHub pages) using GitHub actions, keep the demo as a PHP script.
Move the demo to a demo.simplepie.org subdomain, occasionally updated manually, and point the website to that.
Move only the backend part of the demo to a subdomain, and convert the frontend to use AJAX.
Remove the demo completely.

We should also decide on the host we want to use:

Use the existing web host
- ➕ Allows running demo using PHP.
- ➕ Supports redirects using .htaccess.
- ➕ No need to fiddle with DNS.
- ➕ Trivial setup.
- ➖ No preview support.
Netlify
- ➕ Supports _redirects.
- ➕ Will create a subdomain for each pull requests as a preview.
- ➖ DNS changes necessary.
- ➕ Easy setup.
- ➖ Free plan only allows a single user to access administration.
- ➕ though most things can be configured in a config file in the repository.
Cloudflare Pages
- ➕ Supports _redirects file.
- ➕ Will create a subdomain for each pull requests as a preview.
- ➖ DNS changes necessary.
- ❓ Not sure how hard to set up, I have not used this service yet.
GitHub pages
- ➖ No support for redirects.
- ➖ No preview support.
- ➖ DNS changes necessary.
- ➕ Easy setup.

mblaney · 2022-12-02T06:47:57Z

thanks @jtojnar I wouldn't go for the subdomain option because that involves finding someone who can do DNS changes. It looks like the domain is owned by Automatiic, if someone here wants to help make that happen hopefully they will jump in, otherwise I would say continue with the other options.

By process of elimination that means staying with the current host and either removing the demo or possibly using the github action you've suggested. Happy for you to decide on that.

jtojnar · 2022-12-02T14:52:04Z

@mblaney If we stay with the current host, keeping the demo working is not that hard.

Do you have FTP or SSH credentials for the host? Depending on that, we will need to choose either https://github.com/marketplace/actions/ftp-deploy or https://github.com/marketplace/actions/web-deploy-anything.

And either way, we will need to set up credentials on GitHub: https://docs.github.com/en/actions/security-guides/encrypted-secrets#creating-encrypted-secrets-for-a-repository

mblaney · 2022-12-13T08:08:45Z

I have FTP credentials but I'm not a project owner, so not sure how far I will get. Let me know when you're ready and I can try adding the credentials.

jtojnar · 2023-01-03T02:49:23Z

I have tested this on my repo, seems to work well, including the demo: http://simplepie.ogion.cz/ (use test as username and password)

So it should be ready now.

@mblaney Now you should:

Back up the contents of the FTP server.
Add the following repository secrets on https://github.com/simplepie/simplepie/settings/secrets/actions:

FTP_PASSWORD
FTP_SERVER
FTP_USERNAME

Merge this PR. Please do not squash the commits so that the HTML files are preserved in the git history to allow us to fix Markdown conversion issues if we notice them later.

mblaney · 2023-01-12T02:37:00Z

nice one @jtojnar your demo site looks great! I get a 404 for that actions url though?

jtojnar · 2023-01-12T09:09:47Z

@mblaney This is what I see in my fork:

Or maybe we need someone with member status on the repo?

Art4 · 2023-01-12T10:00:18Z

@jtojnar I noticed some styling errors an the API Docs page. Take a look a the left sidebar:

http://simplepie.ogion.cz/api/

This is how it looks like atm: http://simplepie.org/api/

jtojnar · 2023-01-12T14:02:15Z

@Art4 Tweaked the style, should be fixed now.

Art4 · 2023-01-12T14:24:53Z

Thank you @jtojnar. I also noted some other things:

The source code view has no indentation: http://simplepie.ogion.cz/api/source-src.SimplePie/#690-695
The blog posts are randomly shuffled and don't show the date somewhere. There is also no pagination, but imho that's not important. http://simplepie.ogion.cz/blog/

jtojnar · 2023-05-02T14:47:06Z

One concern would be increased repo size:

master branch: 8.5 MiB (4.61 MiB compressed)
this PR: 14.7 MiB (7.25 MiB compressed)
this PR without the original HTML files: 13.9 MiB (6.52 MiB compressed)

Methodology

Cloned the repo with git clone git@github.com:simplepie/simplepie.git
In the copy of ①, I fetched the PR: gh co 756
In copy of ②, I squashed the Remove original website source commit into Convert website into a static site and fetched the branch into a copy of ① with git fetch ../simplepie2 static-docs-clean and git checkout FETCH_HEAD

Then I ran git fsck; git prune; git gc

For the compressed sizes, I ran git clone -v file://$PWD/simplepie3 $(mktemp -d)

This is not that drastic but it would be a permanent cost going forward so perhaps we should store the backup somewhere else.

Also I noticed just wiki/reference takes 1.1 MiB in the repo unpacked. Maybe we will not want to include it and move the content to PHPDoc comments.

jtojnar · 2023-05-03T13:22:00Z

@mblaney Actually, looks like WordPress supports export without the need to access the database. Are you able to log into the administration and get the export from http://simplepie.org/blog/wp-admin/export.php? And are you able to download the contents of the FTP server and upload it as an archive here?

skyzyx · 2023-05-06T19:40:59Z

Working on this now. Sorry for the delay.

skyzyx · 2023-05-06T19:49:35Z

@mblaney, @jtojnar:

Added these secrets:

SFTP_USERNAME
SFTP_PASSWORD

Added these variables:

SFTP_SERVER
SFTP_PORT

skyzyx · 2023-05-06T19:50:58Z

Backing up the public_html directory. So far, it's several GBs. I'll update when the backup, tarballing, and uploading is complete.

jtojnar · 2023-05-11T12:54:01Z

@skyzyx You can also try uploading the following PHP script and run it to create an archive on the server. It should be much faster than downloading individual files:

<?php
set_time_limit(0);
error_reporting(E_ALL);

$zip = new ZipArchive();
$zip->open(__DIR__ . '/simplepie_website_backup.zip', ZipArchive::CREATE);
echo 'Creating zip archive<br>';
$directory = new \RecursiveDirectoryIterator(
    // Or change the directory path.
    __DIR__,
    FilesystemIterator::KEY_AS_PATHNAME | FilesystemIterator::CURRENT_AS_FILEINFO | FilesystemIterator::SKIP_DOTS
);
$iterator = new \RecursiveIteratorIterator($directory);
foreach ($iterator as $info) {
    echo 'Adding ' . $info->getPathname() . '<br>';
    $zip->addFile($info->getPathname());
}
$zip->close();
echo 'Finished<br>';

skyzyx · 2023-05-13T18:58:35Z

It took some time to finish the download, but it finally completed at 7.4 GiB. I removed the cache files, tarred the directory, and gzipped it with -9. The resulting archive is just under 50 MB. I uploaded it to the root of the SFTP server.

/public_html_2023-05-11T18-20-00Z.tar.gz

mblaney · 2023-05-18T11:11:10Z

thanks @skyzyx that's great. @jtojnar this PR just needs updating for the SFTP change?

jtojnar · 2023-05-18T11:28:40Z

@mblaney I wanted to use the files from the backup as a base for generating since the scraping is not perfect (wget I used for mirroring returns a different set of pages each time and I noticed few places where the wiki software produces messed up HTML).

But since I do not have an access to FTP so I cannot access the backup. Could you please re-upload it somewhere publicly available?

Also do you have access to the WordPress administration? The export would be helpful for similar reason.
It should be available on the following URL if you are able to sign in:
http://simplepie.org/blog/wp-admin/export.php

Otherwise, if the WordPress installation is too broken, could you try getting a database dump, e.g. by uploading a tool like Adminer and exporting the database using the credentials from wp-config.php file?
https://www.adminer.org/en/

mblaney · 2023-05-18T12:02:23Z

hi @jtojnar the backup is just the wordpress install, so nothing usable like that is it? It appears to be too broken to log in, and I don't want to re-upload because it contains login credentials (even though I can't use them).

I can try the database dump if you like, but not sure that will provide anything better than scraping?

jtojnar · 2023-05-18T14:25:26Z

@mblaney IIRC the wiki system stores the content in the directory so that is the main thing I am after. The issue with scraping is that it is incomplete – there are some pages missing or returning error 500. I managed to get some of them out of internet archive but DB dump would be preferred since we can never be certain if wget did get everything.

This was changed in simplepie#745 but without any rationale. The only HTML file is in tests and that should not be manually edited at all.

Ran the following within `nix-shell -I 'nixpkgs=channel:nixos-unstable' -p zola` to create the website tree: zola init docs Filled in the website URL and disabled everything for now. Then created templates based on the successive commits.

There are only two markdown files and both use 2 spaces.

Ran the following within `nix-shell -I 'nixpkgs=channel:nixos-unstable' -p wget2 yq-go dos2unix 'python3.withPackages (ps: with ps; [ beautifulsoup4 pypandoc ])' nodePackages.prettier` ```sh # Download the website contents from the web, and the pages that fail with error 500 from Internet Archive. wget2 --user-agent 'Mozilla/5.0 (X11; Linux x86_64; rv:106.0) Gecko/20100101 Firefox/106.0' --mirror --force-directories --no-robots --retry-on-http-error=403 --http2-request-window=1 --random-wait --exclude-directories=/wiki/lib/exe/ http://simplepie.org rm simplepie.org/blog/2006/03/06/forums-powered-by-punbb/index.html wget2 https://web.archive.org/web/20190404091911/simplepie.org/blog/2006/03/06/forums-powered-by-punbb/ --directory-prefix=simplepie.org/blog/2006/03/06/forums-powered-by-punbb rm simplepie.org/blog/2012/10/30/simplepie-1-3-1-is-now-available/index.html sed -i 's/\xbb//' docs/content/blog/2006-03-06-forums-powered-by-punbb.html # fix encoding wget2 https://web.archive.org/web/20210812123158/https://simplepie.org/blog/2012/10/30/simplepie-1-3-1-is-now-available/ --directory-prefix=simplepie.org/blog/2012/10/30/simplepie-1-3-1-is-now-available # Copy the downloaded contents into the website tree. cp -r simplepie.org/* docs/content mkdir -p docs/static mv docs/content/{scripts,favicon.ico,images,css,robots.txt} docs/static # Standardize line endings. dos2unix docs/** # Drop API docs, we will generate them later. rm -r docs/content/api # Drop mint (analytics), it is abandoned. rm -r docs/content/mint # Drop dynamically generated demo pages. rm -r docs/content/demo/newsblocks # Drop downloads – we will just link GitHub. rm docs/content/downloads/*\?* docs/content/downloads/*.zip # Drop ancient scripts, no more font replacement using flash, or tricks to make PNGs transparent in IE (Sleight). rm -r docs/static/css/sIFR-* docs/static/scripts/ # Download headers explicitly since they are currently rotated by PHP # and wget was not able to find them. rm docs/static/images/headers/rotate-old.php wget http://simplepie.org/images/headers/rotate-xspf.xml --directory-prefix docs/static/images/headers/ cat docs/static/images/headers/rotate-xspf.xml | yq -p=xml '"http://simplepie.org" + .playlist.trackList.track[].location' | xargs wget --directory-prefix docs/static/images/headers/ # Drop Wordpress plug-in clutter. rm -r docs/content/blog/wp-{content,includes,json} # Drop feeds. rm docs/content/blog/**/feed/index.html rmdir docs/content/blog/**/feed # Drop wiki noise. rm docs/content/wiki/lib/exe/css.php?* mv docs/content/wiki/lib/tpl/simplepie/wikistyles.css docs/static/css/ mv docs/content/wiki/lib/images/smileys/icon_exclaim.gif docs/static/images/ rm -r docs/content/wiki/lib rm docs/content/wiki/feed.php* rm -r docs/content/wiki/{_detail,_export} find docs/content/wiki/ -name '*\?idx=*' -exec rm '{}' \; find docs/content/wiki/ -name '*\?do=*' -exec rm '{}' \; rm docs/content/wiki/_media/wiki/dokuwiki-128.png docs/content/wiki/wiki/dokuwiki rmdir docs/content/wiki/wiki mv 'docs/content/wiki/_media/tutorial/update_simplepie_cache.jpg?cache=' 'docs/content/wiki/_media/tutorial/update_simplepie_cache.jpg' rm docs/content/wiki/_media/tutorial/update_simplepie_cache.jpg\?* # Add extension to wiki pages. find docs/content/wiki -type f ! -name '*.jpg' ! -name '*.html' -print0 | xargs -0 -I '{}' mv '{}' '{}.html' # Remove duplicate wiki page rm docs/content/wiki/faq/Supported_Character_Encodings.html echo /wiki/faq/Supported_Character_Encodings /wiki/faq/supported_character_encodings >> docs/static/_redirects rm docs/content/wiki/plugins/wordpress/simplepie_plugin_for_wordpress.1.html # Rename start files (used as directory index in DokuWiki) to _index.html used by Zola. find docs/content/wiki/ -name start.html | sed -E 's#(docs/content/(.*))/start.html#mv "\0" "\1/_index.html"; echo "/\2/start /\2/" >> docs/static/_redirects#g' | sh - # Simplify blog structure. rm -r docs/content/blog/page docs/content/blog/index.html find docs/content/blog/2* -name index.html | sed -E 's#docs/content/blog/(....)/(..)/(..)/(.*)/index.html#mv "\0" "docs/content/blog/\1-\2-\3-\4.html"#g' | sh - rmdir docs/content/blog/*/*/*/* rmdir docs/content/blog/*/*/* rmdir docs/content/blog/*/* rmdir docs/content/blog/???? ls docs/content/blog/*.html | sed -E 's#docs/content/blog/(....)-(..)-(..)-(.+)\.html#/blog/\1/\2/\3/\4/ /blog/\4/#g' >> docs/static/_redirects # Prepare redirects for Apache sed -i 's/^/Redirect 302 /' docs/static/_redirects mv docs/static/{_redirects,.htaccess} # Manually extracted main template into templates/. ```

Produced by markdownify.py

They are big and many are outdated.

So that Zola does not complain about being broken once we remove the wiki.

pull-request-size bot added the size/XXL label Oct 23, 2022

jtojnar force-pushed the static-docs branch 2 times, most recently from 5ab8ab6 to 27f5f39 Compare October 23, 2022 20:44

jtojnar mentioned this pull request Oct 24, 2022

Necessary website changes #757

Open

6 tasks

jtojnar force-pushed the static-docs branch 3 times, most recently from 8094df0 to ef48853 Compare October 30, 2022 22:16

jtojnar force-pushed the static-docs branch from ef48853 to 3bc928d Compare October 31, 2022 10:15

jtojnar force-pushed the static-docs branch 2 times, most recently from 9559027 to 3ed0fc1 Compare January 3, 2023 02:28

jtojnar marked this pull request as ready for review January 3, 2023 02:40

jtojnar force-pushed the static-docs branch 4 times, most recently from 2ba363f to 5feb86f Compare January 12, 2023 13:56

jtojnar force-pushed the static-docs branch 3 times, most recently from bb9a312 to 9818cd8 Compare January 12, 2023 16:05

jtojnar mentioned this pull request Apr 9, 2023

[BUG] #829

Open

jtojnar force-pushed the static-docs branch from f2b4e8e to d7af0d3 Compare May 2, 2023 14:14

jtojnar added 12 commits May 22, 2023 01:29

editorconfig: Use tabs for HTML again

a26ac22

This was changed in simplepie#745 but without any rationale. The only HTML file is in tests and that should not be manually edited at all.

Add script for converting HTML pages to markdown

4d230c7

docs: Prepare directory tree

30f38bb

Ran the following within `nix-shell -I 'nixpkgs=channel:nixos-unstable' -p zola` to create the website tree: zola init docs Filled in the website URL and disabled everything for now. Then created templates based on the successive commits.

editorconfig: Fix indentation for markdown

c6394b7

There are only two markdown files and both use 2 spaces.

Add converted markdown files

b23f28b

Produced by markdownify.py

docs: Add API toolchain

8b53ef5

ci: Build website and add it to GitHub pages

8e373f7

docs: Add breadcrumbs to wiki

d2724ff

docs: Add table of contents to wiki

2f9ad46

docs: Remove original website source

15cca4a

docs: Remove extraneous headers

bb1df1f

They are big and many are outdated.

jtojnar force-pushed the static-docs branch from d7af0d3 to bb1df1f Compare May 22, 2023 00:09

jtojnar mentioned this pull request May 22, 2023

docs: Add docs into the repo #836

Draft

jtojnar added 2 commits May 22, 2023 02:18

docs: Unmarkdown wiki links

d9090bb

So that Zola does not complain about being broken once we remove the wiki.

fixup! docs: Unmarkdown wiki links

e134c9f

jtojnar mentioned this pull request Sep 11, 2023

Downloads and system requirements pages on the website are outdated as of 2023 #847

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add docs into the repo #756

Add docs into the repo #756

jtojnar commented Oct 23, 2022 •

edited

Art4 commented Oct 24, 2022 •

edited

jtojnar commented Oct 24, 2022 •

edited

jtojnar commented Oct 31, 2022

mblaney commented Dec 2, 2022

jtojnar commented Dec 2, 2022

mblaney commented Dec 13, 2022

jtojnar commented Jan 3, 2023 •

edited

mblaney commented Jan 12, 2023

jtojnar commented Jan 12, 2023 •

edited

Art4 commented Jan 12, 2023

jtojnar commented Jan 12, 2023

Art4 commented Jan 12, 2023

jtojnar commented May 2, 2023

jtojnar commented May 3, 2023

skyzyx commented May 6, 2023

skyzyx commented May 6, 2023

skyzyx commented May 6, 2023

jtojnar commented May 11, 2023 •

edited

skyzyx commented May 13, 2023

mblaney commented May 18, 2023

jtojnar commented May 18, 2023 •

edited

mblaney commented May 18, 2023

jtojnar commented May 18, 2023

Add docs into the repo #756

Are you sure you want to change the base?

Add docs into the repo #756

Conversation

jtojnar commented Oct 23, 2022 • edited

TODO

Art4 commented Oct 24, 2022 • edited

jtojnar commented Oct 24, 2022 • edited

jtojnar commented Oct 31, 2022

mblaney commented Dec 2, 2022

jtojnar commented Dec 2, 2022

mblaney commented Dec 13, 2022

jtojnar commented Jan 3, 2023 • edited

mblaney commented Jan 12, 2023

jtojnar commented Jan 12, 2023 • edited

Art4 commented Jan 12, 2023

jtojnar commented Jan 12, 2023

Art4 commented Jan 12, 2023

jtojnar commented May 2, 2023

jtojnar commented May 3, 2023

skyzyx commented May 6, 2023

skyzyx commented May 6, 2023

skyzyx commented May 6, 2023

jtojnar commented May 11, 2023 • edited

skyzyx commented May 13, 2023

mblaney commented May 18, 2023

jtojnar commented May 18, 2023 • edited

mblaney commented May 18, 2023

jtojnar commented May 18, 2023

jtojnar commented Oct 23, 2022 •

edited

Art4 commented Oct 24, 2022 •

edited

jtojnar commented Oct 24, 2022 •

edited

jtojnar commented Jan 3, 2023 •

edited

jtojnar commented Jan 12, 2023 •

edited

jtojnar commented May 11, 2023 •

edited

jtojnar commented May 18, 2023 •

edited